Understanding neural networks for better collaborations with AI
AI tools applications are very broad, ranging from image generation to book summary. Despite these successes, our understanding of generalization capabilities of technologies such as neural networks is still incomplete. Sometimes described as black boxes, mathematical models behind neural networks are crucial elements to understand the validity of the results provided by AI. Thanks to Damien Teney and his colleagues, understanding how neural networks work might be a step closer. They published an article providing a new perspective to explain why neural network are so efficient with different tasks.
Challenging common assumptions
Among various deep learning technologies, neural networks are the most successful on a variety of tasks. If they can be tailored to specific domains, such as image recognition, even relatively simple and general neural networks can be remarkably effective for specialized tasks. This apparent adaptability is surprising considering that these algorithms “learn” from a finite set of examples. The scientific community mainly sought the explanation for these generalization capabilities in how neural networks are trained and built from the beginning.
“To study generalization capabilities, we compared different models and how they performed while adapting various parameters. The aim was to see how these properties at initialization correlate with the performance of trained networks,” Damien Teney, head of Idiap’s Machine Learning research group, explains. Among those properties, parameters called weights and biases are fundamental. They allow adjusting the mathematical functions to process information more effectively. Neural networks training refines these weights and biases.
Contrary to what many researchers believe, the results obtained by Teney and his colleagues suggest that neural networks are not universally biased towards simpler mathematical functions. Furthermore, they suggest that the neural networks inherit the parameters included in their building blocks.
Towards better collaboration between humans and AI
By playing with the parameters of neural networks and modulating their architecture, researchers hope to improve their understanding of how these networks are producing their results. Explaining these networks behaviors could improve their generalization capability and, furthermore, our trust in their results. “If a neural network identifies a cancer in medical images, most people will expect that doctors know why it was identified as such. To be trusted, the process is expected to be explainable, especially for sensitive domains. We hope that this work will contribute to improving trust in AI systems,” Teney concludes.
More information
- Machine Learning research group
- "Neural Redshift: Random Networks are not Random Functions", Damien Teney, Armand Nicolicioiu, Valentin Hartmann, Ehsan Abbasnejad, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024
- Human-AI teaming research program