Training neural networks more difficult than expected Artificial intelligence: Research team delivers new insights
Training neural networks is even more difficult than previously thought. This has been demonstrated by mathematician Dr Linda Kleist from the Algorithms Division at Technische Universität Braunschweig together with computer scientists Mikkel Abrahamsen, University of Copenhagen, and Tillman Miltzow, Utrecht University. In December, the research team presented their results at “NeurIPS”, one of the world’s most important conferences on machine learning.
Artificial intelligence methods and applications have made impressive progress in recent years. One special approach is neural networks. Similar to the human brain, they process input data in several intermediate steps until a result is calculated. Neural networks can perform complex and diverse tasks at a superhuman level.
Typical examples are playing complex games like chess and Go, translating languages, but also recognising human faces, cancer in computer tomography scans or lanes in autonomous driving. They are also widely used in social media to model users’ behaviour and show them personalised ads.
Before a neural network is of any use, it needs to be trained, for example using a large collection of computer tomography scans where a human expert has previously identified which ones show signs of cancer and which ones do not. By learning these patterns, the neural network can then independently detect cancer in new scans. Once the training is complete, the network can be used quickly and efficiently over and over again.
Challenge: Minimising errors
In order to rely on neural networks to solve complex problems, they must work as error-free as possible. In the training phase, the parameters of a neural network are adjusted to gradually minimise the errors. Deciding whether a solution with a small error exists at all is a very difficult problem, according to the researchers’ findings.
Most of the challenges addressed by AI methods, such as complex optimisation problems, are difficult to solve. For many, however, it is easy to recognise when a solution has been found, as is known from tricky puzzles. There are problems that are even more difficult though, such as finding real zeros for a polynomial with many unknowns, for which only very slow algorithms are known. In their mathematical considerations, the researchers proved that minimising the errors of neural networks is just as difficult as this problem. Therefore, they concluded in the study that training neural networks is even more challenging than previously thought.
Looking ahead: Much basic research needed
“With the currently known algorithms, one can easily use the entire worldwide energy production over many years for the required computing power without reaching the optimum,” knows Dr. Linda Kleist, a research assistant in the Algorithms Division at TU Braunschweig. A lot of basic research is still needed to find faster and more efficient algorithms or to prove that such algorithms may not even exist. “So there will still be a lot for computer science researchers to do in the future,” says Kleist.