MIT researchers claim to have created an AI model that sets a new standard for understanding how a neural network makes decisions.
The team from MIT Lincoln Laboratory’s Intelligence and Decision Technologies Group has developed a neural network that performs human-like reasoning to answer questions about the content of images.
As it solves problems, the Transparency by Design Network (TbD-net) shows its workings, by visually rendering its decision-making process, allowing the researchers to see the reasoning behind its conclusions.
Unusually, not only does this model achieve new levels of transparency, it also outperforms most of today’s best visual-reasoning neural networks.
The research is presented in a paper called Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning.
The complexity of brain-inspired neural networks makes them remarkably capable, yet it also renders them opaque to human understanding, turning them into so-called ‘black-box’ systems. In some cases, it’s impossible for researchers to trace the course of a neural network’s calculations.
A transparent neural network
In the case of TbD-net, its transparency allows researchers to correct any erroneous assumptions the system may have made. Its developers say this type of corrective mechanism is missing from other leading neural networks today.
Self-driving cars, for example, must be able to rapidly and accurately distinguish pedestrians from road signs. Creating a suitable AI to do that is hugely challenging, given the opacity of many systems. Even with a capable enough neural network, its reasoning process may be unclear to developers – a problem that MIT’s new approach is set to change.
Ryan Soklaski, who created TbD-net with fellow researchers Arjun Majumdar, David Mascharka, and Philip Tran, said:
Progress on improving performance in visual reasoning has come at the cost of interpretability.
The team took a modular approach to their neural network – building small sub-networks that are specialised to carry out subtasks. TbD-net breaks down a question and assigns it to the relevant module. Each sub-network builds on the previous one’s conclusion.
“Breaking a complex chain of reasoning into a series of smaller sub-problems, each of which can be solved independently and composed, is a powerful and intuitive means for reasoning,” said Majumdar.
The neural network’s approach to problem solving is similar to a human’s reasoning process. As a result, it is able to answer complex spatial reasoning questions such as, “What colour is the cube to the right of the large metal sphere?”
The model breaks this question down into its component concepts, identifying which sphere is the large metal one, understanding what it means for an object to be to the right of another one, and then finding the cube and interpreting its colour.
The network renders each module’s output visually as an ‘attention mask’. A heat-map is layered over objects in the image to show researchers how the module is interpreting it, allowing them to understand the neural network’s decision-making process at each step.
Despite designing the system for greater transparency, TbD-net also achieved state-of-the-art accuracy of 99.1 percent, using a dataset known as CLEVR. And thanks to the system’s transparency, the researchers were able to address faults in its reasoning and redesign some modules accordingly.
The research team hopes that such insights into a neural network’s operation may help build user trust in future visual reasoning systems.
Internet of Business says
The opaque nature of many neural networks risks creating systemic and ethical problems, not to mention allowing bias into the system unchecked.
However, when visual reasoning neural networks are made more transparent, they typically perform poorly on complex tasks, such as on the CLEVR dataset.
Past efforts to overcome the problem of black box AI models, such as Cornell University’s use of transparent model distillation, have gone some way to tackling these issues, but TbD-net’s overt rendering of its reasoning process takes neural network transparency to a new level – without sacrificing the accuracy of the model.
The system is capable of performing complex reasoning tasks in an explicitly interpretable manner, closing the performance gap between interpretable models and state-of-the-art visual reasoning methods.
With computer vision and visual reasoning systems set to play a huge part in autonomous vehicles, satellite imagery, surveillance, smart city monitoring, and many other applications, this represents a major breakthrough in creating highly accurate, transparent-by-design neural networks.