MIT research advances robotic object recognition and manipulation

MIT research advances robotic object recognition and manipulation

Breakthrough research at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) is teaching robots to use computer vision to work out how to pick up objects.

After training, robots have proved capable at picking up the same object over and over in a factory, but when faced with different objects they need to be retrained, or use a rudimentary grasping algorithm.

Recent advances in computer vision have allowed robots to distinguish between objects, but their ability to truly understand their shapes has remained lacking.

However, researchers at MIT have now developed a system whereby a robot can inspect random objects to comprehend their 3D properties, before picking them up and accomplishing specific tasks – all independent of human help and without having seen the objects before.

The Dense Object Nets (DON) system views objects as a series of points, which it maps to a 3D shape. This enables robots to pick up items amongst others that are similar.

In the video above, the Kuka robot is instructed to pick up a shoe by its tongue. Based on this, the robot can look at a shoe it has never seen before and grab its tongue.

None of the data used by the robot was labelled by humans. The system is self-supervised, moving around an object to look at it from several angles in order to recognise its shape.

More independent robots

PhD student Lucas Manuelli wrote the paper about the system with lead author and fellow PhD student Pete Florence, alongside MIT Professor Russ Tedrake. Speaking about the significance of the research to MIT News, he said:

“Many approaches to manipulation can’t identify specific parts of an object across the many orientations that object may encounter. For example, existing algorithms would be unable to grasp a mug by its handle, especially if the mug could be in multiple orientations, like upright, or on its side.”

Similar systems, such as UC-Berkeley’s DexNet, are able to pick up different items but are unable to follow nuanced requests, such as grasping a particular item at a specific point.

The MIT researchers demonstrated their system’s capabilities by having it pick up a caterpillar toy by its right ear, showing its ability to distinguish between left and right on symmetrical objects.

Likewise, when faced with multiple similar baseball caps, DON could identify and grasp the desired hat – without any training data.

“In factories robots often need complex part feeders to work reliably,” says Florence. “But a system like this that can understand objects’ orientations could just take a picture and be able to grasp and adjust the object accordingly.”

The team hopes to further its research to a point where the system can perform tasks with a deeper understanding of corresponding objects, such as grasping an object and moving it to clean a desk.

Internet of Business says

The DON system allows the accuracy and nuance of task-specific methods to be easily applied to multiple objects, without having to train the robot.

Research in this area is vital if we are to give robots human levels of dexterity in non-rote tasks.

Future applications include object picking in warehouses and manufacturing scenarios but, given the flexibility of such a system, potential use-cases are enormous – both in enterprise deployments and, in the long term, perhaps the home too.

MIT has also been exploring other avenues to limit the need to train robots, even using brain waves as a means of control.

Elsewhere, OpenAI research is helping robots to learn how to handle new objects with surprising dexterity.