Evolving humanoid robotic dexterity from toddler to adult

Figure AI’s humanoid general-purpose robot. Figure is one of many robotic companies racing to get humanoid robots out into the market. | Source: Figure AI

Something’s been forgotten in the race to build general-purpose humanoid robots. Roboticists are forgetting to answer this basic question: What does it mean to be general purpose?

For better or worse, this probably means replicating what people can do physically in their day-to-day life and perhaps extending to some semi-skilled labor. In both scenarios, people are most valued for what they can do with their hands. The holy grail of general-purpose robotics is to replicate human dexterity – we need robots that can use their hands like people do. Yet, the industry at large tends to focus on the macro of movement – demonstrations of robots walking, for example, while robotic dexterity and hand movement often come secondary. As a result, many general-purpose humanoid robots are still very clumsy and infantile with their hands compared to their human counterparts.

The Robot Report will be hosting a keynote panel at RoboBusiness 2023 to discuss the state of humanoids. Jeff Cardenas, co-founder and CEO of Apptronik, Jonathan Hurst, co-founder and chief robot officer of Agility Robotics, and Geordie Rose, co-founder and CEO of Sanctuary, will explore the technological breakthroughs that are propelling humanoids into the real world. They’ll share their firsthand insights into the challenges and opportunities that lie ahead and discuss the industries poised to be early adopters of these remarkable creations.

A greater shift toward designing-thinking

Current robotics design thinking is focused on building a precise actuator – motors with high specifications, and joints and linkages with tight tolerances. The motivation for this is to know with high precision the exact location of every part with certainty. There is not much thought given to sensing.

In contrast, the human body can be described as an imprecise machine that is capable of performing very precise tasks. Human muscles (actuators) are imprecise, but it’s because we have such a rich network of sensors, which provide feedback, from which our brain is able to react to (and make decisions and learn) to apply precise control, that we are able to perform very precise tasks – this is particularly true of our hands.

Human dexterity refers to our skillful use of our hands in performing various tasks. But what does it take to be dexterous? Although we are born with specific hardware – sensors (vision, touch, and proprioception), actuators (muscles in the shoulders, arms, wrists, and fingers) and a processor (the brain) – we aren’t necessarily born with dexterity.

Have you ever watched a baby grasping things? It’s a far cry from the dexterity we see in adults, wherein fingers can seemingly effortlessly pinch, grasp, and manipulate even the smallest of day-to-day objects – we can slide a button through a slit along the collar of a linen shirt and turn a miniature screwdriver to delicately adjust the metal frame of our eyeglasses.

In the robotics industry at large, there is a clear need to design robots starting with rich sensing not only because it allows us to work with less precise actuation and lower tolerance parts – which will also potentially enable robots to be built more cost-effectively – but also for the ability to acquire new manipulation skills and achieve human-like dexterity.

The fundamental components of human dexterity

There are 29 muscles in the hand and forearm, giving rise to 27 degrees of freedom. Degrees of freedom refer to the number of ways all the joints of the hand and fingers can move independently. The arms and shoulders are also involved in dexterity including 14 muscles in the shoulder and another 5 muscles in the upper arm.

While vision is commonly used for locating an object (the subject of the manipulation task), it may or may not be involved in reaching for the object (in some cases, proprioception alone is used), and in most simple manipulation tasks, the role of vision ends once contact is made with the fingers/hand, at which point tactile sensing takes over. Consider also that people can perform quite a lot of manipulation tasks in the dark or even blindfolded, so it’s clear that we don’t rely solely on vision.

Proprioception, often referred to as our “sixth sense,” allows us to perceive the location of our body parts in space, understand joint forces, angles, and movements, and interact effectively with our environment. It encompasses sensors like muscle spindles and Golgi tendon organs, critical for dexterous manual behavior and the ability to sense an object’s three-dimensional structure.

There are approximately 17,000 tactile mechanoreceptors (receptors sensitive to mechanical stimulation) in the non-hairy skin (i.e., the grasping surfaces) of one hand. These receptors individually measure vibration, strain, and compression, and as a population can measure force and torque magnitude and direction, slip, friction, and texture. All these parameters are critical for controlling how we hold and manipulate an object in our grasp – when an object is heavier, or slipperier, or the center of mass is further from the center of grip, we apply larger grip forces to prevent the object from slipping from our grasp.

There’s a lot of pre-processing of data in the peripheral nervous system between the sensors and the brain, and the brain dedicates a significant proportion of the somatosensory cortex to processing tactile and proprioceptive sensory data from the hand, fingers, and thumb. Similarly, a significant proportion of the motor cortex is dedicated to controlling the muscles of the hand, fingers, and thumb.

On top of the “dexterity hardware” we are born with, we start to learn our basic dexterity during infancy. Every time we interact with a new physical tool, we add new skills to our dexterity repertoire. Babies grasp and hold toys, press buttons, and hold things between their forefinger and thumb to develop their dexterity.

Toddlers continue to refine these skills through everyday activities like learning to use utensils, holding pens or crayons to draw and stacking blocks. Even as adults we can learn new skills in dexterity. Whenever we attempt a task, we have a plan on how to execute it – this is known as a feedforward mechanism. And as we execute it, our sensory system tells us when we deviate from our expected path/performance, so we can use that information to correct our actions (known as feedback control) as well as update the plan for next time (learning). For dexterous tasks, most of the sensory information that we rely on for feedback control is tactile.

The missing piece in designing for robotic dexterity

For this complex system of touch and evolution to translate to autonomous robots, we need to build a hardware platform that is designed with the capability of acquiring new skills. Analogous to the human dexterity hardware, there are fundamental hardware components necessary for achieving robotic dexterity.

Those include actuators and sensors. Actuators come into play for dexterity as motors are used to move the arms, wrists, and fingers via several potential mechanisms such as tendons, shaft drives, and even pumps for pneumatic-based actuation. With regard to sensors, computer vision and sometimes also proximity sensing are used as a proxy for human vision for the purpose of dexterity.

To emulate human proprioception, position encoders, accelerometers, and gyroscopes are used.

When it comes to tactile sensing, however, despite the overwhelming evidence (and general agreement from roboticists) that it is crucial for achieving dexterity in robots, in most cases, only a force/torque sensors (on the robot wrist) and sometimes pressure sensing films or force-sensitive resistors (on the finger pads and perhaps the palm) are included.

This is often a result of tactile sensing being an afterthought in the design process  – but if a robot cannot feel how heavy or how slippery an object is, how can it pick it up? And if it can’t feel the weight distribution of an object and the resistance, how can it manipulate it? These are properties that can only be sensed through touch (or perhaps X-ray or some other ionizing radiation).

Processors are also important here. Edge computing can be used to perform pre-processing of sensor data, much like the peripheral nervous system, and coordinate simple subsystem control. In lockstep, a central processor is required to make sense of data from multiple sensor types (sensor fusion) and coordinate complex actions and reactions to the received data.

Let’s help robots acquire new skills

One could think of many of today’s existing robots like adult-sized toddlers – out of the box, we may expect them to do some basic tasks like walk along flat ground, avoiding large obstacles such as walls and furniture, picking up tennis ball-sized objects, and perhaps understanding some simple commands in natural language.

Developing new skills must be learned through “embodied learning.” It is impossible to be able to do this purely inside a virtual environment. To learn intuition about an environment, an agent must first interact with its environment, and it must be able to measure the physical properties of this interaction and the success or outcome of the interaction. Much like the human baby/toddler, our robot must learn through trial and error in the physical realm, and through actuation and sensing start to build an understanding of physical cause and effect.

Perhaps one reason why roboticists have avoided the sense of touch is because of its complexity. We have simplified the sensory input of vision to a two-dimensional grid composed of pixels encoded in RGB, which we can capture using a camera. But, we don’t really have similar models for touch, and historically, we haven’t had devices that capture touch.

So, for a long time, we have been in a state of neglect in this area. Now, however, we are seeing more of this work. We’re focused on this at Contactile. We’ve developed tactile sensors – inspired by human tactile physiology – that measure all the critical tactile parameters for dexterity, including 3D forces and torques, slip, and friction. Measuring these properties and closing the control loop (using feedback control) enables even a simple two-finger robotic gripper to apply the exact grip force required to hold any object, regardless of its size, shape, weight, and slipperiness – enabling this imprecise machine to perform a precise task, at last.  

Sensing capabilities for the future of robotics

There is overwhelming evidence and general agreement from roboticists that tactile sensing is crucial for achieving dexterity in robots. There is also an argument that without this kind of sensing in an embodied AI, true Artificial General Intelligence cannot be achieved. A shift in design thinking is required to ensure that these robots are designed with sensing as a core requirement, rather than as an afterthought. After all, we want that toddler to be able to grasp its spoon without fumbling to eat.

Author Bio

Heba Khamis is co-founder of Contactile, a Sydney-based technology company focused on enabling robotic dexterity with a human sense of touch. She has a Ph.D. in Engineering from the University of Sydney.  

The post Evolving humanoid robotic dexterity from toddler to adult appeared first on The Robot Report.