Mark Sagar is changing the way we look at computers by giving them faces—disconcertingly realistic human faces. Sagar first gained widespread recognition for his pioneering work in rendering faces for Hollywood movies, including Avatar and King Kong. With a Ph.D. in bioengineering and two Academy Awards under his belt, he directs a research lab at the University of Auckland, New Zealand, a hub where artificial intelligence, neuroscience, computer science, philosophy, and cognitive psychology intersect in creating interactive and intelligent technologies. He’s also the CEO of Soul Machines, a company in Auckland that designs emotionally intelligent systems.
The most high-profile of his designs is BabyX, which simulates an infant with down hair and rosy cheeks. BabyX can read a word you hold up in front of it, and can smile back at you. It will get upset if you leave the room, like a real baby.
BabyX’s algorithms are based on human biological processes, the neural systems involved in learning and behavior, and the virtual baby is programmed to learn through interaction with its users.
IEEE Pulse chatted with Sagar to find out more about the design of BabyX and what he’s up to next at his lab.
Let’s talk about BabyX. Why a baby and not, say, an adult?
Essentially, it’s to try to make a human simulation as simple as possible. One of the things we want to do is understand the basic building blocks of human behavior, and so you really want to start at the beginning of social learning and social contact. With a baby, you’re not dealing with the complex psychological masking that adults do.
Your goal is to create a baby that appears as lifelike as possible. But have you ever felt that in the process of designing BabyX, you might be able to create something even better than a human?
We haven’t gone down that path. Essentially, what we can hope for out of this is to make an intuitively teachable computer. If we humans receive input from our eyes and ears and so forth, something in the computer can also get input from a variety of sources. That could be a feed from the Internet associated with a feed from a sensor.
That’s how we can start abstracting BabyX to create a digital nervous system. You could call the approach biomimetic. Instead of taking a more computer-science approach, in which you can apply any type of ideas you want and build up a system, we are trying to create a system biologically inspired.
Eventually BabyX could interact as a new type of creature. One of the potential applications for this could be connecting it to the Internet of Things.
Tell us about the potential applications for BabyX as well as your other project, the Auckland Face Simulator.
We’re doing two different types of technology approaches in the lab. BabyX is an exploration of how we humans tick. On the other side, we’re building adult faces. With the Face Simulator, we’re exploring what happens if we put a face on Siri or connect a face to information technology.
For example, if you and I have a face-to-face conversation, my facial movements will actually give information about what might be important in what I’m saying. I might highlight that importance with an eyebrow raise or something similar, or it might be what I’m looking at when I’m referring to something. Those are examples in which all the nonverbal aspects of communication are important. Imagine Siri with a face and all the extra information you’d get from that interaction.
Applications of the technology could be an educational system for children in which collaboration or assistance might really improve their learning, or it could be for elderly people who don’t use computers. If we make something a bit more humanlike, then they can relate to it in a more intuitive, easy-to-understand way.
Your work has been focused for quite some time on the face. Do you have a favorite feature of the face to work on?
Probably the thing I’ve studied the most would be the smile. And it’s probably the nicest thing you can study, too. When you’re doing biomechanical simulation, the smile is really complex, because the face has got so many layers of tissue with different material properties, and when you smile, your whole face actually slides and bunches up. When you’re trying to simulate that, it’s quite challenging. The other thing with the smile is that it is very subtle, so the dynamics matter just as much as the form. But we did manage to make models that can simulate realistic smiles.
The other area that has been of real interest is the eyes. What we’ve tried to do in our latest model is to make eyes that appear realistic. We have the neural networks controlling pupil dilation and constriction and where the eyes are looking. It takes a lot of attention to detail to create these models.
A gauge of success of your AI models could be whether humans empathize with them. Have you found that those who interact with the models express emotions toward them that they’d normally feel for other people?
That’s one of the interesting things when people experience BabyX in person. When I do a live demonstration, I normally get a very empathetic reaction from the audience. If I abandon BabyX and it starts crying, people get upset. It’s interesting because they know they’re looking at an artificial model. They know it because I’ve just explained to them how we’ve been building it. But then they react like that, and they get a sense of relief when I comfort the computer baby. It’s like people create this emotional connection that is independent of intellect or technology.
What is it about BabyX that you think gets people to respond that way?
It’s really the whole thing. What we’re showing with BabyX is how all these different elements come together to make something that appears alive. As AI develops and becomes more and more part of our lives, these elements are going to become increasingly important to put into computer systems.
Have you had any moments that have felt absurd as you’ve worked with your prototypes?
We constantly have bizarre moments in the lab. We’re creating faces and, when things go wrong, it’s not just a bug in which numbers need to be fixed—it actually looks like something. The face might have exploded in some way, for example, or there’s some extremely comical expression that occurs. Every time that happens, I encourage people in the lab to take a screenshot of it. At some point we’ll create a blooper video.
This is an edited excerpt from the November 2015 issue of IEEE Pulse.