Rajesh Rao: Deciphering a 4000-Year-Old Script

Computer analysis might decode messages carved by one of the world's oldest civilizations

27 July 2011

For some people, the inscriptions on artifacts found near the Indus River in South Asia are just pictures. But for others, including IEEE Member Rajesh Rao, those inscriptions could potentially bear great meaning. If they can be decoded, he says, the symbols could shed light on a civilization that existed about 4000 years ago.

Rao, an associate professor of computer science and engineering at the University of Washington in Seattle, is using computational modeling and statistical analysis to look for patterns and regularities in the inscriptions on trader’s seals, pottery, stoneware, and other items left behind by the Indus civilization, which existed in what is now Pakistan and India between 2600 and 1900 B.C. He presented his research in March at TED, an annual conference held in Long Beach, Calif., devoted to bringing experts together to speak about advances in technology, entertainment, and design. A video of Rao's talk was recently posted on TED's website.

“I love puzzles and mysteries in general, but being originally from India, I have always been fascinated by the puzzle posed by the Indus civilization—the oldest civilization on the Indian subcontinent,” he says.

Many scholars consider the Indus civilization to have been as sophisticated as its Mesopotamian and Egyptian contemporaries. The Indus people left few clues behind about their language, however. Some linguists say that the inscriptions, which are based on about 400 symbols, some of them resembling humans and animals, do not correspond with a spoken language at all.

While on sabbatical from the University of Washington in 2007, Rao teamed up with colleagues from the Tata Institute of Fundamental Research in Mumbai and the Institute of Mathematical Sciences in Chennai to investigate this question. They scanned several thousand inscriptions discovered near the Indus River into a computer and used software that looked for regularities in the symbols’ sequences. The program computed how often certain symbols occurred together and which symbols tended to follow another symbol. Their program also could help recreate missing or damaged pieces of an inscription.

Rao and his colleagues published their findings in several publications, including IEEE Computer magazine.

“The computer analysis revealed interesting patterns in the script that are similar to linguistic scripts,” Rao says. Some symbols occurred more often at the beginning or end of the text, while others were frequently paired with certain other symbols. The symbols did not appear to be paired at random—the syntax tended to be somewhat predictable. A quantitative analysis of these patterns revealed them to be similar to written languages.

“This analysis, along with other pieces of evidence, led us to conclude that the script might be versatile enough to encode an unknown language,” Rao says.

But even if the symbols do represent a language, it’s difficult to determine what they actually mean. Many more pieces must fall into place before the historical puzzle is solved.

For one thing, the longest inscription discovered by archeologists so far contains only 17 symbols. And researchers have almost no knowledge about the actual language spoken by the Indus people.

“It would help to have new artifacts with longer inscriptions or maybe even a bilingual inscription, with the Indus script presented side by side with another known script, much like the Rosetta Stone found in Egypt,” Rao says. “There are a number of Indus sites yet to be excavated along the India-Pakistan border, so one can still harbor some hope that such artifacts may eventually be unearthed.”

Rao says good math and science teachers, as well as encouragement from his parents, inspired him to pursue science and engineering. He moved from his home in Hyderabad, India, to the United States in 1988 after receiving a scholarship to study at Angelo State University, in San Angelo, Texas. He earned bachelor’s degrees in computer science and in mathematics in 1992. Rao says he realized the vast potential of artificial intelligence and computational neuroscience during a summer internship at the University of Rochester, in New York, where he earned master’s and doctoral degrees in computer science in 1994 and 1998.

“My thesis advisor convinced me that the fields of AI, machine learning, and neuroscience were rich in problems whose solutions could have far-reaching consequences,” he says. “I jumped at the opportunity, and I’ve enjoyed working at the intersection of these areas ever since.”

In 2000, after a two-year fellowship at the Salk Institute, in La Jolla, Calif., Rao joined the faculty of the computer science and engineering department at the University of Washington, where his research focuses on computational neuroscience, humanoid robotics, and brain-computer interfaces.

He’s currently taking a break from the Indus script project to explore his other passion: using computational models to understand how the brain works. He and his students are studying the role that Bayesian inference, or the tendency to predict the probability of an outcome based on prior experience, plays in perception and decision-making in the human brain. “If successful, such a computational understanding of the brain could suggest new ways of building artificially intelligent systems, help treat neurological disorders, and help us to better understand ourselves as human beings,” Rao says.

Learn More