Among the most difficult tasks for agencies managing disasters—natural or otherwise—is rapidly and accurately identifying the status and whereabouts of potential victims. Manual door-to-door investigation methods are cumbersome and take time, sometimes months, impairing relief efforts and placing hardship on the families and friends of those affected. But with the proliferation of social media and online public records, a wealth of information is available about people—where they live and work, who their friends are, what event they might be attending—allowing for the status of potential victims to be more easily determined. What’s needed is a system for accessing information about the affected population and making it useful and actionable to rescue agencies.
Researchers from MIT Lincoln Laboratory, in Lexington, Mass., are working toward such a system. Their Semi-Automated Family Estimation (SAFE) system puts all that information in one place. Details about the emergency, primarily its location, are keyed into the system. Then information culled from publicly available and open-source sites on the people, businesses, and social media users in the vicinity is swept into SAFE. The system combines all the data, analyzes their relationships, and displays the findings on graphs to help responders.
The prototype information management system is described in the article “A Network Science Approach to Open Source Data Fusion and Analytics for Disaster Response,” available in the IEEE Xplore Digital Library.
“SAFE has two purposes: to identify people who may be affected and determine methods for communicating with them, and to use the latest data about the affected population to identify other at-risk individuals and their locations to prioritize response and research efforts,” says Danelle Shah, the lead researcher. She is a technical staff member for Lincoln Laboratory’s Intelligence and Decision Technologies Group. “The social network is the enabler and open-source data provides a means by which the network can be constructed.”
Social networking sites, aid organizations, and private companies have for many years offered tools to help find people in a crisis. For example, Facebook’s Safety Check tool, introduced in 2011, lets users indicate on their profile pages that they and others with them are safe. The American Red Cross Safe and Well website allows people to let loved ones know about their welfare. And Ushahidi, an open-source software company in Nairobi, Kenya, offers first responders up-to-the-minute reports on an affected area. It relies on maps from the United Nations and similar sources but also sweeps up text and e-mail messages, as well as Twitter posts from volunteers at the site. Ushahidi is designed to help first responders better understand the situation on the ground so they can deliver medical and humanitarian relief faster.
Although such tools are helpful, they rely largely on specific information generated after an event has occurred, and do not make use of the abundant pre-disaster data available about the affected population. Disaster management agencies would find it helpful to understand how people in an affected area are related, Shah says, to help them better identify victims: Are they employees, shopkeepers, neighbors, relatives, friends, coworkers, or customers?
“What we’ve proposed is a process by which such links can be discovered more quickly,” she says.
First, data is collected about people, businesses, and residences in the affected area from public records. The system uses Whitepages for home addresses and telephone numbers and Google Places and Foursquare for business addresses and numbers. Data also comes in from Facebook, LinkedIn, and Twitter. They provide details such as where people work; their ages; names of their relatives, friends, and coworkers; and places they frequent. Using a pre-defined database format, data are either fed in manually or automatically from an application program interface.
Then an inference procedure attempts to discover relationships within the population, such as “spouse of,” “works at,” “attends school at,” and “friends with” to zero in on the identities of victims and those who might be missing, and to identify lines of communication to those affected.
A social graph of the disaster-struck population is generated and displayed, where each node corresponds to a person, a location, or a social media user. Links are drawn between nodes representing the relationships observed in or estimated from the data. For example, one woman might be linked to her spouse, home, and workplace.
Visually representing relationships that way contributes valuable social context to agencies during an emergency and provides contact information for potential victims, Shah says. Additionally, the social network can be leveraged to learn how a specific disaster has impacted a population, and estimate others who might be at high risk.
“Experiments on simulated data have demonstrated promising results; incorporating this rich social media information in the prioritization of a response can significantly increase its speed and effectiveness over a naïve approach,” Shah says.
The complex process is further complicated by the sheer size and diversity of the data involved. Public record databases and online location sources are full of entries that have slightly different names and addresses for the same business or person. Twitter handles, for example, often contain partial names, nicknames, initials, symbols, or phrases. To that end, the researchers developed algorithms to match data across different sources. Classifiers were developed to merge duplicate entities, match social media accounts to their owners, and estimate where people live and work.
For Shah, the next step is to connect with organizations interested in using her system so the researchers can develop their prototype further.