Collective memory - World leading higher education information and services

$\"Graphic:$

As computing power continues to move from the desktop to portable devices, the nature of communications networks will change radically. A network in which devices are regularly being added and removed, and where the strength of the connections between the devices fluctuates with their movement, requires much different protocols from those that govern relatively stable networks, like the Internet.

In a paper appearing in the December issue of the journal Distributed Computing, a group of researchers, including NEC Professor of Software Science and Engineering Nancy Lynch, describe a long-term project to give unstable networks stable, shared memory. Begun at MIT in 2001, and largely funded by the National Science Foundation, the project was initially envisioned as a means to preserve vital information for teams of soldiers operating in hostile environments, as, say, Tora Bora, Afghanistan, was at the time. But the work could also have ramifications for sensor networks, networks of mobile devices, peer-to-peer services on the Internet, and the large server farms that host heavily trafficked websites.

Because it was originally envisioned for military applications, the system has a very military-sounding acronym: RAMBO, for Reconfigurable Atomic Memory for Basic Objects. Suppose that soldiers are trying to secure a small village in hostile territory for a period of weeks. The composition of the teams patrolling the village will change constantly, but the soldiers need to stay abreast of the latest intelligence — which parts of the village are safe, for instance, and which could be dangerous. RAMBO is designed for situations in which the soldiers don’t have access to some central server that could store that type of information. The system distributes the information among the soldiers themselves, in such a way that, no matter who joins or leaves the network, everyone will always have access to the most recently updated information. The same system, however, could also preserve information about road conditions for ad hoc networks of sensor-laden cars, or allow emergency workers to quickly set up a shared data network when, for instance, a natural disaster has disrupted existing communications systems.

Virtual server

“It’s supposed to look like an instantaneously accessible memory, like if you have one machine in one location,” Lynch says. “We wanted to have that same appearance, but really, it’s running on mobile devices out in the field. There is no one machine.” If someone on the network updates some shared information, rather than storing the update on a server, as an office network would, RAMBO sends multiple copies of the information to other devices on the network. But not all the devices, a simple majority of them. Similarly, when a device needs to retrieve information, it polls a simple majority of the other devices on the network. “The thing about majorities is that any two of them have some copy in common,” Lynch says. Suppose, for example, that there are five devices on the network, and an important piece of data is stored on three of them. If you randomly query any three of the five devices, at least one of them will have the data.

Difficulties arise, however, as the structure of the network changes. If just two devices storing the same piece of data leave a network — even a very large one — then querying a majority no longer guarantees the retrieval of that piece of data. The same is true if just two devices that aren’t storing the piece of data join the network. So RAMBO includes algorithms that recognize such changes and copy data to new devices in order to preserve majorities.

Earlier research in the field had investigated majority-based distributed memory systems, says Idit Keidar, an associate professor of electrical engineering at Technion, in Israel, who specializes in distributed systems. But “the papers that presented these solutions, they didn’t deal with reconfiguration,” she says. “Reconfiguration, you don’t want to do it offline. You don’t want to bring the whole system down and then fix all the tables and then bring everything online again. What you want to do is to do it while the system’s working, with no down time. And that’s the problem that RAMBO’s trying to solve.”

E pluribus unum

In some sense, Keidar says, RAMBO’s memory system and its reconfiguration system solve opposed problems. The memory system is designed so that each device on the network can store or retrieve information autonomously. It has to communicate with a majority of the other devices, but it doesn’t need to worry about what those devices are doing. It simply sends them data or requests data from them. But the reconfiguration system, Keidar says, requires the network to make some decisions as a whole. If a device at the edge of the network recognizes that several new devices have appeared nearby, it could start using them as part of its storage majority, but the rest of the network might not even know that they’re there. The researchers have actually proposed several different versions of RAMBO, optimized for different types of network. But during reconfiguration, they generally have to do some coordinating among devices.

Lynch’s two co-authors on the Distributed Computing paper have continued to build on the work they did on RAMBO at MIT. Seth Gilbert, who was a graduate student in Lynch’s lab and is now an assistant professor at the National University of Singapore, is working out the technical details of implementing RAMBO in wireless networks. Alex Shvartsman, a professor of computer science at the University of Connecticut who was a research associate and visiting scientist at MIT, is investigating applications of RAMBO for Internet file sharing.

Keidar, too, is working on a distributed-memory system, one that can reconfigure itself without having to do any coordinating among devices. That approach could prove more practical for some types of networks. But “RAMBO is seminal in inspiring that type of later work,” Keidar says. “RAMBO defined the problem.”