SciTech

CMU creates super efficient, energy saving computer clusters

Anyone walking around Carnegie Mellon’s campus can attest to the sheer amount of computer workspace available, which, while very helpful to a student in need of a printout in five minutes, causes the school’s energy input to be staggeringly high. However, Carnegie Mellon’s computer science department, in conjunction with Intel Labs Pittsburgh, has developed a new system called Fast Array of Wimpy Nodes, or FAWN, in order to increase energy efficiency in computer clusters. In this system, each node acts as a small, individual computer, each with its own network interface, and since the nodes consume much less power, they are called “wimpy nodes.”

The plan was developed by a research team headed by David Andersen, assistant professor of computer science, and Michael Kaminsky, a senior research scientist at Intel. The team began this research over a year ago, and as time went on, they began to build an infrastructure of the nodes, eventually arriving at a point at which the nodes were capable of processing 10 to 100 times more queries as traditional disk-based clusters at the same energy cost. A query involves a key-value storage system in which a user looks up the key and gets values in return. Kaminsky, who is involved in distributed systems and networking operating systems, compares a query to the search features on Facebook and Twitter.
Research involved in the computer cluster field indicates how far the technology has come in the last 20 years. At one time, computer clusters were reserved for only those people with supercomputers. Since the technology has advanced to a more local level and more and more people are forming clusters, the energy usage has become a much larger issue for everyone.

“In two years, data centers as a whole are going to comprise three percent of all energy usage in the United States,” said Vijay Vasudevan, a computer science Ph.D. candidate on the research team. “FAWN is a very timely work, in the sense of its tackling a problem that’s become a very big issue in the past few years.” Each node uses about three watts of energy, a huge reduction from the 100 watts of energy many computers use. The nodes are aligned in a 21-node series, matching them to a storage device known as a flash card, which has about four gigabytes of memory. In addition to the nodes and flash card, each FAWN node contains a low-power processor.

Flash memory is the same technology as that in a digital camera memory card. It is both faster than traditional magnetic disks and requires lower power. The flash memory is coupled with Intel’s Atom processor, noted for its low-power adaptations. FAWN is suitable for managing a random access workload, similar to the technology found on Facebook and Twitter. Amar Phanishayee, another Ph.D. student on the team, makes a comparison: When you log on to Facebook, the information stored on every person in the world on the website must be narrowed down to only your friends.

In such a case, it is far easier to access the small data set that includes your friends rather than accessing the larger data set that includes everyone on Facebook. FAWN works in a similar manner, narrowing down to a few data points from a much larger set.The process has not been without its problems, however. Kaminsky says that the right network architecture (or arrangement of the nodes) must still be found and the response level of the cluster should remain high. The issue with the latter problem is consistency within the cluster, which involves keeping the nodes in synchronization with one another.

With unsuitable consistency, the computer will not perform as quickly as it would otherwise. Another problem faced by the research team has been the relatively slow speed of the process cards.
They are not the quickest cards available, but the team has been able to counteract the slowness by balancing them with input/output bandwidth. Looking to the future, Vasudevan and Phanishayee predict that new green industries will be particularly interested in the FAWN technology.

Companies that may show interest in the future include Microsoft, Google, and any other corporation that has a large data center. Google’s data center, for example, contains the search engine, the e-mail service Gmail, and many other applications. A low-energy plan would make the services seem more desirable over Google’s competitors.

Kaminsky, however, reminds us that FAWN has not been released commercially and that it is still in the research phase, and he also says that they are “not in contact with any groups trying to roll it out.” So until the day comes when the technology is available to the general campus, spending day and night on homework, Facebook, or Skype will not reduce your carbon footprint. But, Kaminsky notes, “The door is open.”