Computing in Concert


When the problem is a monstrously intensive computational task — describing a whole system like the brain or the entire human genome or reconstructing highly detailed images, for example — a single computer is a clumsy and slow tool. More elegant and speedier is the “supercomputer,” actually many dedicated processors working together in close proximity to save time moving data around. Not one giant bee, but a hive of many workers all dedicated to the same problem.

Fred W. Prior, PhD, professor of radiology and director of Mallinckrodt’s Electronic Radiology Laboratory, recalls receiving an email from department director R. Gilbert Jost, MD, about five years ago saying, “There’s this NIH (National Institutes of Health) grant, and one of the things you can buy is a supercomputer. Why don’t you see if anybody’s interested?”

Prior, who directs the Center for High Performance Computing (CHPC), says he was skeptical but called a meeting of people he thought might be interested, assuming a small turnout. “Everybody I invited came,” he says, “and they brought their friends. There was tremendous interest.” He attributes it to a “pent-up demand for high-performance computing” since an earlier center in the Department of Physics had closed.

So Prior wrote the grant, which was accepted, but, he says, “The government didn’t get a budget that year, so there wasn’t any funding.” He notes that reviewers asked them “to prove that the algorithms you’re talking about actually would be improved if you had this computer.” To resubmit the grant, he partnered with IBM, which ran the simulations and gave him the results.

When the system eventually was funded, it was purchased from IBM and housed in the Genome
Institute’s data center. For the first couple of years, they ran it in what Prior calls “family and friends” mode. He says, “If you were in radiology, were collaborating with radiology, had some link, however tenuous, we’d give you an account and it was free.”

When the CHPC wanted to expand access to the rest of the university, two issues needed to be addressed. One was to develop a funding model that would be sustainable, and the other was to assure there was high enough network bandwidth to move large data sets between the medical and Danforth campuses.

“We put our high-performance computer on the highest bandwidth network at Washington University,” Prior says. “And now there’s a high-speed link dedicated for research between the campuses, and we’re hoping to maintain and grow that.”

Today, researchers across the university — from medicine to engineering to economics to music — are taking advantage of the supercomputer to advance their research, which in many cases helps them attract grant money to support it. To date, the CHPC has processed more than 3.4 million jobs that have consumed a total of 19.8 million processor hours. To do that much computing on a single processing core would require approximately 2,260 years. The system earns its “super” appellation.

The funding model was developed “with a lot of help from the dean’s office” at the School of Medicine. Prior says, “We convinced the school’s executive faculty and the deans on the other campus that this should be a fee levied against departments so that all of their faculty could have free access to the system.”

Another important step was when the School of Engineering & Applied Science “made us a deal we couldn’t refuse” and became co-operators of the center. Prior remains the director in charge of day-to-day operations, and his co-director is Rohit V. Pappu, PhD, professor of biomedical engineering and a member of CHPC’s executive committee.

In place of having each engineering school department pay a fee to use the supercomputer, the school upgraded the computer with Graphics Processing Units (GPUs). GPU-accelerated computing offloads computer-intensive portions of an application to the GPU, while the remainder of the code runs on the CPU. The result? Applications run significantly faster.

For investigators like Pappu, access to the supercomputer has solved the issue of trying to find money for annual purchases of computer hardware. He explains that funding agencies will not fund a grant for computing equipment and will take out any line item on computing infrastructure, which posed a “significant limitation” to his laboratory’s research.

The paradox, he says, is that “funding agencies are extremely happy giving you money for your work if you tell them you have access to a supercomputing resource.”

This was one thing that led Ralph S. Quatrano, PhD, dean of the School of Engineering and the Spencer T. Olin Professor, to invest in the GPUs. “The idea, then, was it would enable more grants to come through and provide greater collaboration,” Pappu says. It would also give the engineering school “a fertile playground” in the area of GPU computing.

But the GPUs have benefitted much more. They have been integral to the Human Connectome Project, which is one of the biggest users of the supercomputer, as well as to investigators in the Department of Radiology, which is the other superuser of the system.

Overseeing the work of the CHPC and keeping it running is Malcolm Tobias, PhD, senior system administrator. Prior describes him as having a “tremendous knowledge of high-performance computing,” as “keeping the users happy,” and as “a wizard at adjusting how the scheduling system works.”

Tobias also works closely with what Prior calls the “invisible network” — graduate students and postdocs. Prior says they are “the people who actually figure out how to use this thing, and then they talk to each other.”

Key too, Prior says, is an executive committee of leaders from both campuses. According to Pappu, a primary role of the executive committee is to determine how to fund the CHPC and how to sustain it, particularly when a dynamic at play is Moore’s Law (from Intel co-founder Gordon Moore), which says that the speed of processors basically doubles every 18 months.
That, of course, requires ongoing investment. The CHPC leaders are looking at the kind of analysis that’s being done on the supercomputer by the Human Connectome Project, and then projecting what will happen. They are seeking funding to add higher performance components with a focus on leading-edge GPU computing elements. They also plan to hire a GPU programmer.

“Lots of applications do well on GPU hardware,” Pappu says, “but converting what is designed to work on a CPU to work on a GPU takes a lot of know-how and technical skill.” He envisions that “as this enterprise grows, we hope to have about half a dozen people between the two campuses who will enable the work of about 30, 40, 50 — maybe even 100 — scientists who are all using high-performance computing to advance their particular areas of research.”

Pappu believes two things are needed to advance researchers’ work. “There’s innovation that we bring to the table in terms of state-of-the-art models that enable computations in ways that are unique. But all that would add up to naught if we didn’t have the Center for High Performance Computing. You need both. You need the innovation, but you also need the resources.”

He praises Jost, director of Mallinckrodt Institute of Radiology, as being “genuinely instrumental in the early days, in recognizing the importance of high-performance computing, and then putting his support behind the venture,” with further support from Larry J. Shapiro, MD, Dean of the School of Medicine.

Without them, he says, “I don’t think any of this would have happened. For me, at least, it has been a transformative entity.”