Inside Iowa State
March 21, 1997
Breaking the code
by Skip Derra
It appears that Suresh Kothari and his students have done "the impossible." They've found a way to convert computer programs written for slower (serial) computers into programs that can operate on superfast, parallel computers.
Until now, the only way to convert programs from a serial to a parallel computer was to go through the code line by line, a time-consuming task matched only by its labyrinthine complexity.
What they have done has raised a few eyebrows, as Kothari's graduate student Simanta Mitra can attest. In his preliminary oral exam, Mitra described the new parallelization tool he and fellow student Youngtae Kim had been working on under Kothari's direction.
But Ames Laboratory computer scientist John Gustafson, who had been there and done that, wasn't buying Mitra's story.
"I couldn't believe he was going to solve this problem," Gustafson said. "I thought he was just another inventor of a perpetual motion machine."
"He had lots of questions for Simanta," recalled Kothari, a professor of computer science. "It was very intimidating. Gustafson eventually told him this was a scientifically intractable problem -- a nice way of saying it cannot be done."
Mitra refined his explanation and faced his reviewers again.
"It was like seeing light," Gustafson admitted. "I thought, 'Yes, yes you can do this.'"
Over the past 15 years, millions of dollars have been poured into efforts to develop tools to make code conversion easier. It has been the focus of many of the best computer minds in the country. Nobody has been able to do it.
Now, Kothari, Mitra and Kim are working with Gustafson and Gene Takle, professor of agronomy, on the new parallelization tool. They recently received a $480,000 grant from the Environmental Protection Agency to develop a prototype system targeting codes such as MM5, a computer code used in weather forecasting.
MM5 can be kindly described as the "mother of all code." Extremely complex and detailed, MM5 has 100,000 lines of code. Printed out, it would fill four reams of paper, single- spaced. Other teams have tried to automatically convert this code for parallel machines. All have failed, or are still trying.
Computers dating back to the ABC, invented by Iowa State's John Atanasoff, perform their functions in sequence. That is, they perform an operation, get a result, perform another operation, get a new result and so on. In the early 1980s, the quest for faster high-level computers focused on design of parallel machines.
In a dramatic departure in computing philosophy, parallel machines divide a large problem into smaller pieces, obtain answers to the smaller pieces, then combine the results into a final answer. Parallel machines can perform billions of operations in a heartbeat. Problems that take a serial computer weeks to solve now can be done in five minutes.
The problem is, the programs that make computers work are written exclusively to run on serial machines.
"Scientists want to use parallel machines, but there isn't any good software for them," said Kothari, who became interested in automatic code conversion in 1989. "The proven code is written in serial."
Literally trillions of dollars' worth of serial code still is being used by scientists. They use it to design airplanes and automobiles, to understand how a drug is metabolized in the body, to predict long-term climate patterns. The choice for programmers is simple.
"Either we convert the serial code to run on parallel machines or throw it all away and write new code," Kothari said.
Kothari's method is very different from other approaches to converting code. Rather than trying to "compile" the code -- a brute force conversion from beginning to end -- Kothari builds "knowledge" into his conversion tool. This knowledge consists of a general feel, or road map, for what the program is trying to accomplish. Kothari's tool uses this knowledge to provide pre-processing of information before converting the code.
"Before this tool, it was like asking someone to find a house in Chicago in the evening without a map," Kothari explains. "Our tool now gives that person a map to work with."
The tool finds discrepancies in what it sees and what it thinks it should find in the code. Sorting out these discrepancies early on makes the conversion to parallel code much smoother.
"When it finds a problem, you go back to a code expert to get answers," Kothari said. "It's easier to make those corrections than to go through every line of code by hand."
"Kothari doesn't try to solve the entire problem, just part of the problem," Gustafson said. "But it's a very important part."
The EPA project began in January and Kothari expects to have automatic conversion of MM5 done by the end of August. Not bad for something many thought could not be done.
Kothari and Mitra demonstrated the parallelization tool for NASA officials last June, converting a portion of MM5 (2,000 lines of code) in about 30 minutes. Conversion by hand would have taken two days. Last week, after presenting a paper in Minneapolis, he made a side trip to Cray Computer, where super computing was pioneered.
"They have a whole division working just on converting different types of scientific code," he said.
Kothari is converting more than code. He also is convincing people of a new approach to the problem.
"I really didn't think it could ever be done," Gustafson said. "But when I sat down and saw how it works, I found it is a very simple and powerful tool. I know it will work."
Iowa State homepage
Inside Iowa State, firstname.lastname@example.org, University Relations
Copyright © 1997, Iowa State University, all rights reserved