Pentium flas could spark scramble
for bug-coping strategies
Experts advise cross-checking critical designs using different processors and software. The tradeoff? Time to market.
Newton, MA-- By now, engineers worried about potential Pentium division errors have contacted Intel and swapped their flawed chips for new ones. But while that problem may be "solved," the Pentium flap has raised a broader question: Are other chips or software also making errors that some mathematics professor may not catch until next year? Should engineers consider strategies for overcoming potential flaws, including delaying time to market while checking a design one more time?
"Microprocessors have become so complex that it is no longer possible to completely debug them, or even to determine every bug that exists in one," according to Dr. Thomas R. Nicely, the mathematics professor at Lynchburg College, VA, who discovered the Pentium flaw. His advice: Perform "mission-critical" computations multiple times-preferably with different CPUs, operating systems, and software algorithms.
"This is a direct assault on speed to market," says Michael Schrage, a research associate for the Sloan School at MIT who also writes the Innovation column for the Los Angeles Times. "But you retain no market-share advantage if you're being sued for $250 million by 10 of your largest customers for a flawed product."
The potential for hardware flaws shouldn't be overstated. After all, chips and software go through multiple quality-control checks before hitting the market. Still, caution is advisable. "There's probably a greater chance of error in formulating the problem and writing the code than there is in a microprocessor," says Rich Partridge, a microprocessor market analyst with D.H. Brown Associates, Port Chester, NY. "But they're all tied together. If you can't trust anything, what do you do? You do cross checking."
Pandora's box. The catalyst for this uneasiness is Intel Corp.'s Pentium microprocessor. Its floating-point unit had a flaw that caused it to return reduced-precision results for division involving certain numerators and denominators. The flaw was due to five missing entries in a 4,000-entry look-up table the iterative algorithm uses to perform the divide instruction.
Intel officials say that for 1 in 9 billion possible divides, any digit from the fifth digit on could be incorrect. However, the error could be magnified by multiplying the result by a large number or subtracting two numbers that have been divided.
The impact of the flaw varied by the rate of use of floating-point instructions, the input data fed to them, the use of the results in further calculations, and the accuracy needed. One of the most intensive uses of floating-point math is in engineering software.
"CAE users need to be jolted into the realization that computers are not infallible," says Charles Foundyller, president of CAE market research firm Daratech in Cambridge, MA. He doubts that engineers will be doing CAE on multiple platforms because of the time involved, and advises moving to the latest computer technology with caution.
Schrage predicts that companies will be sharing more hardware testing data and publishing the results. Disclosures from such companies as Boeing, Ford, GM, and HP on their