The storage industry this month celebrates the 15th anniversary of RAID, a now-fundamental technology that supports the great ocean of data served up to clients within enterprises as well as to ordinary users across the Internet.
RAID focuses on the integrity and survivability of data. Perhaps its also time for IT managers to consider a similar plan for applications, operating systems and computing hardware.
In the muggy June weather of Chicago, the 1988 annual meeting of the Association for Computing Machinerys Special Interest Group on Management of Data (ACM SIGMOD) featured a paper: A Case for Redundant Arrays of Inexpensive Disks. Authored by a team of University of California researchers (Garth Gibson, Randy Katz and David Patterson), the paper described a number of configurations for hard disk drive arrays, which became known as “Berkeley RAID” in the business. And the basis for an entire industry.
Of course, the team didnt just write about the technology; they created several prototypes. Heres a photo of the first prototype, which has an aged, slightly torn sign taped to the side declaring “RAID the First.”
Of course, not all the RAID levels were created equal—especially Level 0, or striping, which isnt really RAID at all, since it doesnt provide any redundancy. This is why it was called Level Zero, but not everyone gets the joke. Randy Katz and David Patterson always seemed pained to point out this fact in conversations. Striping is an enabling technology, an expression that could lead the uninitiated to think that these levels are some kind of storage 12-step program.
RAID employs “parity,” a common data term for determining whether data is lost when being moved between computers—or here, between different storage devices. For example, this RAID parity information lets the system restore “missing” data in a striped array when one of the drives in the set fails.
However, I would point readers to a different definition for parity: the state or condition of being the same in power, value, rank, et al. And I would apply that principle beyond data storage to computing itself.
We often describe computing as a resource, so why shouldnt it be afforded the same concept of redundancy as data gained with RAID storage?
Today, most personal and business computing is done from a single operating system: some form of Microsoft Windows. And the predominant hardware running that OS is usually based on an Intel chip set. IT managers see this homogeneity as a positive value, ensuring compatibility. Still, like your data, the computing platform is vulnerable to threats from within and without the enclosure.
Here are a few recent examples:
- A new variant of the dangerous Bugbear virus last week spread rapidly over the Internet. For more information, see High Risk Virus Spreading Rapidly.)
- According to a report in the Instat/MDRs Microprocessor Watch newsletter, Intel last month had planned to release a 3-GHz version of the Pentium 4 chip with an 800-MHz front-side bus, alongside the launch of the Intel 875 chip set, known as Canterwood. However, an anomaly cropped up that prevented shipment. Later, a BIOS tweak fixed the problem.
- Or consider last weeks story of a problem with Intels latest Centrino processor. The system crashes when trying to run a VPN client software. (For more information, see Intel Investigating Centrino Glitch.)
These problems sound minor, and they are mostly minor. But similar problems in the past have blocked computing sitewide. And any hardware problem that could attack your data or introduce errors would be very problematic indeed.
The processors mentioned above arent bad because they may or may not have had a problem; all processors can have a problem. But its likely that processors from different vendors with different architectures would have different problems or have them at a different time.
While the story about the Bugbear variant described the virus as infecting PCs, the bug doesnt infect machines running Linux or the Mac OS. Of course, computers running Linux or Mac OS X or some embedded Java OS are susceptible to such attacks. But they are different attacks.
In my own work, Ive found times my Mac was able to print when my Windows machine couldnt find a printer on the network. Or it delivered important documents when everyone elses mailbox was waiting on tech support for a fix.
Does that make that Mac or Linux or Java better? Now thats a religious argument. For the purposes of this discussion, the alternate platform was just different enough to do the job that needed to be done.
June is a month for weddings and alternatives, including this weeks JavaOne conference and (later this month) Apples Worldwide Developers Conference. Managers would do well to take a look at the announcements and see if the alternatives they present apply to their enterprises.
David Morgenstern is a longtime reporter of the storage industry as well as a veteran of the dotcom boom in the storage-rich fields of professional content creation and digital video.</</body>