Salesforce.coms daylong outage Tuesday was caused by “an extremely rare, undocumented bug” that neither the company nor its database vendor had ever seen before, a company executive said Thursday.
There was no indication that the outage was caused by a lack of capacity or scalability in Salesforce.coms IT architecture, said Bruce Francis, the companys vice president of corporate strategy.
Users were not able to access Salesforce.com servers Tuesday from 9:30 a.m. to 12:41 p.m. EST and between 2 p.m. and 4:45 p.m. EST because of the bug, which affected a database cluster error in one of the companys four global network nodes, according to company officials.
Salesforce.com runs its hosted CRM (customer relationship management) system on Oracle databases, but Francis declined to confirm that Oracle was the source of the bug.
“The vendor has one of the largest and most sophisticated development teams on the planet, and they had never encountered this bug before,” he said. Salesforce.com is working with the vendor to make sure that the bug doesnt crop up again, he said.
“We are working around the clock to take every possible step and take any new ones that we could to prevent this from happening again,” Francis said.
Analysts said they didnt believe that Salesforce.com would lose its customers confidence as long as the outage isnt repeated.
“If this was happening on an on-going basis I would be concerned,” said Sheryl Kingstone, CRM technology analyst with the Yankee Group in Boston. “But hopefully they will learn from the experience and it will be a one-time thing.”
Kingstone said she also didnt believe that it was the result of a system scalability problem. “I know they have been making major investments in expanding their data centers” and building more redundancy into their network, she said.
“Scalability problems are very easily fixed by adding more equipment” and by putting in more fault-tolerant software, she said.
This is something that can happen to any company whether they run critical business applications in their own IT departments or use a hosted service, she said. No IT department managers “in their right mind would ever say this could never happen to us,” she said.
Denis Pombriant, principal analyst with Beagle Research Group in Stoughton, Mass., agreed that Salesforce.com should be able to shake off a one-day outage.
“It will certainly give some people pause. It will make some people say, Well, I always felt that something like this could happen,” Pombriant said.
However, “a company like Salesforce.com can take a hit like this and not sustain serious damage to their reputation. But they dont want to make a habit of it,” Pombriant said.
This is a problem that is inherent in any large-scale system, whether it is an IT service utility or a power utility, he said. “Because we are now relying on information utilities like Salesforce.com, its becoming something to be aware of,” he said.
“The issue for us all is to build redundancy into our systems and try to fail-safe them in every way that we can,” Pombriant said.
Salesforce.com said it is serving more than 350,000 users working for more than 18,000 separate customers. However, Pombriant noted that since the end of 2004 the company has invested in at least one new mirror data center site and other capacity to meet the demand.