Outages Hit Microsoft's Web-Based Productivity Service

Microsoft's Business Productivity Online Services (BPOS) experienced multiple outages this week, the company admitted on Thursday.

The outages were first reported on Thursday afternoon by ZDNet, after it received complaints from angry BPOS customers. ZDNet's article pointed to this Microsoft Online Services TechCenter forum, where several customers using BPOS claimed to have experienced service interruptions.

The outages appear to have impacted Exchange Online, with some customers reporting message delays as well as lost e-mails. One Microsoft partner said in a phone interview that messages were delayed up to six hours.

"It's been resolved in that there are not currently delays but it's been in and out since Tuesday afternoon," the partner said. "I'm hoping that it [the outages] won't come back up again but I don't know for certain that it is permanently resolved."

Users of Exchange Online fumed on the online forum. "We're a worldwide corporation using this. If it doesn't improve, we may have to go back to in-house Exchange," said one poster.

Said another: "I migrated our company to Exchange Online from in-house Exchange 2003 last October, and I'm sorry to say that I regret everything I ever said about how this would be better. It has been far worse in terms of both performance and reliability. I hate to be so harsh, but I am deeply frustrated...We're actively looking at migration paths back to in-house e-mail."

Later on Thursday, Microsoft posted a Twitter update: "Cause of #bpos mail delays mitigated; message queues are draining; watch the dashboard for updates." A Microsoft spokeswoman confirmed the outages, saying the company will issue a formal statement.

Just before 6 p.m. PST on Thursday, Dave Thompson, corporate vice president of Microsoft Online Services, posted a detailed explanation for the BPOS outages on the Microsoft Online Services Team Blog. According to Thompson, the intermittent access reported by BPOS users was the result of issues relating to "malformed email traffic" that occurred on Tuesday and Thursday.

Thompson wrote:

"On Tuesday at 9:30am PDT, the BPOS-S Exchange service experienced an issue with one of the hub components due to malformed email traffic on the service. Exchange has the built-in capability to handle such traffic, but encountered an obscure case where that capability did not work correctly. The result was a growing backlog of email. By 12:00am PDT, the malformed traffic was isolated and the mail queues cleared. The delays encountered by customers varied, on the order of 6-9 hours. Short term mitigation was implemented and a fix was under development.

"At 9:10am PDT today, service monitoring again detected malformed email traffic on the service. The problem was resolved at 10:03am, but users experienced up to 45 minute email delays during this time. A second, but related issue was detected via monitoring at 11:35am PDT, resulting in email stuck in some end users' outboxes. The issue was remediated at 12:04pm PDT. During this time, more than 1.5 million messages had queued on the service awaiting delivery. The backlog was 90% clear by 4:12 PM, but because of this large backlog of email, customers may have experienced delays of as long as 3 hours. We are implementing a comprehensive fix to both problems."

Thompson's post also noted that an unrelated DNS issue on Thursday "prevented users from accessing Outlook Web Access hosted in the Americas, and partially impacted some functionality of Microsoft Outlook and Microsoft Exchange ActiveSync devices." That issue was resolved a few hours later, Thompson said.

The outages come as Microsoft is looking to promote the next generation of Exchange Online, Office 365, which it released for beta testing last month. The Microsoft spokeswoman said the outages "only affected BPOS and did not affect Office 365 whatsoever."

About the Author

Jeffrey Schwartz is editor of Redmond magazine and also covers cloud computing for Virtualization Review's Cloud Report. In addition, he writes the Channeling the Cloud column for Redmond Channel Partner. Follow him on Twitter @JeffreySchwartz.


Subscribe on YouTube