[DM-MUG] Fwd: Re: [MacLaw] Sidekick Customer Data Lost in the Clouds
Jon Thompson
jon at mac-consultant.com
Tue Oct 13 16:59:58 CDT 2009
I just love that some of the words are censored, and some are not.
This certainly has more info than I have heard before. It also is the
type of thing we SAN admins fear. It takes three days to transfer my
data, and that is only my tiny 9 TB SAN (also with a single LTO-3 tape
drive, which slows things down.)
I'm quite surprised. All I've heard is MS is saying that the data is
gone. I realize that it is better to under-promise and over-deliver,
but also figure that they would have under-promised a little less,
considering they have people believing their data is _gone_.
As for taking out the parity: that's what would keep me awake, if I
didn't have backups. Hardware failures can cause other hardware
failures. Many hardware failures spell trouble. SANs are a _lot_ of
hardware. Mine is 24 Hard Drives in two RAID arrays, with a Fiber
Channel Switch, an Ethernet Switch, and three XServes with two hard
drives apiece. And that's a little SAN.
As for not telling the client that things died for four days- that is
unconscionable. They should have had a second device on order within
ten minutes, so when they did tell MS, they could say that it was
already on order and would be here within X number of hours.
Screwing up this badly _should_ put EMC out of business. I would
cancel any contracts with them, if I had them. Anyone want to buy the
Absolut building?
On Oct 13, 2009, at 4:25 PM, Victoria L. Herring wrote:
>> Here's what my source inside the situation is saying. Pretty sure
>> it's not
>> *entirely* MS' fault, even if they're getting the majority of the
>> blame. :
>> Danger, purchased by Microsoft, was moved into a Verizon Business
>> datacenter
>> in Kent, WA a short while ago. While this had to do with the MS
>> assimilation, it was done as a one for one move from Danger to a DC
>> that MS
>> uses heavily. (MS didn't re-write, port, migrate to winblows, etc.)
>> The
>> backend service uses a variety of hardware, load balancers,
>> firewalls, web
>> and application servers, and an EMC SAN (Storage Area Network,
>> think huge
>> drive array connected with fiber.)
>>
>> Well last Tuesday, the EMC SAN took a dump on itself. What I mean
>> by that is
>> the backplane let the magic blue smoke out. While usually in the
>> heavy iron
>> class of datacenter products like an EMC SAN this means you fail
>> over to the
>> redundant backplane and life continues on. Not this time folks. In
>> the
>> process of dying, it took out the parity drives. What does that
>> mean? It
>> means the fancy RAID lost it's ability to actually be a RAID. How
>> much data
>> got eaten by this mega-oops? 800TB. Why wasn't it backed up? It
>> was, to
>> offsite tape, like it's supposed to. But when the array is toast,
>> can't just
>> start copying shit back.
>>
>> Apparently EMC has been on site since Tuesday, but didn't actually
>> inform
>> Danger/MS that their data is in the crapper until Friday afternoon.
>> On top
>> of that, EMC has done nothing to bring in replacement equipment
>> between
>> Tuesday and Friday. (In the Enterprise support world, that's fucking
>> retarded, multi-million dollar support contracts are that expensive
>> for a
>> reason.)
>>
>> So what's being done? Well the good news is that the complex was
>> slated to
>> be migrated into the Verizon Business cloud services (not MS's
>> cloud per se,
>> but it's MS's effort.) And as a part of that migration a newer
>> shinier SAN
>> array was in process of being implemented. But space isn't ready
>> for it on
>> the datacenter floor, and you can't just toss the EMC raid and
>> place this
>> one in it's place, it's a different vendor and is 2 racks instead
>> of one.
>> This means it's being shoehorned into a different part of the
>> datacenter
>> than was originally planned, one that doesn't have the necessary 3
>> phase
>> power installed. So there's a bit of work to be done. Not to
>> mention the
>> restoral of 800TB of backup data from offsite tape.
>>
>> Time to restoral? Looking like Wednesday at the earliest with techs
>> working
>> all weekend.
>>
>> Lessons to be learned?
>> *Buy a f'n phone that doesn't store it's address book and your
>> personal data
>> somewhere else, and one you personally can backup yourself.
>>
>> *Don't expect EMC to actually respond to fixing your core business
>> application in any reasonable amount of time. They've gotten lazy,
>> consider
>> other vendors.
>>
>> *Just because your phone says T-Mobile on it, and T-Mobile is
>> crediting you
>> a month's service, doesn't mean they fucked up.
>>
>> *Just because Microsoft is involved, doesn't mean MS f**ked up.
>>
>> *And lastly, it's not always a "server" that f**ks up.
>>
>
> --
> Victoria L. Herring, Des Moines, Iowa. Blogs:
> http://blog.JourneyZing.com [photography];
> http://www.herringlaw.com [civilrights/discrimination];
> http://victorialherring.typepad.com/serendipity/ [personal].
> _______________________________________________
> DMMUG mailing list
> Use this Address to send mail to the list:
> DMMUG at dmmug.org
> Use this page to modify subscription options:
> http://cialug.org/mailman/listinfo/dmmug
More information about the DMMUG
mailing list