
how-to block ads
|
Uniqs: 830 |
Share Topic  |
 |
|
|
|  ender78 join:2002-10-17 Mississauga, ON | Re: Reliable servers haha The best data centers have their customers run off of conditioned power all the time. You're never really on street power but go through conditioners and or batteries.
That said, the ups you have at home is no more that 800VA. A large data centre will have tons of 300KVA units. Slightly different scale. | |
|  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
| said by iLive4Fusion:I have 2 $50 APC UPS backup battery connected to my desktop computer and one for the internet modem/VOIP to provide power for 25 seconds until my whole house Generac Generator kicks in to provide power to everything but the stove and A/C. Seem's like a big datacenter like 365 could at least afford a battery backup until the generators kick in That's not the type of backup power system 365Main has. They use a flywheel based continuous power system (CPS). Short version is, the power system is based on a huge flywheel (IIRC 12,500 pounds each) which spins a generator. 365Main has, I believe, 10 of these units.
Under utility power, the generators do not actually generator power-- they are used to filter/condition the utility power before it's fed into the DCs. When utility power fails, the generators switch from power conditioners to power producers, driven by the energy of the spinning flywheel. The flywheel has enough stored energy potential to spin the generator for approximately 60 seconds. Within a few seconds of utility power loss, the diesel engine starts and in under 3 seconds clutches onto the flywheel assembly and assumes the task of driving the flywheel. Basically, there is a one-minute window of opportunity after the utility drops out to have the diesel engine online and clutched in.
There are NO battery UPSes at 365Main-- it is a "green" facility, and we all know that huge quantities of old batteries are environmental problems. Batteries do go bad and need a regular cycle of replacement. (How often has a rack gone dark because of a UPS battery failure on a battery that "tested OK" during the weekly test?) The CPS eliminates the battery hassles, and under all but unusual circumstances, is a very reliable source of backup power.
In 365Main's case, however, the engines failed to come online (and *stay* online), hence several colo rooms went dark. | |
|  |  | | Re: Reliable servers haha Oh good that they are thinking green, my home generator has a 12 volt car battery and when the power goes out for more than 10 seconds it is started up and fueled by natural gas. | |
|  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
| Re: Reliable servers haha said by iLive4Fusion:Oh good that they are thinking green, my home generator has a 12 volt car battery and when the power goes out for more than 10 seconds it is started up and fueled by natural gas. The generators failed to run, so once the flywheels spinning the generators lost their intertia there was nothing left to spin the generators, since the diesel engines were not running. It was NOT a matter of the diesels not having batteries to start.
For those not clearly picturing this system, the generators I'm talking about ARE NOT what you normally associate with a "generator"-- i.e., an enclosure containing an engine and generator. In the case of a CPS, it's something like:
[generator] -- [flywheel] -- [clutch] -- [engine]
Under normal operation, the clutch is dis-engated and the engine is off. Utility power drives the flywheel assembly to keep it spinning. The generator part isn't actually generating power, but rather, it's being used as a power conditioner. When utility power drops out, the generator immediately switches from being a power conditioner to being an actual producer of power, and the inertia in the flywheel keeps the assembly rotating for up to 60 seconds. During that time the engine starts, the clutch is engaged, and the engine provides the power to spin the flywheel which is spinning the generator. It's a really slick system when it works.
In 365Main's case (they have 10 of these CPS systems) 3 engines started but immediately shut down (for reasons still unclear and are being investigated) so the flywheels on those 3 assemblies eventually stopped spinning, and thus no more power was generated from those units. This generation deficit caused a 4th generation unit to become overloaded and it shut down as well. | |
|  |  |  |  Reviews:
·AT&T U-Verse
·T-Mobile US
·AT&T Wireless Br..
·ViaTalk
·Verizon Broadban..
| Re: Reliable servers haha said by rebus9:said by iLive4Fusion:Oh good that they are thinking green, my home generator has a 12 volt car battery and when the power goes out for more than 10 seconds it is started up and fueled by natural gas. The generators failed to run, so once the flywheels spinning the generators lost their intertia there was nothing left to spin the generators, since the diesel engines were not running. It was NOT a matter of the diesels not having batteries to start. For those not clearly picturing this system, the generators I'm talking about ARE NOT what you normally associate with a "generator"-- i.e., an enclosure containing an engine and generator. In the case of a CPS, it's something like: [generator] -- [flywheel] -- [clutch] -- [engine] Under normal operation, the clutch is dis-engated and the engine is off. Utility power drives the flywheel assembly to keep it spinning. The generator part isn't actually generating power, but rather, it's being used as a power conditioner. When utility power drops out, the generator immediately switches from being a power conditioner to being an actual producer of power, and the inertia in the flywheel keeps the assembly rotating for up to 60 seconds. During that time the engine starts, the clutch is engaged, and the engine provides the power to spin the flywheel which is spinning the generator. It's a really slick system when it works. In 365Main's case (they have 10 of these CPS systems) 3 engines started but immediately shut down (for reasons still unclear and are being investigated) so the flywheels on those 3 assemblies eventually stopped spinning, and thus no more power was generated from those units. This generation deficit caused a 4th generation unit to become overloaded and it shut down as well. Oh so it doesn't use battery's to start it just uses the already spinning flywheel. Thats how the gas engine in my car starts using a flywheel to get it up to rpm and injects fuel except it just uses the avalible battery to spin it up to speed. Are alot of data centers using these types of system? Last few I have been to were UPS with diesel backups which if I remember correctly had it's own battery to crank it up. | |
|  |  |  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
1 edit | Re: Reliable servers haha said by iLive4Fusion:Oh so it doesn't use battery's to start it just uses the already spinning flywheel.
You have the option of using power generated by the already-spinning flywheel/generator to power the starter on the diesel engine. This is obviously a bad thing if you have to start the diesel after the flywheel has stopped spinning. Or you can install a Dark Start kit, which is basically the same thing as your car-- the starter on the diesel engine is powered from batteries. From the limited information I have, I believe 365Main has dark start ability. The problem is the engines DID start as they were supposed to, but then shut down almost immediately for reasons that are not yet known. Or if they are known, nobody is talking yet. said by iLive4Fusion:Are alot of data centers using these types of system? Last few I have been to were UPS with diesel backups which if I remember correctly had it's own battery to crank it up. CPS systems aren't widely deployed. Most still use battery driven UPS systems which supply power until the generator cranks up and assumes the load. | |
|  |  |  |  |  |  |
 |  | | Several generators at 365 Main's San Francisco data center failed to start when the facility lost grid power Tuesday afternoon, causing an outage that knocked many of the web's most popular destinations offline for several hours. The disruption, which began at 1:45 pm PST, occurred during a grid outage for Pacific Gas & Electric, which left significant portions of San Francisco in the dark. Parts of 365 Main's data center lost power, causing downtime for customer sites including CraigsList, Technorati, LiveJournal, TypePad, AdBrite, the 1Up gaming network, Second Life and Yelp, among others.
Wild rumors circulated about why 365 Main's backup systems failed to maintain power to key systems, including reports of employee sabotage or a possible triggering of the facility's emergency power off (EPO) button, a frequent cause of outages at mission-critical facilities. While less sensational, the actual cause of the outage was the failure of backup diesel generators.
"An initial investigation has revealed that certain 365 Main back-up generators did not start when the initial power surge hit the building," the company said in an incident report. "On-site facility engineers responded and manually started affected generators allowing stable power to be restored at approximately 2:34 pm across the entire facility."
"As a result of the incident, continuous power was interrupted for up to 45 minutes for certain customers," the report continued. "Were certain 3 of the 8 colocation rooms were directly affected, and impact on other colocation rooms is still being investigated."
The 365 Main data center is supported by 10 Hitec 2.1 megawatt generators, which are tested every month. The 277,000 square foot 365 facility is partitioned into eight data center "pods," some of which remained online while others went dark.
The facility's backup systems use flywheel UPS systems - rather than batteries - to provide "ride-through" electricity to keep servers online until the diesel generator can start up and begin powering the facility. A flywheel is a spinning cylinder which generates power from kinetic energy, and continues to spin when grid power is interrupted. In most data centers, the UPS (uninterruptible power supply) system draws power from a bank of large batteries. AboveNet, the original builder/owner of the 365 Main data center, was an early adopter of flywheel UPS systems, which have recently gained attention as a "greener" alternative to batteries.
Some customers speculated about a flywheel issue. Trouble shooting the exact reason for the generator failure will take some time, according to 365 Main. "Due to the complexity and specialization of data center electrical systems, we are currently working with Hitec, Valley Power Systems, Cupertino Electric and PG&E to further investigate the incident and determine the root cause of why certain generators did not start," the company said in its incident report.
The downtime quickly became a public relations setback for 365 Main, as the blogosphere pounced on a failure that knocked many of its leading hosts and services offline. The outage was highlighted at O'Reilly Radar, Scobleizer and TechCrunch, among others.
Earlier in the day the company issued a press release noting two consecutive years of uptime for a customer at the San Francisco data center, RedEnvelope. The press release was noted on Slashdot and Techdirt and has since been removed from 365 Main's web site.
Misinformation spread swiftly, propelled by the blogs and forums not affected by the outage. CNet, which hosts its servers at 365 Main, debunked reports from ValleyWag that a drunk employee had gone on a rampage and that a "mob of angry customers" assembled outside the 365 Main building. The "mob" was actually a line of customers who were forced to enter through the front door and have badges checked manually to get into the building because the parking garage gate was affected by the power outage, according to CNet. ValleyWag's "drunk employee" post quickly became one of the most popular posts on the front page at Digg.
The problems began when parts of PG&E's San Francisco area network began experiencing voltage fluctuations, which apparently caused a transformer to fail in a manhole under 560 Mission St. Witnesses told the San Francisco Chronicle they heard a blast shortly before 2 p.m. and then saw flames licking up through the manhole grate. PG&E could not confirm that an explosion had occurred, but said that 30,000 to 50,000 customers were affected.
The 365 Main data center was originally built by AboveNet, which spent $125 million to construct and "earthquake proof" the facility. After AboveNet filed for bankruptcy, 365 Main bought the property for $2.6 million in a court-approved deal. 365 Main has since expanded its network to seven data centers, including facilities in Oakland, Phoenix, Chantilly, Va. and two centers in Los Angeles (El Segundo and Vernon/Irvine). | |
|  |  |  KrKHeavy Artillery For The Little GuyPremium join:2000-01-17 Tulsa, OK | Re: Reliable servers haha Wow what a steal.
AboveNet spends 125 million on facility and upgrades 365 Main buys the whole shebang for $2.6 million.
One wonders why "On site engineers" took 49 minutes to manually start the generators. | |
|  |  |  Reviews:
·Verizon Online DSL
·Optimum Online
·EarthLink
| said by doncute18:AboveNet, the original builder/owner of the 365 Main data center, was an early adopter of flywheel UPS systems, which have recently gained attention as a "greener" alternative to batteries. How much electricity is wasted and how much pollution is generated to keep the flywheels spinning 24/7? The flywheels are only used rarely, so that's a big waste of power 99.9999% of the time. | |
|  |  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
1 edit | Re: Reliable servers haha said by Bobcat:How much electricity is wasted and how much pollution is generated to keep the flywheels spinning 24/7? The flywheels are only used rarely, so that's a big waste of power 99.9999% of the time. It's my understanding that once the flywheels are in motion, it doesn't take very much energy to keep them spinning. Like in your car, it takes a lot of fuel to move you from a dead stop to 50 mph, but once you're in motion it only takes a feather's push on the accelerator to keep you in motion.
Also, consider the power lost within UPSes during normal operation. You do not put, for example, 50kW into a UPS and derive a full 50kW out the other end. There are internal losses from not only the conversion/filtering operations, but also in maintaining the charge in the batteries (which immediately begin to discharge when you turn off the charger).
I upgraded the UPS under my desk from 400 VA to a new 1500 VA model recently. The heat dissipated by the 1500 is so great (even with only the 120 watt load of my PC and monitor) that it made my feet uncomfortably hot and I had to relocate it. Imagine the heat generated by the UPSes in a datacenter. And of course that heat is simply electrical energy being converted to heat and lost into the air. (i.e., wasted energy)
At the risk of drifting off topic.... for those looking for a new UPS, this one is really slick. Digital readout for utility voltage, load%, load watts, charge level, estimated runtime, etc. and it's cheap. (I paid only $159 + tax, which is less than I paid for my 400 VA model back in 1995.) With my connected load of 1 PC and flat screen monitor, the runtime is about 75 minutes. Very cool. »www.circuitcity.com/ssm/APC-1500···etail.do | |
|
 |  davoice join:2000-08-12 Saxapahaw, NC Reviews:
·Comporium
1 edit | And these SAME flywheels... designed by the SAME company failed in *exactly* the same way at e^Deltacom/Quality Technology Services in Atlanta, GA last year. I think there's a design flaw in relying solely on the flywheels.
We were lucky... we didn't buy their claims of "impossible to have a power outage here" and we had rack mount APCs in our racks. We didn't immediately lose power but many other customers had major system failures due to the hard shutdown and subsequent blips. Power was out in the facility for almost 3 hours.
Oddly enough... the original network admins apparently didn't trust the flywheels either b/c all their core routing, switching, phone switches, etc. had independent battery backup too. Those of us smarties who brought our own UPSes had minimal downtime and were able to safely power down while they worked on the problem.
}Davoice | |
|  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
| Re: Reliable servers haha said by davoice:And these SAME flywheels... designed by the SAME company failed in *exactly* the same way at e^Deltacom/Quality Technology Services in Atlanta, GA last year. I think there's a design flaw in relying solely on the flywheels. IMO, the design flaw is the short window of opportunity to get the engine online and clutched in. I believe it's 60 seconds, which under normal circumstances is way more than needed to get a diesel cranked, spun to operating RPM, and the load transferred.
However, in a failure mode, that is not enough time for human intervention after a startup failure, before the flywheel runs out of inertia and the assembly stops rotating.
Good in concept (in that it completely eliminates the costs, risks, and disposal problems of batteries) but in practice it's a lot of risk to assume when the loads MUST NOT lose power. | |
|
 |  |  |  |  Reviews:
·Verizon Online DSL
·Optimum Online
·EarthLink
| Re: Reliable servers haha My web host had a power failure today, and our server wasn't affected at all. Not even a hiccup:
quote: Aug 9, 2007, 4:24 PM - Utility Power Outage
Earlier this afternoon, as a series of powerful thunderstorms moved through the Pittsburgh area, our datacenter lost utility power for approximately 20 minutes. Our power needs were handled seamlessly by our UPS and generator systems. More storms are expected throughout this afternoon and evening -- our UPSes and generators are ready to again take over instantly if utility power is lost.
»www.pair.com/support/system_notices.html
| |
|
 public join:2002-01-19 Santa Clara, CA | said by iLive4Fusion:Seem's like a big datacenter like 365 could at least afford a battery backup until the generators kick in Well if the available funding is used to pay the senior management, there is not enough left to actually hire staff with adequate skills to setup and test a backup power system. | |
|  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
1 edit | Re: Reliable servers haha said by public:said by iLive4Fusion:Seem's like a big datacenter like 365 could at least afford a battery backup until the generators kick in Well if the available funding is used to pay the senior management, there is not enough left to actually hire staff with adequate skills to setup and test a backup power system. It's not a matter of putting in a battery backup system. With their CPS system already in place, talking about a datacenter-wide UPS system is like saying you need to carry a Toyota Corolla around in the trunk of your Impala in case the Impala breaks down. Or it's like carrying around 4 spare tires, because there exists the possiblity of getting more than 1 flat tire. (i.e., you run through a pile of nails that falls off a truck in front of you on the freeway)
UPS systems are very expensive, not only to purchase, but to maintain. In the case of 365Main, it would be a backup system to a backup system. I personally am not a big fan of CPS systems, mainly because the window of opportunity is so small. (nor am I defending 365Main)
But at the same time, there has to be a line drawn in the sand. You either invest heavily in a CPS, or invest heavily in UPSes. When Abovenet & MFN built the facility, no expense was spared. If there was a premium option to be had, it was bought. What they didn't do, however, was install backup systems for their backup systems. | |
|  |  |  | | Re: Reliable servers haha If they adequately tested their current system though it shouldn't have failed or was it just an odd fluke? | |
|  |  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
| Re: Reliable servers haha said by iLive4Fusion:If they adequately tested their current system though it shouldn't have failed or was it just an odd fluke? From the information I've seen, it was a fluke. I don't know how many times 365Main has lost utility power at that facility, but you never hear about those events because the CPS makes power events invisible to their customers. The colo facility I'm in (not 365Main) has power issues several times per year, but the backup systems work. Usually. There was an event about 2 years ago where a UPS freaked out and caused an EPO throughout the facility. Our equipment was dark for about an hour while the electricians sorted it out.
Basically, we (customers) never even know when power events happen, because the backup systems just quietly do their job without fanfare. However, when a backup power system fails (no system is perfect) it makes headline news, we all scream and threaten to take our servers elsewhere. | |
|  |  |  |  |  | | Re: Reliable servers haha Correct, but it seem's like a big company like Netflix would have 2 different providers or centers for backup instead of just one. It would be quite expensive though. | |
|  |  |  |  |  |  rebus9 join:2002-03-26 Tampa Bay Reviews:
·RoadRunner Cable
·Verizon FiOS
1 edit | Re: Reliable servers haha said by iLive4Fusion:Correct, but it seem's like a big company like Netflix would have 2 different providers or centers for backup instead of just one. It would be quite expensive though. Very expensive, and complex. As someone else said, if
$Cost_Of_Redundancy > $Cost_Of_Downtime then redundant systems are omitted since a business case for them cannot be made. Taking the Netflix example, I doubt they lost much busiess. Subscribers were inconvenienced, but no doubt tried back later and got things done. How many people do you think cancelled their subscription because of this isolated downtime event?
Now balance that minimal loss against the high cost of duplicating their infrastructure in a 2nd facility (along with the technical hassles of replication, etc.) to protect against a rare outage.
Now that doesn't apply to everyone. A major site like Amazon.com would lose a pile of money from an hour's downtime, so multiple sites is a necessity from a sales standpoint (even if we ignore the geographical load-time advantages). | |
|
 | |
|