dslreports logo
site
 
    All Forums Hot Topics Gallery
spc

spacer




how-to block ads


Search Topic:
uniqs
14
share rss forum feed

gweidenh

join:2002-05-18
Houston, TX
kudos:3

1 edit

1 recommendation

reply to Blunderbuss

Re: [voip.ms] Is it rotten with bugs or is it incompetent users?

said by PX Eliezer70:

Why should servers be going down at all?

Just FYI...

Industry standard servers have an average incident rate of about 8%. How you should read that is that given a large enough sample size, you can expect 8% of your servers to have an 'incident' in any given 12 month period.

Now, what defines an incident? It can range from a simple error message to a more catastrophic failure.

DIMM failures, P/S failures, Fan failures, and HDD failures, are the most common physical failure points. Not all of these 'bring a server down' but that depends on the design of the server and how resilient to failure it is.

This of course does not take into consideration external factors (cooling issues, AC Power disruption, IP connectivity issues, etc.)

And then of course, you have to take into account the infrastructure design. For example, when you hear the nebulous term 'cloud', one attribute is that failures are accounted for in the system architecture. A complete server can go down with no impact to the application or service.

Each provider will likely have their own architecture and the characteristics of the design will dictate whether the end user has any idea that there was a failure or not.

PX Eliezer70
Premium
join:2008-08-09
Hutt River
kudos:13
Excellent information, thank you.

----------------------

Indeed, it is clear that all situations are subject to this (not only VoIP providers).

So the metric we would have to be talking about is disruptions visible to the end user.

nitzan
Premium,VIP
join:2008-02-27
kudos:8

1 recommendation

reply to gweidenh
said by gweidenh:

Industry standard servers have an average incident rate of about 8%. How you should read that is that given a large enough sample size, you can expect 8% of your servers to have an 'incident' in any given 12 month period.

And that's why they invented high availability and (automatic!) fail over.

MartinM
VoIP.ms
Premium,VIP
join:2008-07-21
kudos:4
said by nitzan:

said by gweidenh:

Industry standard servers have an average incident rate of about 8%. How you should read that is that given a large enough sample size, you can expect 8% of your servers to have an 'incident' in any given 12 month period.

And that's why they invented high availability and (automatic!) fail over.

Not to bash on you Nitzan but I don't think you're handling the same amount of traffic than we have to handle. Anyway, let's not get side tracked..

Our failover will be automatic soon. The problems we've experienced with Toronto in the last few days is a capacity issue with the bandwidth routing at Data center level (out of our control). We're moving away from them. The various times Toronto went down was in an erratic manner (back/gone/back/gone) because the provider was experiencing a DDoS attack on one of their customers (it happened many times, hence why we're moving away). Automatic / "High Availability" would not have done much here. It required human intervention each time. But we won't switch the blame. We are responsible for being hosted there in the first place.

Last time Seattle went down for a whole week-end we didn't even get a single thread in that forum because we handled the situation well by redirecting the traffic to another POP in a timely and transparent manner.

However, it's always pleasant when another provider sneaks in one of the thread for a cheap poke.

There are now about 10 redundant threads about Toronto. I guess we can all choose our flavour and post in any of them.
--
Martin - VoIP.ms

nitzan
Premium,VIP
join:2008-02-27
kudos:8
said by MartinM:

Not to bash on you Nitzan but I don't think you're handling the same amount of traffic than we have to handle.

However, it's always pleasant when another provider sneaks in one of the thread for a cheap poke.

Pot, meet Kettle.

I wasn't trying to "cheap poke" by the way - I was just commenting on a technical aspect. I don't even know what kind of failover mechanisms you have (or don't have) in place.

NefCanuck

join:2007-06-26
Mississauga, ON
Reviews:
·voip.ms
reply to gweidenh
said by gweidenh:

said by PX Eliezer70:

Why should servers be going down at all?

Just FYI...

Industry standard servers have an average incident rate of about 8%. How you should read that is that given a large enough sample size, you can expect 8% of your servers to have an 'incident' in any given 12 month period.

Interesting average incident rate percentage you quote there 8%, covering everything from the minor glitch to the "Lucy you'se got 'some 'splaining to do" level.

I just wonder what kind of percentages are seen in other industries by comparison and what would users reactions be in those cases?

I'm going to dig up some reliability stats in the auto industry now and see what kind of numbers I get (I'm betting the incident percentage number there is higher and I don't see call for the return to the horse & buggy)

NefCanuck

nonymous
Premium
join:2003-09-08
Glendale, AZ
said by NefCanuck:

said by gweidenh:

said by PX Eliezer70:

Why should servers be going down at all?

Just FYI...

Industry standard servers have an average incident rate of about 8%. How you should read that is that given a large enough sample size, you can expect 8% of your servers to have an 'incident' in any given 12 month period.

Interesting average incident rate percentage you quote there 8%, covering everything from the minor glitch to the "Lucy you'se got 'some 'splaining to do" level.

I just wonder what kind of percentages are seen in other industries by comparison and what would users reactions be in those cases?

I'm going to dig up some reliability stats in the auto industry now and see what kind of numbers I get (I'm betting the incident percentage number there is higher and I don't see call for the return to the horse & buggy)

NefCanuck

But for critical needs like say banking or visa cards failovers are in place. The end users will not see any long failures.

NefCanuck

join:2007-06-26
Mississauga, ON
Reviews:
·voip.ms
That depends on the severity of the situation, you can try to predict for everything, but nothing is 100% guaranteed in life (other than death and taxes)

For example, last year during the holiday season, one of the major banks in Canada had its entire ATM network crap its pants for the better part of the day. That meant no access to cash via ATM or debit card services, it was horribly inconvenient and I know a lot of businesses lost sales that day that they never made up, but it does happen.

The question is how much more are you willing to pay for the service to get closer to that impossible to achieve 100% uptime?

NefCanuck