
how-to block ads
|
  whiteshp
join:2002-03-05 Xenia, OH
| reply to KatOak Re: Packet Loss/Latency Reports
Anyone seeing ping spikes on the Dayton NAP or is it just me? I'm wondering if it's just me. Things look near normal then they climb to 400+ for 10-15 seconds then drop back down. I have asked tech support twice just curious and they act suprised and say there should be no problem here on our NAP. | |   Endorphine Endorphine Premium join:2002-09-30 Bremerton, WA
| reply to Yourself I also have seen a huge degrade in service this last week on the Seattle POP.  I have been keeping ping plotter logs and my first hop used to be 12-13ms all day long. As you can see from the picture below you get past 9am and it goes to around 30-34ms. I understood also that they were going to being doing a major upgrade "this fall", but when is that going to be? I am also in the Speakeasy Gaming Beta test, but that will be a joke if I keep getting so much packet loss/drops in most of the games I play. I will try to be as patient as possible, but I hope Kat can give us an update soon.
Marc | |   Carl F
@Stanford.EDU
| reply to KatOak
Hi SSIO -
Please don't slip into being an apologist for SpeakEasy.
From every report I've read lately, the mandate given to Tech Support seems to be to avoid mentioning the systemic problems, and to make each caller feel as though they are the only one having the problem, or that the problem is normal and not a problem or what-have-you.
I agree with what you said on another thread about SpeakEasy having been great in the past. But companies change, especially when the board decides to bring in a new CEO.
SpeakEasy is working on the Quality of Service problems, but the Quality of Support has clearly degraded into a circling of the wagons, rather than the honest and open exchange that I felt I could always count on.
Please note that Kat is working hard on our Quality of Service issues, but is remarkably silent on our Quality of Support issues. Wouldn't you expect Quality of Support to be part of her charter, as Customer Experience Manager? With Quality of Service, she seems to be managing the SpeakEasy side, but with Quality of Support, she seems to be putting more energy into managing *us* and how we interpret and feel about SpeakEasy's support behavior.
Cheers,
Carl F. | |  Yourself
join:2000-08-10 Lynnwood, WA
| I, for one, have no intention of being an apologist for SE. My contract is up with one more payment and if this isn't resolved soon, I will move on to another ISP. This is the 4th trouble ticket on my circuit in 10 months. When I had a Verizon circuit through an independent ISP, I had a total of 4 service incidents in just under 4 years. So this is getting a bit tiresome....
Self | |   koitsu Premium join:2002-07-16 Mountain View, CA
| reply to KatOak Socket stalling problem has re-appeared tonight at about 23:30-23:45. Line has been working "flawlessly" (i.e. no socket stalls) since my last post. Ping Plotter shows happiness regardless (my ping packets are 56 byte, however).
I'm hoping this is being caused by some unmentioned maintenance somewhere, otherwise I'm going to get quite irate.
Back to the cable modem for the time being... -- Making life hard for others since 1977. | |   bhan261
join:2001-02-12 New York, NY | reply to whiteshp In the past day or two, the horrible spikes in P/L in NYC have been accompanied by huge ping spikes. In the words of Gilda Radnor, "if it's not one thing, it's another." | |   Carl F
@speakeasy.n
| reply to KatOak
I have a question.
What kind of customer/application would fit the following profile?
1. Traffic is generated outbound, that is, for upload, in large bursts, but with substantial intervals in between.
2. The traffic is time sensitive, such that the customer would be willing to pay a premium to have the traffic delivered onto the network/internet as quickly as possible.
I'm asking this question because SpeakEasy seems to have made solid progress on both download bandwidth and download variability, but on the upload side, the progress has been adequate only on bandwidth.
One plausible reason is that they have a premium class of customer whose overall upload output allows SpeakEasy to maintain its promised upload bandwidth to all customers. That is, the customers in this premium class do not need disproportionate bandwidth. They just need to have other customers' traffic throttled when they are uploading, to assure high priority for their traffic, whenever they are uploading.
Customers like us, who are not in this premium upload class, would get the leftovers. We'd get our advertised upload bandwidth, maybe even more. However, whenever the premium customers were online, our upload variability would be unsuitable for many real-time interactive purposes, including those that SpeakEasy has advertised like Gaming.
Do such applications / customers exist? If so, what flavor of animal or vegetable are they?
Cheers,
Carl F. | |   LowJack
join:2000-07-19 Seattle, WA
| reply to Yourself said by Yourself : said by borborpa : Kat has said that Seattle has not yet been corrected, but will be in a week (or something). I'm sure she will let everyone know as soon as it's fixed, so we can see results.
The problem with that is that the tech support rep I spoke with 2 days ago said the interim fix had already been completed, which supposedly included my circuit. Obviously that is not the case.
Exactly. Tech support keeps lying to me.
I'm somewhat considering simply paying the $300 fee to cancel, and then going to cable, which is $20 a month for the next 6 months (a deal in my neighborhood) and then goes up to around $40 a month.
$300 + ($20*6) + ($40*6) = $660 /f year
or I could continue with speakeasy
$105 X 12 = $1,260 /f year | |   bhan261
join:2001-02-12 New York, NY
| No doubt many of us will start (or are already) doing this math. And no doubt some of us will come to the same conclusion that even with the $300 cancellation fee, it will be cheaper AND you get a more reliable circuit by going elsewhere. As I said a couple of months ago, the myth and the reality of Speakeasy being "worth more" are starting to diverge in the minds of many users.
And let me anticipate the response from certain SE fans....
Yes, we know your circuit is fine but do you "get" that ours isn't?
Yes, we know that RADSL is a "best efforts" service but do you "get" that there's good reason to suspect that "best efforts" aren't being given?
Yes, we like Kat and appreciate the work she does but it does not compensate for the fact that we're paying a premium price for what is a sub-premium service right now. (Again, yes, we KNOW your circuit is perfect.) | |   whiteshp
join:2002-03-05 Xenia, OH
| reply to KatOak I've done a little digging. I won't condone the actions of the techs to some of you. Though I doubt they were given a choice as I'm sure someone in upper management thought delay would help the hemorrhage of customers. I dont think it is the best decision but it sounds like someone in upper management hopes for the general user not known is not there. Thing is Speakeasy has many propeller heads who are well connected to each other.
If you log your pings like I do you can see that ICMP pings are flooding into all IPs. With routers having to check packets for where to go millions of pings multiplied by thousands of infected computers is a nightmare for any router CPU. Add in Microsoft Exchange servers must have ICMP and you cant just block it or you lose your business.
For those wondering if premium customers are protected while the others are left to the wolves. I was told by tech that my SDSL priority would help shield me. But my SDSL line is affected exactly and at the same as my ADSL line. I can watch pings on both lines and watch them both grow at the same time and even hit packet loss. But if you think it makes sense. Packet priority only works when your CPU has enough time slices to check all the packets (decide which one should go first). The CPUs in this case (routers connecting Speakeasy and some other ISPs) to Internap cant keep up. So I can vouch as a SDSL customer with my business class routing my packets are getting missed too.
I have been told a (hoped) ETA that some new network equipment will be going into CHI at the end of this month to the early part of October. This should help a lot as everyone goes through CHI one way or another.
To stay or go? A lot of other providers are having problems with these worms as well. This is not isolated to just Speakeasy so if I left Id need to be careful. The other side of the coin is I LIKED my service quite well before all this garbage happened and I hate to lose it. Ive been with quite a few providers who really did SUCKK . I cant speak for the rest of you but the ETA is fairly close. It would take me longer to sign up for new DSL service. So Ill wait and watch the date. If it gets fixed good life goes on as expected. If not I may be forced to check my alternatives. | |   Carl F
@speakeasy.n
| reply to KatOak Hi Whiteshp -
This does go round and round. One after another engineer "connected" to SpeakEasy offering similar scenarios that do not make contact with critical data available on these threads.
SpeakEasy's own "propeller heads" need to be telling us the story, directly. Leaking off the record to you is not nearly so helpful as would be talking on the record to all of us, the user base.
1. I'm sorry to hear that business class SDSL is giving you grief as well. But that is a far cry from a definitive statement that there are NO premium classes of the kind I describe.
Mind you, I have no data that such a class exists. It just occurred to me that such a premium class would be consistent with observation, and that someone with much more experience than I have might be able to tell me if such an animal exists, and what kind of customer/application it might be. If no one makes any concrete suggestions over the next few days, I'll take it that this is just pointless speculation on my part.
2. Interesting story about CPU time-slicing... Are routers really implemented with general purpose processors in this way? I would have thought this would have been done primarily in ASICs, to be able to keep up. In either case, whether the algorithm is in software or silicon, it is not rocket science to build an arbiter that assigns priority on any of a number of criteria.
Again, I don't know nearly enough about network infrastructure equipment to know what traffic management policies are available to the ISPs who own and operate the equipment. I'm asking questions to which I'm looking for answers. Your time-slicing story doesn't fit, because I would expect much greater packet loss as opposed to just stalls, if queues were thrashing in the way you suggest. If I think about it, it doesn't make sense to me. But maybe I'm just another asshole with an opinion...
3. The infection rate for Blaster was about 1%. If the propeller heads you're connected to indicate that SpeakEasy's infection rate was much higher, then SpeakEasy owes us some explanation as to how the virus targeted SpeakEasy's customers with such greater precision than other ISP's customers, for example, a security breach of some broadly construed kind.
I have no data that says that the SpeakEasy customer base had significantly more infections than average. Personally, I think stories about bizarrely high infection rates are just red herrings, and that SpeakEasy was not infected at a significantly higher rate than other ISPs. At the same time, SpeakEasy does seem in the 90th percentile or higher in terms of impact incurred, and that does need explaining...but in terms of network architecture, not in terms of infection rates that imply a security or some other egregious issue. On this one, I trust SpeakEasy, and reject conjectures about high infection rates at SpeakEasy.
4. Again, here are the data that need explaining (round numbers, from memory, as my ability to get to these forums has been all wonky for the last few days).
6Sep03, Midday: ================ Download 350Kbps +/- 180Kbps, Upload 290Kbps +/- 1700Kbps (not a typo!!)
19Sep03, Midday: ================ Download 635Kbps +/- 26Kbps, Upload 640Kbps +/- 300Kbps
This behavior shows a remarkable amount of structure, especially for a system under stress. Degraded performance is usually progressively more chaotic, less predictable performance. Why not in this case, with SpeakEasy?
A. Why do download and upload bandwidths track each other so closely, both improving by about 1X?
B. How do upload and download bandwidths track each other so closely, when the upload variability is about 10X the download variability? (This is really the most striking data point, given the randomness of traffic on the internet.)
C. Why do variabilities (the +/-) track each other so closely, both improving by about 6X?
D. Why is upload variability consistently so much greater than download variability?
To me these look, not like the random failures of a system under stress, but rather like traffic management policies that remained remarkably stable despite the high stress of the Blaster worm. But what do I know, being fairly far away from my areas of expertise?
To be very clear here, I'm not saying I know the answers to any of these questions. I'm not a networking engineer. I do believe these are all good questions, and that SpeakEasy would do best to give us the answers to these questions, in the voice of their own propeller heads. Those propeller heads built what was a great ISP until just very recently, so I am sure that they are a really competent bunch of folks with some substantial grip on what is really happening, whether or not they are allowed to say.
Cheers,
Carl F. | |   borborpa Slipping Slowly Into Oblivion Premium join:2002-02-20 New Cumberland, PA clubs:
·Speakeasy
| said by Carl F: I have no data that says that the SpeakEasy customer base had significantly more infections than average. Personally, I think stories about bizarrely high infection rates are just red herrings, and that SpeakEasy was not infected at a significantly higher rate than other ISPs.
Actually, SE is still at the high end of it all, along with rr.com. I use MyNetWatchmen to filter my Zone Alarm logs and send them to the proper ISP's. I would say 75% of the reports go to RR. Because of the way the worms choose their IP address, SE is at considerable risk from RR infected computers. Accoring to »www.mynetwatchman.com/ListIncide···ider.asp , RR has 3.5 TIMES more incidents than any other ISP...so that definitely takes it's toll on SE. I'm still seeing over 3000 hits per day, per IP (I have 8). That's a LOT of extra overhead just tramping it's way around the internet all day pointlessly.
Of course, SE definitely needs to address the issue by sending out information to infected subscribers (which I'm sure they are doing). They need to start getting a little strong-arm though, and if they see after 2-3 days the infected computer is STILL going at it, suspend service until that person corrects the problem. RR REALLY needs to get on the ball, and get their systems fixed as well... -- There are no stupid questions, but there are a LOT of inquisitive idiots.[AIM - BoyBandsMakeUGay] | |   Carl F
@speakeasy.n
| reply to KatOak Hi SSIO -
For those of us who are relative newbies, your post needs a little unpacking. 
1. What's the relationship between "incidents" and "infected computers"?
2. Are you saying that SpeakEasy has more infected computers within its customer base, or that SpeakEasy IP addresses are more frequently targeted from infected computers outside of SpeakEasy's customer base?
3. Could you elaborate a little more how the worm chooses its target IP addresses, and why that makes SpeakEasy more vulnerable?
4. What is the relationship of rr.com to SpeakEasy? Why would rr.com's incidents take a toll on SpeakEasy, more than any other ISP? (Also, what's the home URL for rr.com?)
5. My understanding is that there are decent, basic firewall programs available for free. What's the disincentive to SpeakEasy to do exactly what you're saying, that is, give folks a reasonable time to resolve the problem, under penalty of suspension of service? I should think that losing income from a small percentage of customers would not begin to approach the costs of the equipment required to carry the extra load. Not to mention the costs of all the calls to Tech Support, and the clear damage to their reputation.
6. Better yet, can SpeakEasy, as part of its "load-balancing", write a Perl-script or something that will dynamically segregate all symptomatic computers to their own router, until a computer becomes asymptomatic? I think that would definitely qualify to match each customer's embodiment of "best effort", segregating the bad citizens to their own little community, where they can degrade each other's service, in isolation from the rest of us. 
7. Why, under random stress, does SpeakEasy's service degrade and then recover in such a well-structured way, if, as SpeakEasy claims, no traffic throttling or other traffic management policies are in force? Apart from everything you've said, these data still need explanation from SpeakEasy. To me, at least, the Blaster worm was the stressor that clearly exposed this well-structured behavior. But whatever is imposing this structure on SpeakEasy's traffic flow is, I think, the root cause of the degraded service. Traffic generators like worms amplify the damaging impact, but the root cause looks to me like some kind of traffic management policy that assures average bandwidth at the expense of increased variability.
Thanks ahead of time for your elaborations.
Cheers,
Carl F. | |   Grocers Pet
@speakeasy.n
| 5. My understanding is that there are decent, basic firewall programs available for free. What's the disincentive to SpeakEasy to do exactly what you're saying, that is, give folks a reasonable time to resolve the problem, under penalty of suspension of service? I should think that losing income from a small percentage of customers would not begin to approach the costs of the equipment required to carry the extra load. Not to mention the costs of all the calls to Tech Support, and the clear damage to their reputation.
What's a reasonable amount of time? I suggest 3 days after notification. The reason is: The brain numbing amount of apathy I experience telling people that having a 24/7 always on net connection means a bit of responsibility. The response I hear often is: I am not into security, it wont happen to me. Until its too late.
Its not "if" its "when"
6. Better yet, can SpeakEasy, as part of its "load-balancing", write a Perl-script or something that will dynamically segregate all symptomatic computers to their own router, until a computer becomes asymptomatic? I think that would definitely qualify to match each customer's embodiment of "best effort", segregating the bad citizens to their own little community, where they can degrade each other's service, in isolation from the rest of us.
Computers just dont become 'asymptomatic' like getting over a cold.
Segregation, as we have seen through history has never really solved a thing. | |   borborpa Slipping Slowly Into Oblivion Premium join:2002-02-20 New Cumberland, PA clubs:
·Speakeasy
| reply to Carl F said by Carl F: 1. What's the relationship between "incidents" and "infected computers"?
The incidents are abuse E-Mails sent to rr.com for infected computers, after they receive a certain number of complaints about a specific IP address. There are only 1400 people that use MyNetWatchman, so the 650+ reports to rr is a lot. quote: 2. Are you saying that SpeakEasy has more infected computers within its customer base, or that SpeakEasy IP addresses are more frequently targeted from infected computers outside of SpeakEasy's customer base?
No, SE is more frequently targeted by rr.com addresses. See more below. quote: 3. Could you elaborate a little more how the worm chooses its target IP addresses, and why that makes SpeakEasy more vulnerable?
The full explaination is at »www.f-secure.com/v-descs/msblast.shtml , but for basic terms, it takes your IP, and goes out from there. SE IP's start with 66.92 and 66.93. rr's major IP space sits at 66.91, so since SE is so close, it becomes an instant target.? quote: 4. What is the relationship of rr.com to SpeakEasy? Why would rr.com's incidents take a toll on SpeakEasy, more than any other ISP? (Also, what's the home URL for rr.com?)?
See above. RoadRunner.com should be their main page. quote: 5. My understanding is that there are decent, basic firewall programs available for free. What's the disincentive to SpeakEasy to do exactly what you're saying, that is, give folks a reasonable time to resolve the problem, under penalty of suspension of service? I should think that losing income from a small percentage of customers would not begin to approach the costs of the equipment required to carry the extra load. Not to mention the costs of all the calls to Tech Support, and the clear damage to their reputation.
Aside from bitchy customer's, there's no disincentive at all. That's why I suggested it.  quote: 6. Better yet, can SpeakEasy, as part of its "load-balancing", write a Perl-script or something that will dynamically segregate all symptomatic computers to their own router, until a computer becomes asymptomatic? I think that would definitely qualify to match each customer's embodiment of "best effort", segregating the bad citizens to their own little community, where they can degrade each other's service, in isolation from the rest of us. 
That is actually a little too hard to try and do. Since nothing exists to do that, they would have to put time, money, and effort into creating that...which would be a bit too much. quote: 7. Why, under random stress, does SpeakEasy's service degrade and then recover in such a well-structured way, if, as SpeakEasy claims, no traffic throttling or other traffic management policies are in force? Apart from everything you've said, these data still need explanation from SpeakEasy. To me, at least, the Blaster worm was the stressor that clearly exposed this well-structured behavior. But whatever is imposing this structure on SpeakEasy's traffic flow is, I think, the root cause of the degraded service. Traffic generators like worms amplify the damaging impact, but the root cause looks to me like some kind of traffic management policy that assures average bandwidth at the expense of increased variability.
Well, once you begin maxing out a connection, it does become very unstable, and variable. Routers are "trained" to drop certain types of packet when they become overloaded. I'm not the pro on this end of things, so my knowledge is limited to what I have seen and heard. -- There are no stupid questions, but there are a LOT of inquisitive idiots.[AIM - BoyBandsMakeUGay] | |   Carl F
@speakeasy.n
| reply to KatOak Hi Grocers Pet -

I didn't mean that computers become asymptomatic by themselves, as though their immune system eventually overcomes an infection.
I did mean that after infected users do a virus clean-up and install a firewall, their computers will observably stop generating lots of recognizably bogus traffic--using particular protocols and ports--and thereby be observably asymptomatic.
Also, I think I'd like to narrow your statement about segregation just a little. Segregation based upon arbitrary, uncorrelated, superstitious, etc. criteria have never solved anything, and almost always makes things worse.
In this case by contrast, we'd be imposing a temporary segregation in order to get a behavior change, that is, installation of a firewall and antiviral software. That makes my proposal more like giving a child a time-out, until they're ready to behave in a responsible, well-socialized way.
Perhaps 'segregation' was not the best choice of words, given the many injustices with which it's associated.
Cheers,
Carl F. | |  Bondman
join:2001-08-24 Livonia, MI
| It looks like the Detroit NAP to Chicago has gone to ***** again. I guess I spoke too soon in my message earlier in this topic. I am getting packet loss and double ping times since this morning.....
sad Ping statistics for 64.81.159.2: Packets: Sent = 76, Received = 73, Lost = 3 (3% loss), Approximate round trip times in milli-seconds: Minimum = 14ms, Maximum = 44ms, Average = 32ms
frown | |   Carl F
@speakeasy.n
| reply to KatOak Hi SSIO -
Thanks for your detailed, informative post.
I now have some understanding as to why SpeakEasy is in the 90+th percentile in terms of Blaster impact.
Perhaps someone with relevant expertise can chime in and model how the degradation comes to be so uniform, if there are no restraining traffic management policies in place. (As Bateson said, if he walked into a room and saw a group of monkeys banging out Shakespeare on typewriters, he'd look inside the typewriters. )
As to your statement that no switching products provide the information to do analyses of traffic for behavioral signatures:
While it would make no sense to build any specific heuristics into hardware, ASICs are these wonderfully parallelized processing engines, where you could capture and buffer attributes of the stream, packet by packet, as it flows through, without impeding the stream. You just fan-out copies of the attributes into parallel pipeline stages that write to dual-port buffers, that yet other processes read out.
You then write the software that uses heuristics to scan the stream, in this case, looking for IPs exhibiting problematic behavior. Once a worm was captured and reverse-engineered, heuristics could be very tailored. Since I would guess that processing would need to make multiple passes over the data, the software itself could also be pipelined, to run across multiple machines in the kind of cheapo, multi Linux-on-Intel server farm that is used for ASIC development.
The hardware buffers would likely overflow doing a 24/7 data capture. But intermittent sampling of an hour or so would seem feasible, and could be triggered during meltdown traffic conditions. Even if the offline heuristic analysis took a little while, I'm guessing it would be faster, more sensitive and more reliable than whatever ISPs are doing now to identify infected machines.
I have to say that it would astonish me to think that nothing like this exists, and that a total newbie like me could invent something like this in a newsgroup, especially since various flavors of DOS attack have been a problem for years now. If the ability to capture this flow-through data does not exist in the current generation of switching products, perhaps this is the beginning of a lucrative business plan... 
Cheers,
Carl F. | |   Grocers Pet
@speakeasy.n
| reply to Carl F Carl,
I was just razzin ya on the choice of words (re: segregation)

"In this case by contrast, we'd be imposing a temporary segregation in order to get a behavior change, that is, installation of a firewall and antiviral software. That makes my proposal more like giving a child a time-out, until they're ready to behave in a responsible, well-socialized way. "
That sounds like a good plan. I think instead of just speakeasy-wide, this should be internet wide. Whoever is still blasting my home apache testing server with code red should be sent to their room for 6 months  | |   koitsu Premium join:2002-07-16 Mountain View, CA
| reply to Carl F Carl,
Keep one thing in mind: Juniper routers are NOT hardware-based. FE or ATM cards are used in the boxes, sure, but the actual processing is NOT done via hardware (re: ASICs). Junipers are pure software routers, running a BSD derivative. Ciscos, on the other hand, do a lot of hardware-based routing, and IOS simply interfaces with the appropriate chip (while some other features are handled purely by software).
I'm not going to say if this is the cause for what's going on, but in my eyes, software routers are absolute garbage for the sole reason you see here. Junipers are presently a strong contender in the router market that's presently dominated by Cisco, for one sole reason: price. They're much more affordable, which is why so many ISPs and backbone providers are steering away from Cisco.
Sadly, cutting on costs will simply bite them in the ass later when their customers cancel due to performance impact of the newly-deployed equipment (oops, did I say that outloud?). We went through this exact situation at Verio back in 2001 or so; the Junipers made router administration much easier, and the unit itself was a lot easier to use. However, the performance... ugh ugh ugh... -- Making life hard for others since 1977. | |
|