
how-to block ads
|
|
Uniqs: 3945 |
Share Topic  |
 |
|
|
|
 | arp info overwritten (OpenBSD) I've made an interesting observation with my broadband connection (Mediacom)... I am using OpenBSD 3.3 for a router/firewall, and have been receiving the following notices after having been connected for a time...
arp info overwritten for [gateway IP address] by [mac address 2] on [NIC] arp info overwritten for [gateway IP address] by [mac address 1] on [NIC] arp info overwritten for [gateway IP address] by [mac address 2] on [NIC] arp info overwritten for [gateway IP address] by [mac address 1] on [NIC] arp info overwritten for [gateway IP address] by [mac address 2] on [NIC] arp info overwritten for [gateway IP address] by [mac address 1] on [NIC] and so on...
The gateway IP address remains the same (duh), but the MAC address alternates between two addresses only (and only for a split second before it switches back).
I know it is possible to spoof the gateway, thus enabling the spoofing box to listen to traffic from a customer's misled connection. However, I am wondering if Mediacom has some type of misconfiguration with a backup gateway server (this would cause the symptom as well, no?).
The annoying thing about this is that sometimes I will not be able to connect for a few minutes after I get the notice. In fact, there have been numerous times that I've still had to reset the modem in order to get the connection up again.
I've emailed them about this, but have not received a response (so far, they wouldn't even verify whether or not both MAC addresses officially belong to them).
Any thoughts or ideas on this?
Also, I previously used FreeBSD as the gateway, and I remember seeing the error on numerous occasions as well...
Thx,
WizLayer | |  | After a while of examining _many_ tcpdumps, I have determined that there is no other plausible explanation for this except that Mediacom has a misconfig in one of their routers.
This _shouldn't_ cause a loss of Internet accessability though, even if the current connections are reset because the arp info is rewritten (switched back) within a second or two.
Other times (and this may be coincidence, because this also seems to happen regardless of the arp overwrites), I find that I am able to ping the gateway, access the DNS server, but nothing beyond. Of course, when I cycle power to the modem, all seems to return to normal.
That leads me to my next question (and this is for you Mediacom, tech folks)... Why is it that if I can ping the gateway and get response, ping 'www.yahoo.com,' get the IP resolution, but no echo reply, ping the IP address which your DNS server sends to me, get no echo, and then get on the phone with my mother and have her ping the same IP address and get reply, that I need to "reset the modem?"
Because I can ping the gateway, there's nothing wrong between the gateway and my box (including the modem).
Because I can ping www.yahoo.com and get resolution, there's nothing wrong between your DNS server, your gateway, and my box (including the modem... It _does_ suggest that there's a problem between your gateway and www.yahoo.com, though (because I get no return from the outgoing echo requests).
Because I get no reply when pinging the IP address of www.yahoo.com directly, it shows that there is a definite problem _somewhere_ and it has nothing to do with your DNS server.
Because my Mother can get on her box (from one of those _other_ ISPs ) and get reply from pinging that same IP address, it shows that www.yahoo.com is not dropping echo requests.
I'm not trying to sound cynical or anything, but "you need to reset the modem" isn't the solution, here. If it were a problem with the modem, then I wouldn't be able to interact with the gateway, right? It sounds more like I'm having to back out of the system alltogether and connect again, which would be a jury-rig, not a fix (kind of like a reset button on a M$ box ).
You've got to be aware of the problems, because I know for a fact that I'm not the only one who has noticed it, and I'm not the only one who has reported it either.
The last time I reported it to support, I included an excerpt from tcpdump, with system messages, dhclient configs (showing that nothing bogus was written to them by your DHCP server/s), and outputs from the previously mentioned pings, showing good connection with the gateway, and the email was forwarded to upper management for futher review. I say this because I don't want anyone to think I'm being negative here. I'm just applying good, troubleshooting techniques with a bit of common sense to help identify the problem so it can get fixed.
Also, I appreciate all the timely responses and positive attitudes when I call tech support, but for real... What's going on?
thx
WizLayer | | |
|  DewiPremium join:2001-09-28 united kingd | Are you sure this is not indicative of ethernet frame collisions caused by two NICs with the same MAC ADDRESS on the same net? | |  | This _would_ bring on the arp info overwritten notices and cause some major connection problems, yes (and I would hope that Mediacom would immediately notice such an attempt and deal with it)... however, when arp info is overwritten the first time, it wouldn't almost immediately switch back if that were the case.
If someone were to "hijack" my connection, then I would loose connection alltogether until he/she was done using it... that is, unless the hijacker _also_ allowed my box to connect through his (that was my initial concern going into this).
Unfortunately, my knowledge of cisco routers is lacking (it's not like I have one at the home to play with ). btw... If you know of a good book on cisco, let me know... I'd love to learn it.
When I first started putting the pieces together, I figured that _somehow_ I must have set my box up wrong. I posted specifics to a newbies forum for openbsd. The last message in that thread, I posted specifics (go to »mailman.theapt.org/pipermail/ope···522.html to see my last post to them regarding this).
What you'll find in the tcpdump is that during regular operation, [MAC address 1] is sent requests and [MAC address 2] replies to each request (that suggests routing). The MAC addresses themselves are Cisco's. Sometimes, [MAC address 2] (as opposed to [MAC address 1]) decides it wants to know who I am and that is when arp info is overwritten. BUT... It is immediately written back.
This is Greek to me for the most part because like I said, I know nothing of cisco. I DO know that nothing is misconfigured in my system, though. And the little I have read on this (long live google!), arp messages are either due to someone hijacking the account (which would have been obvious enough that I would have picked up on it even if Mediacom didn't), a network card being changed out (a one-time switch only), or a misconfigured cisco router. The misconfigured cisco router is the only one that fits this scenario.
As for the rest of the problem (connection problems), this may be something different altogether. I'm not so sure that these two are related. I can take my LAN to a friend's house, connect it directly to his DSL service, and the problems simply don't exist. Kinda weird (that was a 'peace of mind' test for me), but it at least indicates where the problem/s is/are at.
I was just wanting to know if any tech reps would give me more than, "we've sent your email to upper management" because surely they know of this (and if they don't, then goodness... why not?)...
thx
Mike | |  UHFAll static, all day, ForeverPremium,MVM join:2002-05-24 Reviews:
·Callcentric
·DIRECTV
·surpasshosting
·Dish Network
·VOIPo
| reply to WizLayer I've been getting ARP floods here for a long time. None of the Mediacom techs that post here seem to want to touch the posts about them with a 10 foot pole... I'm not sure why that is, but I suspect a router problem is causing it. 99.7% of the traffic seen by my modem is ARP requests, according to my Snort reports. | |  BAINCHPremium,VIP,MVM join:2003-04-02 Middletown, NY kudos:10 | I'll touch it, not that I'm a tech. We are investigating. It doesn't appear to be a router issue. I will report more soon. | | 
| reply to UHF said by UHF: I've been getting ARP floods here for a long time. None of the Mediacom techs that post here seem to want to touch the posts about them with a 10 foot pole... I'm not sure why that is, but I suspect a router problem is causing it. 99.7% of the traffic seen by my modem is ARP requests, according to my Snort reports.
Greetz, mon...
Arp traffic is a necessity with DHCP service. ARP = "Address Resolution Protocol" = the protocol used to help maintain meaningful conversation between a gateway server and every client that connects via DHCP. In more detail, the ARP protocol is used to verify whether or not an IP address is in use (so the server knows whether or not it can lease the IP address out to someone else). As you know, Mediacom has hundreds of customers connected to the same gateway that you are connected to. Therefore, there are lots of ARP requests flying around...
When you do a tcpdump (snort, or whatever), you will notice hundreds of such packets being broadcast ("broadcast" = sent to the entire network [ff:ff:ff:ff:ff:ff], and not an individual connection). Unless you are getting hundreds of arp type packets addressed _only_ to you (ie, addressed specifically to _your_ IP address), then you are in no way being flooded. That's just the way this type of network setup works.
Check out »www.rfc-editor.org/rfc/rfc2131.txt. It's everything you wanted to know (and probably more) about how DHCP works. If you find that the link didn't make sense, then I'm sure there are other sites out there that can explain it to an easier crowd... Google is your friend.
"Arp info" is changed _only_ when the MAC addresses of the connected computers have changed. Any such change which could imply a security breach _should_ be reported, so the reports I'm getting are actually a good thing.
To update the thread, I just received a reply from the tech reps the other day, and it turns out there was a problem with one of their routers (it was reloading 3-4 times a day for some reason). They've replaced the defunct router, and the connection has gotten _much_ better.
I'm still having to reset the modem (even though I can ping the gateway and get reply). That leads me to believe that there are still issues, but it's nothing compared to what it was. I don't mind the problems so much, as long as I know that they're being worked on... (and I like to know what's going on too because I'm still learning).
Oh... And the pings I mentioned earlier in the thread... The reason I couldn't get echo requests beyond the gateway is because mediacom drops them both ways (kind of a bummer, but hey... I drop _everything_ incoming unless it's return, so I can certainly understand and even appreciate their measures).
Cheers,
WizLayer
btw... If I wanted to visit the local Mediacom station (which, for me is in Moultrie), does anyone know what the policy on that would be? Does Mediacom allow people to come in and get a tour of the place? I'd love to look around and ask lots of questions. [text was edited by author 2003-10-24 02:24:55] | | 
| I've posted about the ARP flooding before - never got any rational response.
What ARP is for is mapping an IP address to a physical (Ethernet in most cases) address. When you see an ARP REQUEST, that's a host asking:
I'm at 1.2.3.4, Ethernet 12:34:56:78:90:12 - what is the ethernet addr for ip addresses A.B.C.D
These are always BROADCAST. Most hosts will see the broadcast and remember and cache the source information. The host that has ip address A.B.C.D sends an ARP RESPONSE (typically directly to the sender, but sometimes in broadcast as well)
hey you at 1.2.3.4, Ethernet 12:34:56:78:90:12 - I'm ip addr A.B.C.D, ethernet addr AB:CD:EF:GH:IJ:KL
I think this really has very little to do with DHCP (which is a way for a host to discover it's IP address).
Anyway, most hosts keep caches of ARP mappings - i.e. they remember that IP address A.B.C.D mapped to Ethernet address AB:CD:EF:GH:IJ:KL - usually for 20 minutes or so. As traffic passes, these mappings are updated, so if a host is relatively active, it's ARP mapping should essentially never time out. However, this cache is limited in size, so if it fills up, entries will drop out, requiring them to be re-discovered. Routers are no different in this - they should cache ARP table mappings for a period of time.
I expect there are some tweaks to this process in a Cable Modem DOCSIS environment, but I haven't downloaded the DOCSIS documents to figure out what they might be.
I believe the ARP storms we see are probably due to the fact that many cable modem nodes are running with fairly large sub-nets - we've got a /22 here - which means there are 1024 nodes on the same sub-net. I expect this is too large for the local routers to handle - their ARP caches fill up, so they end up dropping nodes out, and having to constantly rediscover them.
This could also be due to a misconfiguration: if the router thinks it's sub-nets are larger than they really are, then if it thinks it has traffic for an IP address in one of it's sub-nets, it'll try to ARP for that host. But if the host is not there, either it's off or it's on a different sub-net, it'll get no response and will keep trying. This is the only way I can explain why the gateways are ARP'ing for the same IP address over and over again frequently.
And, with the various viruses running around, probing every possible IP address, that really pushes up the problem as now the routers are seeing traffic for *every* IP address in the sub-net, regardless of if the host is there or not. This really kills the ARP cache.
I would still like Mediacom to do something about it. The ARP traffic, while small, chews up bandwidth. Since cable modems are shared, we're all paying the price for this in more latency and less overall available bandwidth. It also takes processing time on the routers, hurting performance there as well.
Oh, one other thing: I've had @Home/Mediacom cable modem at the same location since it first become available, almost 5 years ago. The ARP storm problem only really became noticeable in the last year or so, which is why I think it's a configuration/router problem. Overall, I'm quite happy with Mediacom, although I wish they'd give us a little more upstream BW. [text was edited by author 2003-10-28 09:50:20] | |  UHFAll static, all day, ForeverPremium,MVM join:2002-05-24 Reviews:
·Callcentric
·DIRECTV
·surpasshosting
·Dish Network
·VOIPo
| said by TazMainiac: I've posted about the ARP flooding before - never got any rational response.
What ARP is for is mapping an IP address to a physical (Ethernet in most cases) address.
I know what ARP is, what I want to know is, why do I see ARP requests for the same host multiple times per minute?? That indicates a problem. Since Bainch is looking into it, I assume it will eventually get better. I don't know who that guy is, but he seems to be able to get things done around there. | |  | said by UHF:
I know what ARP is, what I want to know is, why do I see ARP requests for the same host multiple times per minute??
That's the same thing I posted about before and got no response. I can frequently count a dozen or more requests for the SAME host from the gateway in under a minute. My conjecture is that the viruses running around are ping sweeping various IP address ranges and the routers don't have enough ARP cache RAM to handle all the hosts in their sub-nets. So ARP tables overflow all the time, causing lots of ARP request storms.
Hopefully BAINCH will see something... | |  BAINCHPremium,VIP,MVM join:2003-04-02 Middletown, NY kudos:10 | Well, the issue seems to be caused by a particular type of home gateway/router that reports a generic MacID when it has an error. Get two on the same part of the network (and thus two devices with the same MacID) and is causes an arp issue. We are trying to determine what material impact this has on customer performance (if any) and what the best course of action would be to resolve it. | |  UHFAll static, all day, ForeverPremium,MVM join:2002-05-24 Reviews:
·Callcentric
·DIRECTV
·surpasshosting
·Dish Network
·VOIPo
| Nice. You would think manufacturers would try to stick to the 802.3 specs for ethernet devices and not do screwy things with the MACs. I sure would like to know which device this is so I can be sure I don't buy one.
Thanks for the update Bainch. | |  | reply to BAINCH said by BAINCH: Well, the issue seems to be caused by a particular type of home gateway/router that reports a generic MacID when it has an error. Get two on the same part of the network (and thus two devices with the same MacID) and is causes an arp issue. We are trying to determine what material impact this has on customer performance (if any) and what the best course of action would be to resolve it.
Is this something that Mediacom is seeing, or is this something that I would be seeing? I can tell you for certain that what I am seeing is a misconfig of Mediacom's router service (I've already proven that to them, and I've also verified it by asking other people who are knowledgeable enough to explain it). I say that because this _sounds_ like upper management's answer to something that they don't want to spend the time/money on hiring someone who knows what they're doing to fix it.
What do you mean, "generic MacID?" Furthermore, can you identify "a particular type of home gateway/router?" Are you talking about a hardware router, a gateway, a firewall system with NAT, etc? That sounds a little like upper management's bs as well... (I figured I would ask because perhaps you could shed some light on that).
As for the post suggesting that ARP has little to do with DHCP... Perhaps you should read a little more into it (the link provided in my last post explains this very clearly... Read the Friendly Manuals). Without ARP, DCHP simply won't work efficiently. That's just the way it was designed.
Furthermore, _if_ there is a virus problem as someone has suggested, then it would be a problem with either lazy sysadmin or mediacom trying to do a NIX job with a MS product (not trying to sound anti-MS or anything, but... well... sometimes the truth hurts).
I haven't ran a scan to find out what OSes mediacom is running locally, but I would at least _hope_ that they're not that bad off (Out of I-don't-know-how-many, I actually got to hold conversation with two people who at least knew what they were talking about, so I still give them the benefit of the doubt, there).
If there _are_ needless ARP requests for a single IP address being broadcast as you have suggested, then what you're seeing is a client that has either been mis-configured or compromised (both of which are common, btw because people in general are clueless). If Mediacom doesn't have the know-how (or at least the desire) to pinpoint this, then perhaps you would be wise to bring the specifics to their attention (such as firewall logs, explaining what each entry means)... Perhaps they'll be able to figure out who it is from that and get them straightened out).
Reiteration... (not trying to be too obvious, here) The original problem posted in this thread has been resolved. Mediacom had a defunct router, the router was replaced (I guess they couldn't figure out how to fix it), and the problem that I had (past tense) no longer exists. The rest of this thread has nothing to do with it...
Cheers,
WizLayer | |  BAINCHPremium,VIP,MVM join:2003-04-02 Middletown, NY kudos:10 | The router issue was a bad I/O card and was just responding, badly, to the ARP issue. It wasn't the cause. The problem was created by three Linksys BEFW11S4v4s. It has been acknowledge by Linksys and they have a fix available on their website (a firmware upgrade.)
We have been calling the customers with the offending device and helping to resolve it. | |  | reply to WizLayer i've been having the arp flood problem for over a month now. sometimes 2000+ arp requests per second. here's a snort log from a couple of weeks ago: »www.quantumfish.com/files/snort.log.txt i don't have any linksys hardware. my cablemodem is a motorola and it doesn't matter what device is plugged into it for the arps to be received. so, any other ideas? | |  | BAINCH was not saying that the people who were getting the floods had linksys hardware. What was stated is that the particular routers were the cause of the floods. | |  | reply to sigsegv Sig,
These logs look about normal... except... I would wonder why there are so many dhcpds at the helm... A couple, okay... but my quick look through it shows 8. That seems a little much. Mediacom would be better to split this up a little more (especially if they expect growth in their customer base).
It could also be that they are not filtering the noise between clusters, which would be an easy fix.
Of course, it could also be due to idiot client's gateways who who haven't yet learned to point dhcpd in the right direction. 
If you're getting 2000+ arp requests per second at any time, then it should _definately_ get split up... or they should round up the idiot clients ahd hold a public hazing. (Whichever the case may be... )
Try to get a dump when arp traffic is like you say. Then send it to Mediacom and ask them to explain it. If they tell you that you need to reset your modem, that you must be configured incorrectly, or that they have no idea what you're talking about, then insist on talking to someone else.
Good luck,
WizLayer
| |  | reply to BAINCH said by BAINCH: The problem was created by three Linksys BEFW11S4v4s. It has been acknowledge by Linksys and they have a fix available on their website (a firmware upgrade.)
I did a quick search for it, and can't seem to find anything. Can you post a link that explains what the issue is with this hardware?
Thx,
WizLayer | |  BAINCHPremium,VIP,MVM join:2003-04-02 Middletown, NY kudos:10 | I'm away for a few days but I will see if I can look it up from here. | |  wthPremium join:2002-02-20 Iowa City,IA Reviews:
·Mediacom
| reply to WizLayer I see that Linksys is now up to Ver 1.45.7 dated 9/16/03 firmware for the BEFW11S4 V4. That's typical for linky (I have one also). Mine was purchased in June and had Ver 1.45.1 firmware which would just die during data transfer. Tried to upgrade to 1.45.3 and locked up the unit. Sent it back to linksys for new replacement which came with 1.45.3. If Linksys would spend 2-3 months additional testing before releasing new products, they would make a killing, but their profits are going to customer support issues. | |
|