edil join:2010-06-16 Bayamon, PR |
edil
Member
2014-Nov-1 6:50 pm
Please help me to discern router/interface WAN performance?Hi, I have several remote sites connected to central office receiving WAN links (ethernet VLANS) on a trunk port in my router. The Service provider's equipment and my router (Cisco 3800), have the interface set at 1 Gbps (SM SFP/Gbic). So it is like a point to point scenario but all of the links (VLANs), are received on one interface. The remote sites have the same router and the interface speed set accordingly so the aggregated bandwidth on the trunk port is around 500 Mbps. Now, the specs of the router said that it can process from 350Kpps to 500Kpps but that the wirespeed is equivalent to T3/DS-3. That roughly means that it can handle a traffic of 200Mbps on an interface of 45Mbps. How does that works? Anyways when I use iperf I can not get even closer to the 45 value measuring from the fastest remote site (I did use multiple connections and different TCP parameters in order to saturate the link). So I decided to set our Network management platform to collect historical data on the router's trunk interface and the sub-interfaces (also on the remote sites, remember same router). I got that the average sent+receive bps on the trunk interface is not higher than 35 Mbps but then I did have the sub-interfaces reporting average values of 40 Mbps. I don't understand that. Do I pay attention only to the "parent" interface values and not the sub-interfaces? Something else again the interface is T3/DS-3 wirespeed but then I have sporadic peek values going up to 150 Mbps. How can that be? Why I'm getting those values that doesn't match router specs? Is the average sent+receive Mbps meaningful? Please help to understand the concepts of wirespeed vs bandwidth even throughput and enlighten me on how to properly measure link performance. |
|
|
meta
Member
2014-Nov-1 7:46 pm
|
|
|
to edil
...also see routerperformance.pdf for the goto baseline performance of a piece of Cisco kit BAREMETAL. Generally speaking, it'll give you a ROUGH idea of what to expect for performance... but it also has the caveat of "test according to your setup / config."
...and without knowing what you have configured, no way to tell.
The other one is whether the carrier themselves is doing any rate-limiting / policing once it leaves your device... unless you're running your own dedicated fiber.
tl;dr - if you think you're getting screwed for performance, rip it out, stick it on a bench, iperf it to heck and compare to routerperformance.pdf.
My 00000010bits
Regards |
|
edil join:2010-06-16 Bayamon, PR |
edil
Member
2014-Nov-1 10:37 pm
Thanks for your responses.
I'm checking the post.
Can you also explain the relation between wirespeed, bandwidth?. How can the specs say that it can handle 200 Mbps but then the wirespeed is only 45 Mbps?. Should I care about the values of the trunked interface or the virtual sub-interfaces? I'm not running any VPN or tunnel, just routing with OSPF. |
|
|
to edil
said by edil:Can you also explain the relation between wirespeed, bandwidth?. In relation to... what exactly? To me, "wirespeed" generally indicates it can go as fast as the wire itself, be it a copper FE or GigE interface, a fiber GigE interface, an xWIC card, etc. "Bandwidth" is rather a general term... said by edil:How can the specs say that it can handle 200 Mbps but then the wirespeed is only 45 Mbps?. If you're referring to the specs in routerperformance.pdf, then as I said earlier. The paper gives a rough approximation of what the equipment will do BAREMETAL... likely between the two inbuilt 1000BaseTX copper interfaces in the 3845 chassis. If you plug this into a T-3/DS-3, well it's no longer simply passing between 1000BaseTX copper to 1000BaseTX copper interface, so the limiting factor is the size of the T-3/DS-3 circuit itself. ...where it gets REALLY interesting, as I alluded to earlier, is if the carrier is handing you a physical T-3/DS-3, but you're paying for a fractional circuit and/or a certain guarenteed Committed Information Rate, eg. 10Mbps CIR, burst to 25Mbps, say. If so, all that means is the carrier guarentees you can pump 10Mbps all day long and never drop a packet, but if your traffic need / load increases, you can "burst" to 25Mbps, but anything above that 10Mbps CIR is not guarenteed. If you want to read more into this material, may I suggest looking into what carrier SLAs are. said by edil:Should I care about the values of the trunked interface or the virtual sub-interfaces? Depends on your needs / setup of your environment. Offhand, if you have a FE or GigE LAN interface entirely controlled and managed by you, and it can't move a FE or GigE speeds, something ain't right... said by edil:I'm not running any VPN or tunnel, just routing with OSPF. May help to supply your config... minus any nonRFC1918 addresses and passwords. My 00000010bits Regards |
|
edil join:2010-06-16 Bayamon, PR |
edil
Member
2014-Nov-2 4:02 pm
Hellfire thanks for your comments, said by HELLFIRE:In relation to... what exactly? said by HELLFIRE:If you plug this into a T-3/DS-3 said by HELLFIRE:Depends on your needs / setup of your environment. said by HELLFIRE:May help to supply your config The specs that I originally had (now confirmed by the PDF you provided), says that it can handle pps equivalent to 180 Mbps with 45 Mbps wirespeed. The Service Provider give me an ethernet service with VLANs that I set on a 1 Gbps (single mode fiber SFP), dot1q port. The aggregated bandwidth received from the remote locations on the trunk interface should be 500 Mbps, each vlan CBR. So following what you said, with those specs, I do have a bottleneck on the router, correct? Now If the physical interface (no matter how many dot1q vlans sub-interfaces), can only manage 45 Mbps then, how do I get sporadic average peek values up to 150 sent+receive Mbps? If that's my scenario, Should I measure the sub-interfaces or the main interface? Are the average send+receive BPS values significant? Are those values meaningful to determine interface utilization? As I say, the configuration is pretty much straight forward, just routing with OSPF. Here it is: version 12.4
service nagle
no service pad
service tcp-keepalives-in
service tcp-keepalives-out
service timestamps debug datetime msec localtime show-timezone
service timestamps log datetime msec localtime show-timezone
service password-encryption
service sequence-numbers
boot-start-marker
boot-end-marker
security authentication failure rate X log
security passwords min-length X
logging buffered XXXXX debugging
no aaa new-model
resource policy
clock timezone GMT XXX
no ip source-route
ip cef
ip tcp synwait-time 10
no ip bootp server
ip domain name XXXXXXXX
ip name-server XXXXXXXX
ip name-server XXXXXXXX
ip dhcp-server XXXXXXXX
interface LoopbackB
ip address XXXX
interface GigabitEthernetA/B/A
no ip address
ip route-cache flow
negotiation auto
interface GigabitEthernetA/B/A.BG
encapsulation dot1Q BG
ip address XXXXXX
ip nbar protocol-discovery
ip flow ingress
ip flow egress
no snmp trap link-status
no cdp enable
interface GigabitEthernetA/B/A.AH
encapsulation dot1Q AH
ip address XXXXXX
ip nbar protocol-discovery
ip flow ingress
ip flow egress
no snmp trap link-status
no cdp enable
interface GigabitEthernetA/B/A.BA
encapsulation dot1Q BA
ip address XXXXX
ip nbar protocol-discovery
ip flow ingress
ip flow egress
ip pim dense-mode
ip ospf cost XXXX
ip ospf hello-interval X
ip ospf dead-interval X
ip ospf retransmit-interval X
ip ospf transmit-delay X
no snmp trap link-status
no cdp enable
interface GigabitEthernetA/B/A.BC
encapsulation dot1Q BC
ip address XXXXXXXXX
ip nbar protocol-discovery
ip flow ingress
ip flow egress
no snmp trap link-status
no cdp enable
interface GigabitEthernetA/B/A.AF
encapsulation dot1Q AF
ip address XXXXX
ip nbar protocol-discovery
ip flow ingress
ip flow egress
no snmp trap link-status
no cdp enable
router ospf X
log-adjacency-changes
area X stub no-summary
redistribute static subnets
network XXXXXX area X
network XXXXXX area X
network XXXXXX area X
network XXXXXX area X
network XXXXXX area X
ip route 0.0.0.0 0.0.0.0 XXXXXXX
ip route XXXXXX XXXXXX XXXXXX
ip route XXXXXX XXXXXX XXXXXX
ip route XXXXXX XXXXXX XXXXXX
ip flow-export version 5
ip http server
ip http authentication local
no ip http secure-server
ip http timeout-policy idle 60 life 86400 requests 10000
logging trap debugging
logging facility localX
logging XXXXX
snmp-server community XXXXX
snmp-server community XXXXX
no cdp run
control-plane
scheduler allocate 20000 1000
ntp clock-period XXXX
ntp master
ntp update-calendar
ntp server XXXXX
ntp server XXXXX prefer
|
|
|
said by edil:The specs that I originally had (now confirmed by the PDF you provided), says that it can handle pps equivalent to 180 Mbps with 45 Mbps wirespeed. With no services. You have OSPF and SNMP running. They will soak up processor cycles and put a dent in those figures. Not much of a one, but a dent none the less. |
|
|
to edil
said by edil:The specs that I originally had (now confirmed by the PDF you provided), says that it can handle pps equivalent to 180 Mbps with 45 Mbps wirespeed. Okay, I'm going to stop you right there and ask "which line(s) are you looking at to get that info." I should've also asked "exactly which make / model of 3800 are you using... as there's the 3845 and the 3825, which are two difference beasts altogether. Rereading routerperformance.pdf, these are the numbers Cisco publishes for this platform platform Process Switching Fast/CEF Switching
PPS Mbps PPS Mbps
ISR 3825 25,000 12.8 350,000 179.20 No
ISR 3845 35,000 17.92 500,000 256.00 No
Also when you read the PDF, it indicates at the top "Numbers are given with 64 byte packet size, IP only, and are only an indication of raw switching performance." SO exactly HOW you got "45Mbps wirespeed," I'm alittle lost. If you're asking how CISCO got those PPS : Mbps numbers, it's a straight formula of PPS x 64 (bytes) x 8 (bytes to bits conversion) = Mbps throughput. Make sense? said by edil:The Service Provider give me an ethernet service with VLANs that I set on a 1 Gbps (single mode fiber SFP), dot1q port. The aggregated bandwidth received from the remote locations on the trunk interface should be 500 Mbps, each vlan CBR. said by edil:So following what you said, with those specs, I do have a bottleneck on the router, correct? So if I understand, you're asking if any 3800 platform can move 500Mbps... well based on the numbers above, then the answer is a clear "No." Taking a look at your config seems to confuse things as much as it clarifies things. It may help to get us those numbers you're looking at in your "Network Management Platform," but offhand, I'd say a) you've got a carrier and the carrier's gear in the mix, which adds variables to your speedtest, b) what make / model of device wires into your GigA/B/A interface exactly? and c) moving traffic between subinterfaces throws things for a loop, but I'd definately monitor on the subints rather than the physical interface. If you want the least amount of variables, take out the 3800 from the head office, take one of the branch devices from a spoke site, wire up via a Cat5E / 6 cable between em THEN iperf it. My 00000010bits Regards |
|
edil join:2010-06-16 Bayamon, PR |
to markysharkey
Oh ok, if that the case then add netflow, nbar and NTP server. |
|
edil |
to HELLFIRE
said by HELLFIRE:"exactly which make / model of 3800 are you using... said by HELLFIRE:SO exactly HOW you got "45Mbps wirespeed," It is the 3825. From Cisco web site: "The Cisco 3825 Integrated Services Router provides the following support: Wire-speed performance for concurrent services such as security and voice, and advanced services at up to half T3/E3 rates" Adding more to the "soup" the 3825 reach "end of sale" status. I took note of your testing procedure, is just that it is hard with equipment that is in production. Hey guys again you have been very helpful keep on it! |
|
|
|
to edil
Ill add a few notes.
First of all what is your network management platform and how are you collecting data? Something like cacti or spiceworks with a low poll period will work fine, you will want to graph all interfaces and subinterfaces to get as much information as possible, and also crucially, CPU usage.
For your bandwidth tests, are you doing a "show proc cpu" while running them? You should be, this will tell you where your cpu sits, if its max or near max you know thats an issue.
Netflow and Nbar will hog a lot of cpu and should be disabled if possible, nothing else i see in your notes or config will use much of anything. We have a 2821 we use only to be a default gateway for a few internal vlans due to some jumbo frame issues, so it only does that and eigrp, nothing else, and i can do right at 500mb on it, given this is high packet size traffic.
With nothing besides basically osfp, there is no reason the 3825 cant do more than 500mb.
Given that, as long as your needs arent or wont be more advanced (firewall/vpn/etc), this type of setup would be perfect for a layer3 switch. We have a 3750x stack setup we use for some vlans and have a 500mb l2 PTP on it, and the cpu usage rise when we peg out the 500mb is literally not noticeable. If at some point you wanted to move in this direction, you could look for a used 3560G or something along those lines.
So, IMHO you need to look closely at your CPU usage while doing some testing and see what it looks like and go from there. |
|
edil join:2010-06-16 Bayamon, PR |
edil
Member
2014-Nov-3 9:43 am
Cooldude indeed I'm using show proc cpu while using iperf. In the past I found that certain access list and access groups where using high cpu resources and I get rid of them.
I will check without netflow and nbar. |
|
|
aryoba
MVM
2014-Nov-3 10:27 am
said by cooldude9919:First of all what is your network management platform and how are you collecting data? Something like cacti or spiceworks with a low poll period will work fine, you will want to graph all interfaces and subinterfaces to get as much information as possible, and also crucially, CPU usage. edil , am I correct to assume that you have no network management platform proactively collecting data? |
|
|
to edil
Okay, thanks for confirming which make / model of 3800 we're looking at. said by edil:Wire-speed performance for concurrent services such as security and voice, and advanced services at up to half T3/E3 rates" ...and the overall lesson from that is "there's MARKETING numbers, and there'a ACTUAL numbers." Of which a) I'd take routerperformance.pdf over anything, and b) be aware of and follow the caveats routerperformance.pdf's mentions. Don't know if you have any spare identical gear that you can match the code and config to test with. Otherwise, PITP* You may also want to look at the following FAQ item for some more ideas to try. I can agree NBAR may be a CPU hog -- I've never used it before to be sure -- but Netflow IIRC is already built by CEF; all the device has to do at that point is send a copy of the CEF table periodically, no? My 00000010bits Regards *Proof's In The Pudding |
|
edil join:2010-06-16 Bayamon, PR |
to aryoba
said by aryoba:First of all what is your network management platform and how are you collecting data? Orion NPM, indeed I'm polling all interfaces. I'm getting sporadic peek values that goes up to 150 Mbps but those are so few among thousands of samples (data has been collected for months), that they don't affect the average values. |
|
edil |
to HELLFIRE
said by HELLFIRE:...and the overall lesson from that is "there's MARKETING numbers, and there'a ACTUAL numbers. Got it! |
|