2 edits |
[DSL] Need Help with Linux/MLPPP with 3 or more linesI want to eventually get MLPPP working with 5 lines but right now, I'm having trouble getting it to work with 3. Using 2 lines works great, with speeds almost doubling as expected. However, when the third line is added, everything stops working. I downloaded Linux/MLPPP (» fixppp.org/downloads/lin ··· .tar.bz2) and installed it on an openSUSE 11.3 x86_64 system. Then I grab the source from the git repo and recompiled all the binaries for x86_64 (db, db_dump, db_set, ..., pppd, redial-helper, rp-pppoe.so). The kernel on the system is 2.6.34.7 so I did not apply the 2.6.27 patch that came with Linux/MLPPP (this is probably my next step). Using db_set, I configured the MLPPP parameters: mlppptest:/opt/mlppp/bin # ./db_dump ppp0_defaultroute=1 ppp0_username=username@teksavvy.com ppp0_multilink=eth2,eth3,eth4 ppp0_mtu=1486 ppp0_mrru=1486
I add the firewall rules for MSS suggested by DSL_Ricer at this link: » 3-Line MLPPP on LinuxWhen I bring up the link, everything seems to connect fine: Oct 1 23:38:24 mlppptest pppd[3096]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded. Oct 1 23:38:24 mlppptest pppd[3096]: pppd 2.4.4 started by root, uid 0 Oct 1 23:38:24 mlppptest pppd[3096]: PPP session is 3454 Oct 1 23:38:24 mlppptest pppd[3096]: Starting negotiation on eth2 Oct 1 23:38:25 mlppptest pppd[3096]: PAP authentication succeeded Oct 1 23:38:25 mlppptest pppd[3096]: peer from calling number xx:xx:xx:xx:xx:xx authorize d Oct 1 23:38:25 mlppptest pppd[3096]: Using interface ppp0 Oct 1 23:38:25 mlppptest pppd[3096]: New bundle ppp0 created Oct 1 23:38:27 mlppptest pppd[3096]: local IP address 76.xxx.xxx.xxx Oct 1 23:38:27 mlppptest pppd[3096]: remote IP address 76.xxx.xxx.xxx Oct 1 23:38:27 mlppptest pppd[3096]: primary DNS address 76.10.191.198 Oct 1 23:38:27 mlppptest pppd[3096]: secondary DNS address 76.10.191.199 Oct 1 23:38:27 mlppptest pppd[3394]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded. Oct 1 23:38:27 mlppptest pppd[3401]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded. Oct 1 23:38:27 mlppptest pppd[3394]: pppd 2.4.4 started by root, uid 0 Oct 1 23:38:27 mlppptest pppd[3401]: pppd 2.4.4 started by root, uid 0 Oct 1 23:38:27 mlppptest pppd[3401]: PPP session is 3458 Oct 1 23:38:27 mlppptest pppd[3401]: Starting negotiation on eth4 Oct 1 23:38:27 mlppptest pppd[3394]: PPP session is 3457 Oct 1 23:38:27 mlppptest pppd[3394]: Starting negotiation on eth3 Oct 1 23:38:27 mlppptest pppd[3561]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded. Oct 1 23:38:27 mlppptest pppd[3561]: pppd 2.4.4 started by root, uid 0 Oct 1 23:38:27 mlppptest pppd[3561]: PPP session is 3459 Oct 1 23:38:27 mlppptest pppd[3561]: Starting negotiation on eth2 Oct 1 23:38:28 mlppptest pppd[3401]: PAP authentication succeeded Oct 1 23:38:28 mlppptest pppd[3401]: peer from calling number xx:xx:xx:xx:xx:xx authorize d Oct 1 23:38:28 mlppptest pppd[3401]: Using interface ppp0 Oct 1 23:38:28 mlppptest pppd[3401]: Link attached to ppp0 Oct 1 23:38:28 mlppptest pppd[3394]: PAP authentication succeeded Oct 1 23:38:28 mlppptest pppd[3394]: peer from calling number xx:xx:xx:xx:xx:xx authorize d Oct 1 23:38:28 mlppptest pppd[3394]: Using interface ppp0 Oct 1 23:38:28 mlppptest pppd[3394]: Link attached to ppp0 Oct 1 23:38:28 mlppptest pppd[3561]: PAP authentication succeeded Oct 1 23:38:28 mlppptest pppd[3561]: peer from calling number xx:xx:xx:xx:xx:xx authorize d Oct 1 23:38:28 mlppptest pppd[3561]: Using interface ppp0 Oct 1 23:38:28 mlppptest pppd[3561]: Link attached to ppp0 Oct 1 23:38:28 mlppptest pppd[3096]: Terminating link on signal 2 Oct 1 23:38:28 mlppptest pppd[3096]: Link terminated.
mlppptest:/opt/mlppp/bin # ./db_dump ppp0_defaultroute=1 ppp0_username=username@teksavvy.com ppp0_multilink=eth2,eth3,eth4 ppp0_mtu=1486 ppp0_mrru=1486 ppp0-key=AA8BBCB1 ppp0-link=5177 ppp0_remote_endpoint=local:34.36.30.38.33.32.30.30.32.32.00.00.00.00.00 _ppp0-2_lcp_status=1 _ppp0-0_lcp_status=1 _ppp0-1_lcp_status=1 ppp0_multilink_top=3
Now, when I try to use the link, things start to get weird. If I try to ping an ip address, it seems to work, so ICMP packets seen to work fine. However, DNS look ups don't to work. Most other IP traffic moves very slowly. After playing around with dig, I also noticed that I could get DNS to reply if I forced tcp request instead of udp (dig @76.10.191.198 +tcp teksavvy.com). Does anyone have any idea of what I might be doing wrong? Do I have to use the kernel patch to get 3 or more lines working properly? Any suggestions would be greatly appreciated. |
|
scorpido Premium Member join:2009-11-02 Kitchener, ON |
scorpido
Premium Member
2010-Oct-2 9:06 pm
I am having the same issues. I tried Tomato/mlppp and also a Mikrotik 493 and the same thing. 2 works fine, three and above makes it slow as crap. |
|
GuspazGuspaz MVM join:2001-11-05 Montreal, QC |
to westavmax
I'll try to get DSL_Ricer to take a look at this thread.
One thing you should keep an eye on is MSS clamping; Linux/MLPPP doesn't manage that for you, I believe, and if you don't configure it properly, packets above a certain size won't get through. |
|
|
| |
Thanks for the info Guspaz.
I do have the iptables rules for MSS clamping but you are probably right that the values are not properly tuned. I tried several different MSS values (1402, 1411, 1446) as found in some other threads but it did not seem to help. Do you have a formula for calculating this based on the number of lines being used?
I'm also in the process of building a patched 2.6.27 kernel to see if that will help. However, won't be able to test that until Monday. |
|
| westavmax |
I'm having a little more luck with a patched 2.6.27 kernel but have not had a chance to test the speed yet. Currently I have 4 lines in the mlppp bundle without the previous DNS weirdness and overt slowness. If anyone could provide some information on how to properly tune MTU, MRRU and MSS that would be great. Right now I'm using the settings suggested by DSL_Ricer for 3 lines: MTU=1500 MRRU=1500 MSS=1402 for outbound packets MSS=1459 for inbound packets
I want to know how to tune this for 4 and 5 lines. |
|
westavmax 1 edit |
Ok, we took our production DSL line offline and added it to the MLPPP bundle for a total of 5 lines. I ran some speed test and we are getting about 25mbps download and 3.4mbps upload. These speeds are quite good but seem a bit shy of the maximum ( 6mbps * 5 = 30mbps, 800kbps * 5 = 4.0mbps ).
So, can anyone provide any information on how to tune the MTU,MRRU,MSS values to maximizing the usage to the 5 lines? Or is this the speeds I should be expecting due to PPPoE overhead. |
|
| |
to westavmax
Your speed is good its about in line with the performance I would expect to see. |
|
GuspazGuspaz MVM join:2001-11-05 Montreal, QC |
to westavmax
You have to take into account overhead, which is about 15%. The maximum speed you should be expecting would be calculated by:
6*0.85*5 = 25.5Mbps, 0.8*0.85*5 = 3.4
You appear to be more or less getting the full aggregate performance with these lines. Linux/MLPPP should automatically tune most of the values for optimal performance if you set them to automatic, I think the MSS is the only thing it doesn't do?
It looks like you're up and running well, any lingering issues? |
|
| |
said by Guspaz:You appear to be more or less getting the full aggregate performance with these lines. Linux/MLPPP should automatically tune most of the values for optimal performance if you set them to automatic, I think the MSS is the only thing it doesn't do? No it doesn't do lots of things. Zeroshell/MLPPP did most things automatically. Looking at my "configuration" doc file (that I can't seem to find on fixppp, funny that) the config options are as follows (abridged): <if>: any non empty sequence of letters. The sequence can't start with _. and may not end with -\d*.
Required fields:
- One of <if>_multilink, <if>_ifname
- <if>_username
Settings:
<if>_multilink in List of interfaces for multilink. Leave empty for non-multilink
<if>_ifname in Interface name for non-multilink
<if>_username in Username
<if>_defaultroute in Make this the default route/get DNS
<if>_mtu in Link MTU
<if>_mru in Link MRU
<if>_mrru in Link MRRU (Multi-link only)
<if>_force_multilink in Require connection to be multi-link
<if>_ignore_remote_endpoint in Ignore the name returned by the remote end for bundle creation
debug_pppd in Add the debug option to pppd
My recommendations are (based on what I can find are the defaults in ZeroShell/MLPPP): _mrru = 1442 _mtu = 1459 (for 5 lines) You probably also want: _force_multilink = 1 _defaultroute = 1 _ignore_remote_endpoint = 1 Above this, you need to set the MSS. Optimal values are different for incoming an outgoing (and you set the incoming MSS on outgoing packets, and outgoing MSS on incoming packets). A good default is 40 under the MTU/MRRU. We've had issues with certain applications not liking having reduced max packet sizes, so the largest you can put is 1500 on both MTU and MRRU. However, still use the reduced MSS values: the apps that have problems use UDP; MSS is TCP only. |
|
| DSL_Ricer |
BTW, my suggested rules for MSS (you'll have to change ppp0 to whatever's appropriate): FW=iptables
$FW -t mangle -N mss
$FW -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A OUTPUT -o ppp0 -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A INPUT -i ppp0 -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A mss -i ppp0 -m tcpmss --mss 1420: -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1419
$FW -t mangle -A mss -i ppp0 -p tcp --tcp-option \! 2 --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1419
$FW -t mangle -A mss -o ppp0 -m tcpmss --mss 1403: -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1402
$FW -t mangle -A mss -o ppp0 -p tcp --tcp-option \! 2 --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1402
|
|
| |
Thanks for the recommendations and iptables rules DSL_Ricer. said by Guspaz:It looks like you're up and running well, any lingering issues? I am having one issue but I'm not sure if it is MLPPP related or DSL related. We seem to be experiencing some very bad latency/stalls when doing interactive activities. For example when using ssh over the MLPPP link, ocasionally, the connection will pause for 2-15 seconds. When it unfreezes, everything you typed during the pause shows up and is executed as normal. I also see a similar behaviour when browsing files with Windows Explorer over a VPN. The window would freeze while refreshing the file list and would properly refresh after 10-20 seconds. I did not see this stalling issue prior to switching out our cable connection for this MLPPP connection. |
|
| |
said by westavmax: For example when using ssh over the MLPPP link, ocasionally, the connection will pause for 2-15 seconds. When it unfreezes, everything you typed during the pause shows up and is executed as normal. SSH runs over TCP, a protocol with delivery guarantees. In the case of packet drop, it will try to resend. So, you'd either get everything transmitted, or you'd be disconnected. What you're describing could be one of 3 issues. If possible can you connect to each of your DSL modems and check your line stats. See if any of them are marginal: a DSL resync takes 10-20 seconds so depending on when you notice, it could match. You may wish to make sure that all the lines are also on the same profile and all have or don't have interleave. Reassembly buffers overflowing is not impossible, though quite unlikely with TekSavvy's setup. If you wish to observe this on a higher level, it's best shown by pinging the next hop: (unix options) ping -s 400 next-hop where next-hop is the remote IP of your ppp connection. It will show up as bursts of packet drops with otherwise stable pings. If the burst of packet drops is preceded by high latencies, then that's traffic related. Consider QoS. Even if it isn't, consider it anyways. If you're just getting non-load related random packet drops, at a percentage > 20x higher than your CRC error count over successful ATM cell transmission count, then it's likely the problem Caneris was getting. Contact me again if that's the case. |
|
1 edit |
to westavmax
A fast easy way to test if your suffering from the issues caneris discovered do the following.
Start a continuous ping to teksavvy's LNS then launch a speedtest.
If you are suffering from the issue you will notice the following.
1) when starting the speedtest you will get full speed then it will drop and continue to drop. 2) your ping is showing packet loss when running the speedtest. 3) If you just do the ping you should still see packet loss just not as much. 4) your ppp interface shows a ton of errors
Repeat this 3 times if it doesn't work the first 2 times, sometimes for brief periods ml-ppp would work properly.
Caneris/Acanac switched to round robin rather then fragmentation/reassembly to work around this problem.
Erik said it was caused by something in the kernel 2.6 im not sure of the details. Testing I did with zeroshell showed the exact same problem with my teksavvy login and my caneris login.
I think I was the first person to observe this issue with my 4 line ml-ppp setup (Its 8 lines now, Im going to be increasing this to 12 soon) As you add more lines the problem's severity increases as well.
I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue. |
|
| |
said by grayfox:I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue. You probably contacted us at a bad time: when we were busy with other stuff. The main issue is that I'm unable to reproduce the problem in my test environment. |
|
| |
said by DSL_Ricer:said by grayfox:I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue. You probably contacted us at a bad time: when we were busy with other stuff. The main issue is that I'm unable to reproduce the problem in my test environment. Thats fine, Are you guys in canada or quebec ? Do you need access to a setup, I have a 4x6 meg line setup in port hope ontario I am presently not using. (I will be using it in the next week or so tho once a part I ordered arrives) |
|
CanerisErikCaneris Premium Member join:2007-10-03 Toronto, ON |
said by grayfox: Are you guys in canada or quebec ? ROFL |
|
GuspazGuspaz MVM join:2001-11-05 Montreal, QC |
to westavmax
We're in Canada. |
|
| |
to CanerisErik
Evidently, Canada and Quebec are separate entities lol! |
|
1 edit |
to CanerisErik
Oh wow that was a major typo, Ontario or quebec is what I meant. edit: I feel even more stupid now that I see next to guspaz it says hes in quebec. |
|
GuspazGuspaz MVM join:2001-11-05 Montreal, QC |
to westavmax
We haven't been able to do any direct testing with 3+ lines. Rocky is very generous to provide us with a second DSL line to do development (which admittedly we do very slowly), but asking him for a third DSL line would be a bit much!
Actually, up until recently, DSL_Ricer was not even capable of having more than one DSL line. He just moved, I'm not sure if his new place is either, but the old place definitely wasn't capable of more than one line (short of spending hundreds of dollars to run new cable outside the building, which wasn't going to happen). |
|
| |
to DSL_Ricer
DSL_Ricer and grayfox, thanks for all your input. I gave that continuous ping while doing a speed test thing a try. I don't seem to be seeing huge packet loss while performing the test. The ppp connection does show some errors but not huge: ppp0 Link encap:Point-to-Point Protocol inet addr:xxx.xxx.xxx.xxx P-t-P:xxx.xxx.xxx.xxx Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1442 Metric:1 RX packets:20746183 errors:167 dropped:167 overruns:0 frame:0 TX packets:41984252 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:3 RX bytes:3887138792 (3707.0 Mb) TX bytes:31548130466 (30086.6 Mb)
Things do seem to be a bit more stable today, the line does not seem to be stalling as frequently. If there is a kernel patch to try for 2.6.x, I'm more than willing to give that a go. I also checked the stats on each DSL modem as DSL_Ricer suggested: modem 1 Link Information Uptime: 6 days, 17:48:00 Modulation: G.992.5 annex A Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141 Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00 Output Power (Up/Down) [dBm]: 12,0 / 5,0 Line Attenuation (Up/Down) [dB]: 15,5 / 30,0 SN Margin (Up/Down) [dB]: 7,5 / 6,5 Vendor ID (Local/Remote): TMMB / BDCM Loss of Framing (Local/Remote): 0 / 0 Loss of Signal (Local/Remote): 4 / 0 Loss of Power (Local/Remote): 0 / 0 Loss of Link (Remote): 0 Error Seconds (Local/Remote): 4 / 0 FEC Errors (Up/Down): 6.573 / 5.233.176 CRC Errors (Up/Down): 20 / 197 HEC Errors (Up/Down): 82 / 160
modem 2 Link Information Uptime: 6 days, 17:52:16 Modulation: G.992.5 annex A Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141 Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00 Output Power (Up/Down) [dBm]: 12,0 / 5,5 Line Attenuation (Up/Down) [dB]: 15,5 / 29,5 SN Margin (Up/Down) [dB]: 8,0 / 9,0 Vendor ID (Local/Remote): TMMB / BDCM Loss of Framing (Local/Remote): 0 / 0 Loss of Signal (Local/Remote): 5 / 0 Loss of Power (Local/Remote): 0 / 0 Loss of Link (Remote): 0 Error Seconds (Local/Remote): 4 / 0 FEC Errors (Up/Down): 1.649 / 2.736.630 CRC Errors (Up/Down): 0 / 14 HEC Errors (Up/Down): 69 / 4
modem 3 Link Information Uptime: 6 days, 17:41:02 Modulation: G.992.5 annex A Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141 Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00 Output Power (Up/Down) [dBm]: 12,0 / 14,0 Line Attenuation (Up/Down) [dB]: 15,5 / 29,5 SN Margin (Up/Down) [dB]: 7,5 / 12,0 Vendor ID (Local/Remote): TMMB / BDCM Loss of Framing (Local/Remote): 0 / 0 Loss of Signal (Local/Remote): 0 / 0 Loss of Power (Local/Remote): 0 / 0 Loss of Link (Remote): 0 Error Seconds (Local/Remote): 0 / 0 FEC Errors (Up/Down): 5.851 / 595.902 CRC Errors (Up/Down): 41 / 17 HEC Errors (Up/Down): 115 / 8
modem 4 Link Information Uptime: 6 days, 17:56:08 Modulation: G.992.5 annex A Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141 Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00 Output Power (Up/Down) [dBm]: 12,0 / 5,0 Line Attenuation (Up/Down) [dB]: 15,5 / 29,5 SN Margin (Up/Down) [dB]: 7,5 / 6,5 Vendor ID (Local/Remote): TMMB / BDCM Loss of Framing (Local/Remote): 0 / 0 Loss of Signal (Local/Remote): 1 / 0 Loss of Power (Local/Remote): 0 / 0 Loss of Link (Remote): 0 Error Seconds (Local/Remote): 1 / 0 FEC Errors (Up/Down): 4.866 / 17.871.123 CRC Errors (Up/Down): 2 / 34 HEC Errors (Up/Down): 43 / 25
The only thing that stands out to me is that the FEC errors are very high on a couple of lines. Is this anything to worry about? |
|
| |
What I find slightly odd with those line stats is that, while all have roughly the same attenuation on all lines, your SNR varies from 6.5db (marginal) to 12db (quite good). Others would probably be better at debugging this issue, if it is one.
Don't worry too much about the FEC errors. As long as your CRC error count remains under a few per second, it's good. (unless you need to sustain a single high bandwidth, high latency TCP connection, then you need ~0 CRC errors) |
|
| |
So, do you guys think that my current stalling/freezing issue is more likely due to line quality than a configuration problem? |
|