dslreports logo
Search similar:


uniqs
2481
westavmax
join:2010-10-01
Calgary, AB

2 edits

westavmax

Member

[DSL] Need Help with Linux/MLPPP with 3 or more lines

I want to eventually get MLPPP working with 5 lines but right now, I'm having trouble getting it to work with 3. Using 2 lines works great, with speeds almost doubling as expected. However, when the third line is added, everything stops working.

I downloaded Linux/MLPPP (»fixppp.org/downloads/lin ··· .tar.bz2) and installed it on an openSUSE 11.3 x86_64 system. Then I grab the source from the git repo and recompiled all the binaries for x86_64 (db, db_dump, db_set, ..., pppd, redial-helper, rp-pppoe.so). The kernel on the system is 2.6.34.7 so I did not apply the 2.6.27 patch that came with Linux/MLPPP (this is probably my next step).

Using db_set, I configured the MLPPP parameters:

mlppptest:/opt/mlppp/bin # ./db_dump
ppp0_defaultroute=1
ppp0_username=username@teksavvy.com
ppp0_multilink=eth2,eth3,eth4
ppp0_mtu=1486
ppp0_mrru=1486


I add the firewall rules for MSS suggested by DSL_Ricer at this link: »3-Line MLPPP on Linux

When I bring up the link, everything seems to connect fine:

Oct 1 23:38:24 mlppptest pppd[3096]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded.
Oct 1 23:38:24 mlppptest pppd[3096]: pppd 2.4.4 started by root, uid 0
Oct 1 23:38:24 mlppptest pppd[3096]: PPP session is 3454
Oct 1 23:38:24 mlppptest pppd[3096]: Starting negotiation on eth2
Oct 1 23:38:25 mlppptest pppd[3096]: PAP authentication succeeded
Oct 1 23:38:25 mlppptest pppd[3096]: peer from calling number xx:xx:xx:xx:xx:xx authorize d
Oct 1 23:38:25 mlppptest pppd[3096]: Using interface ppp0
Oct 1 23:38:25 mlppptest pppd[3096]: New bundle ppp0 created
Oct 1 23:38:27 mlppptest pppd[3096]: local IP address 76.xxx.xxx.xxx
Oct 1 23:38:27 mlppptest pppd[3096]: remote IP address 76.xxx.xxx.xxx
Oct 1 23:38:27 mlppptest pppd[3096]: primary DNS address 76.10.191.198
Oct 1 23:38:27 mlppptest pppd[3096]: secondary DNS address 76.10.191.199
Oct 1 23:38:27 mlppptest pppd[3394]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded.
Oct 1 23:38:27 mlppptest pppd[3401]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded.
Oct 1 23:38:27 mlppptest pppd[3394]: pppd 2.4.4 started by root, uid 0
Oct 1 23:38:27 mlppptest pppd[3401]: pppd 2.4.4 started by root, uid 0
Oct 1 23:38:27 mlppptest pppd[3401]: PPP session is 3458
Oct 1 23:38:27 mlppptest pppd[3401]: Starting negotiation on eth4
Oct 1 23:38:27 mlppptest pppd[3394]: PPP session is 3457
Oct 1 23:38:27 mlppptest pppd[3394]: Starting negotiation on eth3
Oct 1 23:38:27 mlppptest pppd[3561]: Plugin /opt/mlppp/lib.2.4.4/rp-pppoe.so loaded.
Oct 1 23:38:27 mlppptest pppd[3561]: pppd 2.4.4 started by root, uid 0
Oct 1 23:38:27 mlppptest pppd[3561]: PPP session is 3459
Oct 1 23:38:27 mlppptest pppd[3561]: Starting negotiation on eth2
Oct 1 23:38:28 mlppptest pppd[3401]: PAP authentication succeeded
Oct 1 23:38:28 mlppptest pppd[3401]: peer from calling number xx:xx:xx:xx:xx:xx authorize d
Oct 1 23:38:28 mlppptest pppd[3401]: Using interface ppp0
Oct 1 23:38:28 mlppptest pppd[3401]: Link attached to ppp0
Oct 1 23:38:28 mlppptest pppd[3394]: PAP authentication succeeded
Oct 1 23:38:28 mlppptest pppd[3394]: peer from calling number xx:xx:xx:xx:xx:xx authorize d
Oct 1 23:38:28 mlppptest pppd[3394]: Using interface ppp0
Oct 1 23:38:28 mlppptest pppd[3394]: Link attached to ppp0
Oct 1 23:38:28 mlppptest pppd[3561]: PAP authentication succeeded
Oct 1 23:38:28 mlppptest pppd[3561]: peer from calling number xx:xx:xx:xx:xx:xx authorize d
Oct 1 23:38:28 mlppptest pppd[3561]: Using interface ppp0
Oct 1 23:38:28 mlppptest pppd[3561]: Link attached to ppp0
Oct 1 23:38:28 mlppptest pppd[3096]: Terminating link on signal 2
Oct 1 23:38:28 mlppptest pppd[3096]: Link terminated.


mlppptest:/opt/mlppp/bin # ./db_dump

ppp0_defaultroute=1
ppp0_username=username@teksavvy.com
ppp0_multilink=eth2,eth3,eth4
ppp0_mtu=1486
ppp0_mrru=1486
ppp0-key=AA8BBCB1
ppp0-link=5177
ppp0_remote_endpoint=local:34.36.30.38.33.32.30.30.32.32.00.00.00.00.00
_ppp0-2_lcp_status=1
_ppp0-0_lcp_status=1
_ppp0-1_lcp_status=1
ppp0_multilink_top=3


Now, when I try to use the link, things start to get weird. If I try to ping an ip address, it seems to work, so ICMP packets seen to work fine. However, DNS look ups don't to work. Most other IP traffic moves very slowly.

After playing around with dig, I also noticed that I could get DNS to reply if I forced tcp request instead of udp (dig @76.10.191.198 +tcp teksavvy.com).

Does anyone have any idea of what I might be doing wrong? Do I have to use the kernel patch to get 3 or more lines working properly?

Any suggestions would be greatly appreciated.
scorpido
Premium Member
join:2009-11-02
Kitchener, ON

scorpido

Premium Member

I am having the same issues. I tried Tomato/mlppp and also a Mikrotik 493 and the same thing. 2 works fine, three and above makes it slow as crap.

Guspaz
Guspaz
MVM
join:2001-11-05
Montreal, QC

Guspaz to westavmax

MVM

to westavmax
I'll try to get DSL_Ricer to take a look at this thread.

One thing you should keep an eye on is MSS clamping; Linux/MLPPP doesn't manage that for you, I believe, and if you don't configure it properly, packets above a certain size won't get through.
westavmax
join:2010-10-01
Calgary, AB

westavmax

Member

Thanks for the info Guspaz.

I do have the iptables rules for MSS clamping but you are probably right that the values are not properly tuned. I tried several different MSS values (1402, 1411, 1446) as found in some other threads but it did not seem to help. Do you have a formula for calculating this based on the number of lines being used?

I'm also in the process of building a patched 2.6.27 kernel to see if that will help. However, won't be able to test that until Monday.
westavmax

westavmax

Member

I'm having a little more luck with a patched 2.6.27 kernel but have not had a chance to test the speed yet. Currently I have 4 lines in the mlppp bundle without the previous DNS weirdness and overt slowness.

If anyone could provide some information on how to properly tune MTU, MRRU and MSS that would be great. Right now I'm using the settings suggested by DSL_Ricer for 3 lines:

MTU=1500
MRRU=1500
MSS=1402 for outbound packets
MSS=1459 for inbound packets

I want to know how to tune this for 4 and 5 lines.
westavmax

1 edit

westavmax

Member

Ok, we took our production DSL line offline and added it to the MLPPP bundle for a total of 5 lines. I ran some speed test and we are getting about 25mbps download and 3.4mbps upload. These speeds are quite good but seem a bit shy of the maximum ( 6mbps * 5 = 30mbps, 800kbps * 5 = 4.0mbps ).

So, can anyone provide any information on how to tune the MTU,MRRU,MSS values to maximizing the usage to the 5 lines? Or is this the speeds I should be expecting due to PPPoE overhead.

grayfox
join:2007-12-10
Whitby, ON

grayfox to westavmax

Member

to westavmax
Your speed is good its about in line with the performance I would expect to see.

Guspaz
Guspaz
MVM
join:2001-11-05
Montreal, QC

Guspaz to westavmax

MVM

to westavmax
You have to take into account overhead, which is about 15%. The maximum speed you should be expecting would be calculated by:

6*0.85*5 = 25.5Mbps, 0.8*0.85*5 = 3.4

You appear to be more or less getting the full aggregate performance with these lines. Linux/MLPPP should automatically tune most of the values for optimal performance if you set them to automatic, I think the MSS is the only thing it doesn't do?

It looks like you're up and running well, any lingering issues?
DSL_Ricer
Premium Member
join:2007-07-22

DSL_Ricer

Premium Member

said by Guspaz:

You appear to be more or less getting the full aggregate performance with these lines. Linux/MLPPP should automatically tune most of the values for optimal performance if you set them to automatic, I think the MSS is the only thing it doesn't do?
No it doesn't do lots of things. Zeroshell/MLPPP did most things automatically.

Looking at my "configuration" doc file (that I can't seem to find on fixppp, funny that) the config options are as follows (abridged):
<if>: any non empty sequence of letters. The sequence can't start with _. and may not end with -\d*.
 
Required fields:
- One of <if>_multilink, <if>_ifname
- <if>_username
 
Settings:
<if>_multilink          in      List of interfaces for multilink. Leave empty for non-multilink
<if>_ifname             in      Interface name for non-multilink
<if>_username           in      Username
<if>_defaultroute       in      Make this the default route/get DNS
<if>_mtu                in      Link MTU
<if>_mru                in      Link MRU
<if>_mrru               in      Link MRRU (Multi-link only)
<if>_force_multilink    in      Require connection to be multi-link
<if>_ignore_remote_endpoint     in      Ignore the name returned by the remote end for bundle creation
debug_pppd              in      Add the debug option to pppd
 

My recommendations are (based on what I can find are the defaults in ZeroShell/MLPPP):
_mrru = 1442
_mtu = 1459 (for 5 lines)

You probably also want:
_force_multilink = 1
_defaultroute = 1
_ignore_remote_endpoint = 1

Above this, you need to set the MSS. Optimal values are different for incoming an outgoing (and you set the incoming MSS on outgoing packets, and outgoing MSS on incoming packets). A good default is 40 under the MTU/MRRU.

We've had issues with certain applications not liking having reduced max packet sizes, so the largest you can put is 1500 on both MTU and MRRU. However, still use the reduced MSS values: the apps that have problems use UDP; MSS is TCP only.
DSL_Ricer

DSL_Ricer

Premium Member

BTW, my suggested rules for MSS (you'll have to change ppp0 to whatever's appropriate):
FW=iptables
$FW -t mangle -N mss
$FW -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A OUTPUT -o ppp0 -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A INPUT -i ppp0 -p tcp --tcp-flags SYN,RST SYN -j mss
$FW -t mangle -A mss -i ppp0 -m tcpmss --mss 1420: -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1419
$FW -t mangle -A mss -i ppp0 -p tcp --tcp-option \! 2 --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1419
$FW -t mangle -A mss -o ppp0 -m tcpmss --mss 1403: -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1402
$FW -t mangle -A mss -o ppp0 -p tcp --tcp-option \! 2  --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1402
 
westavmax
join:2010-10-01
Calgary, AB

westavmax

Member

Thanks for the recommendations and iptables rules DSL_Ricer.
said by Guspaz:

It looks like you're up and running well, any lingering issues?
I am having one issue but I'm not sure if it is MLPPP related or DSL related. We seem to be experiencing some very bad latency/stalls when doing interactive activities. For example when using ssh over the MLPPP link, ocasionally, the connection will pause for 2-15 seconds. When it unfreezes, everything you typed during the pause shows up and is executed as normal. I also see a similar behaviour when browsing files with Windows Explorer over a VPN. The window would freeze while refreshing the file list and would properly refresh after 10-20 seconds.

I did not see this stalling issue prior to switching out our cable connection for this MLPPP connection.
DSL_Ricer
Premium Member
join:2007-07-22

DSL_Ricer

Premium Member

said by westavmax:

For example when using ssh over the MLPPP link, ocasionally, the connection will pause for 2-15 seconds. When it unfreezes, everything you typed during the pause shows up and is executed as normal.
SSH runs over TCP, a protocol with delivery guarantees. In the case of packet drop, it will try to resend. So, you'd either get everything transmitted, or you'd be disconnected.

What you're describing could be one of 3 issues.
If possible can you connect to each of your DSL modems and check your line stats. See if any of them are marginal: a DSL resync takes 10-20 seconds so depending on when you notice, it could match.
You may wish to make sure that all the lines are also on the same profile and all have or don't have interleave. Reassembly buffers overflowing is not impossible, though quite unlikely with TekSavvy's setup.

If you wish to observe this on a higher level, it's best shown by pinging the next hop: (unix options)
ping -s 400 next-hop
where next-hop is the remote IP of your ppp connection.

It will show up as bursts of packet drops with otherwise stable pings.

If the burst of packet drops is preceded by high latencies, then that's traffic related. Consider QoS. Even if it isn't, consider it anyways.

If you're just getting non-load related random packet drops, at a percentage > 20x higher than your CRC error count over successful ATM cell transmission count, then it's likely the problem Caneris was getting. Contact me again if that's the case.

grayfox
join:2007-12-10
Whitby, ON

1 edit

grayfox to westavmax

Member

to westavmax
A fast easy way to test if your suffering from the issues caneris discovered do the following.

Start a continuous ping to teksavvy's LNS then launch a speedtest.

If you are suffering from the issue you will notice the following.

1) when starting the speedtest you will get full speed then it will drop and continue to drop.
2) your ping is showing packet loss when running the speedtest.
3) If you just do the ping you should still see packet loss just not as much.
4) your ppp interface shows a ton of errors

Repeat this 3 times if it doesn't work the first 2 times, sometimes for brief periods ml-ppp would work properly.

Caneris/Acanac switched to round robin rather then fragmentation/reassembly to work around this problem.

Erik said it was caused by something in the kernel 2.6 im not sure of the details. Testing I did with zeroshell showed the exact same problem with my teksavvy login and my caneris login.

I think I was the first person to observe this issue with my 4 line ml-ppp setup (Its 8 lines now, Im going to be increasing this to 12 soon) As you add more lines the problem's severity increases as well.

I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue.
DSL_Ricer
Premium Member
join:2007-07-22

DSL_Ricer

Premium Member

said by grayfox:

I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue.
You probably contacted us at a bad time: when we were busy with other stuff.

The main issue is that I'm unable to reproduce the problem in my test environment.

grayfox
join:2007-12-10
Whitby, ON

grayfox

Member

said by DSL_Ricer:

said by grayfox:

I tryed to contact guspaz with this issue when I first observed it he seemed to have no interest in fixing it or investigating it. (From what I gathered he was busy with other stuff), I never contacted ricer, he will probably be able to get you and the linux ml-ppp patch fixed up in no time if this is the issue.
You probably contacted us at a bad time: when we were busy with other stuff.

The main issue is that I'm unable to reproduce the problem in my test environment.
Thats fine, Are you guys in canada or quebec ?

Do you need access to a setup, I have a 4x6 meg line setup in port hope ontario I am presently not using. (I will be using it in the next week or so tho once a part I ordered arrives)

CanerisErik
Caneris
Premium Member
join:2007-10-03
Toronto, ON

CanerisErik

Premium Member

said by grayfox:

Are you guys in canada or quebec ?
ROFL

Guspaz
Guspaz
MVM
join:2001-11-05
Montreal, QC

Guspaz to westavmax

MVM

to westavmax
We're in Canada.
MajorPewPew
join:2010-09-19

MajorPewPew to CanerisErik

Member

to CanerisErik
said by CanerisErik:

said by grayfox:

Are you guys in canada or quebec ?
ROFL
Evidently, Canada and Quebec are separate entities lol!

grayfox
join:2007-12-10
Whitby, ON

1 edit

grayfox to CanerisErik

Member

to CanerisErik
said by CanerisErik:

said by grayfox:

Are you guys in canada or quebec ?
ROFL
Oh wow that was a major typo, Ontario or quebec is what I meant.

edit: I feel even more stupid now that I see next to guspaz it says hes in quebec.

Guspaz
Guspaz
MVM
join:2001-11-05
Montreal, QC

Guspaz to westavmax

MVM

to westavmax
We haven't been able to do any direct testing with 3+ lines. Rocky is very generous to provide us with a second DSL line to do development (which admittedly we do very slowly), but asking him for a third DSL line would be a bit much!

Actually, up until recently, DSL_Ricer was not even capable of having more than one DSL line. He just moved, I'm not sure if his new place is either, but the old place definitely wasn't capable of more than one line (short of spending hundreds of dollars to run new cable outside the building, which wasn't going to happen).
westavmax
join:2010-10-01
Calgary, AB

westavmax to DSL_Ricer

Member

to DSL_Ricer
DSL_Ricer and grayfox, thanks for all your input.

I gave that continuous ping while doing a speed test thing a try. I don't seem to be seeing huge packet loss while performing the test. The ppp connection does show some errors but not huge:

ppp0 Link encap:Point-to-Point Protocol
inet addr:xxx.xxx.xxx.xxx P-t-P:xxx.xxx.xxx.xxx Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1442 Metric:1
RX packets:20746183 errors:167 dropped:167 overruns:0 frame:0
TX packets:41984252 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:3887138792 (3707.0 Mb) TX bytes:31548130466 (30086.6 Mb)

Things do seem to be a bit more stable today, the line does not seem to be stalling as frequently. If there is a kernel patch to try for 2.6.x, I'm more than willing to give that a go.

I also checked the stats on each DSL modem as DSL_Ricer suggested:

modem 1
Link Information
Uptime: 6 days, 17:48:00
Modulation: G.992.5 annex A
Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141
Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00
Output Power (Up/Down) [dBm]: 12,0 / 5,0
Line Attenuation (Up/Down) [dB]: 15,5 / 30,0
SN Margin (Up/Down) [dB]: 7,5 / 6,5
Vendor ID (Local/Remote): TMMB / BDCM
Loss of Framing (Local/Remote): 0 / 0
Loss of Signal (Local/Remote): 4 / 0
Loss of Power (Local/Remote): 0 / 0
Loss of Link (Remote): 0
Error Seconds (Local/Remote): 4 / 0
FEC Errors (Up/Down): 6.573 / 5.233.176
CRC Errors (Up/Down): 20 / 197
HEC Errors (Up/Down): 82 / 160

modem 2
Link Information
Uptime: 6 days, 17:52:16
Modulation: G.992.5 annex A
Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141
Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00
Output Power (Up/Down) [dBm]: 12,0 / 5,5
Line Attenuation (Up/Down) [dB]: 15,5 / 29,5
SN Margin (Up/Down) [dB]: 8,0 / 9,0
Vendor ID (Local/Remote): TMMB / BDCM
Loss of Framing (Local/Remote): 0 / 0
Loss of Signal (Local/Remote): 5 / 0
Loss of Power (Local/Remote): 0 / 0
Loss of Link (Remote): 0
Error Seconds (Local/Remote): 4 / 0
FEC Errors (Up/Down): 1.649 / 2.736.630
CRC Errors (Up/Down): 0 / 14
HEC Errors (Up/Down): 69 / 4

modem 3
Link Information
Uptime: 6 days, 17:41:02
Modulation: G.992.5 annex A
Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141
Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00
Output Power (Up/Down) [dBm]: 12,0 / 14,0
Line Attenuation (Up/Down) [dB]: 15,5 / 29,5
SN Margin (Up/Down) [dB]: 7,5 / 12,0
Vendor ID (Local/Remote): TMMB / BDCM
Loss of Framing (Local/Remote): 0 / 0
Loss of Signal (Local/Remote): 0 / 0
Loss of Power (Local/Remote): 0 / 0
Loss of Link (Remote): 0
Error Seconds (Local/Remote): 0 / 0
FEC Errors (Up/Down): 5.851 / 595.902
CRC Errors (Up/Down): 41 / 17
HEC Errors (Up/Down): 115 / 8

modem 4
Link Information
Uptime: 6 days, 17:56:08
Modulation: G.992.5 annex A
Bandwidth (Up/Down) [kbps/kbps]: 1.020 / 6.141
Data Transferred (Sent/Received) [KB/KB]: 0,00 / 0,00
Output Power (Up/Down) [dBm]: 12,0 / 5,0
Line Attenuation (Up/Down) [dB]: 15,5 / 29,5
SN Margin (Up/Down) [dB]: 7,5 / 6,5
Vendor ID (Local/Remote): TMMB / BDCM
Loss of Framing (Local/Remote): 0 / 0
Loss of Signal (Local/Remote): 1 / 0
Loss of Power (Local/Remote): 0 / 0
Loss of Link (Remote): 0
Error Seconds (Local/Remote): 1 / 0
FEC Errors (Up/Down): 4.866 / 17.871.123
CRC Errors (Up/Down): 2 / 34
HEC Errors (Up/Down): 43 / 25

The only thing that stands out to me is that the FEC errors are very high on a couple of lines. Is this anything to worry about?
DSL_Ricer
Premium Member
join:2007-07-22

DSL_Ricer

Premium Member

What I find slightly odd with those line stats is that, while all have roughly the same attenuation on all lines, your SNR varies from 6.5db (marginal) to 12db (quite good). Others would probably be better at debugging this issue, if it is one.

Don't worry too much about the FEC errors. As long as your CRC error count remains under a few per second, it's good. (unless you need to sustain a single high bandwidth, high latency TCP connection, then you need ~0 CRC errors)
westavmax
join:2010-10-01
Calgary, AB

westavmax

Member

So, do you guys think that my current stalling/freezing issue is more likely due to line quality than a configuration problem?