Take a look at our
ThinkPads.com HOME PAGE
For those who might want to contribute to the blog, start here: Editors Alley Topic
Then contact Bill with a Private Message

Thinkpad T60 - Network latency with FC5

Linux on ThinkPads
Post Reply
Message
Author
smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

Thinkpad T60 - Network latency with FC5

#1 Post by smithrs » Sat Jul 15, 2006 4:02 pm

Hi everyone. Just got a Thinkpad T60 (2007-66U) and am very happy
except for the network performance under Fedora FC5 Linux.

What I'm seeing is that network performance from the laptop going out
is fine, but incoming traffic is subjected to a bizzare form of
latency where inbound pings look like this:

64 bytes from 192.168.1.20: icmp_seq=2197 ttl=64 time=1000 ms
64 bytes from 192.168.1.20: icmp_seq=2198 ttl=64 time=0.555 ms
64 bytes from 192.168.1.20: icmp_seq=2199 ttl=64 time=1000 ms
64 bytes from 192.168.1.20: icmp_seq=2200 ttl=64 time=0.416 ms
64 bytes from 192.168.1.20: icmp_seq=2201 ttl=64 time=1000 ms
64 bytes from 192.168.1.20: icmp_seq=2202 ttl=64 time=0.325 ms
64 bytes from 192.168.1.20: icmp_seq=2203 ttl=64 time=1000 ms
64 bytes from 192.168.1.20: icmp_seq=2204 ttl=64 time=0.218 ms

So far I have tried the following:

- Reboot into Windows XP (I run dual-boot), go into driver properties,
turn off "Deep Smart Power Down" since that seems to be causing
problems for people.

- Download and re-build the e1000 module.

- Update Kernel to latest version (2.6.17-2157-smp).

The other thing I noticed is that when I'm downloading a bunch of
stuff (for example during a yum update), the inbound performance seems
to smooth out. I don't know what this means, maybe full/half
negotiation is getting messed up when the interface is relatively
idle. In case it matters I'm using a Linksys 10/100 switch.

Thanks, as always.

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

That is weird...

#2 Post by xtern0 » Tue Jul 18, 2006 8:07 am

That is strange...

Since you mentioned the e1000 module, I'm guessing that the only interface up is your ethernet. What's the traceroute show? I assume you have iptables running; but did you enable SELinux as well?

What is really strange is that its every other packet. You may want to check your IP address allocations on the network as well.

What about a ping to 127.0.0.1 (localhost)?

djc

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#3 Post by smithrs » Wed Jul 19, 2006 1:38 pm

- SELinux is disabled

- Nothing going on with iptables

- Pings to localhost work fine

- No IP address collisions

- All basic network setup is sane

I've been poking around the Linux kernel mailing lists and there just
seem to be a few people like me who run into e1000 performance issues.
Do a google search on terms like "t60 e1000 network latency" and
you'll see what I'm talking about. There seems to be some hint that
it might be a kernel issue, although I've tried 2.6.15 thru 17 with
the same result.

I'll post back if I run into anything else. Thanks.

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

Fedora Issue?

#4 Post by xtern0 » Sun Jul 23, 2006 3:35 am

So I take it this is a Fedora issue? I haven't experienced it in any of the other distros. I haven't put FC5 on the T60 yet; probably won't bother at this point.

I used to be, and still am in some areas, hardcore (and I mean gung-ho) redhat/FP, so don't take this the wrong way: have you considered installing another distro geared more towards the desktop/functionality environment and less towards the development environment? I just think your time might be better spent on something more productive than efficiently tethering yourself to a drop. Of course, if you're hacking the kernel for the same reason men climb mountains, then don't pay any attention to me. ;)

On a side note, I am surprised by this problem.

But definitely post when you resolve the issue.

Cheers,

djc

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#5 Post by smithrs » Sun Jul 23, 2006 11:22 pm

No it's not just FC5. Take a look here, for example:

http://www.nabble.com/T60-and-Network-t1619775.html

I'm not enough of a kernel hacker to "get" what the guy announced as a solution.

There are also a bunch of e1000 fixes coming in 2.6.18.

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

#6 Post by xtern0 » Mon Jul 24, 2006 12:52 am

OK. I apologize if you've already tried both of these things....

1. Have you updated your BIOS to the latest release from IBM/Lenovo?
http://www-307.ibm.com/pc/support/site. ... MIGR-63024

2. Have you downloaded the latest driver from sourceforge?
http://sourceforge.net/projects/e1000 and then reviewed the README file for instructions on how to install and troubleshoot known issues?

After looking at that dude's post and the other posts talking about updating the BIOS, I'm guessing its a NAPI issue or a WoL (Wake on LAN) combined with LKM (Loadable Kernel Module) issue. I really don't think its kernel support for e100...I just don't buy that, we're talking about loadable kernel modules; however, I could be wrong.

IF THE BIOS UPDATE DOESN'T WORK:
When/if you decide to compile the new driver and load it (or if you already have), try typing this (instead of just 'make'):
make CFLAGS_EXTRA=-DE1000_NO_NAPI install

Of course, I just got that from googling and the readmes, so if you've already tried them, please disregard.

Hopefully, I'll run into a similar issue when we start loading Debian on our other laptops. If I do, I will post the resolution if I figure it out.

Again, this is likely a LKM (loadable kernel module); you should not have to bind the driver to the base kernel in order to get it to work right (that is, recompile the kernel). You also shouldn't have to remove support for other modules to get it work right.

I think the reason the kernel rebuilds are all working is because you would bind the driver into the base kernel, thereby allowing the driver to load properly. the more I think about this, the more I am leaning towards a WoL problem and the LKM not loading at the right time or something along those lines. Hence my recommendation to make absolutely sure the BIOS is up to date.

I may be wrong in this area, so if someone would politely correct me, I'd greatly appreciate it.

Cheers,

X

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#7 Post by smithrs » Mon Jul 24, 2006 2:42 am

- BIOS is at 1.06.

- Intel driver is at 7.1.9

I hear ya on the question of when and if the module is binding. The
README mentions that the e1000 module is written on purpose to have a
zero reference count when you use lsmod. I don't know if there is a
more clever use of lsmod or some other command to see what the
module is really doing.

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

#8 Post by xtern0 » Mon Jul 24, 2006 3:08 am

Latest BIOS release is 1.07, however, the controller version says they are the same. I wouldn't trust it, try updating anyway and see what happens. Can't hurt.

Try dmesg and see if that gives you any ideas.

Also check /proc/modules.

Beyond that, without being at the box, and without having an identical configuration in front of me, there isn't a whole lot more I can suggest beyond hunting through the README and running a bare minimum make on the driver module, seeing if excluding certain functionality gets rid of the problem, and then remaking the module (and testing each build) bit by bit until you identify exactly what area the problem is originating in.

I really don't feel that binding the driver to the base kernel is an acceptable solution; its just a work-around in this particular case.

However, learning to build your own kernel is a very valuable skill; it comes in very handy if you are ever an admin that is striving for the greatest efficiency out of your servers. So fixing the matter by compiling a custom kernel would be a worthwhile endeavor. You may just discover that you prefer to customize your kernel to obtain the greatest performance from your Thinkpad. Its really not all that tough once you've done it a couple of times. More like changing the oil in your car.

Good luck resolving the problem and be sure to post the solution.

Cheers,

X

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#9 Post by smithrs » Mon Jul 24, 2006 1:04 pm

One last question...

I don't see the procedure for building the intel e1000 driver into the kernel.

Do I just symlink it to the /usr/src/kernels/<version>/net/ directory and then build as usual?

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

#10 Post by xtern0 » Mon Jul 24, 2006 5:25 pm

That's because the patch is for the module. Once you compile the new 3rd party driver, the module should load properly. You will need the kernel-development package to do this.

To build it statically into the kernel you need to research how to build a custom kernel, specifically look for how to copy your existing configuration, then how use a config program to dicate the behavior of each of the kernel modules (not available, LKM, static) after copying your existing configuration into the new configuration (the one you are editing prior to compiling).

Good places to start:
http://www.kernelnewbies.org
http://www.digitalhermit.com/linux/Kern ... HOWTO.html

Given the terms and administration you are discussing, I highly recommend that you walk through the exercise of building your own kernel and having it boot successfully at least once. It will clear up a lot of your current questions, while filling your head up with twice as many other questions as you had before. Ah, the beauty of learning....

Heck, this thread alone has reminded me that I am overdue for playing around with a Linux kernel using the included tools again; been stuck in the OpenBSD world for too long...

Have fun!

X

xtern0
Freshman Member
Posts: 50
Joined: Fri Jul 14, 2006 8:01 pm
Location: Fort Worth, Texas
Contact:

#11 Post by xtern0 » Tue Jul 25, 2006 6:50 am

Ahhhhhh.....

make gconfig

You kids don't know how good you have it nowadays! No eye assaulting ncurses menus, and most of all, no more figgin' having to go through answer Y, N, M 500 times with the vain hope that you won't screw up and have to start all over again!

If you do start experimenting with kernel hacking, type "make config" just once to start the process I had to go through the very first time I tried to build my own kernel on a Toshiba laptop with a 500MB hard drive ($4k at the time; quite a little prize). Time yourself and let me know just how far you make it before you Ctrl+C while saying "eff this!" I remember discovering menuconfig and being in heaven at that time.

Cheers,

X

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#12 Post by smithrs » Thu Jul 27, 2006 12:56 am

Thanks for your help on this. I'll let you know how it goes.

jaddle
Posts: 14
Joined: Fri Jul 28, 2006 2:19 pm
Location: Montreal, Canada
Contact:

#13 Post by jaddle » Thu Aug 03, 2006 9:37 am

Have you had any luck with this? I'm seeing the same problem on my X60s, using debian. It's very annoying!

The most annoying thing is that I remember seeing a web page that described this problem exactly, along with a fix.. but it was about a month ago, and I can't find it now, despite spending ages poking around google.

frankausmtank
Freshman Member
Posts: 111
Joined: Thu Aug 03, 2006 5:06 am
Location: Berlin, Germany

#14 Post by frankausmtank » Thu Aug 03, 2006 9:23 pm

Exact same problem here, T60, latest BIOS (1.09a), using debian sarge with a 2.6.16.20 custom kernel. I followed the advice at http://www.nabble.com/T60-and-Network-t1619775.html , deactivating the e100 support and activating NAPI (whatever that means), but without any change in behaviour. It seems that not only ping times are horrid (exactly like shmithrs' times - every 2nd one is around 1000ms), also the data transfer rates within the network are limited to about 100kbps. I also quick-checked with the xubuntu 6.06 live cd (afaik essentialy a debian etch system) without any difference.

I'll try again when 2.6.18 final is out and keep you posted. I also have yet to try the sourceforge driver.

frankausmtank
Freshman Member
Posts: 111
Joined: Thu Aug 03, 2006 5:06 am
Location: Berlin, Germany

#15 Post by frankausmtank » Wed Sep 20, 2006 4:43 am

I'll try again when 2.6.18 final is out and keep you posted.
So..
I just compiled 2.6.18, but there's no difference in behaviour. STILL haven't tried the sourceforge driver.

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#16 Post by smithrs » Thu Sep 21, 2006 2:24 am

I'm seeing some improvement with kernel-smp-2.6.17-1.2187 under FC5.

It went from this (2.6.17-1.2174):

64 bytes from helium (192.168.1.20): icmp_seq=357 ttl=64 time=351 ms
64 bytes from helium (192.168.1.20): icmp_seq=358 ttl=64 time=1000 ms
64 bytes from helium (192.168.1.20): icmp_seq=359 ttl=64 time=0.353 ms
64 bytes from helium (192.168.1.20): icmp_seq=360 ttl=64 time=1000 ms
64 bytes from helium (192.168.1.20): icmp_seq=361 ttl=64 time=0.692 ms
64 bytes from helium (192.168.1.20): icmp_seq=362 ttl=64 time=1000 ms
64 bytes from helium (192.168.1.20): icmp_seq=363 ttl=64 time=0.545 ms
64 bytes from helium (192.168.1.20): icmp_seq=364 ttl=64 time=1000 ms
64 bytes from helium (192.168.1.20): icmp_seq=365 ttl=64 time=0.418 ms
64 bytes from helium (192.168.1.20): icmp_seq=366 ttl=64 time=1000 ms

To this (2.6.17-1.2187):

64 bytes from helium (192.168.1.20): icmp_seq=502 ttl=64 time=97.3 ms
64 bytes from helium (192.168.1.20): icmp_seq=503 ttl=64 time=97.2 ms
64 bytes from helium (192.168.1.20): icmp_seq=504 ttl=64 time=0.266 ms
64 bytes from helium (192.168.1.20): icmp_seq=505 ttl=64 time=0.193 ms
64 bytes from helium (192.168.1.20): icmp_seq=506 ttl=64 time=0.611 ms
64 bytes from helium (192.168.1.20): icmp_seq=507 ttl=64 time=0.508 ms
64 bytes from helium (192.168.1.20): icmp_seq=508 ttl=64 time=97.4 ms
64 bytes from helium (192.168.1.20): icmp_seq=509 ttl=64 time=97.3 ms
64 bytes from helium (192.168.1.20): icmp_seq=510 ttl=64 time=97.2 ms
64 bytes from helium (192.168.1.20): icmp_seq=511 ttl=64 time=97.1 ms
64 bytes from helium (192.168.1.20): icmp_seq=512 ttl=64 time=97.6 ms
64 bytes from helium (192.168.1.20): icmp_seq=513 ttl=64 time=25.5 ms
64 bytes from helium (192.168.1.20): icmp_seq=514 ttl=64 time=97.4 ms
64 bytes from helium (192.168.1.20): icmp_seq=515 ttl=64 time=97.3 ms

It seems to pick a number between 90ms and 130ms and just hover there
while the CPU is idle. And when CPU activity perks up the ping times
drop down to sub-millisecond.

jon
Posts: 42
Joined: Thu May 20, 2004 9:24 am
Location: Coonabarabran

#17 Post by jon » Thu Sep 21, 2006 9:05 am

Can you post the .config for that kernel?

I just tried 2.6.18 and I'm getting the large 1000ms delays. It's worse than in XP. :(

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#18 Post by smithrs » Thu Sep 21, 2006 1:32 pm

I'm using a pre-built kernel.

http://fedoranews.org/cms/node/1604

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#19 Post by smithrs » Wed Oct 11, 2006 2:23 pm

Found more info and a possible workaround (read all the way to the bottom):

http://bugme.osdl.org/show_bug.cgi?id=6929

smithrs
Posts: 11
Joined: Thu Jul 13, 2006 12:54 pm
Location: Los Angeles, CA

#20 Post by smithrs » Wed Oct 11, 2006 2:31 pm

Just verified the fix.

Added this line to modprobe.conf:

options e1000 RxIntDelay=5

Looks good.

frankausmtank
Freshman Member
Posts: 111
Joined: Thu Aug 03, 2006 5:06 am
Location: Berlin, Germany

#21 Post by frankausmtank » Thu Oct 12, 2006 4:38 am

Thanks smithrs.
I currently can't try the fix because my driver is not compiled as a module. Is there a way to set such options when the driver is compiled directly into the kernel?

Post Reply
  • Similar Topics
    Replies
    Views
    Last post

Return to “Linux Questions”

Who is online

Users browsing this forum: No registered users and 11 guests