pflogBueller? Bueller? MVM join:2001-09-01 El Dorado Hills, CA 1 edit |
pflog
MVM
2008-Sep-23 11:18 am
Warning to Intel e1000e ownersI thought I'd pass this on: » www.heise-online.co.uk/n ··· -/111583It appears the Intel e1000e card can be bricked by the 2.6.27rc1 kernel. So be careful if you have one of these cards and are planning on installing a newer distro that may be using a 2.6.27 kernel! I admit, I haven't read the details yet fully. Reading now, but just a heads up! |
|
pflog 1 edit |
pflog
MVM
2008-Sep-23 11:22 am
Re: New OS feature - brick your hardware!Here's someone from the e1000 driver team commenting on the issue: I work on the e1000 team (including the e1000e driver) and here is what we know. A panic in another driver (believed to be the gfx driver but uncertain) which scribbles over the NIC/LOM non-volatile memory (NVM). This is only happening with the 2.6.27-rc kernels on ICHx systems. Since the NIC/LOM VNM is part of the whole BIOS image other things in the system could be effected by this driver panic as well. An update of the system BIOS will restore the NIC/LOM to be operational. We have some patches under test right now that we will be releasing later today to protect the NIC/LOM NVM. That should help narrow down who is scribbling over NVM.
And here's the link to the OpenSUSE mailing list from an Intel dev: » lists.opensuse.org/opens ··· 017.html |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
to pflog
Re: Warning to Intel e1000e ownersNice heads-up. I've posted a warning earlier this morning to the UbuntuForums regarding this issue with a more complete writeup: » ubuntuforums.org/showthr ··· t=927943While it appears like a random event, the consequences are pretty serious. As I commented in the end, it puts a whole new perspective on what it means when vendors give warnings on testing prerelease software. |
|
joako Premium Member join:2000-09-07 /dev/null |
to pflog
FWIW the SLED 11.0 beta has e1000e on blacklist.... |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Sep-27 4:18 pm
said by joako:FWIW the SLED 11.0 beta has e1000e on blacklist.... We've taken the same approach at Ubuntu until this is resolved; The Intel folks are working on an EEPROM reflasher to reverse this damage; though I must caution people not to go around looking for random tools like IBAUTIL to fix this; as you may cause more damage than you have now. |
|
kleemanReduce blood pressure. Ignore trolls join:2000-07-29 Nyack, NY 884.6 923.7
|
to pflog
In reading through several threads on this issue I was unclear about which was the first kernel version with this issue. The e1000e driver was first introduced in the 2.6.26 version kernel. . » lwn.net/Articles/278016/Ubuntu Intrepid uses the 2.6.27 kernel which is where many reports started but I use the 2.6.26 kernel on hardy...... Any info on this issue jdong? |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Sep-28 10:00 pm
said by kleeman:In reading through several threads on this issue I was unclear about which was the first kernel version with this issue. The e1000e driver was first introduced in the 2.6.26 version kernel. . » lwn.net/Articles/278016/Ubuntu Intrepid uses the 2.6.27 kernel which is where many reports started but I use the 2.6.26 kernel on hardy...... Any info on this issue jdong? Technically the fundamental problem exists in 2.6.26's e1000e driver too. The issue here is that e1000e maps registers controlling the flashing of NVRAM and LOM of the chipset into memory space. The issue was made present in 2.6.27 because some unknown (most likely a graphics driver) is spewing random garbage into memory space when crashing, which just happens to flip the NVRAM registers in the right way to write some nonsense into there. Technically if you took some syscalls and zeroed all the RAM space on 2.6.26 in Hardy, you can hurt your NIC the same way, but I don't think anyone sane will do that, so the practical answer is this is a 2.6.27 problem as far as the likelihood of "bricking" the NIC, but it's a 2.6.26+ problem as far as the fundamental design flaw of the driver. |
|
kleemanReduce blood pressure. Ignore trolls join:2000-07-29 Nyack, NY 884.6 923.7
|
kleeman
Member
2008-Sep-28 10:14 pm
Thanks. I am still a bit unclear about this though. Couldn't this (unknown) driver also be acting the same way in 2.6.26 or is there other info that rules that possibility out.? BTW I got a bit paranoid and manually blacklisted the driver  |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Sep-28 10:30 pm
said by kleeman:Thanks. I am still a bit unclear about this though. Couldn't this (unknown) driver also be acting the same way in 2.6.26 or is there other info that rules that possibility out.? BTW I got a bit paranoid and manually blacklisted the driver Well the kernel devs are pretty confident the crashy driver was introduced in 2.6.27. But yeah, it is somewhat possible your worries may be true. But given that so many people using 2.6.26 have not reported any issues except when moving to 2.6.27 I'm more inclined to believe it's a 2.6.27 problem. |
|
kleemanReduce blood pressure. Ignore trolls join:2000-07-29 Nyack, NY 884.6 923.7
1 edit |
kleeman
Member
2008-Sep-29 12:29 pm
Thanks for the additional info. I am assuming intel will fix this before too long and hopefully provide a script to repair any damage. Edit: Here is the kernel bug thread. » bugzilla.kernel.org/show ··· id=11382Reading through it does appear that an intel employee is on the case. |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong to pflog
Premium Member
2008-Sep-29 12:45 pm
to pflog
yeah intel has been on the ball since the original report. An EEPROM reflasher utility is apparently in the works. |
|
rolfp5 join:2001-09-12 Oakland, CA |
to pflog
Intel devels have created an interim kernel patch that provides for re-enabling the driver: » lkml.org/lkml/2008/10/1/368Mandriva has implemented it: [root@localhost /]# rpm -q --changelog kernel-desktop-2.6.27-0.rc8.2mnb-1-1mnb2 | head * Wed Oct 01 2008 Pascal Terjan 2.6.27-0.rc8.2mnb o Herton Ronaldo Krzesinski - Add fix for e1000e corruption bug and re-enable it (»lkml.org/lkml/2008/10/1/368). Closes #44147
* Wed Oct 01 2008 Pascal Terjan 2.6.27-0.rc8.1mnb o Herton Ronaldo Krzesinski - Fix sis190 ethernet device support on Asus P5SD2-VM motherboard (kernel.org bug #11073). - Add fix for sata_nv regression in latest 2.6.27 rcs (kernel.org
|
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Oct-3 12:41 pm
Ubuntu's latest (post-beta) kernel upload adds this upstream fix too. |
|
salahx join:2001-12-03 Saint Louis, MO |
to pflog
The culprit has probably been found. Turns out it was due to bug a part of ftrace (CONFIG_DYNAMIC_FTRACE). ftrace wasn't added until the 2.6.27 merge window (which is why no one with any earlier kernel saw it). There already a fix for it, but its been held off for 2.6.28 since there's quite a few changes involved. So, as a workaround, for 2.6.27.1 CONFIG_DYNAMIC_FTRACE is now marked BROKEN to prevent any further unintentional foot bullets. |
|
rolfp5 join:2001-09-12 Oakland, CA |
rolfp5
Member
2008-Oct-18 12:26 pm
That's some interesting, if, largely, incomprehensible, to me, reading. In that thread, I see CONFIG_DYNAMIC_FTRACE repeatedly, however, [rolf@localhost ~]$ grep CONFIG_DYNAMIC_FTRACE /boot/config
[rolf@localhost ~]$
while, [rolf@localhost ~]$ grep -i ftrace /boot/config
CONFIG_HAVE_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
# CONFIG_FTRACE is not set
[rolf@localhost ~]$ uname -r
2.6.27-desktop-0.rc8.2mnb
[rolf@localhost ~]$
so, I wonder, additionally, about the disparity... |
|
SUMware Premium Member join:2002-05-21 1 edit |
to pflog
openSUSE FixIntel e1000e Corruption Fixed - Already in openSUSE 11.1 Beta2 (with exception of Debug, Vanilla Kernels) October 16th, 2008 by Andreas Jaeger The patches we did for the Intel e1000e network card for Beta2 protect the chip so that the NVRAM could not get corrupted anymore and we indeed did not receive any new bug reports and could not reproduce the bug anymore on our systems. Further investigation by Intel has found the root cause of the problem as Steven Rostedt wrote on the linux kernel mailing list : The dynamic ftrace code contained some fragile code that could write to ioremap-ed memory and thus corrupt the NVRAM. The issue could happen when the init functions of a module are freed and the nvram is vmapped there as well. The full story can be found on LKML. Since 24th of September, we have disabled for our kernel of the day the dynamic ftrace code due for all flavors except the debug and vanilla kernels (on x86 and x86-64 - it was not enabled on other architectures). We have also added the NVRAM protection patches to all kernel flavors. Therefore Beta2 already contains - by pure luck  - not only the NVRAM protection but also not anymore the broken code. Beta3 will contain the same fixes - and the kernel of the day has just been updated with dynamic ftrace code disabled also for the debug and vanilla kernels (with the update to 2.6.27.1). So, if youre running a debug or vanilla kernel, I advice - to be on the safe side - to update to the 2.6.27.1 kernel of the day. For everybody else: The Beta2 and Beta3 kernels should not corrupt your Intel e1000e NVRAM. Id like to thank all that were involved in debugging and fixing the issues around this, including our kernel developers Karsten Keil and Jiri Kosina who debugged and worked on a solution, testers that fried their machine and helped debugging like Stephan Binner and Vladimir Botka, and the team at Intel for developing protection code and finding and fixing the root cause. |
|
|
MTB join:2007-06-29 Newport Beach, CA |
MTB to pflog
Member
2008-Oct-20 11:11 am
to pflog
Re: Warning to Intel e1000e ownersIs this a 2.6.27 problem or did it start in 2.6.17. Some posts indicate trouble with some sort of merge at this point.
I am not sure that this is only related to the e1000 card since I have one that works fine with openSUSE11.0 2.6.25 but the ipw2200 cards are now stinking up the place.
If the e1000 card is handled the way the ipw2200 I would not doubt what is going on and it is possible that unstable drivers are getting into the mix as they are in openSUSE 11.0.
Intel and the DISTROS need to make a post as to what is really going on, how to get and/or install any Intel cards. At least package some of the drivers in a tested/recommended fashion and not the random free for all that is currently in place.
I personally do not plan on buying any more Intel products. These guys claim to offer support for the ipw2200 but not for the hardware. Sounds like more of a windows standard than a linux standard and just a bad direction to go. |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Oct-20 12:05 pm
This has nothing at all to do with ipw2200. The "problem" was introduced with the e1000e driver (NOT to be confused with e1000) but it couldn't be triggered until CONFIG_DYNAMIC_FTRACE from 2.6.27. |
|
MTB join:2007-06-29 Newport Beach, CA |
MTB
Member
2008-Oct-20 12:43 pm
Thanks for the info.
Was the "problem" triggered by CONFIG_DYNAMIC_FTRACE, the use of unstable drivers in distros or both.
and
The fact that Intel does not appear make hardware information availible for open source projects. |
|
MTB 1 edit |
MTB to jdong
Member
2008-Oct-20 12:44 pm
to jdong
. |
|
| MTB |
MTB to jdong
Member
2008-Oct-20 12:54 pm
to jdong
The problem appears to extend further than stated. said by openSUSE comment : Comment by bonux 2008-10-17 14:58:08
Where can I get the fix for 11.0 I need to download it and fix my Lenovo T60. It isnt working at the moment.. I am desperate, please help
I know this might be the wrong place for this but I cant help myself. The eth0 does not even appear on my list of interfaces
I will stick behind what I have said. |
|
pflogBueller? Bueller? MVM join:2001-09-01 El Dorado Hills, CA |
to MTB
said by MTB:The fact that Intel does not appear make hardware information availible for open source projects. Huh? Their ethernet drivers are pretty much fully open source. |
|
Cabal Premium Member join:2007-01-21 |
Cabal to MTB
Premium Member
2008-Oct-20 1:32 pm
to MTB
said by MTB:The fact that Intel does not appear make hardware information availible for open source projects. Not only are Intel's specs (ethernet, video, chipset) open, but they fund their OSS driver development. |
|
MTB join:2007-06-29 Newport Beach, CA 2 edits |
MTB
Member
2008-Oct-20 1:59 pm
I will have to read up on the e1000e card, but here is how other projects go down "NO Hardware Doccumentation" said by »ipw2200.sourceforge.net/ : This project was created by Intel to enable support for the Intel PRO/Wireless 2915ABG Network Connection and Intel PRO/Wireless 2200BG Network Connection mini PCI adapters. This project (IPW2200) is intended to be a community effort as much as is possible given some working constraints (mainly, no HW documentation is available)
It should also be noted that the e1000 and e1000e may infact be the same. I could only find an e1000 project. said by »https://bugs.launchpad.n ··· ug/42572 : Citing Ben Collins from #256555: "The 2.6.26 kernel and 2.6.27 kernel have the exact same e1000e driver (one which we downloaded from Intel's e1000 sf.net project)."
So, although this problem has been fixed since months (patch posted by an Intel employee in Oct 07, patch applied upstream Jan 08, released with Linux 2.6.25), it obviously hasn't been incorporated into the version of e1000e which was downloaded from sf.net and integrated into Ubuntu.
Why is Ubuntu not using the upstream version at all?
The point here is that this may be a bigger issue than it looks and there appears to be room for improvement by all parties involved. I am not a driver expert, but it seems to me that they would be extremely hard to write w/o hardware doccumentation and hence random behavior should be expected. |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
to pflog
The project is called e1000 but the e1000e and e1000 are separate drivers.
The problem inherently is in e1000e, but it was not triggered by anything until CONFIG_DYNAMIC_FTRACE which contained a bug allowing memory writes to certain arbitrary locations accidentally -- this could've also introduced other kinds of nasty corruption too that goes away with a reboot (or, more scarily, if it went into page cache area corrupted files)... |
|
pflogBueller? Bueller? MVM join:2001-09-01 El Dorado Hills, CA |
to MTB
I found this page in about 10 seconds. » www.intel.com/design/net ··· docs.htm |
|
| |
Apparently Gentoo's kernel devs have applied the patch to this "bug" in the 2.6.27 gentoo-sources. According to portage, the bug still exists but it will no longer damage the hardware. |
|
jdongEat A Beaver, Save A Tree. Premium Member join:2002-07-09 Rochester, MI |
jdong
Premium Member
2008-Oct-20 7:02 pm
said by KodiacZiller:Apparently Gentoo's kernel devs have applied the patch to this "bug" in the 2.6.27 gentoo-sources. According to portage, the bug still exists but it will no longer damage the hardware. Yes, the dynamic FTRACE bug is still present and the fix for that will be nontrivial but now that it's marked BROKEN competent kernel configurers won't be enabling it by accident. In addition, the e1000e driver now locks the registers shortly after initializing so this shouldn't ever be a problem. |
|