LinuxMCE Forums

General => Installation issues => Topic started by: pigdog on May 12, 2009, 08:38:50 pm

Title: RESOLVED: Finally! rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 12, 2009, 08:38:50 pm
Hi all,

Prior to alpha 2.15 I was pxe booting a MD with a Realtek 8139 (10ec:8139) in a PCI slot because the Nvidia Boot ROM (forcedeth) was toast.

After my rebuild that system would not pxe boot, I receive a big trace list, 60 second timeout, can't open /tmp/eth0-conf plus a kernel panic message.

The eth0 link says it's up before the trace list and again before the kernel panic.

There is a comment about rtl8139/8139C/8139C+ (rev.10) and chipset incompatible. (Modified - corrected a typo 8138C+ to 8139C+.)

I've tried 8139C & D chipsets.

I've tried pxe, gpxe and grub boot disks and flavours of R8139 and RTL8139 builds.

All with the same results.

Just before 2.15 I was running the .14 generic.  So, I upgraded to the .14 generic from .11 but still have the same thing happening.

This was rock solid prior to 2.15.

Before I spend any more time on this I was wondering if anyone else has experienced the same thing?

I've read a bunch of stuff about Realtek chipsets in other forums but this thing was good to go up to 8.10 alpha 2.15.

Thanks all.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: colinjones on May 13, 2009, 01:37:40 am
hmm... perhaps this is a regression due to the Realtek 8168/8169 bug. Are you aware of this? There is a wiki article on resolving it. By first removing the 8168 ID from the 8169 driver, then recompiling a new version of the 8168 driver. Perhaps something has even removed the 8169 ID from the 8169 driver?! If you are not using the 8168 chipset in any way, then perhaps you should try downloading a new copy of the 8169 driver and replacing the old version in the diskless folder for your MD image?
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: Zaerc on May 13, 2009, 02:14:56 am
hmm... perhaps this is a regression due to the Realtek 8168/8169 bug. Are you aware of this? There is a wiki article on resolving it. By first removing the 8168 ID from the 8169 driver, then recompiling a new version of the 8168 driver. Perhaps something has even removed the 8169 ID from the 8169 driver?! If you are not using the 8168 chipset in any way, then perhaps you should try downloading a new copy of the 8169 driver and replacing the old version in the diskless folder for your MD image?

Just to point out the obvious: 8139 != 8168/8169, the 8139 is a pretty old (and widespread) 10/100 mbit chipset and has been supported by the Linux kernel since like forever and a half.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: colinjones on May 13, 2009, 04:57:37 am
oops! misread it! Disregard my comments, thanks Zaerc!
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: rafik24 on May 13, 2009, 05:06:56 pm
 Hi Pigdog,

 Check that the dhcp daemon is bound to your core interface using: netstat -anop | grep dhcp

I had the same issue many times were my md would not pxe boot because the dhcp server got reconfigured by lmce
and the dhcp range arg was missing in /etc/dhcp/dhcpd.conf

 Have a look

Regards,

Rafik

Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 14, 2009, 12:48:45 am
Hey rafik24,

Thanks for the heads up.

I try it tomorrow.  Soccer season started Monday night, one kid one night at 6, the other the next - Mon - Thurs.

Then Sat. morning/afternoon leagues.

They're getting exercise and I'm getting West Nile from the mosquitos!

Thanks.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 14, 2009, 05:36:20 pm
Hi rafik24,

This is my output...

dcerouter_112566:/etc/default# netstat -anop | grep dhcp
udp        0      0 0.0.0.0:67              0.0.0.0:*                           5305/dhcpd3      off (0.00/0/0)
raw        0      0 0.0.0.0:1               0.0.0.0:*               7           5305/dhcpd3      off (0.00/0/0)
unix  2      [ ]         DGRAM                    16517    5305/dhcpd3

I've got ranges...

                 allow unknown-clients;
                 range 192.168.80.129 192.168.80.130;
                 range 192.168.80.132 192.168.80.254;

and webadmin shows my pluto device range of 80:2 thru 80:128

If I try to use another model of Realtek chip when I build my pxe boot disk it won't load.

The 8139 pxe .zdsk I created matches the lspci report (10ec-8139)

When I try to pxe boot off the dhcp server it runs the 8139cp.ko file from the server and tells me the 8139c chipset is not compatible.

He then runs the 8139too.ko as secondary driver with no success.

I've checked the .7, .11 and .14 generics but both .ko are the same file size.

I've also tried 1 8139C and 2 8139D NIC's.

Cheers
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 15, 2009, 01:13:36 am
Hi,

I thought I'd see what would happened if I forced 8139too to boot first instead of 8139cp.

I went into mkinitramfs as per Unrecognized NIC but still no joy. :(

Tried 8139cp and 8139too
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 15, 2009, 05:51:05 am
HI,

O.K.  So I installed Ubuntu onto a usb memory stick and booted the MD.

I did a lshw and it told me my Realtek was using the 8139too driver version 0.9.28 and I had an IP address of 192.168.80.13.

I browsed the net a bit, looked at the latest BBC news headlines (some monk wrote a book about sex for couples - wtf?).

So my card works.  I'm going on the core and blacklist the 8139cp.ko.

Thanks.

p.s. forget that.  I tried adding it to the blacklist, still ran 8139cp.  Tried creating a local-8139cp blacklist, still ran 8139cp.

I hate to give up, but, I just might have to get another NIC!

Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 16, 2009, 10:05:55 pm
O.K.

When I upgraded to 810 alpha 2.20 my boot up acted a little differently.

8139cp loaded but I didn't get the comment about rtl8139/8139C/8139C+ (rev.10) and chipset incompatiblity.

8139cp was v1.3 Mar 22, 2004.

8139too did not try to load.

Everything else was the same.  Eth0, traceback, 60 second timeout, can't open /tmp/eth0-conf plus a kernel panic message.

So, I re-built from scratch, again, tried to boot the MD again, same result as above.

Now, I will try to force the 8139too to run and see what happens.

Cheers.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 17, 2009, 12:53:07 pm
Hi,

I tried to blacklist 8139cp but it didn't stop it from booting.

Since the system reported during boot that the system had 8139cp was v1.3 Mar 22, 2004 I thought I'd try booting with different issues of etherboot.

eb-5.4.4-rtl8139.zdsk
eb-5.4.3-rtl8139.zdsk
eb-5.2.6-rtl8139.zdsk

5.4.4 & 5.4.3 failed to boot while 5.2.6 reported that /tftpboot/prelinux.0 ...error:not a valid image.

I'll scratch my head for a while longer.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 18, 2009, 02:08:01 am
Well,

Since it was booting without the compatibility message I thought I'd try GRUB PXE booting again (wiki).

I made a disk, selected new media director and had the same thing happen.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: Zaerc on May 18, 2009, 03:02:25 pm
Just a wild stab in the dark, but maybe your /tftpboot/pxelinux.0 is not a valid image.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 18, 2009, 03:18:21 pm
Hi Zaerc,

The only time I received /tftpboot/prelinux.0 ...error:not a valid image was doing the old 5.2.6 etherboot.

I've been trying to force the 8139too .ko to run but DHCP keeps running the 8139cp .ko

I've tried blacklisting the 8139cp and I even removed it.

rm /lib/modules/2.6.27-14-generic/kernel/drivers/net/8139cp.ko

Then I ...

depmod -a

... rebuilt the initrd so that the module won't be included anymore ...

mkinitramfs -o /boot/initrd.img-2.6.27-14-generic 2.6.27-14-generic

... rebooted the core and the MD still ran the 8139cp

Unless I did something wrong.  When I check /lib/modules/2.6.27-14-generic/kernel/drivers/net I have no 8139cp.ko listed!

Is it somewhere else because it's a dhcp client or MD?

Thanks.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 18, 2009, 04:09:57 pm
Hi,

Since the 8139too was the only guy in /lib/modules/2.6.27-14-generic/kernel/drivers/net/8139too

and I coudn't seem to get anything other than the 8139cp to boot I tried this...

nano /etc/initramfs-tools-interactor/modules

added 8139too

then...

/usr/pluto/bin/Diskless_BuildDefaultImage.sh

and...

mkinitramfs -d /etc/initramfs-tools-interactor/ -o /tftpboot/default/initrd

I ended up with    .No IP address    on boot of the MD.

Removed everthing, rebooted - now something is broken - still get .No IP address om MD.

hmm.

So I did

/usr/pluto/bin/Diskless_BuildDefaultImage.sh

and then

mkinitramfs -o /boot/initrd.img-2.6.27-14-generic 2.6.27-14-generic

Hoping this would run the 8139too.  Nope - .No IP address.

So I copied 8139cp.ko back into /lib/modules/2.6.27-14-generic/kernel/drivers/net

did

/usr/pluto/bin/Diskless_BuildDefaultImage.sh

and then

mkinitramfs -o /boot/initrd.img-2.6.27-14-generic 2.6.27-14-generic

and still .No IP address.   Somethings broke.  Time to reload from scratch - again.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 19, 2009, 09:18:58 pm
Hi,

I've tried different stuff over the last couple of days and I'm getting nowhere.

Everything worked prior to alpha 2.15 and now, no matter what I have tried, it won't boot.

I know the hardware works under ubuntu, I'm out of ideas.

I'll just have to wait and see if anyone else hits the same snag.

Cheers.


Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 20, 2009, 04:08:58 pm
Hi,

Well I was going thru my parts bin and I found a D-Link DFE-538TX 10/100.

It had a rtl8139 on board but instead of 10ec:8139 it was 1186:1300.

I tried it but same kernel panic and junk.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 22, 2009, 09:40:38 pm
O.K.  I managed to get the RTL8139 running but I changed a bunch of stuff, even MTU junk, so I don't know if it is a combination of things or just the last resort wtf thing I tried.

So, I'm going to reload from scratch and see if it can be replicated.

Cheers.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 23, 2009, 05:10:01 pm
O.K. now this is getting frustrating.

I rebuilt from scratch loaded the latest and when I went to try I got a completely different error.

So I tried a couple of things and got nowhere.

So I reloaded from scratch and tried things again.

System (A)
EP-8KDA3+ NVIDIA nForce3-250 Socket 754 1GbE LAN - not using the on board Nvidia chipset/BootROM because it's toast.  Using RTL8139 card.

System (B)
ECS-P4S5A/DX+ Socket 478 RTL8201BL 10/100 LAN - not using the on board Realtek because it doesn't PXE boot and testing with the RTL8139 card to match above for troubleshooting.

System (C)
Foxconn Winfast 6150K8MA-8EKRS and Nvidia MCP51 LAN - boots from Core no problem.

System A & B where tested with 8139C and 8139D Realtek LAN Cards  ( I have 2 of each type).

On PXE Boot using ROM-O-Matic Etherboot 5.4.4 both reported same error:

0200 AX:0212 BX:2400 CX:001 DX:0100

I've rebuilt the 5.4.4 version twice ans tried the 5.4.3 variant with the same result.

I've built a grub pxe boot disk as per the wiki and the original error type was returned (traceback, kernel panic, etc).

I'm going to stick with the grub pxe boot disk for testing.  I don't need this other variable.

Cheers.

Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: posde on May 23, 2009, 09:36:50 pm
pigdog,

make sure you disable your onboard NICs completely when you are using plugin NICs (PCI or USB) on a media director.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 23, 2009, 10:06:56 pm
Hi posde,

Yes it's been checked, checked, double-checked, what happens if I change this, what happens with that, what about this?

I'm shaking my head here.  What is it with the 8139 chipset? 

I could easily walk away and buy another NIC card but it would not help someone else.

So once more into the breach - and it'll probably turn out to be something really dumb, but, it worked prior to 2.15 and not afterwards.

What changed?  I thought maybe I was getting mapped to the wrong driver at one time but someone else should have reported problems by now if that was the case.

I will keep at it.

Thanks for the input - Cheers.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 25, 2009, 03:44:16 am
Hi,

The saga continues...

System (B) The ECS-P4S5A/DX+ Socket 478 I put a rtl8139 into. 

In the BIOS settings of the AMIBIOS Version 1.21.12 I changed Plug and Play Aware OS to no.

Using the grub pxe boot disk (as per the wiki) I selected new media director and managed to receive "We announce ourselves to the router" at which point I shut it off and checked the Core/Hybrid.

The Core/Hybrid shows #31 as arch i386, the IP and MAC.

I rebooted with the grub pxe boot disk and no joy - typical error message.

I swapped the NIC to System (A) EP-8KDA3+ NVIDIA nForce3-250 Socket 754 and no joy.

I created an etherboot ROM-O-Matic disk version 5.4.4 and tried it in system B but I get RX errors and the system reboots.

In system A with etherboot ROM-O-Matic I get

-Trying to load:pxelinux.cfg /01-00-08-54-11-98-e3
-could not find kernel image:31/vmlinux
boot:

So this is looking a little better.

Time for sleepy.
Title: Re: rtl8139 nic pxe/gpxe/grub boot after alpha 2.15
Post by: pigdog on May 27, 2009, 12:44:39 am
O.K.  I've been going nuts the last couple of days trying to figure out why System B had RX and UDP errors.

Why System A didn't have the errors but had the traceback, kernel init, eth0, blah, blah.

I've used Wireshark, ethtools, looked at MTU's and other junk.

I've been through four different NIC cards, four different CAT5 cables, different pxe etherboot and grub gpxe disk versions.

I eliminated the 8139cp driver complaint by forcing the 8139too driver to run.

Eventually I read about the difference between "noacpi": "pci=noacpi".

I went into /tftpboot/pxelinux.cfg and edited the default boot script to remove acpi=off.

I booted System A and away we go!

So now that it has booted up and is running (moon48) I changed the default back to acpi=off.

My MD hooked up to my LCD (moon31) did not have any issues with acpi=off.

The BIOS in system B has no settings for turning on or off acpi.

I'm glad that's over with.