Author Topic: Diskless MDs not PXE booting  (Read 22085 times)

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #30 on: July 28, 2011, 04:32:32 pm »
purps, you are a God amongst men: the driver was the f*&king culprit!!  I installed the r8168 module and boom, the Jetway Ecomini pulled in a IP and started the PXE boot.  Thanks man! I would not have considered the Realtek module in 0810 as being the source of the issue - at least not for a while.

So, back to the question of what's in the MD.  Guess what?  It's a Realtek PCIe Gb NIC.  Sweet!!   :'(

I rebuilt the diskless image after installing the r8168 module, but the MD still kernel panics.  From memory, the error is something about an eth0 file not being present.

What's my next step?  I'm going to search the forums and wiki in the meantime.

Note: my switch is a Dell PowerConnect Gb 8port...probably 6 years old now.

tkmedia

  • wants to work for LinuxMCE
  • **
  • Posts: 937
    • View Profile
    • LMCECompatible
Re: Diskless MDs not PXE booting
« Reply #31 on: July 28, 2011, 05:38:46 pm »
My Setup http://wiki.linuxmce.org/index.php/User:Tkmedia

For LinuxMce compatible  systems and accessories
http://lmcecompatible.com/

purps

  • NEEDS to work for LinuxMCE
  • ***
  • Posts: 1402
  • If it ain't broke, tweak it
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #32 on: July 28, 2011, 06:49:49 pm »
I would not have considered the Realtek module in 0810 as being the source of the issue - at least not for a while.

The fact that I said there are known problems with the r8169 driver in 810 should have been your first clue ;)

So, back to the question of what's in the MD.  Guess what?  It's a Realtek PCIe Gb NIC.  Sweet!!   :'(

I rebuilt the diskless image after installing the r8168 module, but the MD still kernel panics.  From memory, the error is something about an eth0 file not being present.

Don't worry, you should be able to get the MD working. Read my user page, check out the Living Room MD, it will be a similar process for you. The various instructions mentioned here are all based on the wiki pages that Tim mentioned.

To give you a nudge in the right direction, you need to manually place a copy of the r8168 driver that you installed on your core amongst the MD gubbins also; at the moment, it is literally just on the core, and the MD can't use it.

Cheers,
Matt.
« Last Edit: July 28, 2011, 06:55:56 pm by purps »
1004 RC :: looking good :: upgraded 01/04/2013
my setup :: http://wiki.linuxmce.org/index.php/User:Purps

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #33 on: July 29, 2011, 05:02:09 am »
So, I looked at the stuff tkmedia linked to and now I have a MD that no longer kernel panics, but instead dies a bit later in the boot. The error message printed is:

"Error: cannot connect to router: rebooting in 5 seconds."

The MD then reboots, and the same thing (as above) happens again.

I searched the forums and came across this thread:

http://forum.linuxmce.org/index.php?topic=8959.0

I tried out what Murcel suggested, and it did indeed get the boot even further along.  The problem now appears to be the MD pausing indefinitely after printing out something about "we've announced ourselves to the router" - I can't remember the exact error.  I let the MD sit like that for 45 minutes and saw no change. So, I rebooted the core for shits and giggles, and powered the MD back up after the core reboot was complete.  Unfortunately, the original error (Error: cannot connect to router: rebooting in 5 seconds.) came back.

I'm assuming I can get past this error if I run the startup-script.sh script again.

Questions:

1. Why do I have to keep running the startup-script.sh script?
2. I don't see any "diskless" script running on the core when the MD gets to the "announced ourselves to the router" message.  How do I fix this?
3. Why is my install so broken? Did one bad NIC driver really introduce this many problems?

Marie.O

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 3675
  • Wastes Life On LinuxMCE Since 2007
    • View Profile
    • My Home
Re: Diskless MDs not PXE booting
« Reply #34 on: July 29, 2011, 07:37:08 am »
m3freak,

please remember rule #1 - MD creation can take longer than 45minutes.


m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #35 on: July 29, 2011, 01:23:22 pm »
m3freak,

please remember rule #1 - MD creation can take longer than 45minutes.



Ok, fair enough.  But, why do I have to run the startup_script.sh script every time I reboot the core? Well, that's been the case for the Jetway Ecomini MD, anywyay.

Also, when I do run the startup_script.sh script, the external interface of the core stops responding.

purps

  • NEEDS to work for LinuxMCE
  • ***
  • Posts: 1402
  • If it ain't broke, tweak it
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #36 on: July 29, 2011, 01:42:36 pm »
Assuming you did a lot of cocking around before you found the solution to your core's NIC problems, reinstalling might be an idea, just to eliminate the possibility of a messed up setting somewhere.

What steps did you take exactly for the unrecognised NIC on your MD?

Cheers,
Matt.
1004 RC :: looking good :: upgraded 01/04/2013
my setup :: http://wiki.linuxmce.org/index.php/User:Purps

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #37 on: July 29, 2011, 02:29:42 pm »
Assuming you did a lot of cocking around before you found the solution to your core's NIC problems, reinstalling might be an idea, just to eliminate the possibility of a messed up setting somewhere.

Nah, did nothing of the sort.  Actually, the install I'm working with right now is new.

What steps did you take exactly for the unrecognised NIC on your MD?

I did what was in those links.  I'd already installed the r8168 driver, so I did the other stuff:

- included the r8168 module in /etc/initramfs-tools-interactor/modules
- recreated the diskless image (the way it's described on the 0810 install page)
- I don't have a /usr/pluto/diskless dir, so I searched around until I found the post about running "startup_script.sh" to get past the MD boot error.

So that's where I am.

BTW, how can I tell if the diskless image is actually being created?  I looked on the core and can't find any running process that would indicate the MD's image is being created.

One other thing: although I can't ping or ssh to my core from the external network, I can definitely ping the external network from the core.  I didn't look at the iptables rules - maybe they're messed up?
« Last Edit: July 29, 2011, 02:31:45 pm by m3freak »

purps

  • NEEDS to work for LinuxMCE
  • ***
  • Posts: 1402
  • If it ain't broke, tweak it
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #38 on: July 29, 2011, 02:47:25 pm »
Run this on your core...
Code: [Select]
modprobe r8168
depmod -a
/usr/pluto/bin/Diskless_BuildDefaultImage.sh

The diskless image thing is already created, so I don't think that running the script from the 810 install page will do very much. The script above would be more appropriate. The depmod command should be run because you have changed some modules (that's my understanding anyway).

Try another reboot once you've done that, we want a directory to appear in /usr/pluto/diskless, as I am sure you are aware.

Cheers,
Matt.
1004 RC :: looking good :: upgraded 01/04/2013
my setup :: http://wiki.linuxmce.org/index.php/User:Purps

uplink

  • Administrator
  • Guru
  • *****
  • Posts: 192
  • Linux and LinuxMCE witchdoctor
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #39 on: July 29, 2011, 05:15:36 pm »
BTW, how can I tell if the diskless image is actually being created?  I looked on the core and can't find any running process that would indicate the MD's image is being created.

The script Diskless_Setup.sh creates the image. If that is running, the Diskless MD filesystems are being created. If you don't see it, it's not happening. It creates the /usr/pluto/diskless directory and MD subdirectory.

If this is not the case for you, here's how it all works:

1. New MD PXE boots default boot image.
2. Default boot image connects to the Core and tells it to create a new MD device.
3. Default boot image displays "announced ourselves to the router" and waits for messages from the Core.
4. Core creates a MD device (check your device tree)
5. Core allocates IP address to new MD, tells new MD about it (you get "Allocated permanent IP" message on MD).
6. Core runs Diskless_Setup.sh, tells new MD about it (you get "Running Diskless_Setup.sh" message on MD).
7. When Diskless_Setup.sh finishes, Core tells MD about it. If it fails, the MD will display "Diskless_Setup.sh failed" message. If it succeeds, you'll get a success message and the Core will also tell the MD to reboot.
8. MD reboots into its new filesystem.

At no point should Diskless_Setup.sh die without the MD getting a message (error or success).

If you don't have the MD device in your tree after the router announcement, you have a different problem. If you do have the device in the tree, and MD says Diskless_Setup is running, but you don't see Diskless_Setup on the core, run /usr/pluto/bin/Diskless_Setup.sh yourself on the Core and see what's happening.

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #40 on: July 31, 2011, 03:32:39 pm »
Run this on your core...
Code: [Select]
modprobe r8168
depmod -a
/usr/pluto/bin/Diskless_BuildDefaultImage.sh

The first two steps get were done by the install script for the r8168 module.  I ran the script in the final step, rebooted - no change.  The MD just at the same screen, and there was no diskless image being created on the core.

I'm going to reinstall LinuxMCE.  Before I run the final install script from the desktop, I'm going to install the r8168 module.  If things still don't work, I'll try a new DVD snapshot.  If shit still fails, I'll dance on the computer.

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #41 on: July 31, 2011, 04:28:52 pm »
FAIL!  Reinstall of LinuxMCE and install of r8168 module right from the get go did not fix anything:

1. MD PXE boot dies after saying it can't contact the router.  The ONLY way to get past this step is to run "startup_script.sh" after EVERY SINGLE CORE REBOOT.
2. The MD's diskless image never runs.  The MD might say it's announced itself to the router, but the core doesn't actually do anything.

There is some seriously broken shit in the LinuxMCE snapshot I'm using.
« Last Edit: July 31, 2011, 07:48:45 pm by m3freak »

m3freak

  • Veteran
  • ***
  • Posts: 125
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #42 on: August 01, 2011, 01:21:47 am »
Downloaded and installed the latest snapshot.  This is what I did:

1. Ran install
2. Appeared to complete successfully, so rebooted.
3. Logged into Kubuntu desktop
4. Tried to stop network services.  Got an error about an unknown device, even though both NICs were up.  eth0 had 192.168.80.1 IP, and eth1 had IP from my external DHCP server.
5. I unloaded the eth0 module, r8169.
6. Installed r8168 module.  After install script finished, eth0 was back up with old IP.
7. I added the r8168 module to /etc/initramfs-tools-interactor/modules
8. Created the default diskless image
9. Ran the final install script by double clicking the icon on the desktop
10. Rebooted after the install finished.
11. Last step of install ran and completed after reboot
12. Powered up the Jetway Ecomini MD
13. After it got an IP, it began the PXE boot.
14. After eth0 apparently came up, the Jetway reported it couldn't connect to the router, so it rebooted.
15.  60 minutes later, it's still rebooting because it can't find the router.

WTF.

Marie.O

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 3675
  • Wastes Life On LinuxMCE Since 2007
    • View Profile
    • My Home
Re: Diskless MDs not PXE booting
« Reply #43 on: August 01, 2011, 08:02:00 am »
Are you able to have other devices on your internal network receive (192.168.80.0/24) DHCP addresses and connect to the core? If yes, then your cores NIC seems to work okay.

If you can connect to the core, check if dcerouter is running on the core,

Code: [Select]
ps ax|grep DCERouter.log|grep -v grep

and have a look in /home/coredump/1 dir if there are any coredumps in there.

uplink

  • Administrator
  • Guru
  • *****
  • Posts: 192
  • Linux and LinuxMCE witchdoctor
    • View Profile
Re: Diskless MDs not PXE booting
« Reply #44 on: August 01, 2011, 06:56:54 pm »
13. After it got an IP, it began the PXE boot.
14. After eth0 apparently came up, the Jetway reported it couldn't connect to the router, so it rebooted.
15.  60 minutes later, it's still rebooting because it can't find the router.

WTF.

Began the PXE boot in what way? Does it load the "default" PXE config file, then the kernel, then the initrd.img files or it doesn't get this far? If it doesn't get this far, check syslog on the Core and tell us what is says.

A normal default image boot log looks like this:

Code: [Select]
Aug  1 15:39:37 dcerouter dhcpd: DHCPDISCOVER from 08:00:27:51:34:0e via eth1
Aug  1 15:39:38 dcerouter dhcpd: DHCPOFFER on 192.168.80.129 to 08:00:27:51:34:0e via eth1
Aug  1 15:39:39 dcerouter dhcpd: DHCPREQUEST for 192.168.80.129 (192.168.80.1) from 08:00:27:51:34:0e via eth1
Aug  1 15:39:39 dcerouter dhcpd: DHCPACK on 192.168.80.129 to 08:00:27:51:34:0e via eth1
Aug  1 15:39:40 dcerouter in.tftpd[14552]: connect from 192.168.80.129 (192.168.80.129)
Aug  1 15:39:40 dcerouter atftpd[14552]: Advanced Trivial FTP server started (0.7)
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.0 to 192.168.80.129:2001
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/56424f58-0000-0000-0000-08002751340e to 192.168.80.129:49152
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/01-08-00-27-51-34-0e to 192.168.80.129:49153
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A85081 to 192.168.80.129:49154
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A8508 to 192.168.80.129:49155
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A850 to 192.168.80.129:49156
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A85 to 192.168.80.129:49157
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A8 to 192.168.80.129:49158
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0A to 192.168.80.129:49159
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C0 to 192.168.80.129:49160
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/C to 192.168.80.129:49161
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/pxelinux.cfg/default to 192.168.80.129:49162
Aug  1 15:39:40 dcerouter atftpd[14552]: Serving /tftpboot/default/vmlinuz to 192.168.80.129:49163
Aug  1 15:39:45 dcerouter atftpd[14552]: Serving /tftpboot/default/initrd to 192.168.80.129:49164
Aug  1 15:39:57 dcerouter dhcpd: DHCPDISCOVER from 08:00:27:51:34:0e via eth1
Aug  1 15:39:57 dcerouter dhcpd: DHCPOFFER on 192.168.80.129 to 08:00:27:51:34:0e via eth1
Aug  1 15:39:57 dcerouter dhcpd: DHCPREQUEST for 192.168.80.129 (192.168.80.1) from 08:00:27:51:34:0e via eth1
Aug  1 15:39:57 dcerouter dhcpd: DHCPACK on 192.168.80.129 to 08:00:27:51:34:0e via eth1

See where in the above sequence your boot process breaks down.
« Last Edit: August 01, 2011, 07:00:18 pm by uplink »