Hi,
I'm getting a lot of kernel oops and sometimes also my LMCE 704 Core hardlocks (only HW reset helps)....
I'm spotting one weird problem :
1. I see kernel Oops or Eeek messages only if I'm connected via ssh at that time. if I search in syslog,messages or dmesg logs aftewards, I can't find any kernel messages Oops or Eeek in them. Do we have some setting to prevent this ? I get 1-2 lockups per day on my core but can't see kernel messages that describe cause of hard lockup... How to deal with this ?
2. How to see if kernel message it just kernel error causing hardlock trouble and is not related to any HW failure ?
3. How would be the procedure to rule out HW failure causes for hard lockups ? I've tried to rum memtest for 1 day and no errors were found in memory. I'm not sure how to check if anything is wrong with disks ? My disks seem not to have support for SMART health check...
4. I've seen all my kernel messages already reported on proper places for that (
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/107325)
I need just to wait and try new kernels or is there anything else that needs to be done ?
Thanks in advance,
regards,
Bulek.
cerouter_42077:~#
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] Eeek! page_mapcount(page) went negative! (-1)
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] page pfn = 23232
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] page->flags = 40000004
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] page->count = 1
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] page->mapping = 00000000
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] vma->vm_ops = 0x0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] ------------[ cut here ]------------
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] invalid opcode: 0000 [1]
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] SMP
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] CPU: 0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] EIP: 0060:[page_remove_rmap+224/240] Tainted: P VLI
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] EFLAGS: 00210246 (2.6.20-15-generic 2)
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] EIP is at page_remove_rmap+0xe0/0xf0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] eax: 00000000 ebx: c1464640 ecx: 00200046 edx: 00000000
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] esi: de27f128 edi: b7800000 ebp: c1464640 esp: daaf5eb8
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] ds: 007b es: 007b ss: 0068
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] Process perl (pid: 29700, ti=daaf4000 task=eb80b030 task.ti=daaf4000)
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] Stack: c036966b 00000000 00000000 e6fc5000 c0163a8d 00000000 b7b63fff 00000000
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] de27f128 daaf5f44 00000000 00000001 b7b64000 f280fb78 cbe74200 c1806640
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] 00000000 ffffffff c14df8ac e6fc5000 f280fb78 23232323 c14df8ac 00362bf3
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] Call Trace:
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [unmap_vmas+733/1472] unmap_vmas+0x2dd/0x5c0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [exit_mmap+119/240] exit_mmap+0x77/0xf0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [mmput+56/160] mmput+0x38/0xa0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [do_exit+242/2048] do_exit+0xf2/0x800
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [do_page_fault+831/1520] do_page_fault+0x33f/0x5f0
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [do_group_exit+38/128] do_group_exit+0x26/0x80
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [sysenter_past_esp+105/169] sysenter_past_esp+0x69/0xa9
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] [xfrm_state_find+1251/1392] xfrm_state_find+0x4e3/0x570
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] =======================
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] Code: c0 74 0d 8b 50 08 b8 fc a1 36 c0 e8 3b ca fd ff 8b 46 48 85 c0 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 1c a2 36 c0 e8 20 ca fd ff <0f> 0b eb fe 8b 53 0c eb 95 8d b4 26 00 00 00 00 55 57 56 89 d6
Message from syslogd@dcerouter at Mon Jan 14 23:39:48 2008 ...
dcerouter kernel: [14373.108000] EIP: [page_remove_rmap+224/240] page_remove_rmap+0xe0/0xf0 SS:ESP 0068:daaf5eb8
dcerouter_42077:~#
dcerouter kernel: [100495.644000] Oops: 0002 [#1]
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] SMP
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] CPU: 0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] EIP: 0060:[cache_alloc_refill+298/1360]
Tainted: P VLI
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] EFLAGS: 00010046 (2.6.20-15-generic #2)
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] EIP is at cache_alloc_refill+0x12a/0x550
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] eax: df90bdc0 ebx: 00000006 ecx: 00000035
edx: a7b4bbc0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] esi: 0000001d edi: dbdfe000 ebp: df909c00
esp: e545be68
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] ds: 007b es: 007b ss: 0068
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] Process fuser (pid: 13555, ti=e545a000 task=d5
4a5560 task.ti=e545a000)
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] Stack: 00000010 000000d0 00000004 000000d0 df9
0e9c0 df908000 df90bdc0 00000000
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] 479726af 0e3c60b5 e56b9560 c0189dba e56
b9560 dbdfe01c 00000246 000000d0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] df90e9c0 da1657c8 c0172c00 00000000 000
00423 00000004 c018811b e545bf08
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] Call Trace:
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [alloc_inode+202/384] alloc_inode+0xca/0x180
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [kmem_cache_alloc+128/144] kmem_cache_alloc+0
x80/0x90
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [d_alloc+27/400] d_alloc+0x1b/0x190
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [proc_fill_cache+257/320] proc_fill_cache+0x1
01/0x140
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [filldir64+0/224] filldir64+0x0/0xe0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [proc_readfd+223/480] proc_readfd+0xdf/0x1e0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [proc_fd_instantiate+0/352] proc_fd_instantia
te+0x0/0x160
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [filldir64+0/224] filldir64+0x0/0xe0
Message from syslogd@dcerouter at Wed Jan 23 12:36:15 2008 ...
dcerouter kernel: [100495.644000] [filldir64+0/224] filldir64+0x0/0xe0