Author Topic: 7.10 Core : deadlocks, DCERouter keeps restarting  (Read 911 times)

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 883
  • Living with LMCE
    • View Profile
7.10 Core : deadlocks, DCERouter keeps restarting
« on: May 10, 2011, 10:51:25 pm »
Hi,

my rock solid 7.10 Core started to crawl.... DCERouter keeps restarting, everything is so slow, main Orbiter works for a few minutes and then it goes back to startup procedure.... I've tried to figure out, what is causing the problems, but couldn't find good reason (I'm not experienced Linux user)...

DCERouter shows several interesting entries, the main that caught up my attention was that SQL queries took long time. I've run mysqlcheck and everything seems fine... I'm confused and would kindly ask for any opinion or pointer where to start checking things... As said I suspect mysql, a lot of errors I get with mythbackend (I don't use it anymore, have tried to disable MythPlugin but no change)...

Thanks in advance,

regards,

Bulek.
Thanks in advance,

regards,

Bulek.

posde

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 2846
  • Wastes Life On LinuxMCE Since 2007
    • View Profile
    • My Home
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #1 on: May 10, 2011, 11:18:50 pm »
Did you add an external USB drive?

How large is your media collection?

How much free diskspace do you have?

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 883
  • Living with LMCE
    • View Profile
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #2 on: May 11, 2011, 01:27:24 am »
Did you add an external USB drive?

How large is your media collection?

How much free diskspace do you have?
1. I didn't add external drive - it did detect new (but empty) Window share.
2. not sure - pretty large
3. disk usage is at 64%.

Do I understand right, that leaving Core to finish 1. or 2. would help ? Situation seems better now.. But is it normal that during 1. or 2. DCERoouter keeps restarting ?

Thanks in advance,

regards,

Bulek.
Thanks in advance,

regards,

Bulek.

golgoj4

  • NEEDS to work for LinuxMCE
  • ***
  • Posts: 1046
  • hrumpf!
    • View Profile
    • La Smarthomes - SoCal smarthome site
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #3 on: May 11, 2011, 01:57:30 am »
Hi,

my rock solid 7.10 Core started to crawl.... DCERouter keeps restarting, everything is so slow, main Orbiter works for a few minutes and then it goes back to startup procedure.... I've tried to figure out, what is causing the problems, but couldn't find good reason (I'm not experienced Linux user)...

DCERouter shows several interesting entries, the main that caught up my attention was that SQL queries took long time. I've run mysqlcheck and everything seems fine... I'm confused and would kindly ask for any opinion or pointer where to start checking things... As said I suspect mysql, a lot of errors I get with mythbackend (I don't use it anymore, have tried to disable MythPlugin but no change)...

Thanks in advance,

regards,

Bulek.


I experienced similar problems when I had an HD slowly on the way out. I did not determine it was the HD until i ran 'top' and checked the %wa, which was abnormally high (40% - 80%). I then installed 'sysstat' and gleaned that there were extremely long times between a request for data and when the request was completed. Based on all this, i determined i had a disfunctional drive as SMART status also essentially said 'your on borrowed time'.

HTH
golgoj4
Those people who tell you not to take chances, they are all missing what life's all about.

Wiki Hardware Page http://wiki.linuxmce.org/index.php/User:Langstonius

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 883
  • Living with LMCE
    • View Profile
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #4 on: May 11, 2011, 05:32:40 pm »
Thanks for responses,


it seems quiet now and working ok. Maybe there was just a huge amount of work that needed to be done... I remember similar situations from the past, but none with such obvious exposure...

Thanks for hint, will check HD.

Regards,

Bulek.
Thanks in advance,

regards,

Bulek.

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 883
  • Living with LMCE
    • View Profile
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #5 on: June 08, 2011, 11:52:53 pm »
Thanks for responses,


it seems quiet now and working ok. Maybe there was just a huge amount of work that needed to be done... I remember similar situations from the past, but none with such obvious exposure...

Thanks for hint, will check HD.

Regards,

Bulek.

Hi,

I'm back with same problems after some time of being OK. I get a lot of "Query failed: mysql server has gone away" or "query xxxx took xxxxmsecs". On of the Orbiters crashed whenever I tried to go into Media Audio list. I also get weird warnings in /var/log/monit.log for not being aberl to read stat for drive etc...

I've performed short test with smartctl, but no error appeared.
I've tried to recreate Orbiter, but doesn't help. I wonder if I can somehow renew media content in mysql base (wipe out, and reread again all media files) ?

Any other advice how to clean this mess a little bit ?

Thanks,
regards,
Bulek.
Thanks in advance,

regards,

Bulek.

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 883
  • Living with LMCE
    • View Profile
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #6 on: June 22, 2011, 08:17:00 am »
Hi,

I've made a step further. I've checked disks with long offline test with smartctl and everything seems fine....
I've tried to repeat crash several times and it seems like Dataplugin is causing problems (or probably something in media database)...

When I press Audio button on main Orbiter menu I get black screen and Orbiter then restarts...
I have this in DCERouter's log:
Quote
08      06/22/11 7:09:47.378            Received Message from 20 (Main OnScreen Orbiter / Living Room/Family Room) to 10 (Media Plug-in / Living Room/Family Room), type 1 id 74 Command:Bind to Media Remote, retry none, parameters: <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 2(PK_Device):  <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 3(PK_DesignObj):  <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 8(On/Off): 0 <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 39(Options):  <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 45(PK_EntertainArea): 1 <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 63(PK_Text_Synopsis):  <0xaeffab90>
08      06/22/11 7:09:47.383              Parameter 159(PK_Screen):  <0xaeffab90>
08      06/22/11 7:09:47.400            Received Message from 20 (Main OnScreen Orbiter / Living Room/Family Room) to 6 (Datagrid Plug-in / Living Room/Family Room), type 1 id 35 Command:Populate Datagrid, retry none, parameters: <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 4(PK_Variable): 0 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 5(Value To Assign):  <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 10(ID): 1 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 15(DataGrid ID): resetav_20 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 38(PK_DataGrid): 30 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 39(Options): 1 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 40(IsSuccessful): 1 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 44(PK_DeviceTemplate): 0 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 60(Width): 7 <0xaeffab90>
08      06/22/11 7:09:47.400              Parameter 61(Height): 6 <0xaeffab90>
08      06/22/11 7:09:47.539            Received Message from 20 (Main OnScreen Orbiter / Living Room/Family Room) to 6 (Datagrid Plug-in / Living Room/Family Room), type 1 id 34 Command:Request Datagrid Contents, retry none, parameters: <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 10(ID): 2 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 15(DataGrid ID): resetav_20 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 32(Row): 0 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 33(Column): 0 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 34(Row count): 6 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 35(Column count): 7 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 36(Keep Row Header): 0 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 37(Keep Column Header): 0 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 49(Add Up-Down Arrows): 1 <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 73(Seek):  <0xaeffab90>
08      06/22/11 7:09:47.539              Parameter 74(Offset): 0 <0xaeffab90>

....

08      06/22/11 7:09:55.393            Received Message from 20 (Main OnScreen Orbiter / Living Room/Family Room) to 6 (Datagrid Plug-in / Living Room/Family Room), type 1 id 35 Command:Populate Datagrid, retry none, parameters: <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 4(PK_Variable): 0 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 5(Value To Assign):  <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 10(ID): 4 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 15(DataGrid ID): MediaFile_20 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 38(PK_DataGrid): 63 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 39(Options): 4||||1,2|0|2|0 | 2 | <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 40(IsSuccessful): 1 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 44(PK_DeviceTemplate): 0 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 60(Width): 1 <0xaeffab90>
08      06/22/11 7:09:55.393              Parameter 61(Height): 8 <0xaeffab90>

....

01      06/22/11 7:10:24.396            Socket::ReceiveData-a 0x99e87f0 failed ret 0 <0xaeffab90>
01      06/22/11 7:10:24.397            Socket 0x99e87f0 failure waiting for response to message from device 6 type 1 id 35 <0xaeffab90>
01      06/22/11 7:10:24.397            Plugin 6 stopped responding <0xaeffab90>
07      06/22/11 7:10:26.075            Event #12 has no handlers <0xb33c7b90>
07      06/22/11 7:10:26.075            Received Message from 22 (Xine Player Main / Living Room/Family Room) to -1001 (unknown / ), type 2 id 12 Event:Playback Completed, retry none, parameters: <0xb33c7b90>
07      06/22/11 7:10:26.075              Parameter 4(MRL):  <0xb33c7b90>
07      06/22/11 7:10:26.075              Parameter 9(Stream ID): 0 <0xb33c7b90>
07      06/22/11 7:10:26.075              Parameter 37(With Errors): 0 <0xb33c7b90>
07      06/22/11 7:10:27.316            Event #12 has no handlers <0xab5b2b90>
07      06/22/11 7:10:27.316            Received Message from 25 (MPlayer Player / Living Room/Family Room) to -1001 (unknown / ), type 2 id 12 Event:Playback Completed, retry none, parameters: <0xab5b2b90>
07      06/22/11 7:10:27.316              Parameter 4(MRL):  <0xab5b2b90>
07      06/22/11 7:10:27.316              Parameter 9(Stream ID): 0 <0xab5b2b90>
07      06/22/11 7:10:27.316              Parameter 37(With Errors): 0 <0xab5b2b90>
07      06/22/11 7:10:33.398            Event #9 has no handlers <0x99cfbb90>
07      06/22/11 7:10:33.399            Received Message from 581 (BM_motion_living_00 / ) to -1000 (unknown / ), type 2 id 9 Event:Sensor Tripped, retry none, parameters: <0x99cfbb90>
07      06/22/11 7:10:33.399              Parameter 25(Tripped): 1 <0x99cfbb90>
01      06/22/11 7:10:44.808            debug_stream_end MediaStream::~MediaStream c1 1010/0x9af3ad0 <0xb6bcf6c0>
01      06/22/11 7:10:44.809            debug_stream_end MediaStream::~MediaStream c1 1009/0x7b7735b8 <0xb6bcf6c0>
08      06/22/11 7:11:16.058            Received Message from 11 (Telecom Plug-in / Living Room/Family Room) to 17 (Asterisk / Living Room/Family Room), type 1 id 922 Command:Send Asterisk Status, retry retry, parameters: <0x9dbf6b90>




and also one on MD in living room (this time on Video button)
Quote
08      06/22/11 7:50:36.521            Received Message from 793 (OnScreen Orbiter / Office) to 6 (Datagrid Plug-in / Living Room/Family Room), type 1 id 35 Command:Populate Datagrid, retry none, parameters: <0x5ddfab90>
08      06/22/11 7:50:36.521              Parameter 4(PK_Variable): 0 <0x5ddfab90>
08      06/22/11 7:50:36.521              Parameter 5(Value To Assign):  <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 10(ID): 2 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 15(DataGrid ID): MediaFile_793 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 38(PK_DataGrid): 63 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 39(Options): 5||||1,2|0|13|0 | 2 | <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 40(IsSuccessful): 1 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 44(PK_DeviceTemplate): 0 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 60(Width): 1 <0x5ddfab90>
08      06/22/11 7:50:36.522              Parameter 61(Height): 20 <0x5ddfab90>
...

01      06/22/11 7:51:05.528            Socket::ReceiveData-a 0x8190b90 failed ret 0 <0x5ddfab90>
01      06/22/11 7:51:05.596            Socket 0x8190b90 failure waiting for response to message from device 6 type 1 id 35 <0x5ddfab90>
01      06/22/11 7:51:05.596            Plugin 6 stopped responding <0x5ddfab90>



It seems that Datagrid plugin doesn't respond or at least not in the right time... Is it something wrong in database?
Any other advice how to solve this ?

Do I have any option to debug this more deeply ?

Thanks in advance,

regards
Thanks in advance,

regards,

Bulek.

klovell

  • Guru
  • ****
  • Posts: 205
    • View Profile
Re: 7.10 Core : deadlocks, DCERouter keeps restarting
« Reply #7 on: June 22, 2011, 03:27:53 pm »
i'm just throwing this out there,  not really telling you where the problem is because i'm not sure. 

I had a raid setup (not lmce) that keep degrading because of a failed drive.  I ran smart reports and test a few times and got positive results, but the drive was actually bad.  I was able to rebuild the array twice After a reboot but not the 3rd time.  After I replaced the drive I was able to rebuild and  everything work fine.

I've heard other people say similair things.  I also worked at a web host company that had so little faith in smart that they didn't even bother to turn it on.  They would just replace the drive at the first sign of failure.

Just throwing it out there....