Author Topic: Zwave device stops working and fails to restart  (Read 5040 times)

mythtified

  • Guru
  • ****
  • Posts: 176
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #15 on: December 08, 2010, 07:59:54 pm »
Having the port disappear due to removing the dongle was not my case.  The machine would see the dongle early in the day.  I could watch the log as devices were polled/responding then later in the evening a light would fail to turn on on schedule or by event.  The dongle had not been removed during that time.  I would look at the logs and see no activity.  I could not control lights.  i would have to reboot the machine for the dongle to start working again.  Whether the usb ports were going to sleep or something else was going on I could not determine.  After playing with it for a while I decided to move to a different motherboard, Asus A8N32SLI.  Something I had laying around and have not had that problem again.  I never pursued the problem further.  I presently am using the board as a mythtv server and have little to no trouble from it.  Of course I am not using the usb ports either.

totallymaxed

  • LinuxMCE God
  • ****
  • Posts: 4412
  • Smart Home Consulting
    • View Profile
    • Dianemo - at home with technology
Re: Zwave device stops working and fails to restart
« Reply #16 on: December 08, 2010, 09:00:14 pm »
Having the port disappear due to removing the dongle was not my case.  The machine would see the dongle early in the day.  I could watch the log as devices were polled/responding then later in the evening a light would fail to turn on on schedule or by event.  The dongle had not been removed during that time.  I would look at the logs and see no activity.  I could not control lights.  i would have to reboot the machine for the dongle to start working again.  Whether the usb ports were going to sleep or something else was going on I could not determine.  After playing with it for a while I decided to move to a different motherboard, Asus A8N32SLI.  Something I had laying around and have not had that problem again.  I never pursued the problem further.  I presently am using the board as a mythtv server and have little to no trouble from it.  Of course I am not using the usb ports either.

The port does not disappear when removing a dongle - but you do need to re-insert it into the same port when PnP is disabled. What I was describing is not specific to the ASrock but a general problem.

It sounds more likely that your ZWave binary was dying after a few hours - it may have been that the usb chip used in the MCV was not happy with the usb ports on the ASrock. I know that MCV have had problems with their dongle when used in conjunction with their Vera product.

We have no such problems with ZWave or the ZWave interfaces we use and never have had either on any hardware let alone the ASrocks.

Anyway glad you found a solutions for your problem.

all the best

Andrew
Andy Herron,
Convergent Home Technologies Ltd
United Kingdom

Read My Blog; http://ellipticalcurve.com

Contact me for Smart Home consulting advice here;
@herron on Twitter, totallymaxed+consulting@gmail.com via email or PM me here.

Get a Dianemo S License: http://forum.linuxmce.org/index.php?topic=8880.0
iOS Orbiter: http://wiki.linuxmce.org/index.php/Dianemo_iOS_Orbiter
Follow us on Facebook: https://www.facebook.com/pages/Dianemo-Home-Automation/226019387454465

Sales & Info:
http://www.dianemo.co.uk

mythtified

  • Guru
  • ****
  • Posts: 176
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #17 on: December 08, 2010, 10:10:58 pm »
Whoops.  I did not mean to imply that the port actually disappeared. Thx for the reply

tortho

  • Regular Poster
  • **
  • Posts: 45
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #18 on: December 12, 2010, 11:55:19 pm »
Well then back to the original thread :-)

Believe this is where it "dies"...
Have been away for a few days and now it is no0t working again..
If it were a sleep issue I believe it would have appeared the first night...

Quote from log:
Code: [Select]
36 12/08/10 13:04:31.171 Got basic report from node 2, value: 99 <0xb71a2b90>
40 12/08/10 13:04:31.275 Sending job 0xa3df048 (cb 243) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0xf3 0xdb (######1####) <0xb71a2b90>
41 12/08/10 13:04:31.280 0x1 0x4 0x1 0x13 0x1 0xe8 (######) <0xb71a2b90>
36 12/08/10 13:04:31.280 ZW_SEND delivered to Z-Wave stack <0xb71a2b90>
41 12/08/10 13:04:31.300 0x1 0x5 0x0 0x13 0xf3 0x0 0x1a (#######) <0xb71a2b90>
36 12/08/10 13:04:31.301 ZW_SEND Response with callback 243 received <0xb71a2b90>
36 12/08/10 13:04:31.301 ZW_SEND was successful, removing job <0xb71a2b90>
36 12/08/10 13:05:01.451 We have been idle for 30 seconds, polling device states <0xb71a2b90>
40 12/08/10 13:05:01.547 Sending job 0xa3df268 (cb 244) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0xf4 0xcb (###### ####) <0xb71a2b90>
05 12/08/10 13:05:04.655 No callback received: await_callback: 244 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:04.755 Sending job 0xa3df268 (cb 244) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0xf4 0xcb (###### ####) <0xb71a2b90>
05 12/08/10 13:05:07.855 No callback received: await_callback: 244 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:07.955 Sending job 0xa3df268 (cb 244) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0xf4 0xcb (###### ####) <0xb71a2b90>
05 12/08/10 13:05:11.055 No callback received: await_callback: 244 timer: 31 <0xb71a2b90>
01 12/08/10 13:05:11.063 ERROR: Dropping command, no callback received after three resends <0xb71a2b90>
40 12/08/10 13:05:11.159 Sending job 0xa3df488 (cb 245) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0xf5 0xac (######`## ####) <0xb71a2b90>
05 12/08/10 13:05:14.267 No callback received: await_callback: 245 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:14.715 Sending job 0xa3df488 (cb 245) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0xf5 0xac (######`## ####) <0xb71a2b90>
05 12/08/10 13:05:17.819 No callback received: await_callback: 245 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:17.919 Sending job 0xa3df488 (cb 245) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0xf5 0xac (######`## ####) <0xb71a2b90>
05 12/08/10 13:05:21.039 No callback received: await_callback: 245 timer: 31 <0xb71a2b90>
01 12/08/10 13:05:21.039 ERROR: Dropping command, no callback received after three resends <0xb71a2b90>
40 12/08/10 13:05:21.139 Sending job 0xa3df6a8 (cb 246) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0xf6 0xde (######1####) <0xb71a2b90>
05 12/08/10 13:05:24.243 No callback received: await_callback: 246 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:24.343 Sending job 0xa3df6a8 (cb 246) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0xf6 0xde (######1####) <0xb71a2b90>
05 12/08/10 13:05:27.463 No callback received: await_callback: 246 timer: 31 <0xb71a2b90>
40 12/08/10 13:05:27.567 Sending job 0xa3df6a8 (cb 246) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0xf6 0xde (######1####) <0xb71a2b90>
05 12/08/10 13:05:30.671 No callback received: await_callback: 246 timer: 31 <0xb71a2b90>
01 12/08/10 13:05:30.671 ERROR: Dropping command, no callback received after three resends <0xb71a2b90>
01 12/08/10 13:05:30.671 ERROR: Three dropped commands in a row, soft-resetting controller <0xb71a2b90>
36 12/08/10 13:05:30.671 Soft-resetting the Z-Wave chip <0xb71a2b90>
40 12/08/10 13:05:30.771 Sending job 0xa3df8c8 (cb 0) - 0x1 0x3 0x0 0x8 0xf4 (#####) <0xb71a2b90>
36 12/08/10 15:34:55.061 Received command for child <0xb61a0b90>
36 12/08/10 15:34:55.061 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>
36 12/09/10 6:00:00.001 Received command for child <0xb61a0b90>
36 12/09/10 6:00:00.051 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>
36 12/09/10 6:00:00.051 Received command for child <0xb61a0b90>
36 12/09/10 6:00:00.051 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>
36 12/09/10 6:00:00.052 Received command for child <0xb61a0b90>
36 12/09/10 6:00:00.052 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>
36 12/09/10 6:00:00.052 Received command for child <0xb61a0b90>
36 12/09/10 6:00:00.052 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>

esev

  • Veteran
  • ***
  • Posts: 87
    • View Profile
    • Eric Severance
Re: Zwave device stops working and fails to restart
« Reply #19 on: December 13, 2010, 04:45:38 pm »
Well then back to the original thread :-)

Believe this is where it "dies"...
Have been away for a few days and now it is no0t working again..
If it were a sleep issue I believe it would have appeared the first night...

Quote from log:
Code: [Select]
01 12/08/10 13:05:30.671 ERROR: Three dropped commands in a row, soft-resetting controller <0xb71a2b90>
36 12/08/10 13:05:30.671 Soft-resetting the Z-Wave chip <0xb71a2b90>
40 12/08/10 13:05:30.771 Sending job 0xa3df8c8 (cb 0) - 0x1 0x3 0x0 0x8 0xf4 (#####) <0xb71a2b90>
36 12/08/10 15:34:55.061 Received command for child <0xb61a0b90>
36 12/08/10 15:34:55.061 ON RECEIVED FOR CHILD 19/0 <0xb61a0b90>

Is it possible that the serial port goes away temporarily when the Z-Wave chip is reset?  I never encountered this when testing the soft-reset on Tricklestar hardware.  Do you see any Plug-and-play events happening around the same time as the soft-reset took place?

Just curious, do you have any wireless devices in your house operating around 900 MHz?  Maybe a cordless phone, wireless speakers, or a video monitor?
Eric Severance
My setup

tortho

  • Regular Poster
  • **
  • Posts: 45
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #20 on: December 13, 2010, 09:51:15 pm »
Been through all log files in /var/log and in /var/log/pluto and none of them has anything happening 8th December at 13:05 which was the time of failure.

Yes I do have a wireless phone operating at 900Mhz, but Zwave european model uses 868.42MHz.

so I'm still a little lost..

hari

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 2420
    • View Profile
    • ago control
Re: Zwave device stops working and fails to restart
« Reply #21 on: December 13, 2010, 11:32:47 pm »
I wonder why the last send attempt does not run into a timeout (after the soft reset)..
rock your home - http://www.agocontrol.com home automation

tortho

  • Regular Poster
  • **
  • Posts: 45
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #22 on: December 14, 2010, 12:05:33 am »
FYI,

- There is no timed events running at that time of the day (which I'm aware of or can see in the setup)
- No one at home used the zwave from the hybrid or md.
- The only thing that might have happened is that someone physically turned on or off one of the lights at home locally (Not through any remotes or computers)

esev

  • Veteran
  • ***
  • Posts: 87
    • View Profile
    • Eric Severance
Re: Zwave device stops working and fails to restart
« Reply #23 on: December 14, 2010, 03:12:07 am »
I wonder why the last send attempt does not run into a timeout (after the soft reset)..

Another question is why are there no more "Sending job" messages after the soft reset and no more 30 second polls.  It's like the main send/receive thread stopped working.
Eric Severance
My setup

hari

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 2420
    • View Profile
    • ago control
Re: Zwave device stops working and fails to restart
« Reply #24 on: December 14, 2010, 03:43:39 pm »
i'd suspect that we should not try to acquire the mutex a 2nd time in dropSendQueueJob()...

br Hari
rock your home - http://www.agocontrol.com home automation

esev

  • Veteran
  • ***
  • Posts: 87
    • View Profile
    • Eric Severance
Re: Zwave device stops working and fails to restart
« Reply #25 on: December 14, 2010, 05:59:23 pm »
i'd suspect that we should not try to acquire the mutex a 2nd time in dropSendQueueJob()...

When I added dropSendQueueJob(), I changed the mutex so that it is recursive.  dropSendQueueJob() should be able to acquire the mutex a second time from the same thread, as should zwSoftReset().
Eric Severance
My setup

hari

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 2420
    • View Profile
    • ago control
Re: Zwave device stops working and fails to restart
« Reply #26 on: December 14, 2010, 10:09:51 pm »
i'd suspect that we should not try to acquire the mutex a 2nd time in dropSendQueueJob()...

When I added dropSendQueueJob(), I changed the mutex so that it is recursive.  dropSendQueueJob() should be able to acquire the mutex a second time from the same thread, as should zwSoftReset().
seen that but it looks like it hangs there..

br Hari
rock your home - http://www.agocontrol.com home automation

esev

  • Veteran
  • ***
  • Posts: 87
    • View Profile
    • Eric Severance
Re: Zwave device stops working and fails to restart
« Reply #27 on: December 14, 2010, 10:35:25 pm »
i'd suspect that we should not try to acquire the mutex a 2nd time in dropSendQueueJob()...

When I added dropSendQueueJob(), I changed the mutex so that it is recursive.  dropSendQueueJob() should be able to acquire the mutex a second time from the same thread, as should zwSoftReset().
seen that but it looks like it hangs there..

I'm not understanding.  Are you saying the mutex is getting corrupted?

I see in the log it prints out "ERROR: Three dropped commands in a row, soft-resetting controller".  In dropSendQueueJob() this happens after the mutex is unlocked.  Then there is the "Soft-resetting the Z-Wave chip" from zwSoftReset().  Then I see the "Sending job..." which occurs from WriteSerialStringEx() back in the main receiveFunction() loop.

I was thinking the mutex was okay, otherwise the receiveFunction() wouldn't be able to acquire it again before calling WriteSerialStringEx().  But maybe, since this is all happening in the same thread, the mutex doesn't get unlocked as many times as it is locked - and then outside threads cannot add new things to the queue.

I was pretty sure there was a matching unlock for every mutex lock, but I'll take another look at the code again tonight.  Hari, if you could take a look too I'd appreciate having a second set of eyes looking at the code too.  It was from changeset 23385
Eric Severance
My setup

valent

  • Guru
  • ****
  • Posts: 380
    • View Profile
    • /kernel_reloaded/
Re: Zwave device stops working and fails to restart
« Reply #28 on: December 15, 2010, 05:00:55 pm »
How do you disable PnP signature for your dongles?
LinuxMCE - If it was easy, everybody would be doing it!!
My setup - http://wiki.linuxmce.org/index.php/User:Valent

tortho

  • Regular Poster
  • **
  • Posts: 45
    • View Profile
Re: Zwave device stops working and fails to restart
« Reply #29 on: December 20, 2010, 04:47:30 pm »
No soft reset on this "freeze"....

Code: [Select]
01 12/19/10 13:42:07.680 ERROR: Dropping command, no callback received after three resends <0xb7236b90>
36 12/19/10 13:42:37.828 We have been idle for 30 seconds, polling device states <0xb7236b90>
40 12/19/10 13:42:37.928 Sending job 0x859e190 (cb 128) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0x80 0xbf (###### ####) <0xb7236b90>
05 12/19/10 13:42:41.033 No callback received: await_callback: 128 timer: 31 <0xb7236b90>
40 12/19/10 13:42:41.133 Sending job 0x859e190 (cb 128) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0x80 0xbf (###### ####) <0xb7236b90>
05 12/19/10 13:42:44.237 No callback received: await_callback: 128 timer: 31 <0xb7236b90>
40 12/19/10 13:42:44.333 Sending job 0x859e190 (cb 128) - 0x1 0x9 0x0 0x13 0xff 0x2 0x20 0x2 0x5 0x80 0xbf (###### ####) <0xb7236b90>
05 12/19/10 13:42:47.437 No callback received: await_callback: 128 timer: 31 <0xb7236b90>
01 12/19/10 13:42:47.437 ERROR: Dropping command, no callback received after three resends <0xb7236b90>
01 12/19/10 13:42:47.437 ERROR: Three dropped commands in a row, soft-resetting controller <0xb7236b90>
36 12/19/10 13:42:47.437 Soft-resetting the Z-Wave chip <0xb7236b90>
40 12/19/10 13:42:47.541 Sending job 0x859e3b0 (cb 129) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0x81 0xd8 (######`## ####) <0xb7236b90>
05 12/19/10 13:42:50.641 No callback received: await_callback: 129 timer: 31 <0xb7236b90>
40 12/19/10 13:42:50.745 Sending job 0x859e3b0 (cb 129) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0x81 0xd8 (######`## ####) <0xb7236b90>
05 12/19/10 13:42:53.849 No callback received: await_callback: 129 timer: 31 <0xb7236b90>
40 12/19/10 13:42:53.948 Sending job 0x859e3b0 (cb 129) - 0x1 0xc 0x0 0x13 0xff 0x5 0x60 0x6 0x2 0x20 0x2 0x5 0x81 0xd8 (######`## ####) <0xb7236b90>
05 12/19/10 13:42:57.053 No callback received: await_callback: 129 timer: 31 <0xb7236b90>
01 12/19/10 13:42:57.053 ERROR: Dropping command, no callback received after three resends <0xb7236b90>
40 12/19/10 13:42:57.153 Sending job 0x859e5d0 (cb 130) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0x82 0xaa (######1####) <0xb7236b90>
05 12/19/10 13:43:00.265 No callback received: await_callback: 130 timer: 31 <0xb7236b90>
40 12/19/10 13:43:00.365 Sending job 0x859e5d0 (cb 130) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0x82 0xaa (######1####) <0xb7236b90>
05 12/19/10 13:43:03.473 No callback received: await_callback: 130 timer: 31 <0xb7236b90>
40 12/19/10 13:43:03.573 Sending job 0x859e5d0 (cb 130) - 0x1 0x9 0x0 0x13 0xff 0x2 0x31 0x4 0x5 0x82 0xaa (######1####) <0xb7236b90>
05 12/19/10 13:43:06.681 No callback received: await_callback: 130 timer: 31 <0xb7236b90>
01 12/19/10 13:43:06.681 ERROR: Dropping command, no callback received after three resends <0xb7236b90>
40 12/19/10 13:43:06.776 Sending job 0x859e7f0 (cb 0) - 0x1 0x3 0x0 0x8 0xf4 (#####) <0xb7236b90>
05 12/19/10 14:22:28.360 Going to rotate logs... <0xb60ffb90>
36 12/19/10 15:34:10.034 Received command for child <0xb60ffb90>
36 12/19/10 15:34:10.035 ON RECEIVED FOR CHILD 19/0 <0xb60ffb90>
36 12/20/10 5:59:59.627 Received command for child <0xb60ffb90>
36 12/20/10 5:59:59.891 ON RECEIVED FOR CHILD 19/0 <0xb60ffb90>
36 12/20/10 6:00:00.001 Received command for child <0xb60ffb90>
« Last Edit: December 20, 2010, 04:49:12 pm by tortho »