Author |
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 18 2007 at 10:17 | IP Logged
|
|
|
I somewhat answered one of my questions in an earlier thread about "nak accounting"...but figured I'd share my adventures here.
My network is extremely reliable. But about once a day or so, I'll give a looksie as to the PH comm status, and I always see a few NAKs here and there. This has kind of bugged me so I wanted to see if I can find a pattern.
Some devices seem a bit more sensitive than others. For example, my ControLincs and KPLs seem to register more NAKs than other devices, but that's a casual observation.
The second thing I've discovered is that only 3 NAKs immediately in a row represent true death, since the device will try up to 3 times. If you try to status scan an offline device, for example, PH will register this as 3 NAKs for the single status scan. (That's the missing piece to my earlier question.) BTW, I don't think this is wrong (or right), but it has helped me understand the network better.
Since the functionality isn't there yet to trigger off of NAKs (hopefully some day, *wink* *wink*), I wrote a super cheesy app that just listens for NAKs. It doesn't do anything interesting with them, but I could use my socket client to fire a macro, thus, I have at least 1 method for triggering a NAK!
So I ran it last night when I went to bed, and this morning I saw these NAKs:
05:24:35 NAK (0B.9D.E4)
05:26:45 NAK (0B.9F.AD)
08:57:36 NAK (0B.9D.AD) [ToggleLinc Relay]
08:57:41 NAK (0B.9D.AD)
08:58:09 NAK (09.4C.A2) [Control Linc]
08:58:14 NAK (09.4C.A2)
08:58:19 NAK (09.4C.A2)
What's interesting about this is the proximity in time. All the NAKs are clustered, which to me suggests either a problem with the PLC, or "universal problem" not related to any specific device. The ControLinc is actually plugged into the very same outlet that the PLC is, so that's even more peculiar. At 8:57, after the second NAK from the TL, there was a successful status request, but then the next status scan hit all NAKs. There was a lot of activity in the house around then -- garage doors opening, toaster and coffee pot running, etc. Impossible to know if it's coincidence or not.
Even the isolated ones at 5:26 are suspiciously close together.
Anyway, just thought I'd share. If anyone has tips/suggestions/comments about troubleshooting or diagnosing comm issues in their house, post 'em, I'd like to see what you've done.
|
Back to Top |
|
|
cmhardwick Senior Member
Joined: July 08 2006 Location: United States
Online Status: Offline Posts: 290
|
Posted: December 18 2007 at 16:13 | IP Logged
|
|
|
care to share your little app? this could be useful to all of us wanting to better isolate time of NAK events, etc.
Edited by cmhardwick - December 18 2007 at 16:13
__________________ Cicero, Enjoying automation!
|
Back to Top |
|
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 18 2007 at 16:24 | IP Logged
|
|
|
Sure -- let me at least put in 1 pass of cleanup -- it's pretty bad. I mean, I've written some bad code (I can admit it) and truly this was only for experimental reasons. It requires a PLC/SDM and PH to be running (not like this is much an issue for us here, but just fyi) I'll post it later tonite.
|
Back to Top |
|
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 18 2007 at 17:49 | IP Logged
|
|
|
Alrighty, well behold a very cheesy app. I make some liberties in assuming a default installation folder, and it requires SDM v308. It won't work with the previous SDM. Also requires v2.0 or higher of the .NET Framework.
Anyway, just unzip to a folder of your choice. I have no idea if it will work well for anyone else as I haven't tested it on any other machine, but it should just start working. If you want log files kept (txt files in same directory) check the box. Also, if you want a user message sent to the PH socket server, check that too. You'd need to update the PowerHomeSocketClient.dll.config file appropriately with the address, port, user/pass.
That's it! Let me know if it works.
No warranties expressed or implied :)
2007-12-18_174330_naksnooper.zip
EDIT: If you do record log files, depending on the amount of traffic, bear in mind they will grow pretty big. Zip 'em up or purge them regularly.
Edited by bhlonewolf - December 18 2007 at 20:03
|
Back to Top |
|
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 18 2007 at 20:01 | IP Logged
|
|
|
I caught another couple of interesting ones but I think I have an explanation. Actually, in this case, it wasn't due to a Status scan so PH didn't record it as a NAK, and there wasn't the typical "flurry" of Event 03's you typically see leading up to a NAK.
What I did see though was a ridiculous amount of traffic around that time (I log device changes in the PH log). On/off, on/off, about a dozen or so times. Then I found the root cause: my 2 year old playing with the remotelinc in the living room :)
Maybe someone knows the insteon protocol better than me, but I believe it's possible to see "legit" NAKs in some situations? I thought I remember someone talking about this awhile ago -- perhaps it had to do with simultaneous commands, or duplicate conflicting commands, I'm not sure. Dave, does that make any sense at all?
|
Back to Top |
|
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 19 2007 at 10:41 | IP Logged
|
|
|
Well, I think I'm generating more questions than answers. :)
I saw this line in my app corresponding with this log file:
12/19/2007 10:08:27.156 NAK from [CF.EC.A2]
10:08:24.390 sentinsteon=07 FA 17 CF EC A2 0F 19 00
10:08:24.953 eventraw=03
10:08:25.015 eventraw=03
10:08:25.468 eventraw=03
10:08:25.828 eventraw=03
10:08:25.890 eventraw=03
10:08:26.750 eventraw=03
10:08:26.796 eventraw=03
10:08:26.859 eventraw=03
10:08:27.031 eventraw=02
10:08:27.203 eventraw=05
10:08:27.421 receiveinsteonraw=02 09 4C A2 00 00 FF CF 13 00
10:08:27.546 eventraw=03
10:08:27.609 eventraw=08
10:08:27.671 receiveX10raw=00 6A
10:08:27.859 receiveX10=A4
10:08:27.921 eventraw=08
10:08:27.984 receiveX10raw=01 63
10:08:28.125 receiveX10=A Off
10:08:29.359 srq:SRIR=send failed
The weird part about this is that CF.EC.A2 isn't even a device on my network! Nor am I using X10, so not sure what that is about. It looks like it was sent from my PLC, so I'm honestly confused....
|
Back to Top |
|
|
bhlonewolf Senior Member
Joined: March 29 2007
Online Status: Offline Posts: 198
|
Posted: December 19 2007 at 18:56 | IP Logged
|
|
|
And from this evening:
12/19/2007 16:58:43.031 NAK from [08.48.C5] [kpl]
12/19/2007 16:58:48.031 NAK from [08.48.C5]
12/19/2007 16:58:53.062 NAK from [08.48.C5]
12/19/2007 17:01:09.593 NAK from [0B.9D.2C] [ tl office ]
12/19/2007 17:04:21.062 NAK from [09.4C.A2] [ cl office ]
12/19/2007 17:04:26.140 NAK from [09.4C.A2]
12/19/2007 17:04:31.015 NAK from [09.4C.A2]
12/19/2007 17:05:01.156 NAK from [0A.90.BD] [ ll office light ]
12/19/2007 17:05:06.203 NAK from [0A.90.BD]
12/19/2007 17:05:11.171 NAK from [0A.90.BD]
(Cl = controlinc, ll = lamplinc, tl = togglelinc)
While these aren't clustered completely back to back (had about a 20 or 25 second scan rate) they are certainly close enough that it seems peculiar. Also of note is that each device listed above, _AND_ the PLC, are all on the same breaker.
What I don't quite get is why other devices, on different breakers, seemed to query just fine to the status scan.
I had an appointment precisely at 5pm today in the house with a contractor, however, nothing out out of the ordinary (no equipment or lights). I think the microwave was being used around then -- perhaps that's the culprit, but then, it wasn't used for 6 whole minutes.
Anyway, sorry for rambling on in this thread ... feel like I'm talking to myself here :) But I figure ultimately may help myself and others out!
|
Back to Top |
|
|
|
|