I guess anything is possible. However, I can't really think of any
reason as to why a direct MIS POLL 21:1/100 process would lock up,
unless the connection went bad and hung, or the connection's handshakes didn't finish properly.
Here's a recent poll I can see. As far as I can tell there is no issue.
If you send me a date/time when your logs show a problem I'm happy to track it back and look at my logs if you like. :)
Correct. It hasn't happened since upgrading to the April release so far.
This one is going to be hard to pinpoint, methinks.
Seems to happen about once a week, and I only poll Paul's system (21:1/100) with Mystic. Seems 21:1/100 is using binkd-111, but I can't find anything in my logs as to why the process sticks. Logs say the poll ended, yet the process is there, until I manually kill it. During the
time the process is locked, no more polls occur, and as soon as I kill
it polls continue normally.
Anyone else have issues on Linux where MIS POLL gets stuck and you have to manually kill the process for polling to continue?
Anyone else have issues on Linux where MIS POLL gets stuck and you have
to manually kill the process for polling to continue?
That is strange Nick. Would it help if I post a copy of the BinkD logs here?
I used to have those problems, mis eating all CPU and had to kill it. Usually once a week or so. I switched to running mis through:
timeout -k 600 --preserve-status -v 600 ./mis poll forced
But, I recently switched to polling through binkd so I'm immune now. :) But it worked fine for at least a year or so.
Well, that doesn't fix anything except the fact you don't have to deal with it any more. ;)
Only if you can produce the exact log snippet from where the process locked.
Maybe I'll be able to get a timeframe when it happens next time. So at least you have a starting point as to where to look.
timeout -k 600 --preserve-status -v 600 ./mis poll forcedI may have to give that a shot, although I would like to fix the underlying issue. Especially if it's something Mystic is doing, as none
of my other systems are experiencing this.
I'm wondering if it could have something to do with time adjustment
(clock going backwards? or does it just slow down to adjust backwards?), as I believe strace output suggested that it was calling gettimeofday() repeatedly and waiting for some kind of timer...
Sysop: | Gary Ailes |
---|---|
Location: | Pittsburgh, PA |
Users: | 132 |
Nodes: | 5 (0 / 5) |
Uptime: | 109:31:14 |
Calls: | 733 |
Files: | 2,171 |
Messages: | 81,483 |