Jump to content

Custom heartbeat rules for problematic devices


Recommended Posts

I am monitoring two Nest thermostats and getting a lot of device down/device up alerts.  The Nest devices seem to "bounce" quite a bit while all the other devices are stable, so no obvious network configuration or other problem.


I would like to keep monitoring the Nest devices for if they did actually go down for more than a few minutes, I want to be aware of this.


Not wanting to design the solution, but would like some way to configure more relaxed "heartbeat lost" time intervals for problematic devices like this.  Or perhaps to define how many minutes the devices must be "down" before reporting a "device down" alert.  Essentially, I would like to still have the Nest thermostats listed as "important" devices, but not receive up or down alerts for relatively transient behavior. (e.g. if the thermostat went down for 5 minutes or more, I would consider that an alertable event.)


On a related note, I have a few specialized IoT devices with embedded processors that have very low (computing) power.  Their TCP/IP stack is not very tolerant of multiple pings and so I know, from prior testing, that pings will often fail.  I would like to monitor these devices as important, but again, not receive a stream of up/down alerts unless the device is truly down (as defined by a long period of time - 5 or 10 minutes) versus the normal Domotz heartbeat detection limits.  For these devices, the frequency of heartbeat checks should ideally be lengthened to be longer than every 30 seconds as they will consistently miss heartbeats if the default interval is used.

Link to comment
Share on other sites

Good news... we have a customizable "period" to consider a device down. Not sure yet when that will be available, but it should be before the end of the year.


With regard to HeartBeat lost: each heartbeat is actually a train of 5 consecutive pings (1 each second). So, also for the IoT devices you were mentioning, just making longer the period to consider them down, should do the work.  

Link to comment
Share on other sites

  • 7 months later...

Dear Imanuk,


I confirm that the customizable period is available since few months now. 


You can configure the status check to any range between few minutes up to 24 hours. See attachment.




Screen Shot 2017-05-18 at 12.13.37 PM.png

Link to comment
Share on other sites

Brilliant :-D  

I had tried clicking the status cog to no avail in chrome, having now tried this in the app and via internet explorer it works. So perhaps a chrome issue. If you want to look at chrome compatibility I'm running  Version 57.0.2987.133  on windows 10. 


Also it would be great if this can be changed to several devices in one go rather than individually. 



Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Create New...