Logging to detect unhealthy agents on agent side

Elias · December 22, 2023, 5:24pm

We’ve found a number of agents that are in an unhealthy (not checking in) state. Some are just temporarily hung and need a service restart. Some require a reinstall because they were accidentally/agressively cleaned out from ImmyBot. We’d typically account for such scenarios by scheduling a task on the agent side that checks for unhealthy conditions in the agent and responds accordingly. Unfortunately, according to our own research and responses from ImmyBot support, we’re not able to detect if the agent is healthy or not from the agent side as the log output of a “bad” agent is essentially identical to the log output of a “good” agent.

Given that we rely on ImmyBot to install/update software that is critical to our clients’ security it’s highly important that we make sure ImmyBot is functioning at all times.

Thanks and happy holidays,
Elias

William_Swartz · January 16, 2024, 2:50pm

Being able to auto-heal immybot agents after detecting they’re unhealthy could be a really solid safety net to be sure that we’re not relying on an agent that isn’t actually able to run maintenance.

I’ve had success with restarting the immybot service on a system in question through ScreenConnect, so having a way to leverage control to restart the service or maybe another method to auto-heal this issue could be a useful advantage

Barry_Trotman · March 7, 2024, 3:37am

We are having the same issues currently with support; agents have been onboarded via our RMM, and we find machines in a disconnected state and have to rerun the onboarding script to fix them.
Not ideal as we want to move away from our RMM, but can’t if the Immybot agent/ services have issues and the computer is “disconnected”

Elias · April 17, 2024, 12:14pm

Still seeing issues with agents staying healthy. We could really use a reliable way to detect the health state of agents on the system/agent side so we can build some accurate repair scripts. We’ve requested help with this from support and don’t seem to be getting anywhere. The last request they just directed us to this community post.

Maybe it would be worth setting a flag ourselves that denotes healthy status (recent checkin) on a regular basis and then acting on the status of that flag.

Brandon_Garrett · July 31, 2024, 5:05pm

Also bumping this due to continued unhealthy agents. No simple way for us to track the difference between DattoRMM and ImmyBot. Many agents missing or not checking in properly and causing maintenance to be missed, customers catching this in Microsoft Security console. This needs to be addressed urgently, as agent stability is paramount.