I've started to look for a monitoring infrastructure for all IT things, because:
- I'm self-hosting this blog and want to know when it fails
- I have home automation (KNX + NodeRed) and I also want to know if there are issues
- I have periodic back-ups set up
- I have a host of docker-based services (e.g. SonarQube, paperless ngx, outline)
- I want to set up a Raspberry Pi cluster with RPI4s and the Compute blade
This gives me a bunch of requirements:
- Cross-architecture
- Multi-site
- Multi-framework (raw, vm, docker, kubernetes)
- Capable of sending data to grafana (via prometheus, influxdb)
- Capable of sending notifications (e.g. via callback, api...)
- Pretty graphs (optional because of grafana)
So, I have a list of initial candidates:
- NetData - good old infrastructure monitoring framework
Netdata is high-fidelity infrastructure monitoring and troubleshooting.
Open-source, free, preconfigured, opinionated, and always real-time. - CheckMK - monitors everything via agents
Quickly gain a complete view of your IT infrastructure, no matter how complex. - Zabbix - Another monitoring tool
Get a single pane of glass view of your whole IT infrastructure stack
There are also other tools which would worth investigating:
- Prometheus + grafana - I already have grafana (+influxdb right now) and it's great. We'd need something to feed data to prometheus though.
- Glances - It's a python-based framework to get OS-level metrics. It looks like I'd need to complement it with some docker/kubernetes-level info.
- Nagios - infrastructure monitoring. Long time ago when I did network monitoring, this was the tool to go.
Now, let the games bagin!
Member discussion: