Simpler alternative to prometheus-alertmanager and/or graphana?

@talkingpumpkin · 5 months ago

Simpler alternative to prometheus-alertmanager and/or graphana?

@[email protected] · 5 months ago

Have you played around with Grafana? It really is quite simple if you have prometheus already working.

For a home lab environment you dont even need to use prometheus-alertmanager. Grafana can handle alerts as well.

Grafana also has hundreds of pre-made dashboards you can import. Node monitoring is quite straightforward.

Assuming you have prometheus good to go, all you need to do is go to Grafana - Datasources, create a new datasource, point to your prometheus instance.

Then you can import the dashboards you want.

Now you can setup your alerts - you can use SMTP, telegram, slack among others for your notifications.

@[email protected] · 5 months ago

The easiest solution I found and use is Beszel.

https://github.com/henrygd/beszel

Just a hub with the most important stats and some simple agents on the servers.

@pageflight · 5 months ago

Is there a self hosted OpenTelemetry consumer?

TheHolm · 5 months ago

ICINGA/NAGIOS? you can even feed data already collected by Prometheus to it if you want.

SuperiorOne · 5 months ago

I’m currently using InfluxDB + Telegraf + Grafana combination to monitor Linux systems and k3s pods. It’s basically same as Prometheus, but InfluxDB uses push model, which makes it easier to develop tools for collecting custom time series data.

For alerts and dashboards, I think Grafana is the simplest and most hassle free solution available at the moment.

@[email protected] · 5 months ago

SNMP does what you want. You just need a good monitoring solution that’s not as involved as Prometheus+grafana (I feel you, I’ve been there)

I really enjoy PRTG, but it’s way too expensive for a home lab, still throwing it out there if you feel like you have money to burn.

I hear good word about libreNMS, it’s next on my list when my PRTG licence runs out.

Be warned that monitoring is ultimately a fickle thing; what you don’t write in yaml config for grafana, you get to dig through obscure SNMP libs to find out (though I find that’s easier for me, ymmv) for other tools.

I recommend against: nagios (I like it but if you hate Prometheus it’s definitely not for you), checkmk (throw checkmk into the sun please it just fucking sucks), cacti (NO!), solar winds (why?)

if you feel like you want to become a datacenter admin: zabbix scales very very well, both in performance and ease of admin against hundreds of servers, but it’s overkill for a home lab, and it can get you lost in configs for hours.

@horse_battery_staple · edit-2 5 months ago

Edit: found better resources

https://linuxhandbook.com/syslog-guide/

https://github.com/linuxserver/docker-syslog-ng

That should be a good place to start. Syslog will do what you want.

@[email protected] · 5 months ago

Syslog is considerable overkill for home lab monitoring.

@horse_battery_staple · edit-2 5 months ago

It’s as complex as you make it, is linux native, is scriptable, doesn’t use YAML, is native to the OS, and is free as in beer. Just like SNMP. however they’ll also get logs at a central server they can drill into if needed.

Which I believe fulfills the requirements of OPs post.

Sidenote, self-hosting is absolutely overkill just as a theory and process. I often read responses to suggestions as this or that is overkill, or complicated, or non-trivial effort.

The self hosting community is a broad spectrum of users , from those with home labs on an old dying laptop to those with a full rack setup. People have different needs and interests. Some are learning infra and devops for work or to get into a new job. Some are privacy minded. Some are trying to get the most bang for their buck. Some just want to pay for a cloud hosted solution. Some just want an automated home. Some run a home business.

Edit: to the point of your valid and helpful SNMP post, most syslog servers also will ingest and report on SNMP traffic as well. The container I linked does exactly that. If they find they want to automate processes in the future they can also trigger on the syslog stream as well. But that complexity is only there if they want it. Otherwise it’s just a stream they can parse and trigger an alert, just like SNMP. So OP could have an extensible solution if they want to expand. Also Grafana/Prometheus will take in syslog natively with a couple standard YAML configs if they choose that they want to look at that solution again in the future.

/Rant

@linearchaos · 5 months ago

I mean, you get a lot of advantages from fluffy pretty systems. But extracting data from df and systemctl and curling it into telegram is going to be like a 10 line bash script called from a one-line cron job.

I pump a lot of complicated metrics through Prometheus / grafana to get graphs and history.

Most of my critical stuff is still in Nagios and instead of using nagios standardized plugins I just query the operating system directly in bash.

@static09 · 5 months ago

Zabbix
https://www.zabbix.com/

Netdata
https://www.netdata.cloud/

Nagios
https://www.nagios.org/