Hi all. May the Fourth be with you! :)
I've run into a problem with an installation of LibreNMS on a system at work. I've carefully followed the instructions here...
https://docs.librenms.org/Installation/Install-LibreNMS/
... for a Debian 10 system, with the occasional peeking at the Ubuntu 20 installation instructions (since I wanted to use an Apache server already on that host, rather than follow the nginx-based install, which was the only option they showed for Debian 10).
When I run the validate.php script from the command line, everything looks fine. When I run the validate under the web interface, it's all fine too, except after a day or more, when it complains under Updates that the install is over 24 hours out of date. The curious thing is I didn't change the default setting for updates, so it should be automatically updating. (That's only the first problem, though...)
When I add a device, it does the discovery, but the results seem incomplete. For example, I don't see any of the network ports, and most of the other things that should be auto-discovered aren't. I've tried this discovery on both a Cisco switch stack, and a Linux host, with the same lack of results. (That's problem #2.)
I've also manually added in a bunch of the Apps on one Linux host (an RHEL clone), to monitor particular services. I've done all the SNMP setup on that host to have the "extend" scripts run for the Apps I want, following these instructions...
https://docs.librenms.org/Extensions/Applications/
... and I've tested that with snmpbulkwalk, to make sure they're producing output. (In some cases, that took some digging through the scripts, and making sure paths were correct and additional software was set up, since the instructions above weren't always complete.)
However, when I try to view the pages for those Apps in LibreNMS, I get either broken image icons, or a place-holder image that says "Error Drawing Graph". (This is problem #3.)
I've searched online for this last problem, but the proposed solutions I've found are mostly red herrings, or not applicable in my case. I haven't found anything useful.
Anyone else here played around with LibreNMS? (I know Adam has...)
If so, any ideas what might be causing any of the problems I've mentioned above? Any tips for better setup documentation to follow than the "official" ones I have? (There were details missing about particular Apps, so maybe the Install instructions also are missing something?)
BTW, I have Netdisco running on the same Debian host as LibreNMS, and it has no trouble getting information from the other devices via SNMP. (I've also got MRTG on another host, and it also has been working fine for years, querying the same devices I'm trying under LibreNMS.)
Thanks, Gilbert
If you suspect php problems, can you seek out and post any errors php is giving you?
php-fpm does logging a bit differently and you should have a /var/log/php-fpm dir. When you problem hits, check for recently updated files in that dir, and tail, and post. If you're lucky, it'll had something there.
If you don't see something there, you can try turning up the logging in php.ini.
php 7.4 does fatal a couple of previously-deprecateds. If l-NMS isn't updated for those yet (i.e. expects 7.3), they will blow up scripts. And if you want it to work, you'll have to tweak those bits of code yourself.
php isn't quite Python insanity for new versions, but it isn't as easy as (almost never break-backwards) perl.
OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues.
Thanks for Alberto for getting me to re-examine my cron file, which got me thinking about permissions and ownership of the file, after seeing no problem with the content itself.
Gilbert
On 2021-05-11 8:40 p.m., Trevor Cordes wrote:
If you suspect php problems, can you seek out and post any errors php is giving you?
php-fpm does logging a bit differently and you should have a /var/log/php-fpm dir. When you problem hits, check for recently updated files in that dir, and tail, and post. If you're lucky, it'll had something there.
If you don't see something there, you can try turning up the logging in php.ini.
php 7.4 does fatal a couple of previously-deprecateds. If l-NMS isn't updated for those yet (i.e. expects 7.3), they will blow up scripts. And if you want it to work, you'll have to tweak those bits of code yourself.
php isn't quite Python insanity for new versions, but it isn't as easy as (almost never break-backwards) perl.
On 2021-05-11 Gilbert E. Detillieux wrote:
OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues.
That would explain the updating issue, but the other weird things you were seeing would not be related. (?)
On 2021-05-11 10:30 p.m., Trevor Cordes wrote:
On 2021-05-11 Gilbert E. Detillieux wrote:
OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues.
That would explain the updating issue, but the other weird things you were seeing would not be related. (?)
Yeah, it does, 'cause almost all of the back-end processing is initiated through cron. So, I wasn't getting any data collection, hence no graphs either.
On 2021-05-11 10:27 p.m., Gilbert E. Detillieux wrote:
OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues.
"he who is without sin among you, let he be the first to throw a stone..."
If anyone here thinks Gilbert should be ashamed, show your face and we'll be sure to point and laugh when your turn comes.
Because *it will*.
Thanks for Alberto for getting me to re-examine my cron file, which got me thinking about permissions and ownership of the file, after seeing no problem with the content itself.
I did not do a thing!
I was reading an anecdote that some coders ask for another coder to hear an explanation of what they are trying to do, when baffled that something does not work as it should.
Most of the time, during the explanation, the baffled coder end up finding something he/she missed on his/her own, usually trivial, but perfectly OK to skip, considering all that is going on while they're "in the zone".
You fixed your own problem. I was just trying to recap what threw me for a loop when I was setting up LibreNMS. In my case - if I recall correctly - I skipped the cron file entirely. But that was a while ago, and it was past 10PM after a looong day (yep, that I do remember... sigh), so... yeah.
I am glad it helped, but all I really did was take you out of "the zone" for a little while.
For the record, without the cron file with all the housekeeping duties, LibreNMS will cough up a hairball and not give you a clue of what's really going on.
Thus, make sure you have your /etc/cron.d/librenms file properly set up, both content (hi!) and permissions/ownership (hi Gilbert!).
:)
Kind regards, Alberto Abrao
Rubber Ducky Debugging. Because you may as well be explaining the problem to your rubber duck bath toy, for all the actual assistance you actually needed from the other person. -Adam
On May 12, 2021 7:14:20 a.m. CDT, Alberto Abrao alberto@abrao.net wrote:
On 2021-05-11 10:27 p.m., Gilbert E. Detillieux wrote:
OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues.
"he who is without sin among you, let he be the first to throw a stone..."
If anyone here thinks Gilbert should be ashamed, show your face and we'll be sure to point and laugh when your turn comes.
Because *it will*.
Thanks for Alberto for getting me to re-examine my cron file, which got me thinking about permissions and ownership of the file, after seeing no problem with the content itself.
I did not do a thing!
I was reading an anecdote that some coders ask for another coder to hear an explanation of what they are trying to do, when baffled that something does not work as it should.
Most of the time, during the explanation, the baffled coder end up finding something he/she missed on his/her own, usually trivial, but perfectly OK to skip, considering all that is going on while they're "in the zone".
You fixed your own problem. I was just trying to recap what threw me for a loop when I was setting up LibreNMS. In my case - if I recall correctly - I skipped the cron file entirely. But that was a while ago, and it was past 10PM after a looong day (yep, that I do remember... sigh), so... yeah.
I am glad it helped, but all I really did was take you out of "the zone" for a little while.
For the record, without the cron file with all the housekeeping duties, LibreNMS will cough up a hairball and not give you a clue of what's really going on.
Thus, make sure you have your /etc/cron.d/librenms file properly set up, both content (hi!) and permissions/ownership (hi Gilbert!).
:)
Kind regards, Alberto Abrao
Roundtable mailing list Roundtable@muug.ca https://muug.ca/mailman/listinfo/roundtable
On 2021-05-12 9:27 a.m., Adam Thompson wrote:
Rubber Ducky Debugging. Because you may as well be explaining the problem to your rubber duck bath toy, for all the actual assistance you actually needed from the other person.
I thought I had another great original quote from Adam, but I found out this is an actual thing!...
https://en.wikipedia.org/wiki/Rubber_duck_debugging
But in fairness to Alberto, he was more than a passive rubber duck in this case. He pointed me back in the direction of cron, when I had already dismissed it out of hand as the problem. By doing so, he encouraged me to look more carefully at the cron entries themselves (which hadn't changed), allowing me to notice the explicit username in the cron entries, which made me think "this thing needs to run as root to begin with" (so it can setuid), and "will cron reject a non-root-owned file as a security risk?" And bingo!
So, yeah, talking things through (even with a rubber ducky, or your favourite stuffie) definitely helps with the thinking process, but a persistent line of questioning back is even more helpful.
Two take-homes for me:
1. Don't quickly dismiss things as irrelevant or too obvious. Look at it from different angles. (The answer may not be blowing in the wind, but stuck to the bottom of your shoe!)
2. When deviating from exact instructions (because I think I know better), be careful of subtle side effects. (Using "cp -p" is usually preferable to just "cp", but not always!)
Gilbert
On May 12, 2021 7:14:20 a.m. CDT, Alberto Abrao alberto@abrao.net wrote:
On 2021-05-11 10:27 p.m., Gilbert E. Detillieux wrote: OK, I'm a bit sheepish to admit what the problem appears to have been: wrong ownership on the crontab file. I've made it owned by root now, and this seems to have fixed my issues. "he who is without sin among you, let he be the first to throw a stone..." If anyone here thinks Gilbert should be ashamed, show your face and we'll be sure to point and laugh when your turn comes. Because *it will*. Thanks for Alberto for getting me to re-examine my cron file, which got me thinking about permissions and ownership of the file, after seeing no problem with the content itself. I did not do a thing! I was reading an anecdote that some coders ask for another coder to hear an explanation of what they are trying to do, when baffled that something does not work as it should. Most of the time, during the explanation, the baffled coder end up finding something he/she missed on his/her own, usually trivial, but perfectly OK to skip, considering all that is going on while they're "in the zone". You fixed your own problem. I was just trying to recap what threw me for a loop when I was setting up LibreNMS. In my case - if I recall correctly - I skipped the cron file entirely. But that was a while ago, and it was past 10PM after a looong day (yep, that I do remember... sigh), so... yeah. I am glad it helped, but all I really did was take you out of "the zone" for a little while. For the record, without the cron file with all the housekeeping duties, LibreNMS will cough up a hairball and not give you a clue of what's really going on. Thus, make sure you have your /etc/cron.d/librenms file properly set up, both content (hi!) and permissions/ownership (hi Gilbert!). :) Kind regards, Alberto Abrao
On 2021-05-12 Gilbert E. Detillieux wrote:
- When deviating from exact instructions (because I think I know
better), be careful of subtle side effects. (Using "cp -p" is usually preferable to just "cp", but not always!)
I reflexively use cp -a for *everything* now. I might as well make it an alias... -a is -p on steroids.
On 2021-05-12 Adam Thompson wrote:
Rubber Ducky Debugging. Because you may as well be explaining the problem to your rubber duck bath toy, for all the actual assistance you actually needed from the other person. -Adam
Yes, Adam & I know this all too well. The best thing to do when you have a problem is start typing it up in an email to a buddy and 3/4ths of the way through typing you realize what the solution is.
Doesn't work *all* the time... but more than enough to surprise you.
Glad you solved it Gilbert. Also, I would call that a bug, assuming the default install didn't set the cron file properly (i.e. not all just Gilbert's fault). The user-facing stuff should give an indication of a lack of the files/data rather than just leave you mystified.
You won't be the only one faced with this problem...