PDA

View Full Version : server monitoring script


kalidust
06-11-2005, 01:06 PM
Is anyone using a server monitoring script(s) and if so, which one?

Thanks :)

Chris
06-11-2005, 08:43 PM
I utilize the free rushland.net script to show service status on my site for my customers. Works pretty well and the price is right.

Robert
06-12-2005, 12:42 AM
Using a script that runs inside of your VPS is not always a good idea. CPanel for example already comes with such a script (chkservd). However, if your VPS hits it's memory limit, it's possible the script will fail and no longer run and other services such as named, bind, etc fail.

We typically install SIM (http://www.rfxnetworks.com/sim.php). It checks for failed services and restarts them if it detects it failed. (Some CPanel systems don't have it included by default as Cpanel has it's own monitoring service as mentioned earlier.) However, if the server hits it's memory limit, SIM can fail along with say Apache or Named.

You would need a script that monitors from outside your VPS and connects to your server if it detects it failed to TRY and restart it. (Though if the resource limits are still being hit, the service will not restart. It will just get an error message like "Not enough memory" or something.)

Hope that helps!

kalidust
06-12-2005, 12:49 AM
Thanks Rob - that certainly does help :)

esc
06-12-2005, 01:43 AM
I'm running NetSaint on our local office development server since a few years to watch several of our client accounts. This program changed the name meanwhile and is called Nagios (www.nagios.org) now. It is Open Source and comes with a GNU type license. I'm very pleased with the functionality and will update to Nagios which is much easier to configure than NetSain definitely when I find time.

Here some quotations from the program discription (http://www.nagios.org/about/):Nagios® is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.

Features:
Nagios has a lot of features, making it a very powerful monitoring tool. Some of the major features are listed below:

Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
Monitoring of host resources (processor load, disk and memory usage, running processes, log files, etc.)
Monitoring of environmental factors such as temperature
Simple plugin design that allows users to easily develop their own host and service checks
Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable
Contact notifications when service or host problems occur and get resolved (via email, pager, or other user-defined method)
Optional escalation of host and service notifications to different contact groups
Ability to define event handlers to be run during service or host events for proactive problem resolution
Support for implementing redundant and distributed monitoring servers
External command interface that allows on-the-fly modifications to be made to the monitoring and notification behavior through the use of event handlers, the web interface, and third-party applications
Retention of host and service status across program restarts
Scheduled downtime for supressing host and service notifications during periods of planned outages
Ability to acknowlege problems via the web interface
Web interface for viewing current network status, notification and problem history, log file, etc.
Simple authorization scheme that allows you restrict what users can see and do from the web interface

Erich

charles
06-12-2005, 01:31 PM
i would recommend Zabbix over nagios. http://www.zabbix.com

charles

esc
06-12-2005, 02:38 PM
Thank you Charles for the pointer to Zabbix. I didn’t know this software as yet. It looks very neat and well sort-out. When Googling for a comparison between Nagios and Zabbix I found a very interesting thread (http://www.zabbix.com/forum/showthread.php?t=577) :) in the Zabbix forum. Probably I will try this program too.

Erich

charles
06-12-2005, 02:50 PM
I am not affiliated with Zabbix in any way, just a happy user who used to use nagios before. It's not perfect, but better than nagios if you want to graph data and monitor anything outside of uptime and std ports.

charles

Chris Imrie
06-12-2005, 05:41 PM
Hey Charles,

Anyone know of a monitoring system which can SMS / Message the end user on his/her mobile/cell phone if a service fails or the whole system goes down completely.

I know Alertra does this but i am in the UK and i don't have Visa to pay for such, the best i have is a Maestro/Switch Debit card.

Regards,

Chris

charles
06-12-2005, 06:28 PM
We are about to partner with hyperspin, and when we do, we will be giving a free check per VPS. YOu can pay for additional checks from them if you like it.

If your running your own system like nagios/zabbix, both support flexible event notifications.

charles

canuck
06-12-2005, 06:34 PM
Charles, what exactly does this free check at Hyperspin give me ?

As a reseller with a VPS-1 plan, do I need these additional systems like zabbix ?

Doesn't CPANEL send something out to me when something is wrong ?

charles
06-12-2005, 06:44 PM
It allows you to check that some service is up (every 10 minutes for the free check, but you can pay for faster checks). So lets say you really care about your web server being up. You can have your free check used to make sure your web server responds. You can choose a port, but nothing fancy like you could with nagios/zabbix like go for a specific page and check that the results you get are valid for example - still very useful.

These are external checks done from several locations around the world to make sure the checking machine doesn't have a problem. The cpanel daemon checks they are running, but not necessarily working properly, and if therer is a problem you may not get an email from the server (if say you run out of memory), but you would from the external service.

You don't need zabbix unless you're an ISP or have a number of servers, or you want to monitor and graph various metrics like cpu/memory/network/disk etc etc. as well

hth
charles

vps-vince
08-18-2005, 06:01 PM
Hi,
just seen this new monitoring service for free:

http://dotuptime.com/products.php

BornOnline
08-18-2005, 08:49 PM
http://host-tracker.com

Starchild
08-19-2005, 01:22 AM
http://www.pingability.com

ozgreg
08-19-2005, 03:00 AM
Hi,
just seen this new monitoring service for free:

http://dotuptime.com/products.php

I use dotuptime they have a nice little JS applet that you can include (and I do) on my internal status page..

Hvu
08-19-2005, 05:15 PM
I use Alertra before not I use use Cacti and doing site monitoring from my other server and my desktop. ;D having a treo really helps i can ssh into my server from my cellphone or check my site for uptime. That monitoring solution charles mentioned seems interesting I might have to test install and play around.

charles
08-19-2005, 06:18 PM
having a treo really helps i can ssh into my server from my cellphone or check my site for uptime.

Which ssh client do you use? I use TG ssh, but without setting up an appropriate termcap (which I have not), I don't find it very useful.

charles

Hvu
08-19-2005, 09:34 PM
Treo 600 series.

pssh, http://sealiesoftware.com/pssh/

charles
08-19-2005, 10:43 PM
Yeah I have the 600. Will give this a try next week.

thanks
charles

capnqwest
08-24-2005, 07:00 PM
Has anyone installed Zabbix on a VPS here? I'm kinda worried about resouce usage. Considering purchasing a Cpanel 1 just to run it if necessary. I would like to see some trending on load, mail and b/w usage.

charles
08-24-2005, 07:22 PM
Yes, on a webmin power-1. Zabbix itself has low resource use (load or bandwidth). It's the number of servers and items you check that will cause a load on it. Also limit the amount of data you store (the length of time you retain data for) because as ther database grows, performance will slow.

Zabbix rocks.

charles

capnqwest
08-24-2005, 07:39 PM
So do you recommend putting it on a dedicated Webmin 1 VPS or could I get away with putting it on my current vps?

charles
08-24-2005, 08:43 PM
I'd start on your current vps. It's a simple job to move it if/when you need to. Put it on it's own IP though or you'll have to update you agents if you move it.

charles

vps-vince
08-31-2005, 04:43 PM
Will Zabbix reports go as far as showing stats on a domain basis, and even drill-down further?

I had a situation whereby my Power-3 VPS uses low bandwidth (less than 2Gb per day), low CPU (avg 0.4) , but yet the ram resources went over my hard limits continuously (privvmpages).


This apparently also causes other vital services to fail (apache, cppop, ftdp, imap, cpsrvd, ... ) all of them in fact at various times. Even cPanel and WHM could not be accessed.

This is very suprising, as I would have thought Virtuazzo (or any others) would somehow ensure that vital services had priority so as to keep things going!

Anyway, my point is that it would have been useful to know what exactly was causing this and hogging all my memory. Would Zabbix help with this?

Using 'top' via SSH is very confusing in tracking down the culprit.

- Vince

charles
08-31-2005, 05:10 PM
Vince, it can help track overall memory use with standard agent monitoring. It will not track individual application memory use by default, but it is possible with custom defined UserParameters.

When running top, press 'M'. This will sort by memory and the hogs will float to the top.

As far as virtuozzo goes, it haas no way of determining what are apps that should be killed and which should not. Once you hit hard limits, anything is fair game.

hth
charles

vps-vince
08-31-2005, 06:05 PM
As far as virtuozzo goes, it haas no way of determining what are apps that should be killed and which should not. Once you hit hard limits, anything is fair game.


Shame, seems a much needed feature. I bad app hogging ram and all Web site critical sevices on that server start failing. :(

I tried top with shift M (as per support ticket), but couldn't understand a thing!

When will the language used be more 'human friendly' huh?

There must be something out there created for this use. How do you manage shared hosting without knowing who/what you need to warn or ban?

- Vince

elix
08-31-2005, 06:10 PM
run PHP as CGI and you can see abusive PHP users.

vps-vince
08-31-2005, 06:20 PM
run PHP as CGI and you can see abusive PHP users.

No thanks, too many other issues with that, and many PHP apps don't like it.

- Vince

charles
08-31-2005, 06:24 PM
SIM might do a little of what you want as far as killing the right apps, but I think the bottom line is you have stop hitting the limits - everything else is a bandaid or going to be wrong sometime

http://rfxnetworks.com/sim.php

You can look at details about high resource processes (things like httpd) using ' ls -l /proc/PID' and get insights as to what its doing (things like the cwd).

charles