Sei sulla pagina 1di 7

6/19/2014 The Nagios Setup Explained - Open Source For You

http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 1/7
The Nagios Setup Explained
By Varad Gupta on July 1, 2011 in How-Tos, Sysadmins, Tools / Apps 0 Comments
In this article, we shall discuss
Nagios, an open source
software that is deployed in
most data centres to monitor
various system and network
parameters.
The practice of appointing prefects is
a time-honoured one. Whether it is in
schools, colleges, armies or societies,
from time immemorial, overseers (or
prefects) like district magistrates or
religious priests, have played an
important part in regulating and
monitoring the performance and day-
to-day activities of the groups of
people they oversee.
Information Technology is no different an overseer/monitor is required to constantly regulate
and monitor the health of the hardware, software and network in modern-day data centres.
These prefects of the data centre provide vital information to systems administrators such as
the amount of free disk space, network outages, application and hardware downtime, and even
server-room temperatures.
Nagios (a recursive acronym for Nagios Aint Gonna Insist On Sainthood) has been one of the
most favoured prefects of the data centre, monitoring parameters such as systems status
(whether a system is up and running; CPU/memory/disk usage, etc.), service status (whether a
service is up and running e.g., DNS, Web server, mail server, etc.), and many other factors
including room temperature and even humidity! It can generate alerts (through email/SMS) when
the monitored parameters exceed preset thresholds.
As I sit down to write this article, I am glad to share with you that this perfect prefect has saved
many of my clients hundreds of hours of downtime. Just recently, a customer decided to move a
problematic database from a central database server host, because Nagios had alerted us about
a possible problem with one of the schemas, which was adversely affecting the overall health of
the database server, and could have severely affected other mission-critical production
database schemas on the same host.
In this article, I shall try to dispel a commonly held myth that Nagios is difficult to install. I
distinctly remember that about five years back, a senior manager in a big IT firm called me and
mentioned this as one of the reasons why the company planned to outsource the installation to
us. It might have required a bit of tweaking then, but now that is no more the case you can
easily install and configure it to meet your requirements.
Let us install and configure Nagios to monitor a sample service, and hence get an idea of how
RSS Feed Twitter
Facebook
Search for:

Search
Get Connected
Search
HOME REVIEWS HOW-TOS CODING INTERVIEWS FEATURES OVERVIEW BLOGS SERIES IT ADMIN
Write For Us Submit Tips Subscribe to Print Edition Have Your Say Contact Us
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 2/7
Nagios can benefit you and your organisation.
Well install Nagios on an RHEL 5 host called prefect.knafl.org. We will use it to monitor itself
whenever it is available and send alerts to nagios-admin@localhost in case of an
outage. In a future article, perhaps, we will look at monitoring remote hosts and services.
Installation
On Red Hat Enterprise Linux, Nagios can be easily installed using the EPEL Repository. To the
uninitiated, EPEL is: Extra Packages for Enterprise Linux (or EPEL), a Fedora Special Interest
Group that creates, maintains, and manages a high-quality set of additional packages for
Enterprise Linux, including, but not limited to, Red Hat Enterprise Linux (RHEL), CentOS and
Scientific Linux (SL).
To ensure that Nagios is available in the EPEL repository, lets browse the relevant repository
(since ours is a 64-bit host, were looking at the x86_64 EPEL repository. On jumping to
packages whose names begin with N, we can see that (as of this writing), there are 65 Nagios
packages (RPMs) available for 64-bit RHEL 5. We can check this using the following command
(on the URL for the group of packages we just mentioned):
To install Nagios from EPEL, add the EPEL repository to yum, and then install the RPMs. The
instructions to add the EPEL repository (clearly mentioned on the EPEL site) are as follows:
1. Download the relevant RPM to set up the repo:
2. Install it:
On listing the files installed by the RPM, you will see a GPG key (for checking package
signatures) and a repo file to identify the package source:
Let us now install Nagios:
Experience tells us that all packages should never be installed on a host only the desired
ones should be installed. Therefore, begin with the basic packages:
Hopefully, the above packages will be installed on your system without any hiccups. Any doubts
about Nagios installation being complex should now be removed.
To configure Nagios, you first need to find its configuration files which is simple with the rpm
tools switches. To locate configuration files provided by the Nagios package, simply run the
following command:
LINUX For You
+ 3,145
Follow +1
Popular Comments Tag cloud
March 14, 2014 4 Comments Nandakumar
Write Your Own conio.h for GNU/Linux
March 5, 2014 3 Comments Gi reesha US
Jumpstart Linux Kernel Module Programming
March 26, 2014 2 Comments Ri shabh Sharma
A Peek Into Storage Solutions
February 21, 2014 1 Comments Senthi l kumar
Setting Up OpenAM for Web Authentication
March 1, 2014 1 Comments Di ksha P Gupta
The security threat on the cloud is now
passe
[vbg@vbg ~]$ elinks --dump http://download.fedora.redhat.com/pub/epel/5/x86_64/repoview/letter_n.group.html | grep -i nagios | grep -v html | wc -l
65
[vbg@vbg ~]$
[vbg@prefect downloads]$ wget -c http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
[root@prefect ~]# rpm -Uvh /home/vbg/downloads/epel-release-5-4.noarch.rpm
warning: /home/vbg/downloads/epel-release-5-4.noarch.rpm: Header V3 DSA signature: NOKEY, key ID 217521f6
Preparing... ########################################### [100%]
1:epel-release ########################################### [100%]
[root@prefect ~]# rpm -ql epel-release-5-4
/etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL
/etc/yum.repos.d/epel-testing.repo
/etc/yum.repos.d/epel.repo
/usr/share/doc/epel-release-5
/usr/share/doc/epel-release-5/GPL
[root@prefect ~]# yum clean all
Loaded plugins: rhnplugin, security
Cleaning up Everything
[root@prefect ~]# yum list nagios*
[root@prefect ~]# yum install nagios nagios-common nagios-plugins \
nagios-plugins-http nagios-plugins-disk nagios-plugins-ping
[root@prefect ~]# rpm -qc nagios
/etc/httpd/conf.d/nagios.conf
/etc/logrotate.d/nagios
/etc/nagios/cgi.cfg
/etc/nagios/commands.cfg
/etc/nagios/localhost.cfg
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 3/7
Lets have a look at the various configuration files (each with a specific purpose), and
understand how Nagios uses them. They are:
The main configuration file /etc/nagios/nagios.cfg
Object definition files /etc/nagios/commands.cfg and /etc/nagios/localhost.cfg
Resource configuration file /etc/nagios/private/resource.cfg
CGI configuration file /etc/nagios/cgi.cfg
The Apache configuration file (/etc/httpd/conf.d/nagios.conf) contains the directive for
the URLs http://<nagios-host>/nagios/, and http://<nagios-host>/nagios/cgi-
bin/, whereas the /etc/logrotate.d/nagios file is a log rotation configuration file.
The main configuration file
The /etc/nagios/nagios.cfg file controls the behaviour of the Nagios process and also the
CGIs. There are many configuration directives in this file, and all of them are well documented.
Let us look at some of the more important ones to get our basic configuration going:
Log file: This should be the first directive the log file where host and service events are
logged. Be careful that the file is accessible and writeable by the nagios user:
Nagios user and group: These are the user and group names under which the nagios
process runs. The yum installation, as above, creates both a user and a group named nagios,
which we will use:
Object definition file(s): This parameter can be specified multiple times. These files contain
definitions for each host and service, as well as groups of hosts and services. As an example,
the yum installation creates two object configuration files: commands.cfg and
localhost.cfg. We will look at these a little later. The parameter syntax is as follows:
Object cache file (object_cache_file): To speed up operations, the nagios service
caches the read object definitions and configurations them in a cache file, which is then read
by the CGI. This also prevents inconsistencies, such as when an object file is being modified,
and is saved before all changes are completed.
Status file (status_file): This file is where the status of all monitored hosts and services is
stored by Nagios, to be processed by the CGI scripts.
Resource file (resource_file): This parameter too can be specified multiple times.
Resource files contain macros that are expanded by Nagios when executing a command found
in the commands file. We can look at this in detail below. The CGIs do not read these files,
and they can contain sensitive information such as user names and passwords. Therefore,
restrictive permissions such as 600 (only the owner can read/write) should be placed on these
files. As you can see, the Nagios RPMs install these files in a separate directory,
/etc/nagios/private, which is owned by the root user and readable by the nagios group:
The object and resource definition files
Objects are entities that need to be monitored, or are used for monitoring. Some examples are
commands, hosts, groups, services and contacts. Let us explore a host object and a command
object in this article.
Host object definitions are used to define a particular host that is being monitored; the mandatory
directives are:
/etc/nagios/nagios.cfg
/etc/nagios/private/resource.cfg
log_file=<path-of-log-file>
nagios_user=nagios
nagios_group=nagios
cfg_file=<path-of-object-definition-file_1>
cfg_file=<path-of-object-definition-file_2>
[root@prefect ~]# ls -ld /etc/nagios/
drwxr-xr-x 3 root root 4096 May 21 08:31 /etc/nagios/
[root@prefect ~]# ls -ld /etc/nagios/private/
drwxr-x--- 2 root nagios 4096 May 20 07:35 /etc/nagios/private/
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 4/7
host_name: a short name for the host. Multiple services can be monitored on a single host.
Normally, the FQDN is used.
alias: a longer description.
address: the IP address of the host being monitored.
max_check_attempts: the number of attempts to check the host, if a non-OK state is
returned.
check_period: the period name (which is also defined), during which checks should be
made.
contact_groups: the contact groups (people to be contacted) in case of problems (or
recoveries) with this host.
notification_interval: the time interval (by default, in minutes) after which notifications
will be sent, in case the host is still down.
notification_period: the time period in which notifications should be sent. In case the host
is down in a time period that is not in this period, no notifications will be sent.
notification_options: This directive can have the following values:
d: send notifications when the host is down
u: send notifications if the host is unreachable
r: send notifications on recoveries
f: when the host starts and stops flapping (flapping is usually used to determine whether
a service/host is stable. Flapping occurs when a service/host changes states too
frequently.)
n: no notifications will be sent
A more efficient way to use host definitions is to define templates and use them. A snippet from
the file /etc/nagios/localhost.cfg that defines a template, and then uses it for a host object
definition, is shown below:
The use statement above specifies that this host definition uses a template called linux-
server. It is defined in the same file, as follows:
This template further uses a template called generic-host, which is also defined in the same
file, as:
Other objects referenced in the above snippets are:
contact_groups called admins
define host{
use linux-server ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name localhost
alias localhost
address 127.0.0.1
}
define host{
name linux-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Linux admins hate to be woken; only notify in the day
; Note that notification_period overrides the value
; inherited from the generic-host template!
notification_interval 120 ; Resend notification every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS TEMPLATE DEFINITION!
}
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
register 0 ; DONT REGISTER THIS TEMPLATE DEFINITION!
}
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 5/7
notification_period called workhours and 247
check_command called check-host-alive
The contact_group object called admins is also defined in the same file:
The member nagios-admin of the above contact-groups is defined as:
The time period workhours is defined as:
The time period 247 is defined as:
Command definitions are used to define commands that Nagios will use. They can include
macros from resource definition files. The command used in the localhost.cfg file for
localhost is defined in /etc/nagios/commands.cfg as:
Once the host, host-groups, commands and time periods have been defined, it is time to define
services. For the purpose of this introductory article, we will use only the ping service. Again, the
service definition sections in the sample configuration file listed below are self-explanatory.
This definition uses a template, local-service, defined as:
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagios-admin
}
define contact{
contact_name nagios-admin
alias Nagios Admin
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email nagios-admin@localhost
}
define timeperiod{
timeperiod_name workhours
alias "Normal" Working Hours
monday 09:00-17:00
tuesday 09:00-17:00
wednesday 09:00-17:00
thursday 09:00-17:00
friday 09:00-17:00
}
define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1
}

$USER1$ is a macro defined in /etc/nagios/private/resource.cfg as a file system path:

$USER1$=/usr/lib64/nagios/plugins
define service{
use local-service ; Name of service template to use
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service{
name local-service ; The name of this service template
use generic-service ; Inherit default values from the generic-service definition
check_period 24x7 ; The service can be checked at any time of the day
max_check_attempts 4 ; Re-check up to 4 times to determine final (hard) state
normal_check_interval 5 ; Check service every 5 minutes normally
retry_check_interval 1 ; Re-check every minute until a hard state can be determined
contact_groups admins ; Send notifications to all in the 'admins' group
notification_options w,u,c,r ; Send warning, unknown, critical, and recovery notifications
notification_interval 60 ; Re-notify about problems every hour
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 6/7
The local-service template further uses a template, generic-service. For our use-case
scenario, please ensure that you comment out all other service definitions in this configuration
file.
Therefore, to sum up the various files used, based on the default configuration, our Nagios
instance is set up thus:
Will monitor localhost (IP address 127.0.0.1)
It will be monitored 247.
This host is checked using the command /usr/lib64/nagios/plugins/check_ping.
Notifications are sent if the host is down, is unreachable or has recovered.
Notifications go to nagios-admin@localhost, but are sent only during workhours, and will be
resent every two hours if the host is still down or unreachable.
The CGI configuration file
The CGI configuration file (/etc/nagios/cgi.cfg) configures the CGI scripts and the Web GUI
of Nagios. The significant parameters are:
main_config_file: The path of the main Nagios configuration file, and where the CGI scripts
should find it.
physical_html_path: The filesystem path for Nagios HTML files.
url_html_path: The URL portion appended to the base URL, that will access the Nagios
HTML files.
refresh_rate: Specifies the refresh rate for various CGIs such as status.
use_authentication: Specifies that the CGI scripts should use authentication.
Once Nagios has been configured, you will need to add an authentication file to be able to
access Nagios pages. By default, the Apache configuration directives (specified in
/etc/httpd/conf.d/nagios.conf) rely on basic authentication, and allow access only from
localhost. The user authentication file /etc/nagios/passwd needs to be created. You can do
this using the htpasswd command:
This creates the nagios-admin user, with the password admin@123 and stores the details in the
file /etc/nagios/passwd.
Hopefully, we are ready to test our base Nagios installation now. Start the nagios service and
check the logs. Restart the Apache service:
If the Nagios logs are fine, you should now open your browser and connect to
http://localhost/nagios/, authenticate as nagios-admin and check the Host summary. The
configured host, localhost, should be up.
There is a wealth of information available on Nagios, and the documentation provided along with
the installation is also quite good. Go on, build your prefect and manage your data centre.
Related Posts:
A Peek Into Some Cloud Monitoring Tools
A Look at the Top Three Network Monitoring Tools
Manage Your Routine Tasks With Jenkins
Monitoring and Graphing Your Network With Cacti
Deploying a Ticket Request System with OTRS
notification_period 24x7 ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS TEMPLATE DEFINITION!
}
[root@prefect ~]# htpasswd -bc /etc/nagios/passwd nagios-admin admin@123
[root@prefect nagios]# /etc/init.d/httpd restart
[root@prefect nagios]# /etc/init.d/nagios start
[root@prefect nagios]# tailf /var/log/nagios/nagios.log
6/19/2014 The Nagios Setup Explained - Open Source For You
http://www.opensourceforu.com/2011/07/nagios-setup-guide/ 7/7
Varad Gupta
Varad has over 15 years of experience as a system architect, where his
bread and butter business is architecture design, configuration and
deployment of Linux clusters, load balancers, messaging solutions,
Linux-based domain controllers and LDAP, Virtualisation (KVM),
databases, network architectures, et al. When he's some time off from his
18-hours-a-day work grind, he loves to fill in as a trainer and writer.
Connect with him: Website
Tags: CentOS, CGI, data center, data center moni tori ng, DNS, EPEL, Fedora, FQDN, free di sk space, LFY Jul y 2011, moni tor,
moni tori ng, Nagi os, Nagi os confi g fi l es, network outages, network parameters, prefects, Red Hat Enterpri se Li nux, reposi tory, room
temperatures, Sci enti fi c Li nux, servi ce status
Article written by:
Reviews How-Tos Coding Interviews Features Overview Blogs
Search
Popular tags
Linux, ubuntu, Java, Android, MySQL, Google, python, Fedora, PHP, C, open
source, html, Microsof t, web applications, Windows, India, Security,
programming, Apache, Red Hat, unix, operating systems, JavaScript, Oracle,
RAM, xml, LFY April 2012, Developers, f irewall, FOSS
For You & Me
Developers
Sysadmins
Open Gurus
CXOs
Columns
All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMPstack.
.
Previous Post
Exploring Software: Unity, GNOME Shell and the
Notification Area
Next Post
Using QEMU for Embedded Systems
Development, Part 2

Potrebbero piacerti anche