%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% %% This default.mgp is "TrueType fonts" oriented. %% First, you should create "~/.mgprc" whose contents are: %% tfdir "/path/to/truetype/fonts" %% %% To visualize English, install "standard.ttf", "thick.ttf", and %% "typewriter.ttf" into the "tfdir" directory above: %% ftp://ftp.mew.org/pub/mgp/ttf-us.tar.gz %% %% To visualize Japanese, install "kochi-mincho.ttf" and "goth.ttf" %% into the "tfdir" directory above: %% ftp://ftp.mew.org/pub/mgp/ttf-jp.tar.gz %% %deffont "standard" tfont "standard.ttf", tmfont "kochi-mincho.ttf" %deffont "thick" tfont "thick.ttf", tmfont "goth.ttf" %deffont "typewriter" tfont "typewriter.ttf", tmfont "goth.ttf" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% %% Default settings per each line numbers. %% %default 1 leftfill, size 2, fore "white", back "black", font "thick" %default 2 size 7, vgap 10, prefix " " %default 3 size 2, bar "gray70", vgap 10 %default 4 size 5, fore "white", vgap 30, prefix " ", font "standard" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% %% Default settings that are applied to TAB-indented lines. %% %tab 1 size 5, vgap 40, prefix " ", icon box "green" 50 %tab 2 size 4, vgap 40, prefix " ", icon arc "yellow" 50 %tab 3 size 3, vgap 40, prefix " ", icon delta3 "white" 40 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %nodefault %size 7, font "standard", fore "white", vgap 20, back "black" %bquality 10 %center Monitoring the world with NetBSD %size 4 Alan Horn Jennifer Davis ahorn@inktomi.com sigje@caltech.edu %font "typewriter" http://www.deorth.org/papers/monitoring %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad A scenario Hello, I'm a sysadmin who likes sleep. %pause But what happens to stop this ? %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad The Problem %pause Management don't understand, or care about an outage %pause Bottom line it's about loss of revenue. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Monitoring Latin, from monre, to warn. %pause Being warned is a good thing ! %pause To check the quality or content To keep track of systematically with a view to collecting information To test or sample, especially on a regular or ongoing basis: To keep close watch over; supervise %pause Reasonable Definition Regularly sample some sort of content, systematically track the state of that content, and warn where appropriate. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad What should we be monitoring ? Uptime and Availability Performance Security %pause Anything else you think is appropriate %pause It's about making life easier %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Selecting your tools (1) Dependable, stable, consistent. Rich feature set Clean design Easily understandable by others with only small effort Rapidly deployable %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Selecting your tools (2) Hardware Toolset Operating system %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Why NetBSD ? A stable, clean design and implementation, a well documented OS. A great network stack. A comprehensive base Unix system with good analysis tools for simple monitoring. Readily available packages in the pkgsrc system. Multi-platform on cheap commodity hardware. Personal comfort with the OS, brand loyalty. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad The pkgsrc system - a brief overview 'P. K. G. source' or 'package source' ? 3300 potential tools Much good stuff in net/ security/, and sysutils/ subdirs %pause Simple install process $ su # cd /usr/pkgsrc/net/nocol # make # make install %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Availability monitoring %pause Definition When something is available it means that folks can get to it when they need it, retrieve what it offers, avail themselves of the service. %pause Strategies %pause Request operation Types of tests Frequency of measurement Exit status or parse output %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Sample Tools for availability monitoring Simple system built-in tools (ping, ftp) Pkgsrc installed simple tools (fping, wget, lynx) Pkgsrc installed complex suites (snips/nocol, tkined) Non-pkgsrc tools (nagios, bigbrother,bigsister) Your own perl script (Net:: modules etc...) Built-in software testing tools for a given device %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Performance Monitoring %pause Definition How promptly is the machines normal action concluded. %pause Strategies %pause Extended information Baseline first Predict failure modes and degradation Figure out counters and alarms Historical data Minimum stuff to monitor The case for SNMP vs NRPE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Sample Tools for performance monitoring System tools (ps, df, uptime, iostat, vmstat, netstat) Data storage tools (mrtg, rrdtool) Rrdtool frontends (cricket, flowscan, smokeping) Perl scripts for glue %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Security Monitoring %pause This is complex !! %pause Definitions Confidentiality - Prevent unauthorized disclosure of data Integrity - Prevent unauthorized modification of data Availability - Ensure reliable and timely access to data by authorized personnel %pause Strategies (nowhere near complete !!) %pause Baseline system MD5/tripwire for files parse security.conf output systems log analysis Asyncronous event triggers Guard critical points Glue code required %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Sample Tools for security monitoring nmap nessus md5 swatch /etc/security portsentry logsentry snort tcpwrappers ipfilter Perl scripts %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Nagios (formerly known as 'Netsaint') %pause Central monitoring host %pause Notification tool with plugins to perform monitoring %pause Webserver with CGI required, SSL recommended. %pause Not (yet) in pkgsrc %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Nagios Demo install and configure %pause Things I've already done : Install gmake from pkgsrc Create nagios user and group Download and install nagios Install nagios monitoring plugins that you need Configure apache to see the appropriate htdocs and CGI dirs Copied sample nagios files to 'real' files (remove -sample extension) %pause What we're going to do here : Configure nagios to monitor a single service %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Additional Nagios tools Nagios Administration Tool (NAGAT) Nagios Service Check Acceptor (NSCA) 2.1 nagios_statd NTray 0.91 Remote Execution Layer (REL) remote_ctl %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Nagios 'gotchas' Process space Always do reload, never stop and start Use dependencies and parents %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Quis custodiet ipsos custodes The watched shall watch the watchers %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Other stuff Out of band notifications Physical world %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %bgrad Questions ? %center Alan Horn (personal) (work) Monitoring the world with NetBSD %font "typewriter" http://www.deorth.org/papers/monitoring %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%