PK %yKC Y% % settings.xml
-298-4842899221651view1falsefalsetruetruefalsefalsefalsefalsetrue1500false//////////////////////////////////////////8=//////////////////////////////////////////8=falsetruefalse02falsefalsetruefalse4001-298-48428993216521270127063563563516351false1500truetrue$(user)/config/standard.sob0$(user)/config/standard.soc$(user)/config/standard.sod1270enUS$(user)/config/standard.sog$(user)/config/standard.sohfalsefalsetruetruefalsetruefalsefalsetruefalsefalsefalsefalsefalse$(user)/config/standard.soefalse4false0low-resolutionfalsetruePK %yKC0T T meta.xml
odplib(python)2008-09-15T11:12:022008-10-01T20:32:433PT26M35S
PK %yKCYK K content.xmlWhat is Performance Management?Monitor and measure various aspects of performance so that overall
performance can be maintained at an acceptable level.How would you do it?What can you measure?How would you measure it?How would you store it?How would you report it?Why use Performance Management?Forensic analysis of failuresPredicting future failuresWe most often use these graphs when a failure has occurred, or a problem is
detected by a monitoring system, such as:a server crasheda network connection is slow/congested/lossyNagios detected a server running out of disk space or memorya server is/was running slowly (out of spec)Nagios checks (yes/no answers) are our most important performance spec.
We set thresholds for things like:"too slow" page loading"too slow" email delivery"too much" packet loss"too little" disk space freeAll these thresholds are arbitrary,but they often detect problems
before they occur. They also often create noise. Sometimes we have to
change the thresholds to reduce noise and keep the system useful.By looking at the performance graphs we can establish correlations:The disk filled up suddently, overnight: someone left a job running?The network connection has been getting more congested for a while,Available bandwidth suddenly dropped two weeks ago when the ISP
sent an engineer out to resolve another issue.The server's memory is nearly full every night: a scheduled cron job?Usually these correlations either help us to create new hypotheses, or rule
out some hypotheses, to help us identify the cause. They rarely answer the
question by themselves.Graphs are really useful for quickly analysing large quantities of performance
data that would otherwise be useless, to pick out correlations and trends
by engaging the brain's visual circuitry.Case study: Diagnosis using baseline dataUseful questions for diagnosing performance-related problems:What is "normal"?Are we within a "normal" range?When did it change?What happened at the same time?All of these require collecting and storing historical data (a baseline).Careful with Performance Management!It's possible to measure everything;Everything has a cost;The benefits are limited:post-mortem analysis,early warning of problems.Performance data can require:a lot of CPU time and bandwidth to collect and aggregate;a lot of disk space to store;a lot of CPU to produce graphs.For example, our monitoring system with 55 hosts, 650 services in Nagios
and 31 hosts, 1085 services in Munin, uses:93% of one virtual CPU5.7 GB disk space1.8 GB per day bandwidthPerformance management toolsSome software tools return numerical information about services, or even
archive and graph it:Smokepingcollect and graph low-level packet loss and latency data from wireless
and modulated Layer 2 links (quality of service/QoS).collect and graph low-level packet loss and latency data from wireless
and modulated Layer 2 links (quality of service/QoS).Munincollect and graphing information from Unix servers, such
as disk, memory and CPU usage, mail queue size, etc.collect and graphing information from Unix servers, such
as disk, memory and CPU usage, mail queue size, etc.Windows Performance Counterscollect and graphing information from multiple (newer) Windows servers
across a network.collect and graphing information from multiple (newer) Windows servers
across a network.Nagioscollect performance metrics as well as up/down decision data and
from each service, can derive service uptime, can be graphed using
add-ons.collect performance metrics as well as up/down decision data and
from each service, can derive service uptime, can be graphed using
add-ons.Zenoss and Cacticollect performance metrics from devices with SNMP support and graph them.collect performance metrics from devices with SNMP support and graph them.pmacct, pmgraph, argus and nfsencollect network traffic information broken down by flow for analysis.collect network traffic information broken down by flow for analysis.squid cache manager, webalizer and Google Analyticscollect web traffic logs for analysis.collect web traffic logs for analysis.rrdtoolstores time-sequence data with high performance, automatic history
purging, and decent graphs. Used by smokeping, munin and cacti.stores time-sequence data with high performance, automatic history
purging, and decent graphs. Used by smokeping, munin and cacti.PK %yKC3&/ / mimetypeapplication/vnd.oasis.opendocument.presentationPK %yKCM
styles.xml
..<number>03/09/13<number><number>PK %yKC]N@ N@ - Pictures/2000000100000C6700000457DF4CC15C.svg
PK %yKC'(n n - Pictures/1000000000000555000004006D7770E5.pngPNG
IHDR U n. pHYs +
OiCCPPhotoshop ICC profile xڝSgTS=BKKoR RB&*! J!QEEȠQ,
!{kּ>H3Q5B.@
$p d!s# ~<<+" x M0B\t8K @zB @F&S