2002 Called; They want
  their rrdtool shell
     scripts back
       Dave Josephsen

        dave@dbg.com
2002 Called; They want their rrdtool shell scripts back

A Brief history of time-series data visualization
                architectectures


                     Dave Josephsen

                      dave@dbg.com
2002 Called; They want their rrdtool shell scripts back
 A Brief history of time-series data visualization
                   Architectures


         A Tale of 3 Sysadmin

                 Dave Josephsen

                   dave@dbg.com
Jer, Per, and quitter (aka Dave)




                        2012       4
Jer, traditional needs for Fortune-500

                  Suitcorp
                        >5000 hosts
                        >20,000 services
                        1, 9-story office building
                        Plenty of Budget
                               Beefy Hardware
                        1.5m/1000 hosts




                        2012                         5
Nagios + NG + Drraw (ho-hum)




                     2012      6
Per, near real-time data from Lots of hosts

                  Singularity.gov
                        80,000 hosts in 80 clusters
                        No budget
                        Mad Scientists
                               No measurable impact
                                allowed
                        15 second polling interval
                         (max)
                               CPU, Mem, Disk, Net
                               Needs to alert on
                                performance thresholds



                        2012                             7
Enter Ganglia




                2012   8
That's all fine but what about Nagios?
 Awesome Nagios Integration
       Easily send data from Nagios to Ganglia with gmetric
       Monitor server metrics stored in Ganglia with Nagios
        with a series of included Nagios plug-ins
             Check host heartbeat
             Check single metric on a specific host
             Check multiple metrics on a specific host
             Check multiple metrics on a set of hosts
             Verify a single metric is the same on a set of hosts
       Display Ganglia graphs in Nagios via the Gweb URL
        interface
       Monitor Ganglia with Nagios (duh)

                              2012                                  9
Not just for mad scientists with supercomputers
                    Ganglia is a great fit if
                              You want to offload
                               Performance data
                               processing.
                                    You're worried about scale
                                    You want a super-lightweight
                                     metric gathering agent
                              You need near-real time data
                              You want a really great rrdtool
                               FE
                                    Drag scaling, trend-lines,
                                     holt-winters forecasting,
                                     time-shifts
                                    Lots more

                       2012                                      10
Quitter.. er.. Dave: Graph everything always
                 Massive Ginormic
                      DevOps “paradise” (nightmare)
                      Visualize datapoints on irregular
                        intervals
                              Code promotions
                              Function calls
                      LOTS of metrics (millions)
                      Centralized time-series
                       visualization for LOTS of very
                       different data sources
                              Nagios
                              Application instrumentation
                              Sales... thingies
                       2012                                 11
Enter Graphite
                              Life after RRDTool
                                          Carbon
                                                 Trivial, remote, updates
                                                 Smart buffering/cacheing
                                                 Horizontal scalability
                                          Whisper
                                                 Automatic provisioning
                                                 Interval-agnosticism
                                                 Type agnosticism
                                          Graphite
                                                 Functions!
                                                 Typeglobs!
       Graphic Stolen from: http://www.aosabook.org/en/graphite.html
                                   2012                                   12
Not just for billion dollar mega-giants
                  Graphite works great if
                       You want to combine data from
                        multiple monitoring systems
                                Nagios, Ganglia, Collectd etc..
                       You want to assimilate data from
                        other groups or business units
                                Dev, Sales, etc..
                       You want really flexible centralized
                        visualization that scales
                       You want to empower non-ops
                        groups to explore their own data


                         2012                                 13
Functions!

 Say you have counter data:                               Rate is the derivative of the counter:




&target=router1.bytes&target=router2.bytes                 &target=derive(router1.bytes)
OR: &target=router[12].bytes

      But actually, the raw counter data is kind of interesting if
      We visualize it correctly:




                                     &target=router1.bytes&target=secondYAxis(router2.bytes)


                                               2012                                                14
Moar functions!




        &target=user.registrations                                   &target=summarize(user.registrations,”1h”)




&target=summarize(user.registrations,”1h”)&target=threshold(400,”goal)




&target=summarize(user.registrations,”1h”)&target=timeShift(summarize(user.registrations,”1h”),”30d”)&target=threshold(400,”goal)



                                                                  2012                                                              15
OK BYE!
     http://ganglia.sourceforge.net

     https://launchpad.net/graphite

http://www.aosabook.org/en/graphite.html

      (and speaking of “buy”...)

Nagios Conference 2012 - Dave Josephsen - 2002 called they want there rrd shell scripts back

  • 1.
    2002 Called; Theywant their rrdtool shell scripts back Dave Josephsen [email protected]
  • 2.
    2002 Called; Theywant their rrdtool shell scripts back A Brief history of time-series data visualization architectectures Dave Josephsen [email protected]
  • 3.
    2002 Called; Theywant their rrdtool shell scripts back A Brief history of time-series data visualization Architectures A Tale of 3 Sysadmin Dave Josephsen [email protected]
  • 4.
    Jer, Per, andquitter (aka Dave) 2012 4
  • 5.
    Jer, traditional needsfor Fortune-500 Suitcorp >5000 hosts >20,000 services 1, 9-story office building Plenty of Budget Beefy Hardware 1.5m/1000 hosts 2012 5
  • 6.
    Nagios + NG+ Drraw (ho-hum) 2012 6
  • 7.
    Per, near real-timedata from Lots of hosts Singularity.gov 80,000 hosts in 80 clusters No budget Mad Scientists No measurable impact allowed 15 second polling interval (max) CPU, Mem, Disk, Net Needs to alert on performance thresholds 2012 7
  • 8.
  • 9.
    That's all finebut what about Nagios? Awesome Nagios Integration Easily send data from Nagios to Ganglia with gmetric Monitor server metrics stored in Ganglia with Nagios with a series of included Nagios plug-ins Check host heartbeat Check single metric on a specific host Check multiple metrics on a specific host Check multiple metrics on a set of hosts Verify a single metric is the same on a set of hosts Display Ganglia graphs in Nagios via the Gweb URL interface Monitor Ganglia with Nagios (duh) 2012 9
  • 10.
    Not just formad scientists with supercomputers Ganglia is a great fit if You want to offload Performance data processing. You're worried about scale You want a super-lightweight metric gathering agent You need near-real time data You want a really great rrdtool FE Drag scaling, trend-lines, holt-winters forecasting, time-shifts Lots more 2012 10
  • 11.
    Quitter.. er.. Dave:Graph everything always Massive Ginormic DevOps “paradise” (nightmare) Visualize datapoints on irregular intervals Code promotions Function calls LOTS of metrics (millions) Centralized time-series visualization for LOTS of very different data sources Nagios Application instrumentation Sales... thingies 2012 11
  • 12.
    Enter Graphite Life after RRDTool Carbon Trivial, remote, updates Smart buffering/cacheing Horizontal scalability Whisper Automatic provisioning Interval-agnosticism Type agnosticism Graphite Functions! Typeglobs! Graphic Stolen from: http://www.aosabook.org/en/graphite.html 2012 12
  • 13.
    Not just forbillion dollar mega-giants Graphite works great if You want to combine data from multiple monitoring systems Nagios, Ganglia, Collectd etc.. You want to assimilate data from other groups or business units Dev, Sales, etc.. You want really flexible centralized visualization that scales You want to empower non-ops groups to explore their own data 2012 13
  • 14.
    Functions! Say youhave counter data: Rate is the derivative of the counter: &target=router1.bytes&target=router2.bytes &target=derive(router1.bytes) OR: &target=router[12].bytes But actually, the raw counter data is kind of interesting if We visualize it correctly: &target=router1.bytes&target=secondYAxis(router2.bytes) 2012 14
  • 15.
    Moar functions! &target=user.registrations &target=summarize(user.registrations,”1h”) &target=summarize(user.registrations,”1h”)&target=threshold(400,”goal) &target=summarize(user.registrations,”1h”)&target=timeShift(summarize(user.registrations,”1h”),”30d”)&target=threshold(400,”goal) 2012 15
  • 16.
    OK BYE! http://ganglia.sourceforge.net https://launchpad.net/graphite http://www.aosabook.org/en/graphite.html (and speaking of “buy”...)