collectd 
An introduction
About me 
● Florian "octo" Forster 
● Open-source work since 2001 
● Started collectd in 2005
Agenda 
● collectd 
● Aggregation of metrics 
● Alerting with Icinga
Agenda 
● collectd 
● Aggregation of metrics 
● Alerting with Icinga
collectd 
● Daemon 
● collect metrics 
● mangle / transport metrics 
● store metrics (no retrieve)
collectd 
● Open-source project 
○ MIT and GPL licensed 
● Platform independent 
○ Linux, BSD, Solaris, AIX, HP-UX, … 
○ Windows via SSC Serv (non-free)
collectd 
● Agent based design 
○ Runs on each host 
● Extensible via plugins 
○ Language bindings (Perl, Python, Java) 
○ "exec" plugin, e.g. shell scripts
collectd 
● 95+ "read" (input) plugins 
○ System metrics (e.g. CPU, memory) 
○ Application metrics (e.g. MySQL) 
○ Other (Xeon Phi, SNMP, OneWire)
collectd 
● 15+ "write" (output) plugins 
○ Graphite 
○ RRDtool 
○ RRDCacheD 
○ Riemann 
○ MongoDB 
○ HTTP (generic)
collectd 
# Input 
LoadPlugin cpu 
LoadPlugin memory 
LoadPlugin df 
<Plugin df> 
MountPoint "/" 
ValuesPercentage true 
</Plugin> 
# Output 
LoadPlugin write_graphite 
<Plugin write_graphite> 
<Node "default"> 
Host "graphite.example.com" 
</Node> 
</Plugin> 
Example configuration
collectd 
● collectd's write_graphite plugin 
○ Sends metric to Graphite 
○ TCP or UDP transport 
○ Metric names somewhat adjustable 
→ Monitoring mit Graphite 
(15:30 in this room, German)
Agenda 
● collectd 
● Aggregation of metrics 
● Alerting with Icinga
Aggregation 
● Aggregates often more useful for alerting 
○ e.g. sum over CPUs, minimum RTT 
● Metric storage often I/O bound 
● Dashboards require "sane" amount of 
information
Aggregation 
collectd Graphite 
CPU 
Disk 
Memory 
… 
Aggregation
Aggregation 
● Load the Aggregation plugin 
● Select (filter) applicable metrics 
● Group by metric type and other fields 
● Aggregate functions (e.g. sum)
Aggregation 
LoadPlugin aggregation 
<Plugin aggregation> 
<Aggregation> 
</Aggregation> 
</Plugin> 
example.com/battery/percent-charged 
example.com/cpu-0/cpu-idle 
example.com/cpu-0/cpu-user 
example.com/cpu-0/cpu-wait 
example.com/cpu-1/cpu-idle 
… 
example.com/df-root/df_complex-free 
example.com/df-root/df_complex-used 
example.com/df-root/df_complex-rsvd 
… 
Load the aggregation plugin
Aggregation: Selection 
● Five fields usable for selection 
○ Host 
○ Plugin 
○ PluginInstance 
○ Type (mandatory) 
○ TypeInstance
Aggregation: Selection 
LoadPlugin aggregation 
<Plugin aggregation> 
<Aggregation> 
Plugin "cpu" 
Type "cpu" 
</Aggregation> 
</Plugin> 
example.com/cpu-0/cpu-idle 
example.com/cpu-0/cpu-user 
example.com/cpu-0/cpu-wait 
example.com/cpu-1/cpu-idle 
example.com/cpu-1/cpu-user 
example.com/cpu-1/cpu-wait 
example.com/cpu-2/cpu-idle 
example.com/cpu-2/cpu-user 
example.com/cpu-2/cpu-wait 
… 
Select metrics
Aggregation: Grouping 
● Four fields usable for selection 
○ Host 
○ Plugin 
○ PluginInstance 
○ TypeInstance 
● One field unspecified (or more)
Aggregation: Grouping 
LoadPlugin aggregation 
<Plugin aggregation> 
<Aggregation> 
Plugin "cpu" 
Type "cpu" 
GroupBy Host 
GroupBy TypeInstance 
</Aggregation> 
</Plugin> 
example.com/cpu-???/cpu-idle 
example.com/cpu-???/cpu-user 
example.com/cpu-???/cpu-wait 
Configure grouping
Aggregation: Functions 
● Up to six aggregate functions 
○ Count 
○ Sum 
○ Minimum 
○ Maximum 
○ Average 
○ Standard deviation
Aggregation 
LoadPlugin aggregation 
<Plugin aggregation> 
<Aggregation> 
Plugin "cpu" 
Type "cpu" 
GroupBy Host 
GroupBy TypeInstance 
CalculateSum true 
</Aggregation> 
</Plugin> 
example.com/cpu-sum/cpu-idle 
example.com/cpu-sum/cpu-user 
example.com/cpu-sum/cpu-wait 
Select aggregate function(s)
Aggregation 
● Creates additional metrics 
● Use chains to filter out unwanted "raw" 
metrics. 
● Usable on client and/or server.
Agenda 
● collectd 
● Aggregation of metrics 
● Alerting with Icinga
Alerting 
● Load the Unixsock plugin 
● Query and check values with collectd-nagios 
● Both come with collectd
Alerting 
Load the Unixsock plugin 
LoadPlugin unixsock 
<Plugin unixsock> 
SocketFile "/var/run/collectd-unixsock" 
SocketGroup "collectd-nagios" 
SocketPerms "0660" 
DeleteSocket true 
</Plugin>
Alerting 
Query values with the Unixsock plugin 
-> GETVAL example.com/cpu-average/cpu-wait 
<- 1 Value found 
<- value=8.540017+e00
Alerting 
● collectd-nagios queries and checks metrics 
● Ranged -w (warn) and -c (critical) options 
● Conforms to Icinga's best practices
Alerting 
Example: collectd-nagios 
$ collectd-nagios -s /var/run/collectd-unixsock  
> -n cpu-average/cpu-wait -H example.com  
> -w '0:10' -c '0:25' 
OKAY: 0 critical, 0 warning, 1 okay | value=8.540017;;;;
Alerting 
commands.cfg services.cfg 
define command{ 
command_name check_cpuio_collectd 
command_line collectd-nagios  
-H $HOSTNAME$  
-n cpu-average/cpu-wait  
-w $ARG1$ -c $ARG2$ 
} 
define service{ 
use generic-service 
host_name example.com 
service_description I/O wait 
check_command  
check_cpuio_collectd!10:!5: 
}
Alerting 
● What's next? 
○ Use "passive checks" 
○ Let collectd push metrics to Icinga 2? 
○ Bring on the patches!
Thank you! 
Thank you!
Questions? 
It's time for 
Questions

More Related Content

PDF
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
PDF
Prometheus Monitoring Mixins
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
PPTX
General Programming on the GPU - Confoo
PDF
EncExec: Secure In-Cache Execution
PPTX
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
PPTX
Header files of c++ unit 3 -topic 3
PPTX
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
Prometheus Monitoring Mixins
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
General Programming on the GPU - Confoo
EncExec: Secure In-Cache Execution
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
Header files of c++ unit 3 -topic 3
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi

What's hot (19)

PDF
Devoxx france 2015 influxdb
PDF
Optimizing the Grafana Platform for Flux
PDF
Log Event Stream Processing In Flink Way
PDF
Influxdb and time series data
PPTX
Fantastic caches and where to find them
PDF
Flux and InfluxDB 2.0
PDF
Getting started with influx Db and Grafana Installation Guide
PDF
Influx db talk-20150415
PDF
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
PDF
Tajo case study bay area hug 20131105
PDF
Chronix as Long-Term Storage for Prometheus
PDF
Efficient and Fast Time Series Storage - The missing link in dynamic software...
DOCX
Array using recursion
PPTX
Terraform infrastructure as code for mere mortals
PDF
Reactive x
PDF
HyperLogLog in Hive - How to count sheep efficiently?
PPT
More than UI
PDF
tf.data: TensorFlow Input Pipeline
PPTX
Intro to Cuda
Devoxx france 2015 influxdb
Optimizing the Grafana Platform for Flux
Log Event Stream Processing In Flink Way
Influxdb and time series data
Fantastic caches and where to find them
Flux and InfluxDB 2.0
Getting started with influx Db and Grafana Installation Guide
Influx db talk-20150415
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Tajo case study bay area hug 20131105
Chronix as Long-Term Storage for Prometheus
Efficient and Fast Time Series Storage - The missing link in dynamic software...
Array using recursion
Terraform infrastructure as code for mere mortals
Reactive x
HyperLogLog in Hive - How to count sheep efficiently?
More than UI
tf.data: TensorFlow Input Pipeline
Intro to Cuda
Ad

Viewers also liked (20)

PDF
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
PDF
OSMC 2014: MonitoringLove with Sensu | Jochen Lillich
PDF
OSDC 2014: Yves Fauser - OpenStack Networking (Neutron) - Overview of network...
PDF
OSDC 2014: Mike Adolphs - How we run Support at GitHub
PPTX
OSDC 2014: Fernando Hönig - New Data Center Service Model: Cloud + DevOps
PDF
OSDC 2014: Jan-Piet Mens - Configuration Management with Ansible
PDF
OSDC 2015: Nigel kersten | In Defense of Data Centers
PDF
OSMC 2014: Network Discovery update | Remo Rickli
PDF
OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
PDF
OSMC 2014: Server Hardware Monitoring done right | Werner Fischer
PDF
OSMC 2014: Why we do monitoring wrong | Michael Medin
PDF
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
PDF
OSMC 2014: Business Prozessmonitoring mit BPView | Rene Koch
PDF
OSMC 2014: Interesting use cases of Zabbix improvements in latest versions | ...
PDF
Open Source Backup Conference 2014: Automating backup provisioning with bacul...
PDF
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
PDF
Bareos - Open Source Data Protection, by Philipp Storz
PDF
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
PDF
OSDC 2015: Jan-Piet Mens | MQTT for your data center (and for the IoT)
PDF
OSDC 2015: Kris Buytaert | From ConfigManagementSucks to ConfigManagementLove
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
OSMC 2014: MonitoringLove with Sensu | Jochen Lillich
OSDC 2014: Yves Fauser - OpenStack Networking (Neutron) - Overview of network...
OSDC 2014: Mike Adolphs - How we run Support at GitHub
OSDC 2014: Fernando Hönig - New Data Center Service Model: Cloud + DevOps
OSDC 2014: Jan-Piet Mens - Configuration Management with Ansible
OSDC 2015: Nigel kersten | In Defense of Data Centers
OSMC 2014: Network Discovery update | Remo Rickli
OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
OSMC 2014: Server Hardware Monitoring done right | Werner Fischer
OSMC 2014: Why we do monitoring wrong | Michael Medin
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Business Prozessmonitoring mit BPView | Rene Koch
OSMC 2014: Interesting use cases of Zabbix improvements in latest versions | ...
Open Source Backup Conference 2014: Automating backup provisioning with bacul...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Bareos - Open Source Data Protection, by Philipp Storz
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSDC 2015: Jan-Piet Mens | MQTT for your data center (and for the IoT)
OSDC 2015: Kris Buytaert | From ConfigManagementSucks to ConfigManagementLove
Ad

Similar to OSMC 2014: Introduction into collectd | Florian Foster (20)

PDF
OSMC 2014 | Introduction into collectd by Florian Forster
PDF
OSMC 2015: Collectd Thresholds Plugin and Icinga by Florian Forster
PDF
OSMC 2015 | collectd's "threshold" Plugin and Icinga by Florian Forster
PDF
Fine grained monitoring
ODP
Monitoring your VM's at Scale
PPTX
Time to say goodbye to your Nagios based setup
PDF
collectd & PostgreSQL
PDF
Hardware monitoring with collectd at CERN
PDF
Monitoring in the cloud with Puppet
PDF
StatsD DevOps Boulder 7/20/15
KEY
Trending with Purpose
ODP
Monitoring at/with SUSE 2015
ODP
Monitoring shootout loadays
PDF
Monitoring your API
PDF
Handout: 'Open Source Tools & Resources'
PDF
Monitoring at a SAAS Startup: Tradeoffs and Tools
ODP
opensource Monitoring Tool , an overview
PDF
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
PDF
Monitoring - deeper dive
PPTX
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
OSMC 2014 | Introduction into collectd by Florian Forster
OSMC 2015: Collectd Thresholds Plugin and Icinga by Florian Forster
OSMC 2015 | collectd's "threshold" Plugin and Icinga by Florian Forster
Fine grained monitoring
Monitoring your VM's at Scale
Time to say goodbye to your Nagios based setup
collectd & PostgreSQL
Hardware monitoring with collectd at CERN
Monitoring in the cloud with Puppet
StatsD DevOps Boulder 7/20/15
Trending with Purpose
Monitoring at/with SUSE 2015
Monitoring shootout loadays
Monitoring your API
Handout: 'Open Source Tools & Resources'
Monitoring at a SAAS Startup: Tradeoffs and Tools
opensource Monitoring Tool , an overview
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Monitoring - deeper dive
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...

Recently uploaded (20)

PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PDF
novaPDF Pro 11.9.482 Crack + License Key [Latest 2025]
DOC
UTEP毕业证学历认证,宾夕法尼亚克拉里恩大学毕业证未毕业
PDF
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
PDF
Microsoft Office 365 Crack Download Free
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PPTX
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PPTX
Tech Workshop Escape Room Tech Workshop
PDF
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
PDF
CCleaner 6.39.11548 Crack 2025 License Key
PPTX
Airline CRS | Airline CRS Systems | CRS System
PDF
BoxLang Dynamic AWS Lambda - Japan Edition
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PDF
Internet Download Manager IDM Crack powerful download accelerator New Version...
PPTX
Introduction to Windows Operating System
PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
PPTX
Cybersecurity: Protecting the Digital World
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
novaPDF Pro 11.9.482 Crack + License Key [Latest 2025]
UTEP毕业证学历认证,宾夕法尼亚克拉里恩大学毕业证未毕业
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
Microsoft Office 365 Crack Download Free
Topaz Photo AI Crack New Download (Latest 2025)
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
How to Use SharePoint as an ISO-Compliant Document Management System
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Tech Workshop Escape Room Tech Workshop
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
CCleaner 6.39.11548 Crack 2025 License Key
Airline CRS | Airline CRS Systems | CRS System
BoxLang Dynamic AWS Lambda - Japan Edition
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Internet Download Manager IDM Crack powerful download accelerator New Version...
Introduction to Windows Operating System
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
Cybersecurity: Protecting the Digital World

OSMC 2014: Introduction into collectd | Florian Foster

  • 2. About me ● Florian "octo" Forster ● Open-source work since 2001 ● Started collectd in 2005
  • 3. Agenda ● collectd ● Aggregation of metrics ● Alerting with Icinga
  • 4. Agenda ● collectd ● Aggregation of metrics ● Alerting with Icinga
  • 5. collectd ● Daemon ● collect metrics ● mangle / transport metrics ● store metrics (no retrieve)
  • 6. collectd ● Open-source project ○ MIT and GPL licensed ● Platform independent ○ Linux, BSD, Solaris, AIX, HP-UX, … ○ Windows via SSC Serv (non-free)
  • 7. collectd ● Agent based design ○ Runs on each host ● Extensible via plugins ○ Language bindings (Perl, Python, Java) ○ "exec" plugin, e.g. shell scripts
  • 8. collectd ● 95+ "read" (input) plugins ○ System metrics (e.g. CPU, memory) ○ Application metrics (e.g. MySQL) ○ Other (Xeon Phi, SNMP, OneWire)
  • 9. collectd ● 15+ "write" (output) plugins ○ Graphite ○ RRDtool ○ RRDCacheD ○ Riemann ○ MongoDB ○ HTTP (generic)
  • 10. collectd # Input LoadPlugin cpu LoadPlugin memory LoadPlugin df <Plugin df> MountPoint "/" ValuesPercentage true </Plugin> # Output LoadPlugin write_graphite <Plugin write_graphite> <Node "default"> Host "graphite.example.com" </Node> </Plugin> Example configuration
  • 11. collectd ● collectd's write_graphite plugin ○ Sends metric to Graphite ○ TCP or UDP transport ○ Metric names somewhat adjustable → Monitoring mit Graphite (15:30 in this room, German)
  • 12. Agenda ● collectd ● Aggregation of metrics ● Alerting with Icinga
  • 13. Aggregation ● Aggregates often more useful for alerting ○ e.g. sum over CPUs, minimum RTT ● Metric storage often I/O bound ● Dashboards require "sane" amount of information
  • 14. Aggregation collectd Graphite CPU Disk Memory … Aggregation
  • 15. Aggregation ● Load the Aggregation plugin ● Select (filter) applicable metrics ● Group by metric type and other fields ● Aggregate functions (e.g. sum)
  • 16. Aggregation LoadPlugin aggregation <Plugin aggregation> <Aggregation> </Aggregation> </Plugin> example.com/battery/percent-charged example.com/cpu-0/cpu-idle example.com/cpu-0/cpu-user example.com/cpu-0/cpu-wait example.com/cpu-1/cpu-idle … example.com/df-root/df_complex-free example.com/df-root/df_complex-used example.com/df-root/df_complex-rsvd … Load the aggregation plugin
  • 17. Aggregation: Selection ● Five fields usable for selection ○ Host ○ Plugin ○ PluginInstance ○ Type (mandatory) ○ TypeInstance
  • 18. Aggregation: Selection LoadPlugin aggregation <Plugin aggregation> <Aggregation> Plugin "cpu" Type "cpu" </Aggregation> </Plugin> example.com/cpu-0/cpu-idle example.com/cpu-0/cpu-user example.com/cpu-0/cpu-wait example.com/cpu-1/cpu-idle example.com/cpu-1/cpu-user example.com/cpu-1/cpu-wait example.com/cpu-2/cpu-idle example.com/cpu-2/cpu-user example.com/cpu-2/cpu-wait … Select metrics
  • 19. Aggregation: Grouping ● Four fields usable for selection ○ Host ○ Plugin ○ PluginInstance ○ TypeInstance ● One field unspecified (or more)
  • 20. Aggregation: Grouping LoadPlugin aggregation <Plugin aggregation> <Aggregation> Plugin "cpu" Type "cpu" GroupBy Host GroupBy TypeInstance </Aggregation> </Plugin> example.com/cpu-???/cpu-idle example.com/cpu-???/cpu-user example.com/cpu-???/cpu-wait Configure grouping
  • 21. Aggregation: Functions ● Up to six aggregate functions ○ Count ○ Sum ○ Minimum ○ Maximum ○ Average ○ Standard deviation
  • 22. Aggregation LoadPlugin aggregation <Plugin aggregation> <Aggregation> Plugin "cpu" Type "cpu" GroupBy Host GroupBy TypeInstance CalculateSum true </Aggregation> </Plugin> example.com/cpu-sum/cpu-idle example.com/cpu-sum/cpu-user example.com/cpu-sum/cpu-wait Select aggregate function(s)
  • 23. Aggregation ● Creates additional metrics ● Use chains to filter out unwanted "raw" metrics. ● Usable on client and/or server.
  • 24. Agenda ● collectd ● Aggregation of metrics ● Alerting with Icinga
  • 25. Alerting ● Load the Unixsock plugin ● Query and check values with collectd-nagios ● Both come with collectd
  • 26. Alerting Load the Unixsock plugin LoadPlugin unixsock <Plugin unixsock> SocketFile "/var/run/collectd-unixsock" SocketGroup "collectd-nagios" SocketPerms "0660" DeleteSocket true </Plugin>
  • 27. Alerting Query values with the Unixsock plugin -> GETVAL example.com/cpu-average/cpu-wait <- 1 Value found <- value=8.540017+e00
  • 28. Alerting ● collectd-nagios queries and checks metrics ● Ranged -w (warn) and -c (critical) options ● Conforms to Icinga's best practices
  • 29. Alerting Example: collectd-nagios $ collectd-nagios -s /var/run/collectd-unixsock > -n cpu-average/cpu-wait -H example.com > -w '0:10' -c '0:25' OKAY: 0 critical, 0 warning, 1 okay | value=8.540017;;;;
  • 30. Alerting commands.cfg services.cfg define command{ command_name check_cpuio_collectd command_line collectd-nagios -H $HOSTNAME$ -n cpu-average/cpu-wait -w $ARG1$ -c $ARG2$ } define service{ use generic-service host_name example.com service_description I/O wait check_command check_cpuio_collectd!10:!5: }
  • 31. Alerting ● What's next? ○ Use "passive checks" ○ Let collectd push metrics to Icinga 2? ○ Bring on the patches!
  • 33. Questions? It's time for Questions