SlideShare a Scribd company logo
Application Performance
Management for Blackboard Learn
Danny Thomas
Noriaki Tatsumi
7/15/2014
Who We Are – Blackboard Performance Team
2
Who We Are – Blackboard Performance Team
Teams
• Program
• Server
• Database
• Frontend
Tools
• Monitoring
• APM
• Profiler
• HTTP load
generator
• HTTP replay
• Micro-benchmark
• Performance CI
Development
Recent highlights:
• B2 framework
stabilization
• Frames
elimination
• Server
concurrency
optimizations
• New Relic
instrumentation
3
APMs at Blackboard
Production Support Development
4
Without a Tool You Are Running a Blackbox!
5
APM Objectives
6
• Monitoring for visibility
– Centralize
– Improve Dev and Ops communication
• Identify what constitutes performance issues
– Abnormal behaviors
– Anti-patterns
• Detect and diagnose root cause quickly
• Translate into end user experience
Keys to Success
7
• Choosing the right tool
• Deployment automation
• Alert policies
• Instrumentation
Keys to Success:
Choosing the Right Tool
8
Features
9
• Real user monitoring (RUM)
• Application and database monitoring and profiling
• Servers, network, and filer monitoring
• Application runtime architecture discovery
• Transaction tracing
• Alert policies
• Reports - SLA, error tracking, custom
• Extension and customization framework
Deployment: SaaS
10
Deployment: Self-hosting
11
Data Retention
• Objectives
– Load/hardware forecast
– Business insights via data exploration
• Data types
– Time-series metric
– Transaction traces
– Slow SQL samples
– Errors
• Data format
– Raw/sampled data
– Aggregated data
• Flexibility: Self-hosted vs. SaaS
12
Extension Framework
• Custom metrics
– https://github.com/ntatsumi/newrelic-postgresql
– https://github.com/ntatsumi/appdynamics-blackboard-learn
• Custom dashboards
13
Keys to Success:
Deployment Automation
14
Deployment Automation
15
Keys to Success:
Constructing Alert Policies
16
Alert Policies – Design Considerations
• Minimize noise and false positives
• Use thresholds (e.g. >90% for 3 minutes)
• Use multiple data points (e.g. CPU + response times)
• Use event types based on severity (e.g. warning, critical)
• Send notifications that require action only
• Test your alerts and notifications
• Continuously tweak
17
Alert Policies - Rule Conditions
• Application: Downtime, errors, application resource metrics,
Apdex score
• Server: Downtime, CPU usage, disk space, disk IO, memory
usage
• Key transactions: Errors, Apdex score
18
Alert Policies - Apdex
• Industry standard way to measure users' perceptions of satisfactory
application responsiveness.
• Converts many measurements into one number on a uniform scale of
0-to-1 (0 = no users satisfied, 1 = all users satisfied)
• Apdex Score = (Satisfied Count + Tolerating Count / 2) / Total
Samples
• Example: 100 samples with a target time of 3 seconds, where 60 are
below 3 seconds, 30 are between 3 and 12 seconds, and the
remaining 10 are above 12 seconds
(60 + 30 / 2 )/ 100 = 0.75
http://en.wikipedia.org/wiki/Apdex
19
Keys to Success:
Instrumentation
20
Instrumentation Entry Points
Web
• HTTP requests
• Request URI,
parameters
Non-Web
• Scheduled tasks
• Background
threads
Event / Counter
• Message Queuing
• JMX
• Application
21
• APM tools generally require an entry point to treat other
activity as ‘interesting’:
Common Instrumentation
• Once an entry point is reached, default instrumentations
typically include:
– Servlets (Filters, Requests)
– Web frameworks (Spring, Struts, etc)
– Database calls (JDBC)
– Errors via logging frameworks and uncaught exceptions
– External HTTP services
22
Custom Instrumentation
• Depending on the APM, will vary from custom entry points, to a
more flexible, but complex sensor approach
• New Relic supports native API and XML based configurations
– The April release of Learn ships with New Relic capabilities
– Including instrumentation for:
• Errors
• Real-user monitoring
• Scheduled (bb-task) and queued tasks
• ‘Default’ servlet requests for static files
– Additional XML based configuration, for features such as message
queue handlers available from:
https://github.com/blackboard/newrelic-blackboard-learn
23
Real User Monitoring (RUM)
• Real-user monitoring inserts JavaScript snippets into pages
• Allows the APM tool to measure end to end:
– Web application contribution, as transactions are uniquely identified
– Network time
– DOM processing and page rendering time
– JavaScript Errors
– AJAX Requests
• By browser
• By location
24
System Monitoring
• Some tools may have no support for system level statistics, as
they’re application focused
• If not available, application contribution in term of CPU usage,
heap and native memory utilisation accounted for by JVM
statistics
• Provided by a separate daemon process
25
Demonstration – New Relic
26
Best Practices
27
Deployment
• Start slowly:
– APM can introduce performance side effects (typically ~5%, could be
much higher if misconfigured)
– Allow enough time to establish a baseline to compare changes against
• Deploy end-to-end, avoid the temptation to instrument only
some hosts
• Follow APM vendor best practices
28
Sizing/Scaling
• Oversizing application resources can be as harmful as
undersizing
• Most of interest
– Tomcat executor threads
– Connection pool sizing (available via JMX in April release, can be
implied from executor usage)
– Heap utilisation, Garbage Collection time
29
Troubleshooting Issues
• Compare with your baseline
• Trust the data
• Use APM as a starting point; dig deeper into suspected
components
• Provide as much data as possible when reporting an issue (e.g.
screenshots)
30
Q&A
31

More Related Content

PPTX
Operating a High Velocity Large Organization with Spring Cloud Microservices
Noriaki Tatsumi
 
PDF
Introduction to developing modern web apps
Fabricio Epaminondas
 
PDF
[UC4] Version and Automate Everything
Perforce
 
PPTX
Quick and dirty performance analysis
Chris Kernaghan
 
PPTX
DevOps for Windows Admins
Rex Antony Peter
 
PDF
Agile Secure Cloud Application Development Management
Adam Getchell
 
PPTX
SAP TechEd 2013 session Tec118 managing your-environment
Chris Kernaghan
 
PPTX
What's new in SBM 11.1
Serena Software
 
Operating a High Velocity Large Organization with Spring Cloud Microservices
Noriaki Tatsumi
 
Introduction to developing modern web apps
Fabricio Epaminondas
 
[UC4] Version and Automate Everything
Perforce
 
Quick and dirty performance analysis
Chris Kernaghan
 
DevOps for Windows Admins
Rex Antony Peter
 
Agile Secure Cloud Application Development Management
Adam Getchell
 
SAP TechEd 2013 session Tec118 managing your-environment
Chris Kernaghan
 
What's new in SBM 11.1
Serena Software
 

What's hot (19)

PPTX
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
Amit Gupta
 
PPTX
Oracle Upgrade Project Big Rocks - Done Right!
panayaofficial
 
PPTX
ONE Automation Platform - v11 Features and Functions
CA | Automic Software
 
PDF
Siebel monitoring
Sarnindar Purewal
 
PPTX
Continuous Performance Testing
Grid Dynamics
 
PPTX
Eating our Own Dogfood - How Automic Automates
CA | Automic Software
 
PDF
Group meeting: Identifying Information Disclosure in Web Applications with Re...
Yu-Hsin Hung
 
PDF
Docker in Production: How RightScale Delivers Cloud Applications
RightScale
 
PPT
UC4 SCHEDULING
roelspi
 
PPTX
Business Automation - Cloud Automation Orchestration Service - Nordea
CA | Automic Software
 
PDF
Hexagonal architecture for java applications
Fabricio Epaminondas
 
PPTX
How eBay does Automatic Outage Planning
CA | Automic Software
 
PPTX
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Lucas Jellema
 
PDF
10 Tips to Pump Up Your Atlassian Performance
Atlassian
 
PPTX
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
PPTX
Resolving problems & high availability
Zend by Rogue Wave Software
 
PPTX
JavaOne 2015: Top Performance Patterns Deep Dive
Andreas Grabner
 
PDF
Application Performance, Test and Monitoring
Dony Riyanto
 
PPTX
Andreas Grabner - Performance as Code, Let's Make It a Standard
Neotys_Partner
 
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
Amit Gupta
 
Oracle Upgrade Project Big Rocks - Done Right!
panayaofficial
 
ONE Automation Platform - v11 Features and Functions
CA | Automic Software
 
Siebel monitoring
Sarnindar Purewal
 
Continuous Performance Testing
Grid Dynamics
 
Eating our Own Dogfood - How Automic Automates
CA | Automic Software
 
Group meeting: Identifying Information Disclosure in Web Applications with Re...
Yu-Hsin Hung
 
Docker in Production: How RightScale Delivers Cloud Applications
RightScale
 
UC4 SCHEDULING
roelspi
 
Business Automation - Cloud Automation Orchestration Service - Nordea
CA | Automic Software
 
Hexagonal architecture for java applications
Fabricio Epaminondas
 
How eBay does Automatic Outage Planning
CA | Automic Software
 
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Lucas Jellema
 
10 Tips to Pump Up Your Atlassian Performance
Atlassian
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
Resolving problems & high availability
Zend by Rogue Wave Software
 
JavaOne 2015: Top Performance Patterns Deep Dive
Andreas Grabner
 
Application Performance, Test and Monitoring
Dony Riyanto
 
Andreas Grabner - Performance as Code, Let's Make It a Standard
Neotys_Partner
 
Ad

Similar to Application Performance Management (20)

PPT
Application Performance Monitoring
Olivier Gérardin
 
PDF
Why not let apm do all the heavy lifting beyond the basics of monitoring | Sw...
Swatantra Kumar
 
PPTX
How Applications Manager helps with application performance monitoring
ManageEngine, Zoho Corporation
 
PPTX
New relic
Shubhani Jain
 
PDF
Vendor Analysis Template
Joy Davidson, SCM
 
PPTX
Neev Application Performance Management Services
Neev Technologies
 
PPTX
AppDynamics VS New Relic – The Complete Guide
Takipi
 
PPTX
The Business Justification for APM
Jonah Kowall
 
PPTX
From web interface to the database:Monitor all that matters
ManageEngine, Zoho Corporation
 
PDF
Velocity Presentation - Unified Monitoring with AppDynamics
AppDynamics
 
PDF
Building Reliability - The Realities of Observability
All Things Open
 
PDF
Introducing the E.P.I.C. APM: Stimulate User-Loyalty and Differentiation
CA Technologies
 
PDF
Building Reliability - The Realities of Observability
All Things Open
 
PDF
TradeTech Architecture 2011 - Rodney Morrison, How to Achieve Success with Ap...
SL Corporation
 
PPTX
Introduction to appDynamics
Siddhanta Rath
 
PPTX
ManageEngine - Forrester Webinar: Maximize your application performance to en...
ManageEngine
 
PPTX
Enabling DevOps to optimize application performance with Applications Manager
ManageEngine, Zoho Corporation
 
PPTX
The John Hancock Monitoring Story, FutureStack17
New Relic
 
PDF
Infrastructure and APM Approach and Framework v.3
Don Michie
 
PPTX
SplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
Splunk
 
Application Performance Monitoring
Olivier Gérardin
 
Why not let apm do all the heavy lifting beyond the basics of monitoring | Sw...
Swatantra Kumar
 
How Applications Manager helps with application performance monitoring
ManageEngine, Zoho Corporation
 
New relic
Shubhani Jain
 
Vendor Analysis Template
Joy Davidson, SCM
 
Neev Application Performance Management Services
Neev Technologies
 
AppDynamics VS New Relic – The Complete Guide
Takipi
 
The Business Justification for APM
Jonah Kowall
 
From web interface to the database:Monitor all that matters
ManageEngine, Zoho Corporation
 
Velocity Presentation - Unified Monitoring with AppDynamics
AppDynamics
 
Building Reliability - The Realities of Observability
All Things Open
 
Introducing the E.P.I.C. APM: Stimulate User-Loyalty and Differentiation
CA Technologies
 
Building Reliability - The Realities of Observability
All Things Open
 
TradeTech Architecture 2011 - Rodney Morrison, How to Achieve Success with Ap...
SL Corporation
 
Introduction to appDynamics
Siddhanta Rath
 
ManageEngine - Forrester Webinar: Maximize your application performance to en...
ManageEngine
 
Enabling DevOps to optimize application performance with Applications Manager
ManageEngine, Zoho Corporation
 
The John Hancock Monitoring Story, FutureStack17
New Relic
 
Infrastructure and APM Approach and Framework v.3
Don Michie
 
SplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
Splunk
 
Ad

More from Noriaki Tatsumi (10)

PDF
Feature drift monitoring as a service for machine learning models at scale
Noriaki Tatsumi
 
PPTX
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
Noriaki Tatsumi
 
PPTX
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Noriaki Tatsumi
 
PPTX
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2013 - Advanced Caching in Blackboard Learn Using Redis Bui...
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2013 - Hackathon
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2012 - Ensuring Code Quality
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2011 - Developing B2 for Performance and Scalability
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2011 - Performance Considerations for Custom Theme Development
Noriaki Tatsumi
 
PPTX
Blackboard DevCon 2012 - How to Turn on the Lights to Your Blackboard Learn E...
Noriaki Tatsumi
 
Feature drift monitoring as a service for machine learning models at scale
Noriaki Tatsumi
 
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
Noriaki Tatsumi
 
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Noriaki Tatsumi
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Noriaki Tatsumi
 
Blackboard DevCon 2013 - Advanced Caching in Blackboard Learn Using Redis Bui...
Noriaki Tatsumi
 
Blackboard DevCon 2013 - Hackathon
Noriaki Tatsumi
 
Blackboard DevCon 2012 - Ensuring Code Quality
Noriaki Tatsumi
 
Blackboard DevCon 2011 - Developing B2 for Performance and Scalability
Noriaki Tatsumi
 
Blackboard DevCon 2011 - Performance Considerations for Custom Theme Development
Noriaki Tatsumi
 
Blackboard DevCon 2012 - How to Turn on the Lights to Your Blackboard Learn E...
Noriaki Tatsumi
 

Recently uploaded (20)

PPTX
AZ900_SLA_Pricing_2025_LondonIT (1).pptx
chumairabdullahph
 
DOCX
The Future of Smart Factories Why Embedded Analytics Leads the Way
Varsha Nayak
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PPTX
oapresentation.pptx
mehatdhavalrajubhai
 
PDF
Exploring AI Agents in Process Industries
amoreira6
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PPTX
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
PDF
How to Seamlessly Integrate Salesforce Data Cloud with Marketing Cloud.pdf
NSIQINFOTECH
 
PDF
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
PPTX
Presentation of Computer CLASS 2 .pptx
darshilchaudhary558
 
PPTX
Materi_Pemrograman_Komputer-Looping.pptx
RanuFajar1
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
PDF
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PDF
A REACT POMODORO TIMER WEB APPLICATION.pdf
Michael624841
 
PDF
Solar Panel Installation Guide – Step By Step Process 2025.pdf
CRMLeaf
 
PDF
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PPTX
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
AZ900_SLA_Pricing_2025_LondonIT (1).pptx
chumairabdullahph
 
The Future of Smart Factories Why Embedded Analytics Leads the Way
Varsha Nayak
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
oapresentation.pptx
mehatdhavalrajubhai
 
Exploring AI Agents in Process Industries
amoreira6
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
How to Seamlessly Integrate Salesforce Data Cloud with Marketing Cloud.pdf
NSIQINFOTECH
 
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
Presentation of Computer CLASS 2 .pptx
darshilchaudhary558
 
Materi_Pemrograman_Komputer-Looping.pptx
RanuFajar1
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
A REACT POMODORO TIMER WEB APPLICATION.pdf
Michael624841
 
Solar Panel Installation Guide – Step By Step Process 2025.pdf
CRMLeaf
 
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 

Application Performance Management

  • 1. Application Performance Management for Blackboard Learn Danny Thomas Noriaki Tatsumi 7/15/2014
  • 2. Who We Are – Blackboard Performance Team 2
  • 3. Who We Are – Blackboard Performance Team Teams • Program • Server • Database • Frontend Tools • Monitoring • APM • Profiler • HTTP load generator • HTTP replay • Micro-benchmark • Performance CI Development Recent highlights: • B2 framework stabilization • Frames elimination • Server concurrency optimizations • New Relic instrumentation 3
  • 4. APMs at Blackboard Production Support Development 4
  • 5. Without a Tool You Are Running a Blackbox! 5
  • 6. APM Objectives 6 • Monitoring for visibility – Centralize – Improve Dev and Ops communication • Identify what constitutes performance issues – Abnormal behaviors – Anti-patterns • Detect and diagnose root cause quickly • Translate into end user experience
  • 7. Keys to Success 7 • Choosing the right tool • Deployment automation • Alert policies • Instrumentation
  • 8. Keys to Success: Choosing the Right Tool 8
  • 9. Features 9 • Real user monitoring (RUM) • Application and database monitoring and profiling • Servers, network, and filer monitoring • Application runtime architecture discovery • Transaction tracing • Alert policies • Reports - SLA, error tracking, custom • Extension and customization framework
  • 12. Data Retention • Objectives – Load/hardware forecast – Business insights via data exploration • Data types – Time-series metric – Transaction traces – Slow SQL samples – Errors • Data format – Raw/sampled data – Aggregated data • Flexibility: Self-hosted vs. SaaS 12
  • 13. Extension Framework • Custom metrics – https://github.com/ntatsumi/newrelic-postgresql – https://github.com/ntatsumi/appdynamics-blackboard-learn • Custom dashboards 13
  • 16. Keys to Success: Constructing Alert Policies 16
  • 17. Alert Policies – Design Considerations • Minimize noise and false positives • Use thresholds (e.g. >90% for 3 minutes) • Use multiple data points (e.g. CPU + response times) • Use event types based on severity (e.g. warning, critical) • Send notifications that require action only • Test your alerts and notifications • Continuously tweak 17
  • 18. Alert Policies - Rule Conditions • Application: Downtime, errors, application resource metrics, Apdex score • Server: Downtime, CPU usage, disk space, disk IO, memory usage • Key transactions: Errors, Apdex score 18
  • 19. Alert Policies - Apdex • Industry standard way to measure users' perceptions of satisfactory application responsiveness. • Converts many measurements into one number on a uniform scale of 0-to-1 (0 = no users satisfied, 1 = all users satisfied) • Apdex Score = (Satisfied Count + Tolerating Count / 2) / Total Samples • Example: 100 samples with a target time of 3 seconds, where 60 are below 3 seconds, 30 are between 3 and 12 seconds, and the remaining 10 are above 12 seconds (60 + 30 / 2 )/ 100 = 0.75 http://en.wikipedia.org/wiki/Apdex 19
  • 21. Instrumentation Entry Points Web • HTTP requests • Request URI, parameters Non-Web • Scheduled tasks • Background threads Event / Counter • Message Queuing • JMX • Application 21 • APM tools generally require an entry point to treat other activity as ‘interesting’:
  • 22. Common Instrumentation • Once an entry point is reached, default instrumentations typically include: – Servlets (Filters, Requests) – Web frameworks (Spring, Struts, etc) – Database calls (JDBC) – Errors via logging frameworks and uncaught exceptions – External HTTP services 22
  • 23. Custom Instrumentation • Depending on the APM, will vary from custom entry points, to a more flexible, but complex sensor approach • New Relic supports native API and XML based configurations – The April release of Learn ships with New Relic capabilities – Including instrumentation for: • Errors • Real-user monitoring • Scheduled (bb-task) and queued tasks • ‘Default’ servlet requests for static files – Additional XML based configuration, for features such as message queue handlers available from: https://github.com/blackboard/newrelic-blackboard-learn 23
  • 24. Real User Monitoring (RUM) • Real-user monitoring inserts JavaScript snippets into pages • Allows the APM tool to measure end to end: – Web application contribution, as transactions are uniquely identified – Network time – DOM processing and page rendering time – JavaScript Errors – AJAX Requests • By browser • By location 24
  • 25. System Monitoring • Some tools may have no support for system level statistics, as they’re application focused • If not available, application contribution in term of CPU usage, heap and native memory utilisation accounted for by JVM statistics • Provided by a separate daemon process 25
  • 28. Deployment • Start slowly: – APM can introduce performance side effects (typically ~5%, could be much higher if misconfigured) – Allow enough time to establish a baseline to compare changes against • Deploy end-to-end, avoid the temptation to instrument only some hosts • Follow APM vendor best practices 28
  • 29. Sizing/Scaling • Oversizing application resources can be as harmful as undersizing • Most of interest – Tomcat executor threads – Connection pool sizing (available via JMX in April release, can be implied from executor usage) – Heap utilisation, Garbage Collection time 29
  • 30. Troubleshooting Issues • Compare with your baseline • Trust the data • Use APM as a starting point; dig deeper into suspected components • Provide as much data as possible when reporting an issue (e.g. screenshots) 30