SlideShare a Scribd company logo
Test Driven Development
with Oracle Coherence
Alexey Ragozin
London, 18 Jul 2013
Presentation outline
• Motivation
• PTDD philosophy
• Oracle Coherence under test
 Coherence Vs. Testing
 Small cluster Vs. Big cluster
 Areas to keep an eye on
• Automation challenge
• Common pitfalls of performance testing
This code works
Filter keyFilter = new InFilter(new KeyExtractor(), keySet);
EntryAggregator aggregator = new Count();
Object result = cache.aggregate(keyFilter, aggregator);
ValueExtractor[] extractors = {
new PofExtractor(String.class, TRADE_ID),
new PofExtractor(String.class, SIDE),
new PofExtractor(String.class, SECURITY),
new PofExtractor(String.class, CLIENT),
new PofExtractor(String.class, TRADER),
new PofExtractor(String.class, STATUS),
};
MultiExtractor projecter = new MultiExtractor(extractors);
ReducerAggregator reducer = new ReducerAggregator(projecter);
Object result = cache.aggregate(filter, reducer);
This code also works
public static class MyNextExpiryExtractor implements ValueExtractor {
@Override
public Object extract(Object obj) {
MyPorfolio pf = (MyPorfolio) obj;
long nextExpiry = Long.MAX_VALUE;
for(MyOption opt: pf.getOptions()) {
if (opt.getExpiry() < nextExpiry) {
nextExpiry = opt.getExpiry();
}
}
return nextExpiry;
}
@Override
public String toString() {
return getClass().getSimpleName();
}
}
And this also looks Ok
@LiveObject
public static class MyLiveObject implements PortableObject {
// ...
@EventProcessorFor(events={EntryInsertedEvent.class})
public void inserted(
EventDispatcher dispatcher,
EntryInsertedEvent event) {
DefaultCommandSubmitter.getInstance()
.submitCommand(
"subscription",
new MySubscribeCommand(event.getEntry().getKey()));
}
}
Another slide to scare you
API
Cache
service
Packet
publisher
Packet
speaker
OS
Packet
listener
Packet
receiver
OS
Service
thread
Worker
thread
Packet
receiver
Packet
publisher
Packet
speaker
Packet
listener
Packet
receiver
Service
thread
Cache
service
Packet
publisher
Packet
speaker
Packet
listener
API
Service
thread
Packet
receiver
Packet
listener
OS OS
Packet
speaker
Packet
publisher
Service
thread
Worker
thread
Serialization
Deserialization
Client thread
Approximate sequence diagram for cache get operation
Functional Vs. Fast
 You have paid for Coherence
 You have paid for gigabytes of RAM
 You have spent time developing solution
and you want to be REALLY FAST
 Do not be a hostage of your beliefs
 Test early
 Test often
PTTD Philosophy
Working cycle
 Write simplest functional code
 Benchmark it
 Improve based on test measurements
Never optimize unless you can measure outcome
Never speculate, measure
Saves time and improves work/life balance 
Testing Coherence
Challenges
 Cluster required
 Sensitive to network
 Database is usually part of solution
 Massive parallel load generation required
Testing Coherence
Benefits
 Pure Java, less hurdle with OS tweaking etc
 Nodes are usually plain J2SE processes
 you can avoid app server setup pain
 No disk persistence
 managing data is usually hardest part of test setup
Benchmarking and cluster size
Single node cluster may reveal
 server side processing issues
Small cluster 2-8 physical servers
 latency related problems
 scalability anomalies
 partitioning anomalies
Large cluster > 8 physical servers
 my terra incognita, so far
Areas to keep eye on
Extractors, queries, indexes
• query index usage
• query plans for complex filters
• POF extractors
Server side processing
• backing map listeners
• storage side transformations
• cross cache access
Network
• effective network throughput
Capacity
• large messages in cluster
• Coherence*Extend buffers
Mixed operation loads
• cache service thread pool saturation
• cache service lock contention
Scale out
• broadcast requests
• hash quality
• 100% utilization of network thread
Automation
“Classic” approach
 bash + SSH + log scraping
Problems
 not reusable
 short “half-live” of test harness
 Java and bash/awk is totally different skill set
Automation
Stock performance test tools
 Deployment are not covered
 Often “web oriented”
 Insufficient performance of tool
Automation – Game changer
cloud = CloudFactory.createSimpleSshCloud();
cloud.node("cbox1");
cloud.node("cbox2");
cloud.node("cbox3");
cloud.node("**").touch();
// Say hello
cloud.node("**").exec(new Callable<Void>() {
@Override
public Void call() throws Exception {
String jvmName =
ManagementFactory.getRuntimeMXBean().getName();
System.out.println("My name is '" + jvmName + "'. Hello!");
return null;
}
});
Automation – Game changer
NanoCloud - http://code.google.com/p/gridkit/wiki/NanoCloudTutorial
• Managing slave nodes
 in-process, local JVM process, remote JVM process
• Deploy free remote execution
• Classpath management
 automatic master classpath replication
 include/exclude classpath elements
• Pain free master/slave communications
• Just works! 
Automation – Game changer
Full stack
• Maven – ensure test portability
• Nanocloud – painless remote execution
• JUnit – test enter point
• Java – all test logic
staring nodes, starting clients, load generation, result processing …
•Java – all test logic
• Jenkins – execution scheduling
Simple microbench mark
@Before
public void setupCluster() {
// Simple cluster configuration template
// Single host cluster config preset
cloud.all().presetFastLocalCluster();
cloud.all().pofEnabled(true);
cloud.all().pofConfig("benchmark-pof-config.xml");
// DSL for cache config XML generation
DistributedScheme scheme = CacheConfig.distributedSheme();
scheme.backingMapScheme(CacheConfig.localScheme());
cloud.all().mapCache("data", scheme);
// Configuring roles
cloud.node("storage*").localStorage(true);
cloud.node("client").localStorage(false);
// Storage nodes will run as separate processes
cloud.node("storage*").outOfProcess(true);
}
*https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java
Simple microbench mark
@Test
public void verify_full_vs_index_scan() {
// Tweak JVM arguments for storage nodes
JvmProps.addJvmArg(cloud.node("storage-*"),
"|-Xmx1024m|-Xms1024m|-XX:+UseConcMarkSweepGC");
// Generating data for benchmark
// ...
cloud.node("client").exec(new Callable<Void>() {
@Override
public Void call() throws Exception {
NamedCache cache = CacheFactory.getCache("data");
System.out.println("Cache size: " + cache.size());
calculate_query_time(tagFilter);
long time =
TimeUnit.NANOSECONDS.toMicros(calculate_query_time(tagFilter));
System.out.println("Exec time for [tagFilter] no index - " + time);
// ...
return null;
}
});
}
*https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java
Monitoring
 Server CPU usage
 Process CPU usage
 Network bandwidth usage
 Coherence threads CPU usage
 Packet Publisher/Speaker/Receiver
 Cache service thread
 Cache service thread pool
 Coherence MBeans
 Cache service task backlog
 TCMP, *Extend IO throughput
etc
Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
 Bottlenecks analyzing
 Performance sign off
Flavors of testing
 Distributed micro benchmarks
• Micro benchmark using real cluster
• Proving ideas
• To be run manually be developer
 Performance regression tests
 Bottlenecks analyzing
 Performance sign off
Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
• To be run by CI
• Execute several stable test scenarios
• Fixed load scenarios, not for stress testing
• GOAL: track impact of code changes
• GOAL: keep test harness compatible with code base
 Bottlenecks analyzing
 Performance sign off
Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
 Bottlenecks analyzing
• Testing through N-dimensional space of parameters
• Fully autonomous execution of all test grid !!!
• Analyzing correlation to pin point bottle neck
• To be performed regularly to prioritize optimization
• Also used to measure/prove effect of optimizations
 Performance sign off
Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
 Bottlenecks analyzing
 Performance sign off
• Execution performance acceptance tests aligned to release goals
• Activity driven by QA
• Can share infrastructure with dev team owned tests
Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
 Bottlenecks analyzing
 Performance sign off
Common pit falls
 Measuring “exception generation” performance
 always validate operation results
 write functional test on performance tests
 Fixed user Vs. Fixed request rate
 serious problems may go unnoticed
 Ignoring environment health and side load
Common pit falls
Fixed user Vs. Fixed request frequency
Fixed users
 5 threads
 5 operations out of 50k
will fall out of time envelop
 99 percentile would be ~1ms
Fixed request rate
 300k operation in total
 250k around 1 ms
 50k between 1ms and 10 s
 99 percentile would be ~9.4 s
Case:
 Operation mean time: 1ms
 Throughput: 5k ops/s
 Test time: 1 minute
 GC pause 10 seconds in middle of run
Links
• Nanocloud
 http://code.google.com/p/gridkit/wiki/NanoCloudTutorial
• ChTest – Coherence oriented wrapper for Nanocloud
 http://code.google.com/p/gridkit/wiki/ChTest
 http://blog.ragozin.info/2013/03/chtest-is-out.html
 https://speakerdeck.com/aragozin/chtest-feature-outline
• GridKit @ GitHub
 https://github.com/gridkit
Thank you
Alexey Ragozin
alexey.ragozin@gmail.com
http://blog.ragozin.info
- my blog about JVM, Coherence and other stuff

More Related Content

PDF
Casual mass parallel data processing in Java
Altoros
 
PDF
Virtualizing Java in Java (jug.ru)
aragozin
 
PDF
Java black box profiling JUG.EKB 2016
aragozin
 
PPTX
Performance tests with Gatling
Andrzej Ludwikowski
 
PPTX
Java profiling Do It Yourself (jug.msk.ru 2016)
aragozin
 
PDF
I know why your Java is slow
aragozin
 
PDF
Effective testing for spark programs Strata NY 2015
Holden Karau
 
PDF
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 
Casual mass parallel data processing in Java
Altoros
 
Virtualizing Java in Java (jug.ru)
aragozin
 
Java black box profiling JUG.EKB 2016
aragozin
 
Performance tests with Gatling
Andrzej Ludwikowski
 
Java profiling Do It Yourself (jug.msk.ru 2016)
aragozin
 
I know why your Java is slow
aragozin
 
Effective testing for spark programs Strata NY 2015
Holden Karau
 
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 

What's hot (20)

PDF
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
PDF
77739818 troubleshooting-web-logic-103
shashank_ibm
 
PPT
Reactive programming with examples
Peter Lawrey
 
PDF
CCI2018 - Benchmarking in the cloud
walk2talk srl
 
PDF
Advanced Oracle Troubleshooting
Hector Martinez
 
PDF
In Memory Database In Action by Tanel Poder and Kerry Osborne
Enkitec
 
PDF
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
PDF
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
Aman Kohli
 
PPTX
Stream processing from single node to a cluster
Gal Marder
 
KEY
Curator intro
Jordan Zimmerman
 
PPT
Speed Up Synchronization Locks: How and Why?
psteinb
 
PDF
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Lucidworks
 
PDF
PostgreSQL Extensions: A deeper look
Jignesh Shah
 
PPTX
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
KEY
London devops logging
Tomas Doran
 
PDF
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder
 
PDF
Java on Linux for devs and ops
aragozin
 
PDF
Cassandra - lesson learned
Andrzej Ludwikowski
 
PDF
Lightening Talk - PostgreSQL Worst Practices
PGConf APAC
 
PPTX
Open Policy Agent for governance as a code
Alexander Tokarev
 
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
77739818 troubleshooting-web-logic-103
shashank_ibm
 
Reactive programming with examples
Peter Lawrey
 
CCI2018 - Benchmarking in the cloud
walk2talk srl
 
Advanced Oracle Troubleshooting
Hector Martinez
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
Enkitec
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
Aman Kohli
 
Stream processing from single node to a cluster
Gal Marder
 
Curator intro
Jordan Zimmerman
 
Speed Up Synchronization Locks: How and Why?
psteinb
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Lucidworks
 
PostgreSQL Extensions: A deeper look
Jignesh Shah
 
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
London devops logging
Tomas Doran
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder
 
Java on Linux for devs and ops
aragozin
 
Cassandra - lesson learned
Andrzej Ludwikowski
 
Lightening Talk - PostgreSQL Worst Practices
PGConf APAC
 
Open Policy Agent for governance as a code
Alexander Tokarev
 
Ad

Similar to Performance Test Driven Development with Oracle Coherence (20)

PDF
Into The Box 2018 | Assert control over your legacy applications
Ortus Solutions, Corp
 
PDF
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax
 
PDF
Effective Testing in DSE
pedjak
 
PDF
How to Build Your Own Test Automation Framework?
Dmitry Buzdin
 
PDF
Deliver Faster with BDD/TDD - Designing Automated Tests That Don't Suck
Kevin Brockhoff
 
PPTX
Windows Azure Acid Test
expanz
 
PDF
AMIS Oracle OpenWorld 2013 Review Part 3 - Fusion Middleware
Getting value from IoT, Integration and Data Analytics
 
PPTX
introduction to node.js
orkaplan
 
PDF
What to expect from Java 9
Ivan Krylov
 
PDF
Escaping Test Hell - Our Journey - XPDays Ukraine 2013
Wojciech Seliga
 
PPTX
Qt test framework
ICS
 
PPTX
Easy Java Integration Testing with Testcontainers​
Payara
 
PDF
Migration strategies 4
Wenhua Wang
 
PPTX
Cerberus : Framework for Manual and Automated Testing (Web Application)
CIVEL Benoit
 
PPTX
Cerberus_Presentation1
CIVEL Benoit
 
PPTX
Static analysis of java enterprise applications
Anastasiοs Antoniadis
 
PDF
Managing Millions of Tests Using Databricks
Databricks
 
KEY
33rd degree
Dariusz Kordonski
 
PDF
New types of tests for Java projects
Vincent Massol
 
PDF
Oracle WebLogic Diagnostics & Perfomance tuning
Michel Schildmeijer
 
Into The Box 2018 | Assert control over your legacy applications
Ortus Solutions, Corp
 
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax
 
Effective Testing in DSE
pedjak
 
How to Build Your Own Test Automation Framework?
Dmitry Buzdin
 
Deliver Faster with BDD/TDD - Designing Automated Tests That Don't Suck
Kevin Brockhoff
 
Windows Azure Acid Test
expanz
 
AMIS Oracle OpenWorld 2013 Review Part 3 - Fusion Middleware
Getting value from IoT, Integration and Data Analytics
 
introduction to node.js
orkaplan
 
What to expect from Java 9
Ivan Krylov
 
Escaping Test Hell - Our Journey - XPDays Ukraine 2013
Wojciech Seliga
 
Qt test framework
ICS
 
Easy Java Integration Testing with Testcontainers​
Payara
 
Migration strategies 4
Wenhua Wang
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
CIVEL Benoit
 
Cerberus_Presentation1
CIVEL Benoit
 
Static analysis of java enterprise applications
Anastasiοs Antoniadis
 
Managing Millions of Tests Using Databricks
Databricks
 
33rd degree
Dariusz Kordonski
 
New types of tests for Java projects
Vincent Massol
 
Oracle WebLogic Diagnostics & Perfomance tuning
Michel Schildmeijer
 
Ad

More from aragozin (20)

PDF
Распределённое нагрузочное тестирование на Java
aragozin
 
PDF
What every Java developer should know about network?
aragozin
 
PPTX
Java profiling Do It Yourself
aragozin
 
PPTX
DIY Java Profiler
aragozin
 
PPTX
Java black box profiling
aragozin
 
PDF
Блеск и нищета распределённых кэшей
aragozin
 
PDF
JIT compilation in modern platforms – challenges and solutions
aragozin
 
PDF
Casual mass parallel computing
aragozin
 
PPTX
Nanocloud cloud scale jvm
aragozin
 
PDF
Java GC tuning and monitoring (by Alexander Ashitkin)
aragozin
 
PDF
Garbage collection in JVM
aragozin
 
PDF
Filtering 100M objects in Coherence cache. What can go wrong?
aragozin
 
PDF
Cборка мусора в Java без пауз (HighLoad++ 2013)
aragozin
 
PDF
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
aragozin
 
PDF
Performance Test Driven Development (CEE SERC 2013 Moscow)
aragozin
 
PPTX
Борьба с GС паузами в JVM
aragozin
 
PPTX
Распределённый кэш или хранилище данных. Что выбрать?
aragozin
 
PPTX
Devirtualization of method calls
aragozin
 
PPTX
Tech talk network - friend or foe
aragozin
 
PDF
Database backed coherence cache
aragozin
 
Распределённое нагрузочное тестирование на Java
aragozin
 
What every Java developer should know about network?
aragozin
 
Java profiling Do It Yourself
aragozin
 
DIY Java Profiler
aragozin
 
Java black box profiling
aragozin
 
Блеск и нищета распределённых кэшей
aragozin
 
JIT compilation in modern platforms – challenges and solutions
aragozin
 
Casual mass parallel computing
aragozin
 
Nanocloud cloud scale jvm
aragozin
 
Java GC tuning and monitoring (by Alexander Ashitkin)
aragozin
 
Garbage collection in JVM
aragozin
 
Filtering 100M objects in Coherence cache. What can go wrong?
aragozin
 
Cборка мусора в Java без пауз (HighLoad++ 2013)
aragozin
 
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
aragozin
 
Performance Test Driven Development (CEE SERC 2013 Moscow)
aragozin
 
Борьба с GС паузами в JVM
aragozin
 
Распределённый кэш или хранилище данных. Что выбрать?
aragozin
 
Devirtualization of method calls
aragozin
 
Tech talk network - friend or foe
aragozin
 
Database backed coherence cache
aragozin
 

Recently uploaded (20)

PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
PDF
Software Development Company | KodekX
KodekX
 
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
CIFDAQ
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PDF
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
PDF
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
Software Development Company | KodekX
KodekX
 
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
Doc9.....................................
SofiaCollazos
 
Software Development Methodologies in 2025
KodekX
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
CIFDAQ
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 

Performance Test Driven Development with Oracle Coherence

  • 1. Test Driven Development with Oracle Coherence Alexey Ragozin London, 18 Jul 2013
  • 2. Presentation outline • Motivation • PTDD philosophy • Oracle Coherence under test  Coherence Vs. Testing  Small cluster Vs. Big cluster  Areas to keep an eye on • Automation challenge • Common pitfalls of performance testing
  • 3. This code works Filter keyFilter = new InFilter(new KeyExtractor(), keySet); EntryAggregator aggregator = new Count(); Object result = cache.aggregate(keyFilter, aggregator); ValueExtractor[] extractors = { new PofExtractor(String.class, TRADE_ID), new PofExtractor(String.class, SIDE), new PofExtractor(String.class, SECURITY), new PofExtractor(String.class, CLIENT), new PofExtractor(String.class, TRADER), new PofExtractor(String.class, STATUS), }; MultiExtractor projecter = new MultiExtractor(extractors); ReducerAggregator reducer = new ReducerAggregator(projecter); Object result = cache.aggregate(filter, reducer);
  • 4. This code also works public static class MyNextExpiryExtractor implements ValueExtractor { @Override public Object extract(Object obj) { MyPorfolio pf = (MyPorfolio) obj; long nextExpiry = Long.MAX_VALUE; for(MyOption opt: pf.getOptions()) { if (opt.getExpiry() < nextExpiry) { nextExpiry = opt.getExpiry(); } } return nextExpiry; } @Override public String toString() { return getClass().getSimpleName(); } }
  • 5. And this also looks Ok @LiveObject public static class MyLiveObject implements PortableObject { // ... @EventProcessorFor(events={EntryInsertedEvent.class}) public void inserted( EventDispatcher dispatcher, EntryInsertedEvent event) { DefaultCommandSubmitter.getInstance() .submitCommand( "subscription", new MySubscribeCommand(event.getEntry().getKey())); } }
  • 6. Another slide to scare you API Cache service Packet publisher Packet speaker OS Packet listener Packet receiver OS Service thread Worker thread Packet receiver Packet publisher Packet speaker Packet listener Packet receiver Service thread Cache service Packet publisher Packet speaker Packet listener API Service thread Packet receiver Packet listener OS OS Packet speaker Packet publisher Service thread Worker thread Serialization Deserialization Client thread Approximate sequence diagram for cache get operation
  • 7. Functional Vs. Fast  You have paid for Coherence  You have paid for gigabytes of RAM  You have spent time developing solution and you want to be REALLY FAST  Do not be a hostage of your beliefs  Test early  Test often
  • 8. PTTD Philosophy Working cycle  Write simplest functional code  Benchmark it  Improve based on test measurements Never optimize unless you can measure outcome Never speculate, measure Saves time and improves work/life balance 
  • 9. Testing Coherence Challenges  Cluster required  Sensitive to network  Database is usually part of solution  Massive parallel load generation required
  • 10. Testing Coherence Benefits  Pure Java, less hurdle with OS tweaking etc  Nodes are usually plain J2SE processes  you can avoid app server setup pain  No disk persistence  managing data is usually hardest part of test setup
  • 11. Benchmarking and cluster size Single node cluster may reveal  server side processing issues Small cluster 2-8 physical servers  latency related problems  scalability anomalies  partitioning anomalies Large cluster > 8 physical servers  my terra incognita, so far
  • 12. Areas to keep eye on Extractors, queries, indexes • query index usage • query plans for complex filters • POF extractors Server side processing • backing map listeners • storage side transformations • cross cache access Network • effective network throughput Capacity • large messages in cluster • Coherence*Extend buffers Mixed operation loads • cache service thread pool saturation • cache service lock contention Scale out • broadcast requests • hash quality • 100% utilization of network thread
  • 13. Automation “Classic” approach  bash + SSH + log scraping Problems  not reusable  short “half-live” of test harness  Java and bash/awk is totally different skill set
  • 14. Automation Stock performance test tools  Deployment are not covered  Often “web oriented”  Insufficient performance of tool
  • 15. Automation – Game changer cloud = CloudFactory.createSimpleSshCloud(); cloud.node("cbox1"); cloud.node("cbox2"); cloud.node("cbox3"); cloud.node("**").touch(); // Say hello cloud.node("**").exec(new Callable<Void>() { @Override public Void call() throws Exception { String jvmName = ManagementFactory.getRuntimeMXBean().getName(); System.out.println("My name is '" + jvmName + "'. Hello!"); return null; } });
  • 16. Automation – Game changer NanoCloud - http://code.google.com/p/gridkit/wiki/NanoCloudTutorial • Managing slave nodes  in-process, local JVM process, remote JVM process • Deploy free remote execution • Classpath management  automatic master classpath replication  include/exclude classpath elements • Pain free master/slave communications • Just works! 
  • 17. Automation – Game changer Full stack • Maven – ensure test portability • Nanocloud – painless remote execution • JUnit – test enter point • Java – all test logic staring nodes, starting clients, load generation, result processing … •Java – all test logic • Jenkins – execution scheduling
  • 18. Simple microbench mark @Before public void setupCluster() { // Simple cluster configuration template // Single host cluster config preset cloud.all().presetFastLocalCluster(); cloud.all().pofEnabled(true); cloud.all().pofConfig("benchmark-pof-config.xml"); // DSL for cache config XML generation DistributedScheme scheme = CacheConfig.distributedSheme(); scheme.backingMapScheme(CacheConfig.localScheme()); cloud.all().mapCache("data", scheme); // Configuring roles cloud.node("storage*").localStorage(true); cloud.node("client").localStorage(false); // Storage nodes will run as separate processes cloud.node("storage*").outOfProcess(true); } *https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java
  • 19. Simple microbench mark @Test public void verify_full_vs_index_scan() { // Tweak JVM arguments for storage nodes JvmProps.addJvmArg(cloud.node("storage-*"), "|-Xmx1024m|-Xms1024m|-XX:+UseConcMarkSweepGC"); // Generating data for benchmark // ... cloud.node("client").exec(new Callable<Void>() { @Override public Void call() throws Exception { NamedCache cache = CacheFactory.getCache("data"); System.out.println("Cache size: " + cache.size()); calculate_query_time(tagFilter); long time = TimeUnit.NANOSECONDS.toMicros(calculate_query_time(tagFilter)); System.out.println("Exec time for [tagFilter] no index - " + time); // ... return null; } }); } *https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java
  • 20. Monitoring  Server CPU usage  Process CPU usage  Network bandwidth usage  Coherence threads CPU usage  Packet Publisher/Speaker/Receiver  Cache service thread  Cache service thread pool  Coherence MBeans  Cache service task backlog  TCMP, *Extend IO throughput etc
  • 21. Flavors of testing  Distributed micro benchmarks  Performance regression tests  Bottlenecks analyzing  Performance sign off
  • 22. Flavors of testing  Distributed micro benchmarks • Micro benchmark using real cluster • Proving ideas • To be run manually be developer  Performance regression tests  Bottlenecks analyzing  Performance sign off
  • 23. Flavors of testing  Distributed micro benchmarks  Performance regression tests • To be run by CI • Execute several stable test scenarios • Fixed load scenarios, not for stress testing • GOAL: track impact of code changes • GOAL: keep test harness compatible with code base  Bottlenecks analyzing  Performance sign off
  • 24. Flavors of testing  Distributed micro benchmarks  Performance regression tests  Bottlenecks analyzing • Testing through N-dimensional space of parameters • Fully autonomous execution of all test grid !!! • Analyzing correlation to pin point bottle neck • To be performed regularly to prioritize optimization • Also used to measure/prove effect of optimizations  Performance sign off
  • 25. Flavors of testing  Distributed micro benchmarks  Performance regression tests  Bottlenecks analyzing  Performance sign off • Execution performance acceptance tests aligned to release goals • Activity driven by QA • Can share infrastructure with dev team owned tests
  • 26. Flavors of testing  Distributed micro benchmarks  Performance regression tests  Bottlenecks analyzing  Performance sign off
  • 27. Common pit falls  Measuring “exception generation” performance  always validate operation results  write functional test on performance tests  Fixed user Vs. Fixed request rate  serious problems may go unnoticed  Ignoring environment health and side load
  • 28. Common pit falls Fixed user Vs. Fixed request frequency Fixed users  5 threads  5 operations out of 50k will fall out of time envelop  99 percentile would be ~1ms Fixed request rate  300k operation in total  250k around 1 ms  50k between 1ms and 10 s  99 percentile would be ~9.4 s Case:  Operation mean time: 1ms  Throughput: 5k ops/s  Test time: 1 minute  GC pause 10 seconds in middle of run
  • 29. Links • Nanocloud  http://code.google.com/p/gridkit/wiki/NanoCloudTutorial • ChTest – Coherence oriented wrapper for Nanocloud  http://code.google.com/p/gridkit/wiki/ChTest  http://blog.ragozin.info/2013/03/chtest-is-out.html  https://speakerdeck.com/aragozin/chtest-feature-outline • GridKit @ GitHub  https://github.com/gridkit