SlideShare a Scribd company logo
TUNING N1QL QUERY PERFORMANCE & SCALE
IN COUCHBASE SERVER 4.0
Cihan Biyikoglu
Dir. Product Management
1
©2015 Couchbase Inc. 2
Goals
 Deeper look at query performance and scale
 Look at Query and Index Service Scale Characteristics
 Understand Query Execution Flow
 Understand Index Usage
 Tune queries with a few techniques
©2015 Couchbase Inc. 3
Agenda
 Part I - Architectural Overview
 New Cluster Architecture with Couchbase Server 4.0
 Query Processing & Indexing
 Part II - Optimizing Queries
 Execution Plans and Operators
 Optimizing Queries - Filtering, Index Selection and Joins
 OptimizingApps - Consistency Dials
 QA
Demos & More Demos…
©2015 Couchbase Inc. 4
Disclaimer
Couchbase Server 4.0 and ForestDB are still
in development and the final version of the
products may not be identical in details
discussed on this session.
Architecture Overview
Part I
©2015 Couchbase Inc. 6
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
©2014 Couchbase Inc.
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Query Processing Overview
©2015 Couchbase Inc. 9
Query Execution
 Submitting Queries in N1QL
 Stateless Connectivity through REST
 Load-Balance across Query Service nodes
 Prepared vs Ad-hoc Query Execution
 Consistency Dials – more on this later…
©2015 Couchbase Inc. 10
Query Execution
 Parallelization factor is #cores on Query Service Node
Execution Flow
©2015 Couchbase Inc. 11
Query Service - Capacity Management
Scaling the Query Service
 Pro: Load Balance Queries across all nodes
 Con: Compete with Index and DataWorkloads
Index Service
Couchbase Cluster
Query Service
Data Service
node1 node8
©2015 Couchbase Inc. 12
Query Service - Capacity Management
Scaling the Query Service
 Added CPU: higher intra-query parallelization
 Added RAM: improved caching with larger result sets
 Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data ServiceIndex Service
Query Service
Indexing Overview
©2015 Couchbase Inc. 14
Indexing in Couchbase Server 4.0
 Multiple Indexers
 GSI – Index Service
New indexing for N1QL for low latency queries without compromising on mutation performance
(insert/update/delete)
Independently partitioned and independently scalable indexes in Indexing Service
 Map/ReduceViews – Data Service
Powerful programmable indexer for complex reporting and indexing logic.
Full partition alignment and paired scalability with Data Service.
 SpatialView – Data Service
Incremental R-tree indexing for powerful bounding-box queries
Full partition alignment and paired scalability with Data Service
New
Index Scan
©2015 Couchbase Inc. 15
Which to choose – GSI vsViews
Workloads New GSI in v4.0 Map/ReduceViews
Complex
Reporting
Just InTime Pre-aggregated
Workload
Optimization
Optimized for Scan Latency &
Throughput
Optimized for Insertion
Flexible
Index Logic
N1QL Functions Javascript
Secondary
Lookups
Single Node Lookup Scatter-Gather
Tunable
Consistency
Staleness false or ok or
everything in between
Staleness false or ok
©2015 Couchbase Inc. 16
Which to choose – GSI vsViews
Capabilities New GSI in v4.0 Map/ReduceViews
Partitioning Model Independent – Indexing Service Aligned to Data – Data Service
Scale Model Independently Scale Index Service Scale with Data Service
Fetch with Index Key Single Node Scatter-Gather
Range Scan Single Node Scatter-Gather
Grouping,Aggregates With N1QL Built-in withViews API
Caching Managed Not Managed
Storage ForestDB Couchstore
Availability Multiple Identical Indexes load
balanced
Replica Based
©2015 Couchbase Inc. 17
Query Service - Capacity Management
Scaling the Index Service
 Pro: Load balance scans across all nodes
 Con: Compete with Query and DataWorkloads
Index Service
Couchbase Cluster
Query Service
Data Service
node1 node8
©2015 Couchbase Inc. 18
Index Service Capacity Management
Scaling the Index Service
 Added RAM: better caching of indexes
 Added CPU: faster index maintenance & parallelized index scans
 Add Faster IO Path: faster index persistence
 Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data Service
Index Service
Query Service
Optimizing Queries
Part II
©2015 Couchbase Inc. 20
Execution Plans & Explain
 EXPLAIN query
 Plan is assembled into an execution flow expressed through the
operators
 Operators stream results up and down the stream
Sequence Parallel
Primary
Scan
Initial
Project
Fetch
Initial
ProjectFetch
Initial
ProjectFetch
…
Limit
©2015 Couchbase Inc. 21
Operators
 Main Operations
 Scans
PrimaryScan: Scan of the Primary Index based on document keys
IndexScan: Scan of the Secondary Index based on a predicate
 Fetch
Fetch: Reach into the Data service with a document key
 Projection Operations
InitialProject: reducing the stream size to the fields involved in query.
FinalProject: final shaping of the result to the requested JSON shape
©2015 Couchbase Inc. 22
Operators cont.
 Operator Assembly
Parallel: execute all child operations in parallel
Sequence: execute child items in a sequence
 Filtering Operators
Filter:Apply a filter expression (ex.WHERE field = “value”)
Limit: limit the number of items returned to N
Offset: start returning items from a specified item count
©2015 Couchbase Inc. 23
Operators cont.
 Join Operators
Join: Join left and right keyspaces on attributes and document key
Unnest: Join operation between a parent and a child with a nested
array where parent is repeated for each child array item.
Nest:Grouping operation between a parent and a child array where child
array is embedded into the parent.
DEMO
Execution Plans
Demo #1
CommonTechniques forTuning Queries
©2015 Couchbase Inc. 26
Minimize Items Scanned
 Primary Index Scan vs. Index Scan
 Primary Index can only filter on document keys thus typically means
“full-scan” of the bucket
 Secondary Index is typically done with predicates and are smaller in
size thus better to scan
Index Selection: Based on matching expressions matching in Index andWHERE clause
DEMO #2
SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
Vs.
CREATE INDEX i_type on `beer-sample`(type) USING GSI;
SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
©2015 Couchbase Inc. 27
Minimize Items Scanned
 HINT index usage to queries
 There can be multiple indexes with to choose from and you can hint
index choice to us.
SELECT name,updated FROM `beer-sample` USE INDEX(i_type using gsi)
WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
©2015 Couchbase Inc. 28
Minimize Items Scanned
 Limit & Filters help eliminate rows early in the execution plan
 With Limit, Upstream operators are signaled to stop by limit when enough
rows accumulate
 Ex: Remember to Filter on Document type with buckets that contain
multiple types.
DEMO #3
SELECT b1.name as beer_name, b2.name as brewery_name, b2.country
FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE abv>0;
vs
SELECT b1.name as beer_name, b2.name as brewery_name, b2.country
FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE b1.type="beer” and abv>0;
©2015 Couchbase Inc. 29
Joins
 Joins are efficient by nature
 Left hand value is joined to the right hand document key with nested
loop.
Query: Get brewery location for each beer:
SELECT …
FROM `beer-sample` AS b1
JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE b1.type="beer”;
For each document with type=“beer” take b1.brewery_id and look for and
equal document key in b2.
Optimizing Applications
©2015 Couchbase Inc. 31
New Consistency Settings!
 View Stale-ness
 Ok: unbounded – query what’s available in the index/view now
 False: query after all changes up to the request timestamp (and
maybe more) has been indexed for a given index or view.
 New Indexes with Couchbase Server 4.0
 Improves granularity of the consistency logical-timestamp.
 New: ScanConsistency can be set to any logical timestamp
Indicate stale=false to stale=ok and everything in between
©2015 Couchbase Inc. 32
Flexible Consistency Settings
 Time
t1 insert (k1, v1)
…
t2 do other business logic computation
…
t3 issue query/read on (k1,v1) with t3 vs t1
Catch up all the indexes
to t3 and then issue query
Identical to “stale=false”
Catch up all the indexes
to t1 and then issue query
Improved efficiency over
“stale=false”
Recap
©2015 Couchbase Inc. 34
Recap
 New Unique Query and Indexing Architecture
 Workload isolation with MDS gives you a great performance and scale
advancement.
 Familiar Concepts from your past life will help tune queries
 Understand Execution Plans
 Understand Indexes and Index Selection
 Filter & Limit aggressively
 Understand JOINs
 Use powerful new Consistency Dials for best efficiency
Couchbase.com/beta
Q&A
Cihan Biyikoglu
cihan@couchbase.com
@cihangirb
Thank you.

More Related Content

PPTX
Under the Hood - Couchbase Server Architecture - June 2015
Cihan Biyikoglu
 
PPTX
Deploy data analysis pipeline with mesos and docker
Vu Nguyen Duy
 
PPTX
A tour of Oracle DV V3.0 new features (June 2017)
Philippe Lions
 
PDF
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
Insight Technology, Inc.
 
PPTX
Oracle DV V4 new features overview
Philippe Lions
 
PPTX
NYC* 2013 — "Using Cassandra for DVR Scheduling at Comcast"
DataStax Academy
 
PPTX
Tez Data Processing over Yarn
InMobi Technology
 
PPTX
Foundations of streaming SQL: stream & table theory
DataWorks Summit
 
Under the Hood - Couchbase Server Architecture - June 2015
Cihan Biyikoglu
 
Deploy data analysis pipeline with mesos and docker
Vu Nguyen Duy
 
A tour of Oracle DV V3.0 new features (June 2017)
Philippe Lions
 
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
Insight Technology, Inc.
 
Oracle DV V4 new features overview
Philippe Lions
 
NYC* 2013 — "Using Cassandra for DVR Scheduling at Comcast"
DataStax Academy
 
Tez Data Processing over Yarn
InMobi Technology
 
Foundations of streaming SQL: stream & table theory
DataWorks Summit
 

What's hot (20)

PDF
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Karthik Babu Sekar
 
PPTX
Couchbase presentation
sharonyb
 
PDF
Couchbase Day
Idan Tohami
 
PDF
Big Data Tools in AWS
Shu-Jeng Hsieh
 
PDF
Parallelization of Structured Streaming Jobs Using Delta Lake
Databricks
 
PDF
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Karthik Babu Sekar
 
PPTX
Apache Tez - Accelerating Hadoop Data Processing
hitesh1892
 
PPTX
Apache Tez : Accelerating Hadoop Query Processing
Bikas Saha
 
PPTX
Cost-based query optimization in Apache Hive
Julian Hyde
 
PPTX
Ozone: scaling HDFS to trillions of objects
DataWorks Summit
 
PPTX
Couchbase 101
Dipti Borkar
 
PPTX
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
StreamNative
 
PPTX
Cognos Analytics November 2017 Enhancements: 11.0.8 Demos and Q&A with the IB...
Senturus
 
PDF
How InfluxDB Enables NodeSource to Run Extreme Levels of Node.js Processes
InfluxData
 
PPTX
Tez big datacamp-la-bikas_saha
Data Con LA
 
PPTX
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Mesosphere Inc.
 
PPTX
Change data capture
Ron Barabash
 
PPTX
Curriculum Associates Strata NYC 2017
Kristi Lewandowski
 
PPTX
Discover How IBM Uses InfluxDB and Grafana to Help Clients Monitor Large Prod...
InfluxData
 
PPTX
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
DataWorks Summit
 
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Karthik Babu Sekar
 
Couchbase presentation
sharonyb
 
Couchbase Day
Idan Tohami
 
Big Data Tools in AWS
Shu-Jeng Hsieh
 
Parallelization of Structured Streaming Jobs Using Delta Lake
Databricks
 
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Karthik Babu Sekar
 
Apache Tez - Accelerating Hadoop Data Processing
hitesh1892
 
Apache Tez : Accelerating Hadoop Query Processing
Bikas Saha
 
Cost-based query optimization in Apache Hive
Julian Hyde
 
Ozone: scaling HDFS to trillions of objects
DataWorks Summit
 
Couchbase 101
Dipti Borkar
 
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
StreamNative
 
Cognos Analytics November 2017 Enhancements: 11.0.8 Demos and Q&A with the IB...
Senturus
 
How InfluxDB Enables NodeSource to Run Extreme Levels of Node.js Processes
InfluxData
 
Tez big datacamp-la-bikas_saha
Data Con LA
 
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Mesosphere Inc.
 
Change data capture
Ron Barabash
 
Curriculum Associates Strata NYC 2017
Kristi Lewandowski
 
Discover How IBM Uses InfluxDB and Grafana to Help Clients Monitor Large Prod...
InfluxData
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
DataWorks Summit
 
Ad

Similar to Tuning N1QL Query Performance with Couchbase Server 4.0 (20)

PPTX
N1QL workshop: Indexing & Query turning.
Keshav Murthy
 
PPTX
Global Secondary Indexes in Couchbase Server 4.0 - JUNE 2015
Cihan Biyikoglu
 
PDF
Couchbase 5.5: N1QL and Indexing features
Keshav Murthy
 
PPTX
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
Keshav Murthy
 
PPTX
Understanding N1QL Optimizer to Tune Queries
Keshav Murthy
 
PPTX
Deep dive into N1QL: SQL for JSON: Internals and power features.
Keshav Murthy
 
PPTX
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Data Con LA
 
PDF
NoSQL's biggest lie: SQL never went away - Martin Esmann
distributed matters
 
PPTX
Enterprise Architect's view of Couchbase 4.0 with N1QL
Keshav Murthy
 
PPTX
Couchbase N1QL: Index Advisor
Keshav Murthy
 
PPTX
N1QL: What's new in Couchbase 5.0
Keshav Murthy
 
PDF
N1QL New Features in couchbase 7.0
Keshav Murthy
 
ODP
Couchbase - Introduction
Knoldus Inc.
 
PPTX
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
Martin Esmann
 
PPTX
Couchbase Query Workbench Enhancements By Eben Haber
Keshav Murthy
 
PPTX
Revolutionizing the customer experience - Hello Engagement Database
Dipti Borkar
 
PPTX
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
ODP
Couchbase training advanced
Knoldus Inc.
 
PPTX
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Keshav Murthy
 
PPTX
Query in Couchbase. N1QL: SQL for JSON
Keshav Murthy
 
N1QL workshop: Indexing & Query turning.
Keshav Murthy
 
Global Secondary Indexes in Couchbase Server 4.0 - JUNE 2015
Cihan Biyikoglu
 
Couchbase 5.5: N1QL and Indexing features
Keshav Murthy
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
Keshav Murthy
 
Understanding N1QL Optimizer to Tune Queries
Keshav Murthy
 
Deep dive into N1QL: SQL for JSON: Internals and power features.
Keshav Murthy
 
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Data Con LA
 
NoSQL's biggest lie: SQL never went away - Martin Esmann
distributed matters
 
Enterprise Architect's view of Couchbase 4.0 with N1QL
Keshav Murthy
 
Couchbase N1QL: Index Advisor
Keshav Murthy
 
N1QL: What's new in Couchbase 5.0
Keshav Murthy
 
N1QL New Features in couchbase 7.0
Keshav Murthy
 
Couchbase - Introduction
Knoldus Inc.
 
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
Martin Esmann
 
Couchbase Query Workbench Enhancements By Eben Haber
Keshav Murthy
 
Revolutionizing the customer experience - Hello Engagement Database
Dipti Borkar
 
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
Couchbase training advanced
Knoldus Inc.
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Keshav Murthy
 
Query in Couchbase. N1QL: SQL for JSON
Keshav Murthy
 
Ad

More from Cihan Biyikoglu (8)

PPTX
Securing Redis
Cihan Biyikoglu
 
PPTX
Real-time Analytics with Redis
Cihan Biyikoglu
 
PPTX
Developing Active-Active Geo-Distributed Apps with Redis
Cihan Biyikoglu
 
PPTX
Cross Data Center Replication with Redis using Redis Enterprise
Cihan Biyikoglu
 
PPTX
SQL gene in NoSQL
Cihan Biyikoglu
 
PPTX
Document Data Modelling with Couchbase Server 4.0
Cihan Biyikoglu
 
PPTX
Deploying couchbaseserverazure cihanbiyikoglu_microsoft
Cihan Biyikoglu
 
PPTX
Inside Sql Azure - Cihan Biyikoglu - SQL Azure
Cihan Biyikoglu
 
Securing Redis
Cihan Biyikoglu
 
Real-time Analytics with Redis
Cihan Biyikoglu
 
Developing Active-Active Geo-Distributed Apps with Redis
Cihan Biyikoglu
 
Cross Data Center Replication with Redis using Redis Enterprise
Cihan Biyikoglu
 
SQL gene in NoSQL
Cihan Biyikoglu
 
Document Data Modelling with Couchbase Server 4.0
Cihan Biyikoglu
 
Deploying couchbaseserverazure cihanbiyikoglu_microsoft
Cihan Biyikoglu
 
Inside Sql Azure - Cihan Biyikoglu - SQL Azure
Cihan Biyikoglu
 

Recently uploaded (20)

PPTX
Tunnel Ventilation System in Kanpur Metro
220105053
 
PDF
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
PPTX
22PCOAM21 Session 1 Data Management.pptx
Guru Nanak Technical Institutions
 
PDF
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
PDF
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
PPTX
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
PPT
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
PDF
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PPTX
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
PDF
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
PDF
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
Tunnel Ventilation System in Kanpur Metro
220105053
 
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
22PCOAM21 Session 1 Data Management.pptx
Guru Nanak Technical Institutions
 
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 

Tuning N1QL Query Performance with Couchbase Server 4.0

  • 1. TUNING N1QL QUERY PERFORMANCE & SCALE IN COUCHBASE SERVER 4.0 Cihan Biyikoglu Dir. Product Management 1
  • 2. ©2015 Couchbase Inc. 2 Goals  Deeper look at query performance and scale  Look at Query and Index Service Scale Characteristics  Understand Query Execution Flow  Understand Index Usage  Tune queries with a few techniques
  • 3. ©2015 Couchbase Inc. 3 Agenda  Part I - Architectural Overview  New Cluster Architecture with Couchbase Server 4.0  Query Processing & Indexing  Part II - Optimizing Queries  Execution Plans and Operators  Optimizing Queries - Filtering, Index Selection and Joins  OptimizingApps - Consistency Dials  QA Demos & More Demos…
  • 4. ©2015 Couchbase Inc. 4 Disclaimer Couchbase Server 4.0 and ForestDB are still in development and the final version of the products may not be identical in details discussed on this session.
  • 6. ©2015 Couchbase Inc. 6 Couchbase Server Cluster Architecture STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage
  • 7. ©2014 Couchbase Inc. Couchbase Server Cluster Architecture STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage
  • 9. ©2015 Couchbase Inc. 9 Query Execution  Submitting Queries in N1QL  Stateless Connectivity through REST  Load-Balance across Query Service nodes  Prepared vs Ad-hoc Query Execution  Consistency Dials – more on this later…
  • 10. ©2015 Couchbase Inc. 10 Query Execution  Parallelization factor is #cores on Query Service Node Execution Flow
  • 11. ©2015 Couchbase Inc. 11 Query Service - Capacity Management Scaling the Query Service  Pro: Load Balance Queries across all nodes  Con: Compete with Index and DataWorkloads Index Service Couchbase Cluster Query Service Data Service node1 node8
  • 12. ©2015 Couchbase Inc. 12 Query Service - Capacity Management Scaling the Query Service  Added CPU: higher intra-query parallelization  Added RAM: improved caching with larger result sets  Added Node: better availability and load balancing Couchbase Cluster node1 node8 Data ServiceIndex Service Query Service
  • 14. ©2015 Couchbase Inc. 14 Indexing in Couchbase Server 4.0  Multiple Indexers  GSI – Index Service New indexing for N1QL for low latency queries without compromising on mutation performance (insert/update/delete) Independently partitioned and independently scalable indexes in Indexing Service  Map/ReduceViews – Data Service Powerful programmable indexer for complex reporting and indexing logic. Full partition alignment and paired scalability with Data Service.  SpatialView – Data Service Incremental R-tree indexing for powerful bounding-box queries Full partition alignment and paired scalability with Data Service New Index Scan
  • 15. ©2015 Couchbase Inc. 15 Which to choose – GSI vsViews Workloads New GSI in v4.0 Map/ReduceViews Complex Reporting Just InTime Pre-aggregated Workload Optimization Optimized for Scan Latency & Throughput Optimized for Insertion Flexible Index Logic N1QL Functions Javascript Secondary Lookups Single Node Lookup Scatter-Gather Tunable Consistency Staleness false or ok or everything in between Staleness false or ok
  • 16. ©2015 Couchbase Inc. 16 Which to choose – GSI vsViews Capabilities New GSI in v4.0 Map/ReduceViews Partitioning Model Independent – Indexing Service Aligned to Data – Data Service Scale Model Independently Scale Index Service Scale with Data Service Fetch with Index Key Single Node Scatter-Gather Range Scan Single Node Scatter-Gather Grouping,Aggregates With N1QL Built-in withViews API Caching Managed Not Managed Storage ForestDB Couchstore Availability Multiple Identical Indexes load balanced Replica Based
  • 17. ©2015 Couchbase Inc. 17 Query Service - Capacity Management Scaling the Index Service  Pro: Load balance scans across all nodes  Con: Compete with Query and DataWorkloads Index Service Couchbase Cluster Query Service Data Service node1 node8
  • 18. ©2015 Couchbase Inc. 18 Index Service Capacity Management Scaling the Index Service  Added RAM: better caching of indexes  Added CPU: faster index maintenance & parallelized index scans  Add Faster IO Path: faster index persistence  Added Node: better availability and load balancing Couchbase Cluster node1 node8 Data Service Index Service Query Service
  • 20. ©2015 Couchbase Inc. 20 Execution Plans & Explain  EXPLAIN query  Plan is assembled into an execution flow expressed through the operators  Operators stream results up and down the stream Sequence Parallel Primary Scan Initial Project Fetch Initial ProjectFetch Initial ProjectFetch … Limit
  • 21. ©2015 Couchbase Inc. 21 Operators  Main Operations  Scans PrimaryScan: Scan of the Primary Index based on document keys IndexScan: Scan of the Secondary Index based on a predicate  Fetch Fetch: Reach into the Data service with a document key  Projection Operations InitialProject: reducing the stream size to the fields involved in query. FinalProject: final shaping of the result to the requested JSON shape
  • 22. ©2015 Couchbase Inc. 22 Operators cont.  Operator Assembly Parallel: execute all child operations in parallel Sequence: execute child items in a sequence  Filtering Operators Filter:Apply a filter expression (ex.WHERE field = “value”) Limit: limit the number of items returned to N Offset: start returning items from a specified item count
  • 23. ©2015 Couchbase Inc. 23 Operators cont.  Join Operators Join: Join left and right keyspaces on attributes and document key Unnest: Join operation between a parent and a child with a nested array where parent is repeated for each child array item. Nest:Grouping operation between a parent and a child array where child array is embedded into the parent.
  • 26. ©2015 Couchbase Inc. 26 Minimize Items Scanned  Primary Index Scan vs. Index Scan  Primary Index can only filter on document keys thus typically means “full-scan” of the bucket  Secondary Index is typically done with predicates and are smaller in size thus better to scan Index Selection: Based on matching expressions matching in Index andWHERE clause DEMO #2 SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10; Vs. CREATE INDEX i_type on `beer-sample`(type) USING GSI; SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
  • 27. ©2015 Couchbase Inc. 27 Minimize Items Scanned  HINT index usage to queries  There can be multiple indexes with to choose from and you can hint index choice to us. SELECT name,updated FROM `beer-sample` USE INDEX(i_type using gsi) WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
  • 28. ©2015 Couchbase Inc. 28 Minimize Items Scanned  Limit & Filters help eliminate rows early in the execution plan  With Limit, Upstream operators are signaled to stop by limit when enough rows accumulate  Ex: Remember to Filter on Document type with buckets that contain multiple types. DEMO #3 SELECT b1.name as beer_name, b2.name as brewery_name, b2.country FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id WHERE abv>0; vs SELECT b1.name as beer_name, b2.name as brewery_name, b2.country FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id WHERE b1.type="beer” and abv>0;
  • 29. ©2015 Couchbase Inc. 29 Joins  Joins are efficient by nature  Left hand value is joined to the right hand document key with nested loop. Query: Get brewery location for each beer: SELECT … FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id WHERE b1.type="beer”; For each document with type=“beer” take b1.brewery_id and look for and equal document key in b2.
  • 31. ©2015 Couchbase Inc. 31 New Consistency Settings!  View Stale-ness  Ok: unbounded – query what’s available in the index/view now  False: query after all changes up to the request timestamp (and maybe more) has been indexed for a given index or view.  New Indexes with Couchbase Server 4.0  Improves granularity of the consistency logical-timestamp.  New: ScanConsistency can be set to any logical timestamp Indicate stale=false to stale=ok and everything in between
  • 32. ©2015 Couchbase Inc. 32 Flexible Consistency Settings  Time t1 insert (k1, v1) … t2 do other business logic computation … t3 issue query/read on (k1,v1) with t3 vs t1 Catch up all the indexes to t3 and then issue query Identical to “stale=false” Catch up all the indexes to t1 and then issue query Improved efficiency over “stale=false”
  • 33. Recap
  • 34. ©2015 Couchbase Inc. 34 Recap  New Unique Query and Indexing Architecture  Workload isolation with MDS gives you a great performance and scale advancement.  Familiar Concepts from your past life will help tune queries  Understand Execution Plans  Understand Indexes and Index Selection  Filter & Limit aggressively  Understand JOINs  Use powerful new Consistency Dials for best efficiency

Editor's Notes

  • #10: Application has single logical connection to cluster (client object) Data is automatically sharded resulting in even document data distribution across cluster Each vbucket replicated 1, 2 or 3 times (“peer-to-peer” replication) Docs are automatically hashed by the client to a shard’ Cluster map provides location of which server a shard is on Every read/write/update/delete goes to same node for a given key Strongly consistent data access (“read your own writes”) A single Couchbase node can achieve 100k’s ops/sec so no need to scale reads
  • #16: Flip m/r index vs gsi on the graph
  • #17: Flip m/r index vs gsi on the graph
  • #21: ParentScan KeyScan ValueScan DummyScan CountScan IntersectScan
  • #23: Discard Stream Collect Channel