Introduction to couchbase

Introduction to
Couchbase Server
Dipti Borkar
Director, Product Management
Anil Kumar
Product Management

Couchbase Server
NoSQL Document Database

Couchbase Open Source Project
• Leading NoSQL database project
focused on distributed database
technology and surrounding
ecosystem
• Supports both key-value and
document-oriented use cases
• All components are available
under the Apache 2.0 Public
License

• Obtained as packaged software in
both enterprise and community
editions.

Couchbase
Open Source Project

Couchbase Server
Easy
Scalability
Grow cluster without
application changes, without
downtime with a single click

Always On
24x365
No downtime for software
upgrades, hardware
maintenance, etc.

PE
RFORM ANCE

Consistent High
Performance

Consistent sub-millisecond
read and write response times
with consistent high throughput

JSON
JSON JSO

JSON
JSON

N

Flexible Data
Model

JSON document model with
no fixed schema.

Core Couchbase Server Features

Built-in clustering – All nodes equal

Append-only storage layer

Data replication with auto-failover

Online compaction

Zero-downtime maintenance

Monitoring and admin API & UI

Built-in managed cached

SDK for a variety of languages

2.0 introduced
JSON support

Indexing and Querying

JSON
JSON JSO

JSON N
JSON

Incremental Map Reduce

Cross data center replication

2.1 introduced

New in 2.2

 Multi-threaded persistence
engine

 New XDCR protocol based
on memcached

 Optimistic XDCR

 Read-only admin user

 CBHealthcheck – Cluster
health check tool

 Automated and optimized
purge management

 Hostname management

 CBRecovery Data recovery
tool from remote clusters

 Rebalance progress
indicators

 Non-root, non-sudo install

Couchbase Server Architecture
11211

11210

Query API

Memcapable 1.0

Memcapable 2.0

New Persistence Layer

vBucket state and replication manager

Node health monitor

Rebalance orchestrator

storage interface

Global singleton supervisor

Data Manager

Configuration manager

Couchbase EP Engine

Process monitor

Memcached

Heartbeat

Moxi
REST management API/Web UI

Query Engine

8092

Cluster Manager

http

on each node

one per cluster

Erlang/OTP

HTTP

Erlang port mapper

Distributed Erlang

8091

4369

21100 - 21199

Couchbase Server Architecture

Query Engine

Query API

11210 / 11211

8091
Admin Console

Data access ports

http

Object-managed
Cache

Erlang /OTP

8092

REST management
API/Web UI
Replication, Rebalance,
Shard State Manager

Multi-threaded
Persistence Engine

Data Manager

Cluster Manager

Single node - Couchbase Write
Operation
Doc 1

App Server

Couchbase Server Node
3
2
Managed Cache
Replication
Queue

Disk

Doc 1

Disk Queue

To other node

3

Single node - Couchbase Update
Operation
Doc 1’

App Server

3
2
Managed Cache
Replication
Queue

Doc 1
Doc 1’

Disk Queue

To other node

3

Disk
Doc 1

GET
Doc 1

Single node - Couchbase Read
Operation
App Server

3
2
Managed Cache
Replication
Queue

Doc 1

Disk Queue

To other node

3

Disk
Doc 1

Single node – Couchbase Cache Miss
GET
Doc 1

2
App Server

3
2
Managed Cache
Replication
Queue

Doc 5 4 4
Doc
Doc

Doc 1

Doc
Doc 3 2

Disk Queue

To other node

Disk
Doc 1

Doc 6

Doc 5

Doc 4

Doc 3

3

Doc 2


Basic Operation
APP SERVER 1

APP SERVER 2

COUCHBASE Client Library


CLUSTER MAP

CLUSTER MAP

READ/WRITE/UPDATE
SERVER 1

SERVER 2

SERVER 3

ACTIVE

ACTIVE

ACTIVE

Doc 5

Doc

Doc 4

Doc

Doc 1

Doc

Doc 2

Doc

Doc 7

Doc

Doc 2

Doc

Doc 9

Doc

Doc 8

Doc

Doc 6

Doc

REPLICA

REPLICA

REPLICA

• Docs distributed evenly across
servers
• Each server stores both active and
replica docs
Only one server active at a time

• Client library provides app with
simple interface to database
• Cluster map provides map
to which server doc is on

Doc 4

Doc

Doc 6

Doc

Doc 7

Doc

Doc 1

Doc

Doc 3

Doc

Doc 9

Doc

• App reads, writes, updates docs

Doc 8

Doc

Doc 2

Doc

Doc 5

Doc

• Multiple app servers can access same
document at same time

COUCHBASE SERVER CLUSTER

User Configured Replica Count = 1

App never needs to know

Add Nodes to Cluster
APP SERVER 1

APP SERVER 2



CLUSTER MAP

CLUSTER MAP

READ/WRITE/UPDATE

READ/WRITE/UPDATE

SERVER 1

SERVER 2

SERVER 3

SERVER 4

SERVER 5

ACTIVE

ACTIVE

ACTIVE

ACTIVE

ACTIVE

Doc 5

Doc

Doc 4

Doc

Doc 1

Doc

Doc 7

Doc

Doc 2

Doc

Doc 9

Doc

Doc 8

Doc

Doc 6

• Docs automatically
rebalanced across
cluster

Doc

Doc 2

• Two servers added
One-click operation

Doc

Even distribution of docs
Minimum doc movement

• Cluster map updated
REPLICA

REPLICA

REPLICA

Doc 4

Doc

Doc 6

Doc

Doc 7

Doc

Doc 1

Doc

Doc 3

Doc

Doc 9

Doc

Doc 8

Doc

Doc 2

Doc

Doc 5

Doc



REPLICA

REPLICA

• App database
calls now distributed
over larger number of
servers

Fail Over Node
APP SERVER 1

APP SERVER 2



CLUSTER MAP

CLUSTER MAP

SERVER 1

SERVER 2

SERVER 3

SERVER 4

SERVER 5

ACTIVE

ACTIVE

ACTIVE

ACTIVE

ACTIVE

Doc 5

Doc

Doc 4

Doc

Doc 1

Doc

Doc 9

Doc

Doc 2

Doc

Doc 7

Doc

Doc 2

Doc

Doc 8

Doc

Doc 1

Doc 6

Doc
Doc

Doc 3

REPLICA

REPLICA

REPLICA

REPLICA

Doc 4

Doc

Doc 6

Doc

Doc 7

Doc

Doc 5

Doc 1

Doc

Doc 3

Doc

Doc 9

Doc

Doc 2



Doc

REPLICA

Doc 8

Doc

Doc

• App servers accessing docs
• Requests to Server 3 fail

• Cluster detects server failed
Promotes replicas of docs to
active
Updates cluster map

• Requests for docs now go to
appropriate server
• Typically rebalance
would follow

Indexing and Querying – The basics
• Define materialized views on JSON documents and then
query across the data set
• Using views you can define
• Primary indexes

• Simple secondary indexes (most common use case)
• Complex secondary, tertiary and composite indexes
• Aggregations (reduction)

• Indexes are eventually indexed
• Queries are eventually consistent
• Built using Map/Reduce technology
• Map and Reduce functions are written in Javascript

Indexing and Querying
APP SERVER 1

APP SERVER 2



CLUSTER MAP

CLUSTER MAP

Query
SERVER 1

SERVER 2

ACTIVE

ACTIVE

SERVER 3
ACTIVE

• Indexing work is distributed
amongst nodes

Doc 5

Doc

Doc 5

Doc

Doc 5

Doc

• Large data set possible

Doc 2

Doc

Doc 2

Doc

Doc 2

Doc

• Parallelize the effort

Doc 9

Doc

Doc 9

Doc

Doc 9

Doc

REPLICA

REPLICA

REPLICA

Doc 4

Doc

Doc 4

Doc

Doc 4

Doc

Doc 1

Doc

Doc 1

Doc

Doc 1

Doc

Doc 8

Doc

Doc 8

Doc

Doc 8

Doc



• Each node has index for data stored
on it
• Queries combine the results from
required nodes

Cross Data Center Replication – The basics
• Replicate your Couchbase data across clusters
• Clusters may be spread across geos

• Configured on a per-bucket (per-database) basis
• Supports unidirectional and bidirectional operation
• Application can read and write from both clusters
-

Active – Active replication

• Replication throughput scales out linearly
• Different from intra-cluster replication

Cross data center replication – Data flow
2
Doc 1

App Server


3
2
Managed Cache
Replication
Queue

Doc 1

Disk Queue

To other node

3

Disk
Doc 1

XDCR Engine
To other cluster

Cross Data Center Replication (XDCR)
Optimistic replication
Couchbase Server – San Francisco
SERVER 1

SERVER 2

SERVER 3

Per replication
Tunable Parameters

Couchbase Server – New York
SERVER 1

SERVER 2

Optimized protocol
based on memcached
Reliability and
performance at scale

SERVER 3

Couchbase Query Language

N1QL
Read “Nickel”

Our next generation query
language for JSON
In Dev Preview

Couchbase Server

www.couchbase.com/download

Thank you!
anil@couchbase.com
@anilkumar1129
Download Couchbase Server 2.2
http://www.couchbase.com/download

Introduction to couchbase

In this document

More Related Content

What's hot

Viewers also liked

Similar to Introduction to couchbase

More from Dipti Borkar

Recently uploaded

Introduction to couchbase