Databases On AWS: Raul Hugo, Solutions Architect
Databases On AWS: Raul Hugo, Solutions Architect
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Services
Managed Relational Petabyte-scale Data In-Memory Key
Database Service Warehouse Value Store
Amazon Amazon
Amazon RDS Redshift Elasticache
)
e w] e w] GA
e vi e vi ow
r r (N
[P [P MongoDB
Fully Managed Time Fully Managed Compatible
Series Database Ledger Database Document Database
Amazon Amazon Amazon
Timestream QLDB DocumentDB
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traditional Database Architecture
Client Tier
one database
for all App/Web Tier
workloads
RDBMS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traditional Database Architecture
Client Tier
Key-value access
Complex queries
App/Web Tier
OLAP transactions
Analytics
RDBMS
All forced into the
relational database
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Data Tier Architecture
Client Tier
On AWS choose best
database service for
each workload App/Web Tier
Data Tier
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Workload Driven Data Store Selection
hot reads analytics
logging
NoSQL complex queries Periodic rich search
simple query & transactions data
Graph / Key Value / Document Untampered
data
Data Tier
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Services for the Data Tier
hot reads analytics
logging
NoSQL complex queries Periodic rich search
simple query & transactions data
Graph / Key Value / Document Untampered
data
Data Tier
Amazon Amazon Amazon
Amazon S3
ElastiCache Redshift Timestream
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon RDS
Managed relational database service with a choice of popular database engines
Easy to administer Performant & scalable Available & durable Secure and compliant
Easily deploy and maintain Scale compute Automatic Multi-AZ data Data encryption at rest and in
hardware, OS and DB and storage with a few clicks; replication; automated transit; industry compliance
software; built-in monitoring minimal downtime for your backup, snapshots, and and assurance programs
application failover
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
If you host your databases on-premises…
App optimization
Scaling
High availability
Database backups
DB s/w patches
DB s/w installs
OS patches
OS installation
Server maintenance
Rack & stack
Power, HVAC, net
you
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
If you host your databases in Amazon EC2…
App optimization
Scaling
High availability
Database backups
DB s/w patches
DB s/w installs
OS patches
OS installation OS installation
Server maintenance Server maintenance
Rack & stack Rack & stack
Power, HVAC, net Power, HVAC, net
you
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
If you choose Amazon RDS…
App optimization
Scaling Scaling
High availability High availability
Database backups Database backups
DB s/w patches DB s/w patches
DB s/w installs DB s/w installs
OS patches OS patches
OS installation OS installation
Server maintenance Server maintenance
Rack & stack Rack & stack
Power, HVAC, net Power, HVAC, net
you
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key Amazon RDS Features
Multi-AZ
Amazon RDS Improve Increase Reduce
Configuration Availability Throughput Latency
Push-Button Scaling
Multi AZ
Read Replicas availability availability
zone zone
Provisioned IOPS Region
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora
MySQL and PostgreSQL compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scale-out, distributed, multi-tenant architecture
Master Replica Replica
• Purpose-built log-structured
distributed storage system SQL SQL SQL
designed for databases Transactions
Transactions Transactions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Aurora MySQL performance
WRITE PERFORMANCE READ PERFORMANCE
250000 700000
600000
200000
500000
150000
400000
100000 300000
200000
50000
100000
0
0
MySQL SysBench results; R4.16XL: 64cores / 488 GB RAM Aurora MySQL 5.6
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Aurora PostgreSQL performance
While running pgbench at load, throughput is 3x more consistent than
PostgreSQL
pgbench throughput over time, 150 GiB, 1024 clients
45000
40000
35000
30000
Throughput, tps
25000
20000
15000
10000
5000
0
10 15 20 25 30 35 40 45 50 55 60
Minutes
PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Everything you get from Amazon RDS…
App optimization App optimization App optimization
Scaling Scaling Scaling
High availability High availability High availability
Database backups Database backups Database backups
DB software patches DB software patches DB software patches
Managed DB software installs DB software installs DB software installs Managed
by you OS patches OS patches OS patches by AWS
OS installation OS installation OS installation
Server maintenance Server maintenance Server maintenance
Rack and stack Rack and stack Rack and stack
Power, HVAC, net Power, HVAC, net Power, HVAC, net
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
…and more
up to 64 TB
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Database backtrack
Invisible
t4
Invisible
t2 t3
Rewind to t3
t0 t1
Rewind to t1
t0 t1 t2 t3 t4
Backtrack brings the database to a point in time without requiring restore from backups
• Backtracking from an unintentional DML or DDL operation
• Backtrack is not destructive. You can backtrack multiple times to find the right point in time
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How does backtrack work?
SEGMENT LOG
SNAPSHOT RECORDS
SEGMENT 1
SEGMENT 2
SEGMENT 3
RECOVERY TIME
POINT
We keep periodic snapshot of each segment; we also preserve the redo logs
For backtrack, we identify the appropriate segment snapshots
Apply log streams to segment snapshots in parallel and asynchronously
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Zero downtime patching
Storage Service
Net App
Old DB
Before ZDP
state state
Engine
Net App
state state New DB
User sessions terminate Engine
during patching
Storage Service
With ZDP
Old DB
Engine
Application
Networking
state
state
New DB
User sessions remain Engine
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
active through patching
Fast database cloning BENCHMARKS
system.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
(Preview)
Aurora Multi-Master
First relational database service with scale-out reads and writes
across multiple data centers
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Global database
Faster disaster recovery and enhanced data locality
Aurora
Primary Aurora Aurora
Replication Server
Replication Agent
Replica
Instance Replica Replica
(optional)
Async.
TYPE OF WRITE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REDO LOG FRM FILES
Aurora Serverless
On-demand, auto-scaling database for applications with variable workloads
Application
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance Insights for Aurora
Analyze and troubleshoot your database performance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
for as low as
$934/TB per year
Petabyte scale
Massively parallel
Amazon Columnar Store
Redshift
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Redshift – Data Warehousing
Fast, powerful, and simple data warehousing at 1/10 the cost
Massively parallel, petabyte scale
$
Columnar storage As low as $1000 per Resize your cluster up Data encrypted at rest
technology to improve I/O terabyte per year, and down as your and transit. Isolate
efficiency and parallelize 1/10th the cost of performance and clusters with VPC.
queries. Data load scales traditional data capacity needs Manage your own keys
linearly. warehouse solutions change with KMS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Redshift cluster architecture
Massively parallel, shared nothing architecture
Streaming Backup/Restore from S3 JDBC/ODBC
Redshift Cluster
Leader node
Leader Node
• SQL endpoint
• Stores metadata
• Coordinates parallel SQL processing
Compute Nodes
Compute nodes
• Local, columnar storage
• Executes queries in parallel Efficient Data Loads
• Load, backup, restore Streaming Backup/Restore
• 2, 16, or 32 slices
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Redshift Spectrum
Run SQL queries directly against data in S3 using thousands of nodes
High concurrency: Multiple No ETL: Query data in-place Full Amazon Redshift SQL
clusters access same data using open file formats support
S3 SQL
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NoSQL database
Seamless scalability
Zero admin
Amazon
DynamoDB
Single-digit millisecond latency
Multi-Master
Multi-Region
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
Highly available
Fully managed Consistently fast at any scale and durable
Designed to support
Built for high durability 99.99%
of availability
WRITES READS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Highly available and durable
3-way replication
OrderId: 1
CustomerId: 1 Data is always replicated to
ASIN: [B00X4WHP5E] three Availability Zones
Hash(1) = 7B
CustomerOrdersTable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Backup and restore
The only cloud database to provide on-demand and continuous backups
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Global Tables
The first fully-managed, multi-master, multi-region database
Global Table
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DynamoDB On-Demand
Features
• No capacity planning, provisioning, or
reservations–simply make API calls
Key benefits
• Eliminates tradeoffs of over- or under-
provisioning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Capacity managed for you
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DynamoDB Accelerator (DAX)
High performance
Even faster—
DAX microsecond latency
Scales to millions of
requests per second
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fully managed auto scaling
Automated
scaling policies
$$$ Savings Scales up when
you need it
Scales down when
you don’t
Scheduled
auto scaling
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NoSQL vs. SQL for a new app: how to choose?
Want simplest possible DB Need joins, transactions, frequent
management? table scans?
Want app to manage DB integrity? Want DB engine to manage DB
integrity?
Team has SQL skills?
Amazon Amazon
DynamoDB RDS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introducing Amazon ElastiCache
Fully-managed, Redis or Memcached compatible, low-latency, in-memory data
store
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
µs is the new ms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Internet-scale apps need low latency and high
concurrency
Users 1M+
Locality Global
Performance Milliseconds
to microseconds
Request Rate Millions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ElastiCache
• In-memory cache in the cloud
• Improve latency and throughput for read-heavy
workloads
• Supports open-source caching engines
• Memcached
• Redis
• Fully managed
• Multi-AZ
Examples
• Caching of MySQL database query results
• Caching of post-processing results
• Caching of user session and frequently accessed data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ElastiCache Redis
#1 Key-Value Store* Highly Available & Reliable
Fast in-memory data store in the cloud. Use as a database, cache, Read replicas, multiple primaries, multi-AZ with
message broker, queue automatic failover
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
*: https://db-engines.com/en/ranking
ElastiCache Memcached
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fully managed graph database
Supports open graph APIs
Scalable
Amazon
Neptune
ACID compliant
Multi-AZ
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Neptune
Fully managed graph database for highly connected data
Gremlin
SPARQL
Supports Apache Store billions of relationships; 6 replicas of your data Build powerful queries
TinkerPopTM & W3C RDF query with millisecond latency across 3 AZs with full easily with Gremlin and
graph models backup and restore SPARQL
+
GRAPHQL with AppSync
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use cases for highly connected data
• Social networking
• Recommendations
• Knowledge graphs
• Fraud detection
• Life sciences
• Up to 15 read replicas
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
(Preview)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Quantum Ledger Database (QLDB) (Preview)
Cryptographically
Immutable verifiable Highly scalable Easy to use
Maintains a sequenced Uses cryptography to Executes 2–3X as many Easy to use, letting you
record of all changes to your generate a secure output transactions as ledgers use familiar database
data, which cannot be file of your data’s history in common blockchain capabilities like SQL APIs
deleted or modified; you have frameworks for querying the data
the ability to query and
analyze the full history
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How Amazon QLDB works
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Common customer use cases
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
(Now GA)
Performance at scale
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why use a document database?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use cases for document databases
{
id: 181276,
{ username: 'sue1942',
id: 181276, name: {first: 'Susan',
username: 'sue1942', last: 'Benoit'},
name: {first: 'Susan', tankfight: {
last: 'Benoit'}, hi_score: 3185400,
} global_rank: 5139
}
}
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MongoDB – #1 NoSQL database engine
Source: http://db-engines.com
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MongoDB Architecture
Sharded cluster scaling dramatically increases operational complexity
reads / writes
Shard
Shard Shard
Primary Primary Primary
replication
Shard Balancing
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Running MongoDB is difficult……
TCO
Time to scale
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DocumentDB
Fast, scalable, highly available, fully managed MongoDB-compatible database
service
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DocumentDB Architecture
Separate compute and storage provide 2x throughput of current MongoDB
managed services
AWS Region
reads
wri
reads
tes writes
Automatic
Fast More throughput Analytics
storage scaling
Millions of requests per Separation of storage and DocumentDB will Launch instances in minutes
second with millisecond compute offloads replication, automatically grow the size for analytical queries and
latency; scale-out up to providing 2x the throughput of of your storage volume as shut them down at the end
15 read replicas current MongoDB managed your cluster storage needs of the day
services grow.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reliable
Fast, reliable, and fully-managed MongoDB-compatible database service
Failing instances are Replicas are Continuous backups with Data is replicated six-
automatically detected and automatically promoted to point in time recovery. ways across three AZs
recovered; no cache warm- primary Scheduled snapshots.
up needed No performance impact.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fully-managed
Fast, reliable, and fully-managed MongoDB-compatible database service
Up-to-date with the latest Provision production- Over 20 key operational Deeply integrated with AWS
patches ready clusters in minutes metrics for your clusters services such as
at no extra charge CloudFormation, CloudTrail,
CloudWatch, DMS, IAM, VPC,
and more.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MongoDB-compatible
Fast, reliable, fully-managed MongoDB-compatible database
Compatible with MongoDB Use the same MongoDB Live migrations with Read scaling is easy with
Community Edition 3.6 drivers and tools with DMS; free for 6-months automatic replica set
DocumentDB; as simple as configurations
changing an application
connection string
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s all about
choice Performance-oriented
Cost-oriented
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Any questions?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.