0% found this document useful (0 votes)
177 views

G G 'S Bigtable: Name: Tunahan YILDIRIM Number:2195303 Paper: A Distributed Storage System For Structured Data

Google's BigTable is a distributed storage system for managing structured data at a large scale. It provides a simple data model and API that allows for dynamic control over data layout and format. BigTable stores data across many computers in a large clusters and scales to petabytes of data and trillions of rows. It offers high performance for read/write operations and reliability even in the case of hardware/software failures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views

G G 'S Bigtable: Name: Tunahan YILDIRIM Number:2195303 Paper: A Distributed Storage System For Structured Data

Google's BigTable is a distributed storage system for managing structured data at a large scale. It provides a simple data model and API that allows for dynamic control over data layout and format. BigTable stores data across many computers in a large clusters and scales to petabytes of data and trillions of rows. It offers high performance for read/write operations and reliability even in the case of hardware/software failures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

GOOGLES BIGTABLE

Name: Tunahan YILDIRIM


Number:2195303
Paper: A Distributed Storage System for Structured Data
Outline
Introduction
Data Model
Design
API
Implementation
Performance
Conclusion
What is BigTable?

Big Table is a distributed storage system for managing


structured data
Sparse
Distributed
Provides high performance
Reliably scale
What is BigTable?

Not fully relational database


Used in many Google products
Google Earth
Google Documents
Google Finance
Orkut
Personalized Search
Uses different configuration for every application
Goal

The BigTable provides clients with a simple data


model that supports dynamic control over data
layout and format
Data Model

Sparse Distrubuted- Persistent Multidimensional map


Map indexed by :
Row Key
Column Key
TimeStamp
Rows: com.cnn.www
Columns:contents,anchor:Referenced websites
Timestamp: Two anchor has one timestamp ;however there
are three version of content
Row:
Arbitrary strings(Up to sixty-four kilobyte)
Row keys are in alphabetical order
Rows are called as tablets
Reversed version of webpages are used for efficiency
maps.google.com/index.html ->
com.google.maps/index.html
Webpages which are in same domain is grouped.
Provides distribution and load balancing
Column

Provides access control


All data stored in same column family is same type
Column name is named with this syntax
Family:qualifier
Qualifier can be arbitrary however family is not.
Memory accounting is also perform at at this level.
Timestamp

Each cell in Bigtable can contain multiple versions of the


data
Sixty four bit integers.
To avoid collusion ,application creates a unique
timestamp
Garbage collection will be used for deleting older data.
API

Used for creating and deleting tables and column families.


Provide functions for changing data
Supports single row transactions
Sawzall programming language
This API isnt used for manupulating table data but it
allows many filtering base operations.
Writing Big Table:
Reading BigTable
Design

Big Table uses Google File System (GFS)


Stores log and data files
A Bigtable cluster, operates in a shared pool of machines
that run a large variety of other distributed applications.
Bigtable processes often share the same machines with
processes from other applications.
SSTable
This file format is used for internally store Bigtable Data
Provides a persistent ordered immutable map for keys to values.
Generally block size 64 kb but it is configurable
Use binary search in memory in a single disk seek

64kb 64kb 64kb 64kb 64kb

Block Index
Chubby Service

Highly-available and persistent distributed lock service


Consists five active replicas
One of them used as a master.
Provides namespace consist of directories and small files
Implementation

Three Major Components


A library that is linked into every client
One master server
Many tablet servers(Dynamically added or removed)
Master Server

Masteris responsible for assigning tables to tablet


servers.
Balancing tablet-server load
Garbage collection of files in GFS
Handles the schema changes
Tablet Servers

Each tablet server manages between ten to thousand


tablets per tablet server
Handles read and write requests
Splits tablets that have grown too large
Every tablets are nearly one hundert-two hundert
megabytes
Three Level Hierarchy(Tablet Locations)
Three Level Hierarchy
First Level
A file stored in Chubby that contains location of
root tablet
Second Level
Root tablet contains the location of metadata
tablets.
Third Level
Contains user tables
Implementation

METADATA row stores approximately 1KB of data in


memory
The client library caches the location of tablets.
Each tablet is assigned to one tablet server at a time.
Master keeps track of the set of live tablet servers
If unassigned tablet exist, master assigns the tablet and
load the request.
Implementation
Chubby keeps track of tablet servers
Starting tablet server:
Tablet server creates and acquires an exclusive lock on
Chubby directory.
If tablet server loses its exclusive lock, it stops serving
tablets.
Try to reacquire the exclusive lock on as long as the
file still exists.
If file no longer exists,the tablet server will never be
able to serve again so it kills itself.
Startup Master

Grabs unique master lock in Chubby to prevent concurrent master


instantiations
Master scan servers directory in Chubby.
Communicates with every live tablet server
Scans the METADATA table.
Task of Master

Periodically asks each tablet server is no longer serving its tablets.


Reassiging these tablets.
If no answer from tablets, master tries to acquire lock itself.
Tablet Serving

Persistent state of tablet is stored in GFS.


Updates are committed to a commit log that stores redo records.
New records are stored in memtable
Old records are stored in SSTable.
Tablet Serving
Tablet operations

Read
Write
Read Operation

Tablet server controls whether wellformed and controls the authorization.


When read operation arrives at a table server, this read operation executed
on union of SSTables and memtable.
Write Operation

Server checks that is well-formed and control the sender whether authorized
or not.
Authorization is performed by reading the list of permitted writer in Chubby
file
Write to commit log
Contents are inserted in Memtable
Compactions

Minor compaction
When write operations are executed, the size of memtable increases.
When memtable size reached threshold , a new memtable is created and old
data is converted to an SSTable.

Advantages
It shrinks the memory usage of the tablet server, and it reduces the amount
of data that has to be read from the commit log .
Merge Compaction

A merging compaction reads the contents of a few SSTables and the


memtable, and writes out a new SSTable.
Periodically executed.
Major Compaction

Rewrite all SSTables into a SSTable


This SSTable doesnt contain deleted data
Refinements
Locality Groups:
Column families that are not accessed together into separate
locality groups enables more efficient reads.
Compression:
The compression process is very fast and applied to each
SSTable.
Bloom Filters:
Reduce number of disk Access.
Bloom filter allow us to ask whether an SSTable might
contain any data for a specified row/column pair
Refinements

Caching:
To improve read performance,tablet servers use two
levels of caching.
Scan Cache:
Higher level caches the key-value pair returned.
Block Cache:
Lower level cache SSTables block that were read from GFS
Performance
Conclusion

More than sixty Google products use BigTable.


Performance and high availability are provided by BigTable.
Some adaptation problems exist, and following years new features will be
added.
THANKS

You might also like