0% found this document useful (0 votes)
72 views

NoSQL Databases (MongoDB-Cassandra)

The document provides an overview of NoSQL databases and MongoDB. It discusses the CAP theorem and how different databases achieve different levels of consistency, availability, and partition tolerance. It then covers MongoDB's features like scalability, flexibility, indexing and querying capabilities. The document also discusses MongoDB's data model, CRUD operations, administration and installation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

NoSQL Databases (MongoDB-Cassandra)

The document provides an overview of NoSQL databases and MongoDB. It discusses the CAP theorem and how different databases achieve different levels of consistency, availability, and partition tolerance. It then covers MongoDB's features like scalability, flexibility, indexing and querying capabilities. The document also discusses MongoDB's data model, CRUD operations, administration and installation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

NoSQL Databases – Mongo DB,

Cassandra

Session One
CAP Theorem
 There are three parameters to define any distributed systems. They are:
– Consistency: Which ensures that users can access same data at the same
time
– Availability: Every request receives a response about whether it was
successful or failed
– Partition Tolerance: The system continues to operate despite arbitrary
message loss or failure of part of the system
 Definition: Any distributed system can achieve any two of them.
 All these three parameters are vertices of Triangle. We have three sides of
triangle are CA, AP, CP.
 CA -> RDBMS, Teradata, Greenplum etc.
 AP -> Cassandra, Voldemart, DynamoDB, Raik, Couch DB etc
 CP -> HBase, Big Table, Mongo DB, Hyper Table etc
NoSQL
 NoSQL means Not only “Relational/SQL”.
 A NoSQL database provides a simple, lightweight mechanism for storage
and retrieval of data that provides higher scalability and availability than
traditional RDBMS.
 Horizontal Scalability/ Scale out.
 Schema Free/ Flexible Schema.
 High Write/Read throughput.
 Multiple Data Models.
 Different Interfaces like CLI, HQL, CQL, Language API, REST API etc.
 Handles all varieties of data.
 Programmer friendly.
NoSQL vs. SQL
SQL Databases NoSQL Databases
Scale Up or Vertical Scaling. Scale out or Horizontal Scaling.
Consistency, Availability. Consistency, Availability, Partition tolerance.
Single Data Model i.e Relational. Multiple Data Models i.e Columnar,
Document, Key-Value, Graph and many.
Single Query Language i.e SQL. Multiple Query Languages i.e Simple CLI,
HQL,CQL, REST, Thrift, DSLs.
Rigid Schema. Schema free/ Flexible Schema.
Joins are expensive. Free from Joins.
Good for Real time Querying i.e point queries Good for Real time Querying as well as Real
time Decisioning.
Scales up to a few Tera bytes. Scales up to Peta bytes.
Contd…

Good for Low data traffic. Good for high volumes of data traffic.
Complexity in managing Distributed Majority are Distributed in Nature. Very
databases i.e Adding/Removing machines is easy to add/remove Machines to the
so complex Existing clusters.
Good for non volatile data Good for volatile data.
Hard to implement i.e schema design, data Simple to implement.
integrity
Good for Transactions i.e OLTP. Good for Decisioning i.e OLAP.

Write throughput is very low. Write throughput is very high.


Good for handling Structured data. Good for handling Unstructured and
Semistructured data.
Not Programmer friendly. Programmer Friendly.
NoSQL Data Stores
 There are several categories of NoSQL Databases. Some of them are using data
models as key – value model, document model, graph model, container
models. The following are some of them.
• Columnar stores
• HBase, Cassandra, Hyper Table, Big Table, Accumulo, ClouData etc.
• Key – Value Stores
• Redis, Voldemart, Raik, Tokyo, DynamoDB etc
• Document Stores
• Mango DB, Couch DB, CouchBase Server, Terrastore etc
• Graph Databases
• Neo4j, Infinite graph, Info Grid etc.
• XML Databases
• EMC Documentum DB, Berkeley DB XML etc.
• Many more…
Mongo DB

 Mongo DB is a distributed, scalable, high-performance, open source


NoSQL database.
 It handles humongous amount of data and written in c++.
 It is a Document Oriented database.
 It is good for volatile data.
 It scales to tera bytes of data.
 It has connectors to Apache Solr, Apache Hadoop.
 It is good for logging systems, storing news data, social network content.
 It has interface/Drivers to all programming languages
 It has Restful API.
 Simple to implement and Administration.
Installation
 Download the binary distribution from mongo db website
 Create the directory for mongo db ( default /data/db ) “directory path”
 Start the mongo db server:
– cd mongo home directory/bin
– ./mongod –rest –dbpath=“directory path”
 Start the mongo db client:
– ./mongo database ( default is test ).
 Start the web interface
– hostname:28017
Features
 JSON/BSON document is basic unit of data.
 Programmer friendly.
 Document-Oriented Storage
– JSON-style documents with dynamic schemas offer simplicity and power.
 Full Index Support
– Index on any attribute, primary as well as secondary indexes.
 Replication & High Availability
– Mirror across LANs and WANs for scale and peace of mind.
 Auto Sharding
– Scale horizontally without compromising functionality.
 Querying
– Rich, document-based queries.
 Map/Reduce
– Flexible aggregation and data processing.
 GridFS
– Store files of any size without complicating your stack
Data Model
 Basic Unit is a Document
 A document is a collection of key value pair.
 A key is string and value is a primitive data type or arrays or document
Analogy of Mongo DB to SQL Databases

Mongo DB SQL Databases


Database Database

Collection Table

Document Record/Tuple
CRUD Operations

 Create databases
 Create collections
 Insert documents
 Update documents
 Delete documents
 Drop collections
 Drop databases
 Create indexes
 Drop indexes
Administration

 Create users
 Change database permissions
 Dump data
 Export data
 Import data
 Check load
 Loading files to GridFS
 Checking stats
 Setting mongo cluster
 Replication
Questions & Answers

You might also like