NoSQL Databases (MongoDB-Cassandra)

The document provides an overview of NoSQL databases and MongoDB. It discusses the CAP theorem and how different databases achieve different levels of consistency, availability, and partition tolerance. It then covers MongoDB's features like scalability, flexibility, indexing and querying capabilities. The document also discusses MongoDB's data model, CRUD operations, administration and installation.

Uploaded by

Enkh-Erdene Lkhagvasuren

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views

NoSQL Databases (MongoDB-Cassandra)

Uploaded by

Enkh-Erdene Lkhagvasuren

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 13

NoSQL Databases – Mongo DB,

Cassandra

Session One
CAP Theorem
 There are three parameters to define any distributed systems. They are:
– Consistency: Which ensures that users can access same data at the same
time
– Availability: Every request receives a response about whether it was
successful or failed
– Partition Tolerance: The system continues to operate despite arbitrary
message loss or failure of part of the system
 Definition: Any distributed system can achieve any two of them.
 All these three parameters are vertices of Triangle. We have three sides of
triangle are CA, AP, CP.
 CA -> RDBMS, Teradata, Greenplum etc.
 AP -> Cassandra, Voldemart, DynamoDB, Raik, Couch DB etc
 CP -> HBase, Big Table, Mongo DB, Hyper Table etc
NoSQL
 NoSQL means Not only “Relational/SQL”.
 A NoSQL database provides a simple, lightweight mechanism for storage
and retrieval of data that provides higher scalability and availability than
traditional RDBMS.
 Horizontal Scalability/ Scale out.
 Schema Free/ Flexible Schema.
 High Write/Read throughput.
 Multiple Data Models.
 Different Interfaces like CLI, HQL, CQL, Language API, REST API etc.
 Handles all varieties of data.
 Programmer friendly.
NoSQL vs. SQL
SQL Databases NoSQL Databases
Scale Up or Vertical Scaling. Scale out or Horizontal Scaling.
Consistency, Availability. Consistency, Availability, Partition tolerance.
Single Data Model i.e Relational. Multiple Data Models i.e Columnar,
Document, Key-Value, Graph and many.
Single Query Language i.e SQL. Multiple Query Languages i.e Simple CLI,
HQL,CQL, REST, Thrift, DSLs.
Rigid Schema. Schema free/ Flexible Schema.
Joins are expensive. Free from Joins.
Good for Real time Querying i.e point queries Good for Real time Querying as well as Real
time Decisioning.
Scales up to a few Tera bytes. Scales up to Peta bytes.
Contd…

Good for Low data traffic. Good for high volumes of data traffic.
Complexity in managing Distributed Majority are Distributed in Nature. Very
databases i.e Adding/Removing machines is easy to add/remove Machines to the
so complex Existing clusters.
Good for non volatile data Good for volatile data.
Hard to implement i.e schema design, data Simple to implement.
integrity
Good for Transactions i.e OLTP. Good for Decisioning i.e OLAP.

Write throughput is very low. Write throughput is very high.

Good for handling Structured data. Good for handling Unstructured and
Semistructured data.
Not Programmer friendly. Programmer Friendly.
NoSQL Data Stores
 There are several categories of NoSQL Databases. Some of them are using data
models as key – value model, document model, graph model, container
models. The following are some of them.
• Columnar stores
• HBase, Cassandra, Hyper Table, Big Table, Accumulo, ClouData etc.
• Key – Value Stores
• Redis, Voldemart, Raik, Tokyo, DynamoDB etc
• Document Stores
• Mango DB, Couch DB, CouchBase Server, Terrastore etc
• Graph Databases
• Neo4j, Infinite graph, Info Grid etc.
• XML Databases
• EMC Documentum DB, Berkeley DB XML etc.
• Many more…
Mongo DB

 Mongo DB is a distributed, scalable, high-performance, open source

NoSQL database.
 It handles humongous amount of data and written in c++.
 It is a Document Oriented database.
 It is good for volatile data.
 It scales to tera bytes of data.
 It has connectors to Apache Solr, Apache Hadoop.
 It is good for logging systems, storing news data, social network content.
 It has interface/Drivers to all programming languages
 It has Restful API.
 Simple to implement and Administration.
Installation
 Download the binary distribution from mongo db website
 Create the directory for mongo db ( default /data/db ) “directory path”
 Start the mongo db server:
– cd mongo home directory/bin
– ./mongod –rest –dbpath=“directory path”
 Start the mongo db client:
– ./mongo database ( default is test ).
 Start the web interface
– hostname:28017
Features
 JSON/BSON document is basic unit of data.
 Programmer friendly.
 Document-Oriented Storage
– JSON-style documents with dynamic schemas offer simplicity and power.
 Full Index Support
– Index on any attribute, primary as well as secondary indexes.
 Replication & High Availability
– Mirror across LANs and WANs for scale and peace of mind.
 Auto Sharding
– Scale horizontally without compromising functionality.
 Querying
– Rich, document-based queries.
 Map/Reduce
– Flexible aggregation and data processing.
 GridFS
– Store files of any size without complicating your stack
Data Model
 Basic Unit is a Document
 A document is a collection of key value pair.
 A key is string and value is a primitive data type or arrays or document
Analogy of Mongo DB to SQL Databases

Mongo DB SQL Databases

Database Database

Collection Table

Document Record/Tuple
CRUD Operations

 Create databases
 Create collections
 Insert documents
 Update documents
 Delete documents
 Drop collections
 Drop databases
 Create indexes
 Drop indexes
Administration

 Create users
 Change database permissions
 Dump data
 Export data
 Import data
 Check load
 Loading files to GridFS
 Checking stats
 Setting mongo cluster
 Replication
Questions & Answers

Introduction to information and big data security
No ratings yet
Introduction to information and big data security
39 pages
M1 - Introducing Google Cloud v5.2 - ILT
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
69 pages
Data Lake Bootcamp: Building Reliable Data Lakes
No ratings yet
Data Lake Bootcamp: Building Reliable Data Lakes
29 pages
APC Building Data Lakes On AWS SG
No ratings yet
APC Building Data Lakes On AWS SG
187 pages
C 100 Dev
No ratings yet
C 100 Dev
10 pages
User Manual - DATA IKU
No ratings yet
User Manual - DATA IKU
6 pages
Data Analytics Using NoSQL
0% (1)
Data Analytics Using NoSQL
50 pages
Lekcija09 - 04 NoSQL Redis
No ratings yet
Lekcija09 - 04 NoSQL Redis
40 pages
C100dev 3
No ratings yet
C100dev 3
14 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
Cloudera Nokia Case Study Final
No ratings yet
Cloudera Nokia Case Study Final
2 pages
Dataiku Datsheet
No ratings yet
Dataiku Datsheet
16 pages
Distributed System
100% (1)
Distributed System
119 pages
When Where and Why To Use NoSQL
No ratings yet
When Where and Why To Use NoSQL
13 pages
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
100% (5)
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
62 pages
Sonar Qube
No ratings yet
Sonar Qube
46 pages
The AI Hierarchy of Needs
No ratings yet
The AI Hierarchy of Needs
8 pages
Data Platform and Analytics Foundational Training: (Speaker Name)
100% (1)
Data Platform and Analytics Foundational Training: (Speaker Name)
23 pages
Federated learning Overview, strategies, applications, tools and
No ratings yet
Federated learning Overview, strategies, applications, tools and
24 pages
Scalable-ML-3 4 1
No ratings yet
Scalable-ML-3 4 1
147 pages
Data Versioning For Graph Databases
No ratings yet
Data Versioning For Graph Databases
71 pages
Hemanshu Kumar Saraf - Resume New
No ratings yet
Hemanshu Kumar Saraf - Resume New
1 page
Anomaly Detection: Course: Data Mining II
No ratings yet
Anomaly Detection: Course: Data Mining II
12 pages
Unit 4 (MongoDB)
No ratings yet
Unit 4 (MongoDB)
46 pages
Real Time Data Processing With PDI
No ratings yet
Real Time Data Processing With PDI
15 pages
Unit II Requirements Elicitation
No ratings yet
Unit II Requirements Elicitation
23 pages
AWS Machine Learning Engineer: Nanodegree Program Syllabus
100% (1)
AWS Machine Learning Engineer: Nanodegree Program Syllabus
18 pages
Characteristics of Key Value DB (DB)
No ratings yet
Characteristics of Key Value DB (DB)
13 pages
FAANGPath Simple Template 1
No ratings yet
FAANGPath Simple Template 1
2 pages
Cofactor Statistics
100% (1)
Cofactor Statistics
27 pages
A Comparison of in Memory Databases
No ratings yet
A Comparison of in Memory Databases
6 pages
AtlasDB - ACID Transactions For Your Favorite Key-Value Store Presentation
No ratings yet
AtlasDB - ACID Transactions For Your Favorite Key-Value Store Presentation
37 pages
EDA - The Right Way
No ratings yet
EDA - The Right Way
111 pages
Mongodb Atlas Setting Up Using Managed Mongodb
No ratings yet
Mongodb Atlas Setting Up Using Managed Mongodb
17 pages
Operating System
No ratings yet
Operating System
60 pages
1. Application Of Large Language
No ratings yet
1. Application Of Large Language
75 pages
Introduction To API Security
100% (1)
Introduction To API Security
33 pages
XML Databases
No ratings yet
XML Databases
1 page
Example Star Schema For Banking
No ratings yet
Example Star Schema For Banking
16 pages
4.2.4 - Data Source Architectural Patterns
No ratings yet
4.2.4 - Data Source Architectural Patterns
20 pages
ERModel PDF
100% (1)
ERModel PDF
82 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
Cert DEWD (Edits)
No ratings yet
Cert DEWD (Edits)
158 pages
Developing Solutions For Microsoft Azure AZ 204 1726611181
No ratings yet
Developing Solutions For Microsoft Azure AZ 204 1726611181
74 pages
Cassandra
No ratings yet
Cassandra
31 pages
Oltp Olap Rtap
No ratings yet
Oltp Olap Rtap
53 pages
Data Mining N Business Intelligence
No ratings yet
Data Mining N Business Intelligence
63 pages
New Ebook Guide To AI Data Science
No ratings yet
New Ebook Guide To AI Data Science
50 pages
Redis Vs Ncache
No ratings yet
Redis Vs Ncache
36 pages
Apache Pig
100% (2)
Apache Pig
80 pages
Big Data - Bi - and - Analytics PDF
0% (1)
Big Data - Bi - and - Analytics PDF
30 pages
Introduction To: Nosql
No ratings yet
Introduction To: Nosql
27 pages
Analysis of The Banking Industry
No ratings yet
Analysis of The Banking Industry
44 pages
17 2017 Lecture1-2 INT312
0% (2)
17 2017 Lecture1-2 INT312
21 pages
Unleashing Potential of Employees Through Artificial Intelligence
No ratings yet
Unleashing Potential of Employees Through Artificial Intelligence
3 pages
Generativeaiconamazonbedrock 231229150142 844d444e
No ratings yet
Generativeaiconamazonbedrock 231229150142 844d444e
48 pages
Mongodbcertificationstudygrouppresentation 160511153946
0% (1)
Mongodbcertificationstudygrouppresentation 160511153946
29 pages
Machine Learning + Devops Using Azure ML Services
No ratings yet
Machine Learning + Devops Using Azure ML Services
17 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Ultimate AWS Certified Solutions Architect Associate Exam Guide: Master Designing Resilient, Scalable Architectures with Core and Advanced AWS Services to Crack the SAA-C03 Certification (English Edition)
From Everand
Ultimate AWS Certified Solutions Architect Associate Exam Guide: Master Designing Resilient, Scalable Architectures with Core and Advanced AWS Services to Crack the SAA-C03 Certification (English Edition)
Venkata Sasi Kanumuri
No ratings yet

NoSQL Databases (MongoDB-Cassandra)

Uploaded by

NoSQL Databases (MongoDB-Cassandra)

Uploaded by

NoSQL Databases – Mongo DB,

Write throughput is very low. Write throughput is very high.

 Mongo DB is a distributed, scalable, high-performance, open source

Mongo DB SQL Databases

You might also like