SlideShare a Scribd company logo
Aggregation Framework
in MongoDB
Overview - Part-1
What is Aggregation
English Definition:- The act of gathering
something together.
Database Definition:- Aggregation are
operations that process on the data sets,
group some data records, do some
computation on that records and return
computed results.
Aggregration Approach in
MongoDB
● Aggregation Pipeline
● Map-Reduce
● Single Purpose Aggregation Operations
● Hadoop Connector
Aggregation-Pipeline
Similar to pipeline in UNIX.
In Unix -
Cat file.txt | grep abc | wc -l
In MongoDB -
Group Limit SortCollection Output
MapReduce
● Two Phase process – Mapper & Reducer.
● Use JavaScript function for map-reduce
job.
● Less efficient as it run over only one thread.
● Finalise stage to make final modification.
Aggregation Framework in MongoDB Overview Part-1
Aggregarion Vs. MapReduce
Easy to use, just need to use the build
in operators.
Complex and steep learning curve.
Supports non-sharded and sharded
input collections.
Supports non-sharded and sharded
input collections.
Returns results inline. Return result in inline, new collection,
merge, replace, reduce.
Limited to the operators and
expressions supported by the
aggregation pipeline.
Custom map, reduce and finalize
JavaScript functions offer flexibility to
aggregation logic.
Single Purpose Aggregation
Operation
● Applicable for single purpose of
aggregation, like count, distinct, group.
● Limited scope as compared to aggregation
pipeline and map-reduce.
● group does not support data in sharded
collections, and result of operation should
not be more than 16 MB.
Aggregation Pipeline Operators
● $group
● $match
● $project
● $limit
● $sort
● $unwind
● $skip
SQL to Aggregation Mapping Chart
SQL Operators Aggregation Operators
WHERE $match
GROUP BY $group
SELECT $project
LIMIT $limit
ORDER BY $sort
SUM() $sum
COUNT() $sum
JOIN $unwind (Not exact operator as JOIN
works, but unwinds the array
embedded in the document)
Examples
SQL Query Aggregation Query
SELECT COUNT(*) AS COUNT
FROM EMPLOYEE
db.employee.aggregate([
{$group: {_id:null, count: { $sum:1} } }
])
Explaination :
Group by : on nothing
Sum : Just add 1 to count field for each
record
SELECT SUM(SALARY) AS TOTAL
FROM EMPLOYEE
db.employee.aggregate([
{$group: {_id:null, total:
{ $sum:”$salary”} } }
])
Explaination :
Group by : on nothing
Sum: do the sum of value of salary
field of each doc and put result in total.
Example Cont.
SELECT DEPARTMENT_ID,
SUM(SALARY) AS TOTAL FROM
EMPLOYEE GROUP BY
DEPARTMENT_ID ORDER BY
TOTAL
db.employee.aggregate( [
{ $group: { _id: “$department_id”, total:
{ $sum:”$salary” } },
{ $sort: { total : 1 } } }
] )
Explanation:
Group by : department,
Sum : salary,
Order by : total of salary for each
department
Example Cont.
SELECT DEPT_ID, SUM(SALARY)
AS TOTAL FROM EMPLOYEE
WHERE AGE>25 GROUB BY
DEPT_ID, HAVING TOTAL > 5000
db.employee.aggregate( [
{ $match : { age : {$gt: 25 } } }
{ $group: { _id: “$dept_id”, total:
{ $sum: ”$salary” } } },
{ $match: { total: {$gt:5000 } } }
] )
Explanation:
Group by : department id,
Sum : salary,
Having: on total of each department
salary
Example Cont.
SELECT DEPT_ID, SUM(SALARY) AS
TOTAL FROM EMPLOYEE WHERE
AGE>25 GROUB BY DEPT_ID,
HAVING TOTAL > 5000
db.employee.aggregate( [
{ $match : { age : {$gt: 25 } } }
{ $group: { _id: “$dept_id”, total:
{ $sum: ”$salary” } } },
{ $match: { total: {$gt:5000 } } }
] )
Explanation:
Group by : department id,
Sum : salary,
Having: on total of each department
salary
$unwind Operator
Decompose the embedded array into flat
document and relate each entry in the
array with outer fields.
Example :-
{ _id: “blog”, tags: [ “social”, “economic” ] }
{ _id: “blog”, { _id: “blog”
tags: “social” } tags: “economic” }
Mongo Aggregation Optimization
MongoDB re-arranged the pipeline
operations to optimize the aggregation
performance.
● Pipeline Sequence Optimization.
● Projection Optimization.
Mongo Aggregation Optimization
MongoDB re-arranged the pipeline
operations to optimize the aggregation
performance.
● Pipeline Sequence Optimization.
● Projection Optimization.
Pipeline Sequence Optimization
● $sort + $skip + $limit
{ $sort: { salary : -1 } }, { $sort: { salary: -1 } },
{ $skip: 10 }, { $limit: 20 },
{ $limit: 10 } { $skip: 10}
Projection Optimization
● $project
● Reduce the amount of data passing
through channels of operation and will help
in performace improvement.
● Below example will only emit salary.
db.employee.aggregate(
[ {$match: {“name”: “xyz” } }
{ $project: { salary:1, _id:0} } ]
)
Restriction
● Output BSON document cannot exceed the
16 MB of data. If exceed will throw error.
● If single aggregation operation consumes
more than 10 percent of system RAM, the
operation will produce error.
Aggregation Examples
● Download data from the below link:
http://media.mongdb.org/zips.json
● Practice sample problem from the below
links:
http://docs.mongodb.org/manual/tutorial/aggregatio-zip-code-data-set/
References
● http://docs.mongodb.org/
Good Examples:-
● http://rubayeet.wordpress.com/2013/12/29/web-analytics
-using-mongodb-aggregation-framework/
● http://derickrethans.nl/aggregation-framework.html
● http://architects.dzone.com/articles/using-mongodb
-aggregation
Anuj Jain
● ajain@equalexperts.com
● anuj2jain@gmail.com

More Related Content

What's hot (20)

MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
Amit Ghosh
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
MongoDB
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
zahid-mian
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
MongoDB
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
MongoDB
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
MongoDB
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Latinoware
LatinowareLatinoware
Latinoware
kchodorow
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
antoinegirbal
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
Amit Ghosh
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
MongoDB
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
zahid-mian
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
MongoDB
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
MongoDB
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
MongoDB
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
antoinegirbal
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 

Similar to Aggregation Framework in MongoDB Overview Part-1 (20)

Experiment no 05
Experiment no 05Experiment no 05
Experiment no 05
Ankit Dubey
 
Learning MongoDB Aggregations in 10 Minutes
Learning MongoDB Aggregations in 10 MinutesLearning MongoDB Aggregations in 10 Minutes
Learning MongoDB Aggregations in 10 Minutes
techprane
 
Redis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch AggregationsRedis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch Aggregations
Redis Labs
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
MongoDB
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011
Chris Westin
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
Binh Le
 
Module3 for enginerring students ppt.pptx
Module3 for enginerring students ppt.pptxModule3 for enginerring students ppt.pptx
Module3 for enginerring students ppt.pptx
mudduanjali02
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Raghunath A
 
MongoDB Distilled
MongoDB DistilledMongoDB Distilled
MongoDB Distilled
b0ris_1
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic Functions
Zohar Elkayam
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced Querying
Zohar Elkayam
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
Julian Hyde
 
Functional programming with streams
Functional programming with streamsFunctional programming with streams
Functional programming with streams
Riadh MNASRI
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
MongoDB
 
Les04
Les04Les04
Les04
Sudharsan S
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
Experiment no 05
Experiment no 05Experiment no 05
Experiment no 05
Ankit Dubey
 
Learning MongoDB Aggregations in 10 Minutes
Learning MongoDB Aggregations in 10 MinutesLearning MongoDB Aggregations in 10 Minutes
Learning MongoDB Aggregations in 10 Minutes
techprane
 
Redis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch AggregationsRedis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch Aggregations
Redis Labs
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
MongoDB
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011
Chris Westin
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
Binh Le
 
Module3 for enginerring students ppt.pptx
Module3 for enginerring students ppt.pptxModule3 for enginerring students ppt.pptx
Module3 for enginerring students ppt.pptx
mudduanjali02
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Raghunath A
 
MongoDB Distilled
MongoDB DistilledMongoDB Distilled
MongoDB Distilled
b0ris_1
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic Functions
Zohar Elkayam
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced Querying
Zohar Elkayam
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
Julian Hyde
 
Functional programming with streams
Functional programming with streamsFunctional programming with streams
Functional programming with streams
Riadh MNASRI
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
MongoDB
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
Ad

Recently uploaded (20)

Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to KnowWhat is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
SMACT Works
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | BluebashMCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
Bluebash
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdfHow Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
Rejig Digital
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
AI Creative Generates You Passive Income Like Never Before
AI Creative Generates You Passive Income Like Never BeforeAI Creative Generates You Passive Income Like Never Before
AI Creative Generates You Passive Income Like Never Before
SivaRajan47
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
IntroSlides-May-BuildWithAi-EarthEngine.pdf
IntroSlides-May-BuildWithAi-EarthEngine.pdfIntroSlides-May-BuildWithAi-EarthEngine.pdf
IntroSlides-May-BuildWithAi-EarthEngine.pdf
Luiz Carneiro
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
soulmaite review - Find Real AI soulmate review
soulmaite review - Find Real AI soulmate reviewsoulmaite review - Find Real AI soulmate review
soulmaite review - Find Real AI soulmate review
Soulmaite
 
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Scott M. Graffius
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to KnowWhat is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to Know
SMACT Works
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | BluebashMCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
Bluebash
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdfHow Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
Rejig Digital
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
AI Creative Generates You Passive Income Like Never Before
AI Creative Generates You Passive Income Like Never BeforeAI Creative Generates You Passive Income Like Never Before
AI Creative Generates You Passive Income Like Never Before
SivaRajan47
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
IntroSlides-May-BuildWithAi-EarthEngine.pdf
IntroSlides-May-BuildWithAi-EarthEngine.pdfIntroSlides-May-BuildWithAi-EarthEngine.pdf
IntroSlides-May-BuildWithAi-EarthEngine.pdf
Luiz Carneiro
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
soulmaite review - Find Real AI soulmate review
soulmaite review - Find Real AI soulmate reviewsoulmaite review - Find Real AI soulmate review
soulmaite review - Find Real AI soulmate review
Soulmaite
 
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Scott M. Graffius
 
Ad

Aggregation Framework in MongoDB Overview Part-1

  • 2. What is Aggregation English Definition:- The act of gathering something together. Database Definition:- Aggregation are operations that process on the data sets, group some data records, do some computation on that records and return computed results.
  • 3. Aggregration Approach in MongoDB ● Aggregation Pipeline ● Map-Reduce ● Single Purpose Aggregation Operations ● Hadoop Connector
  • 4. Aggregation-Pipeline Similar to pipeline in UNIX. In Unix - Cat file.txt | grep abc | wc -l In MongoDB - Group Limit SortCollection Output
  • 5. MapReduce ● Two Phase process – Mapper & Reducer. ● Use JavaScript function for map-reduce job. ● Less efficient as it run over only one thread. ● Finalise stage to make final modification.
  • 7. Aggregarion Vs. MapReduce Easy to use, just need to use the build in operators. Complex and steep learning curve. Supports non-sharded and sharded input collections. Supports non-sharded and sharded input collections. Returns results inline. Return result in inline, new collection, merge, replace, reduce. Limited to the operators and expressions supported by the aggregation pipeline. Custom map, reduce and finalize JavaScript functions offer flexibility to aggregation logic.
  • 8. Single Purpose Aggregation Operation ● Applicable for single purpose of aggregation, like count, distinct, group. ● Limited scope as compared to aggregation pipeline and map-reduce. ● group does not support data in sharded collections, and result of operation should not be more than 16 MB.
  • 9. Aggregation Pipeline Operators ● $group ● $match ● $project ● $limit ● $sort ● $unwind ● $skip
  • 10. SQL to Aggregation Mapping Chart SQL Operators Aggregation Operators WHERE $match GROUP BY $group SELECT $project LIMIT $limit ORDER BY $sort SUM() $sum COUNT() $sum JOIN $unwind (Not exact operator as JOIN works, but unwinds the array embedded in the document)
  • 11. Examples SQL Query Aggregation Query SELECT COUNT(*) AS COUNT FROM EMPLOYEE db.employee.aggregate([ {$group: {_id:null, count: { $sum:1} } } ]) Explaination : Group by : on nothing Sum : Just add 1 to count field for each record SELECT SUM(SALARY) AS TOTAL FROM EMPLOYEE db.employee.aggregate([ {$group: {_id:null, total: { $sum:”$salary”} } } ]) Explaination : Group by : on nothing Sum: do the sum of value of salary field of each doc and put result in total.
  • 12. Example Cont. SELECT DEPARTMENT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE GROUP BY DEPARTMENT_ID ORDER BY TOTAL db.employee.aggregate( [ { $group: { _id: “$department_id”, total: { $sum:”$salary” } }, { $sort: { total : 1 } } } ] ) Explanation: Group by : department, Sum : salary, Order by : total of salary for each department
  • 13. Example Cont. SELECT DEPT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE WHERE AGE>25 GROUB BY DEPT_ID, HAVING TOTAL > 5000 db.employee.aggregate( [ { $match : { age : {$gt: 25 } } } { $group: { _id: “$dept_id”, total: { $sum: ”$salary” } } }, { $match: { total: {$gt:5000 } } } ] ) Explanation: Group by : department id, Sum : salary, Having: on total of each department salary
  • 14. Example Cont. SELECT DEPT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE WHERE AGE>25 GROUB BY DEPT_ID, HAVING TOTAL > 5000 db.employee.aggregate( [ { $match : { age : {$gt: 25 } } } { $group: { _id: “$dept_id”, total: { $sum: ”$salary” } } }, { $match: { total: {$gt:5000 } } } ] ) Explanation: Group by : department id, Sum : salary, Having: on total of each department salary
  • 15. $unwind Operator Decompose the embedded array into flat document and relate each entry in the array with outer fields. Example :- { _id: “blog”, tags: [ “social”, “economic” ] } { _id: “blog”, { _id: “blog” tags: “social” } tags: “economic” }
  • 16. Mongo Aggregation Optimization MongoDB re-arranged the pipeline operations to optimize the aggregation performance. ● Pipeline Sequence Optimization. ● Projection Optimization.
  • 17. Mongo Aggregation Optimization MongoDB re-arranged the pipeline operations to optimize the aggregation performance. ● Pipeline Sequence Optimization. ● Projection Optimization.
  • 18. Pipeline Sequence Optimization ● $sort + $skip + $limit { $sort: { salary : -1 } }, { $sort: { salary: -1 } }, { $skip: 10 }, { $limit: 20 }, { $limit: 10 } { $skip: 10}
  • 19. Projection Optimization ● $project ● Reduce the amount of data passing through channels of operation and will help in performace improvement. ● Below example will only emit salary. db.employee.aggregate( [ {$match: {“name”: “xyz” } } { $project: { salary:1, _id:0} } ] )
  • 20. Restriction ● Output BSON document cannot exceed the 16 MB of data. If exceed will throw error. ● If single aggregation operation consumes more than 10 percent of system RAM, the operation will produce error.
  • 21. Aggregation Examples ● Download data from the below link: http://media.mongdb.org/zips.json ● Practice sample problem from the below links: http://docs.mongodb.org/manual/tutorial/aggregatio-zip-code-data-set/
  • 22. References ● http://docs.mongodb.org/ Good Examples:- ● http://rubayeet.wordpress.com/2013/12/29/web-analytics -using-mongodb-aggregation-framework/ ● http://derickrethans.nl/aggregation-framework.html ● http://architects.dzone.com/articles/using-mongodb -aggregation