Aggregation and Indexing
Aggregation and Indexing
Assignment No.
11
Problem Statement : Implement aggregation and indexing with suitable example
using MongoDB.
Aggregations operations process data records and return computed results. Aggregation operations
group values from multiple documents together, and can perform a variety of operations on the
grouped data to return a single result. In SQL count(*) and with group by is an equivalent of mongodb
aggregation.
The aggregate() Method For the aggregation in MongoDB, you should use
>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
Example
_id:
ObjectId(7df78ad8902c)
url: 'http://www.tutorialspoint.com',
_id: ObjectId(7df78ad8902d)
url: 'http://www.tutorialspoint.com',
'NoSQL'], likes: 10
},
Now from the above collection, if you want to display a list stating how many tutorials are written by each
user, then you will use the following aggregate() method:
"result" : [
},
{
"_id" : "Neo4j","num_tutorial" : 1
}],
"ok" : 1
}>
Sql equivalent query for the above use case will be select by_user, count(*) from mycol
group by by_user.
Pipeline Concept
In UNIX command, shell pipeline means the possibility to execute an operation on some input
and use the output as the input for the next command and so on. MongoDB also supports
same concept in aggregation framework. There is a set of possible stages and each of those
is taken as a set of documents as an input and produces a resulting set of documents (or the
final resulting JSON document at the end of the pipeline). This can then in turn be used for the
next stage and so on.
$match: This is a filtering operation and thus this can reduce the amount of documents that
are given as input to the next stage.
$skip: With this, it is possible to skip forward in the list of documents for a given
amount of documents.
$limit: This limits the amount of docu ments to look at, by the given number starting
from the current positions.
$unwind: This is used to unwind document that are using arrays. When using an array, the
data is kind of pre-joined and this operation will be undone with this to have individual
documents again. Thus with this stage we will increase the amount of documents for the next
stage.
Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan every
document of a collection to select those documents that match the query statement This scan is
highly inefficient and require MongoDB to process a large volume of data.
Indexes are special data structures, that store a small portion of the data set in an easy -to-
traverse form. The index stores the value of a specific field or set of fields, ordered by the
value of the field as specified in the index.
To create an index you need to use ensureIndex() method of MongoDB. The basic syntax of
ensureIndex() method is as follows().
>db.COLLECTION_NAME.ensureIndex({KEY:1})
Here key is the name of the file on which you want to create index and 1 is for ascending
order. To create index in descending order you need to use -1.
Example
>db.mycol.ensureIndex({"title":1})
In ensureIndex() method you can pass multiple fields, to create index on multiple fields.
>db.mycol.ensureIndex({"title":1,"description":-1})
ensureIndex() method also accepts list of options (which are optional). Following is the list:
Conclusion: - Thus we have studied use and implementation of aggregation function &indexing
function.