CMSC-691-Assignment-3

The assignment for CMSC 691 involves two parts: a market basket analysis using the Breadbasket dataset and a MongoDB project using movie data. Students must describe the dataset, perform association rule mining, and implement various data manipulations in Python with MongoDB. Submissions include code, a summary of added data, and reflections on the learning experience.

Uploaded by

Bharath Narasimha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views2 pages

CMSC-691-Assignment-3

Uploaded by

Bharath Narasimha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

CMSC 691 – Introduction to Data Science

Assignment 3
Total Points – 30 Due: April 23, 2025

This assignment consists of two parts. For each part, please answer all the questions in a single
document. Also submit the R files or python files. You will find both the datasets in the
CourseDataSet folder in Blackboard.

Part 1: [15]
Using Breadbasket dataset, do the following:
1. Describe the dataset and list the distinct items in the dataset.
2. Do a market basket analysis and uncover the association rules. Make sure your rules have
least two items or at most five items. Then filter the rules so that “Coffee” doesn’t appear
on the right hand side.
3. Sort the rules using metrics of your choice (e.g. lift etc.).
4. Choose a rule from the top 3 rules and describe it and explain what information it provides
you.

Part 2: [15]
1. You will use MongoDB for this part. First, download MongoDB in your computer. Then
do the following.
2. For this assignment, it is best that you use Python.
3. Please download movies, tags and ratings files. Write a program to read the given 3
different csv files (movies, ratings, tags), and insert all the records into 3 different
collections (movies, ratings, tags).
4. Next, write a program to add five movies that you have watched this year, or you would
like to watch to the collection “movies”. Make sure that you assign unique movie IDs
and specify the genres (genres need not be completely accurate).
5. Corresponding to the movies that you added, write a program to add some suitable
ratings to the collection “ratings” and some suitable “tags” to the collection “tags”.
Make sure that you use a unique userid for yourself.
6. For the following items, you must use Aggregation Pipeline. If you use any other
method, no credit will be given.
a. Develop code to find number of movies released per year.
b. Develop code to find number of movies per genre.
c. Develop code to find number of movies per rating.
d. Develop code to find number of movies tagged.
e. Develop code to find the most popular tag.
7. What to submit?
a. Jupyter Notebook file that contains all the above code.
b. Summarize all the data you added.
c. A document that summarizes what you learnt while doing the Assignment.

For doing this part, it may be easier to setup a virtual environment -

(https://pypi.org/project/virtualenv/)
Use PyMongo - https://pypi.org/project/pymongo/

Links:
• https://docs.mongodb.com/manual/administration/install-community/
• https://docs.mongodb.com/manual/installation/
• https://docs.mongodb.com/drivers/pymongo/
• https://www.mongodb.com/developer/quickstart/python-quickstart-aggregation/
• https://www.analyticsvidhya.com/blog/2020/08/how-to-create-aggregation-pipelines-
in-a-mongodb-database-using-pymongo/
• https://www.mongodb.com/docs/manual/core/aggregation-pipeline/
• https://www.mongodb.com/basics/aggregation-pipeline
• https://www.mongodb.com/docs/v6.0/core/aggregation-pipeline/
• https://www.mongodb.com/docs/manual/reference/operator/aggregation/count/

Fundamentals of Data Science and Analytics - AD3491 - Important Questions with Answer - Unit 1 - Introduction to Data Science
No ratings yet
Fundamentals of Data Science and Analytics - AD3491 - Important Questions with Answer - Unit 1 - Introduction to Data Science
28 pages
Mock Test - Data Backup and Restoration
No ratings yet
Mock Test - Data Backup and Restoration
5 pages
Kroenke Mis5e PPT ch05
No ratings yet
Kroenke Mis5e PPT ch05
44 pages
Data Science Manual
No ratings yet
Data Science Manual
155 pages
Grade10 AI Practical Programs Questions 2025-26 (2)
No ratings yet
Grade10 AI Practical Programs Questions 2025-26 (2)
4 pages
Dbms Question Bank Full Solution
No ratings yet
Dbms Question Bank Full Solution
41 pages
Assignment
No ratings yet
Assignment
10 pages
Data Science Papers
No ratings yet
Data Science Papers
109 pages
Data Science Using Python
100% (1)
Data Science Using Python
2 pages
23HCS4142.pdf
No ratings yet
23HCS4142.pdf
24 pages
CS 3361 SET 2
No ratings yet
CS 3361 SET 2
3 pages
xslx.
No ratings yet
xslx.
4 pages
Dynatrace Database-1
No ratings yet
Dynatrace Database-1
6 pages
DATASCIENCE (1)
No ratings yet
DATASCIENCE (1)
3 pages
Tybsc Cs368 Data Analytics Labbook (1) (1)
No ratings yet
Tybsc Cs368 Data Analytics Labbook (1) (1)
4 pages
SL-III Lab Manual
No ratings yet
SL-III Lab Manual
74 pages
chapter4 (1)
No ratings yet
chapter4 (1)
28 pages
00_DBMS
No ratings yet
00_DBMS
8 pages
Final Project Guidelines - Option 2(1)
No ratings yet
Final Project Guidelines - Option 2(1)
3 pages
MD120-Account Scoring Report
No ratings yet
MD120-Account Scoring Report
6 pages
Add a heading
No ratings yet
Add a heading
12 pages
Python_MB-2_sec-09 (1)
No ratings yet
Python_MB-2_sec-09 (1)
10 pages
2
No ratings yet
2
35 pages
XII IP Practical List 2023-24
No ratings yet
XII IP Practical List 2023-24
4 pages
CMIS 550 2023W Assignment 1
No ratings yet
CMIS 550 2023W Assignment 1
4 pages
FDS - 1 SOLVED
No ratings yet
FDS - 1 SOLVED
17 pages
Instagram User Analytics (1)
No ratings yet
Instagram User Analytics (1)
8 pages
Practical 07
No ratings yet
Practical 07
15 pages
4BUIS014W Business Computing-Portfolio
No ratings yet
4BUIS014W Business Computing-Portfolio
7 pages
Alok Mall - Oracle DBA - Raqmiyat LLC
No ratings yet
Alok Mall - Oracle DBA - Raqmiyat LLC
3 pages
SQL Exercises
100% (1)
SQL Exercises
14 pages
Sample Paper Annual
No ratings yet
Sample Paper Annual
3 pages
IP_1
No ratings yet
IP_1
5 pages
hw1 Instructions Light Mode
No ratings yet
hw1 Instructions Light Mode
4 pages
IP practical file2
No ratings yet
IP practical file2
35 pages
Data Dictionary
No ratings yet
Data Dictionary
15 pages
Problem Statement
No ratings yet
Problem Statement
6 pages
Class 12 List of Practicals 2022 11 27
No ratings yet
Class 12 List of Practicals 2022 11 27
5 pages
ETL Standards Document
100% (2)
ETL Standards Document
38 pages
1-Big Data & Strategic Decision Making
100% (1)
1-Big Data & Strategic Decision Making
22 pages
12TH Hy Ip St. Mary 2023
No ratings yet
12TH Hy Ip St. Mary 2023
10 pages
Amity International School SESSION: 2024-25 Informatics Practices (065) Class Xii Practical List
No ratings yet
Amity International School SESSION: 2024-25 Informatics Practices (065) Class Xii Practical List
5 pages
The Entity-Relationship Model
No ratings yet
The Entity-Relationship Model
6 pages
Important Questions Stu
No ratings yet
Important Questions Stu
3 pages
Practical Guidelines & Questions for Grade X Term-1
No ratings yet
Practical Guidelines & Questions for Grade X Term-1
3 pages
HSQL Database
No ratings yet
HSQL Database
5 pages
Approved l7 Comp7067 2023-24 Sub Brief
No ratings yet
Approved l7 Comp7067 2023-24 Sub Brief
7 pages
AI PRACTICAL FILE CLASS X 2024-25
No ratings yet
AI PRACTICAL FILE CLASS X 2024-25
20 pages
18CN627 Big Data Framework For Data Science: Centre For Excellence in Computational Engineering and Networking
No ratings yet
18CN627 Big Data Framework For Data Science: Centre For Excellence in Computational Engineering and Networking
1 page
Ip Practical File 2
No ratings yet
Ip Practical File 2
30 pages
Unit 1
No ratings yet
Unit 1
26 pages
DATA SCIENCE SAMPLE
No ratings yet
DATA SCIENCE SAMPLE
5 pages
Introduction T o Information Technology: By: Rajiv Raman Parajuli
No ratings yet
Introduction T o Information Technology: By: Rajiv Raman Parajuli
9 pages
Hanacleaner Intro PDF
No ratings yet
Hanacleaner Intro PDF
30 pages
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
100% (1)
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
256 pages
Practical 14 Rdbms
No ratings yet
Practical 14 Rdbms
8 pages
SAPBW Technical Specification Template
100% (3)
SAPBW Technical Specification Template
30 pages
DBMS Slides
No ratings yet
DBMS Slides
127 pages
Yashica IP Practical
No ratings yet
Yashica IP Practical
51 pages
Text Operations 2021
No ratings yet
Text Operations 2021
45 pages
115 SQL Interview Questions and Answers
100% (1)
115 SQL Interview Questions and Answers
34 pages
01 SYNON-Complete Overview
No ratings yet
01 SYNON-Complete Overview
45 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
CLASS 10 PRACTICAL FILE-format
100% (1)
CLASS 10 PRACTICAL FILE-format
31 pages
Exam Name: Exam Type: Exam Code: Certification: Total Questions
100% (1)
Exam Name: Exam Type: Exam Code: Certification: Total Questions
3 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
Practical File X (Ai - 417)
100% (1)
Practical File X (Ai - 417)
13 pages
Big Data: Data Science & Advanced Analytics
No ratings yet
Big Data: Data Science & Advanced Analytics
42 pages
X-AI Practical File-2 (2024)
No ratings yet
X-AI Practical File-2 (2024)
17 pages
MCQs - Big Data Analytics - 7 V's of Big Data
No ratings yet
MCQs - Big Data Analytics - 7 V's of Big Data
7 pages
Multimedia Mining Presentation
No ratings yet
Multimedia Mining Presentation
18 pages
Babylon.js Essentials: Understand, train, and be ready to develop 3D Web applications/video games using the Babylon.js framework, even for beginners
From Everand
Babylon.js Essentials: Understand, train, and be ready to develop 3D Web applications/video games using the Babylon.js framework, even for beginners
Julien Moreau-Mathis
No ratings yet
Test-Driven iOS Development with Swift: Create fully-featured and highly functional iOS apps by writing tests first
From Everand
Test-Driven iOS Development with Swift: Create fully-featured and highly functional iOS apps by writing tests first
Dr. Dominik Hauser
5/5 (2)
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
From Everand
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
David Hecksel
5/5 (2)
Modular Programming with Python
From Everand
Modular Programming with Python
Erik Westra
No ratings yet
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
Python for Beginners: A Crash Course to Learn Python Programming in 1 Week
From Everand
Python for Beginners: A Crash Course to Learn Python Programming in 1 Week
Brady Ellison
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Python for Mechanical and Aerospace Engineering
From Everand
Python for Mechanical and Aerospace Engineering
Alexander Kenan
No ratings yet
BeagleBone Media Center
From Everand
BeagleBone Media Center
David Lewin
No ratings yet
SC-200: Microsoft Security Operations Analyst Preparation
From Everand
SC-200: Microsoft Security Operations Analyst Preparation
Georgio Daccache
No ratings yet
FuelPHP Application Development Blueprints
From Everand
FuelPHP Application Development Blueprints
Sébastien Drouyer
No ratings yet
Python Programming: Learn, Code, Create
From Everand
Python Programming: Learn, Code, Create
Sachin Naha
No ratings yet
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
From Everand
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
Mark Magic
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
How to Track Schedules, Costs and Earned Value with Microsoft Project
From Everand
How to Track Schedules, Costs and Earned Value with Microsoft Project
Akram Najjar
No ratings yet
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

CMSC-691-Assignment-3

Uploaded by

CMSC-691-Assignment-3

Uploaded by

CMSC 691 – Introduction to Data Science

For doing this part, it may be easier to setup a virtual environment -

You might also like