B.tech Minor Syllabus-CSE (Data Science) - Final
B.tech Minor Syllabus-CSE (Data Science) - Final
VISION
To create and nurture competent engineers and managers who would be enterprise
leaders throughout the world with a sound background in ethics and societal
responsibilities.
MISSION
QUALITY POLICY
Course objectives: The main objective of the course is to make students learn about
What is data, types of data, What is data science, Fundamentals of data science, Data science
life cycle, Why data science is important, Applications of data science, Why Python is necessary
for data science
Jupyter/pycharm/spyder or any other python tool set up and installation. Basics of Python
including data types, operators, variables, expressions, control structures using sample dataset,
objects and functions. Python sequence data structures including String, Array, List, Tuple, Set
and Dictionary. Introduction to various python libraries for data science
Data loading, dealing with missing values and outliers, data wrangling, filtering data, Data
Normalization, Data Formatting, data cleaning, Web scraping with beautiful soup.
Basic visualizations with Matplotlib, Advanced visualizations with Seaborn, Plotting images,
graphs and grids of charts.
1. Apply various Python data structures to effectively manage various types of data.
2. Learn the fundamentals of some of the widely used python packages and apply them
into data analytics.
3. Design applications applying various operations for data cleansing and transformation.
4. Describe the various areas where data science is applied
5. Design visualizations using various Python Libraries.
Text Book:
1. Python for data science for dummies 2nd Edition, John Paul Mueller, Luca Massaron, and
Wiley 2. Programming through Python, M. T. Savaliya, R. K. Maurya, G. M. Magar, STAREDU
Solutions.
Reference Books:
2. Introducing Data Science: Big Data, Machine Learning, and More, Using Python Tools, Davy
Cielen, Arno D.B. Meysman, et al., Minning
3. Applied Data Science with Python and Jupyter: Use powerful industry-standard tools to
unlock new, actionable insights from your data.
III Year B.Tech. CSE (DS) I Sem L T P C
- - 3 1.5
DATA SCEINCE LABARATORY
1. How to program in Python and how to use its packages for effective data analysis.
2. Install and configure software necessary for a statistical programming environment.
3. Carry out basic data pre-processing and wrangling in to explore practical issues
4. Read in data into the python environment from different sources
5. Implement supervised/unsupervised machine learning techniques
List of Practicals/Tutorials:
WEEK-1. Write a program to create a list, insert elements into the list and sort it in ascending
order.
WEEK-2. Write a program to create a dictionary of 10 elements, change/delete the values off
ewkeys and display the dictionary before and after the updates.
WEEK-3. Write a program to create a tuple and a list. Convert the list to tuple and display
the elements of both. Write the program to remove the duplicate element of the list.
WEEK-4. Write a program to perform all basic data pre processing steps on the given data
set.
WEEK-5. Write a program to perform exploratory data analysis on the given dataset.
WEEK-6. Develop programs to learn the concept of Modules and packages.
WEEK-7. Develop a program to learn concept of array and numpy module.
WEEK-8. Write a NumPy program to convert a list of numeric value into a one dimensional
NumPy array. And perform all operations on that array.
WEEK-9. Write a NumPy program to find the union of two arrays. Union will return the unique,
sorted array of values that are in either of the two input arrays.
WEEK-10. Write a Pandas program to convert a NumPy array to a Pandas series. Also write a Pandas
program to calculate the frequency counts of each unique value of a given series.
WEEK-11. Write a Pandas program to read a dataset from diamonds DataFrame and modify the
default columns values and print the first 6 rows. Also find the number of rows and
columns and data type of each column of diamonds Data frame.
WEEK-12. Consider dataset with student name, gender, Enrollment no, 4 semester result with
marks of each subject, his mobile number, and city. Implement following in Python.
Plot various graphs and chart to visualize students ‘data.
COURSE OUTCOMES: After completion of the course, the students would be able to:
2. John Mueller and Luca Massaron, “Machine Learning for Dummies”, John Wiley & Sons,
2016.
III Year B.Tech. CSE (DS) II Sem L T P C
3 - - 3
DATA VISUALIZATION TECHNIQUES
Basics - Relationship between Visualization and Other Fields - The Visualization Process -
Pseudo code Conventions - The Scatter plot. Data Foundation - Types of Data - Structure
within and between Records - Data Preprocessing - Data Sets
Visualization stages - Semiology of Graphical Symbols - The Eight Visual Variables - Historical
Perspective - Taxonomies - Experimental Semiotics based on Perception. Gibson‘s Affordance
theory – A Model of Perceptual Processing.
Text and Document Visualization: Introduction - Levels of Text Representations - The Vector
Space Model - Single Document Visualizations –Document Collection Visualizations - Extended
Text Visualizations Interaction Concepts: Interaction Operators - Interaction Operands and
Spaces - A Unified Framework. Interaction Techniques: Screen Space - Object-Space -Data Space
-Attribute Space- Data Structure Space - Visualization Structure – Animating Transformations -
Interaction Control
TEXT BOOKS:
1. Matthew Ward, Georges Grinstein and Daniel Keim, “Interactive Data Visualization
Foundations, Techniques, Applications”, 2010.
2. Colin Ware, “Information Visualization Perception for Design”, 2nd edition, Margon
Kaufmann Publishers, 2004.
REFERENCE BOOKS:
1. Robert Spence “Information visualization – Design for interaction”, Pearson Education, 2nd
Edition, 2007.
2. Alexandru C. Telea, “Data Visualization: Principles and Practice,” A. K. Peters Ltd, 2008
III Year B.Tech. CSE (DS) II Sem L T P C
- - 3 1.5
DATA VISUALIZATION TECHNIQUES LAB
Course Objectives:
1. Understand the various types of data, apply and evaluate the principles of data
visualization.
2. Acquire skills to apply visualization techniques to a problem and its associated dataset.
3. Discuss various design issues that arise when assembling data visualizations.
4. Build data visualizations, dashboards and stories to support relevant communication
for diverse audiences.
List of Experiments:
Course Outcomes:
1. Identify the different data types, visualization types to bring out the insight.
2. Relate the visualization towards the problem based on the dataset to analyze and bring out
valuable insight on a large dataset.
3. Demonstrate the analysis of a large dataset using various visualization techniques and tools.
4. Identify the different attributes and showcasing them in plots. Identify and create various
visualizations for geospatial and table data.
5. Ability to create and interpret plots using Python and various data visualization tools as well.
TEXT BOOKS:
1. Matthew Ward, Georges Grinstein and Daniel Keim, “Interactive Data Visualization
Foundations, Techniques, Applications”, 2010.
2. Colin Ware, “Information Visualization Perception for Design”, 2nd edition, Margon
Kaufmann Publishers, 2004.
REFERENCE BOOKS:
2. Alexandru C. Telea, “Data Visualization: Principles and Practice,” A. K. Peters Ltd, 2008.
IV Year B.Tech. CSE (DS) I Sem L T P C
3 - - 3
BIG DATA ENGINEERING
Course Objectives:
1. Learn the fundamental components of big data storage and processing techniques.
2. The purpose of this course is to provide the students with the knowledge of Big Data
Analytics principles and techniques.
3. This course is also designed to give an exposure of the frontiers of Big Data Analytics
4. Learn the fundamental components of big data storage and processing techniques.
5. Explore HADOOP Distributed File System and YARN resource manager
Four V’s of Big Data – Drivers for Big Data –Introduction to Big Data Analytics – Big Data Analytics
applications.
Hadoop’s Parallel World – Data discovery – Open source technology for Big Data Analytics – cloud
and Big Data –Predictive Analytics – Mobile Business Intelligence and Big Data
Big Data – Apache Hadoop & Hadoop Eco System – Moving Data in and out of Hadoop –
Understanding inputs and outputs of MapReduce - Data Serialization.
Hadoop: RDBMS Vs Hadoop, Hadoop Overview, Hadoop distributors, HDFS, HDFS Daemons,
Anatomy of File Write and Read., Name Node, Secondary Name Node, and Data Node, HDFS
Architecture, Hadoop Configuration, Map Reduce Framework, Role of HBase in Big Data
processing, HIVE, PIG.
1. Explain the foundations, definitions, and challenges of Big Data and various Analytical
tools.
2. Understand the differentiations of HADOOP and Map reduce, NOSQL
3. Understand the fundamentals of various big data analytics techniques.
4. Describe distributed data storage and management in HDFS
5. Ability to understand the importance of Big Data in Social Media and Mining.
TEXT BOOKS:
REFERENCE BOOKS:
1. Big Data and Business Analytics, Jay Liebowitz, Auerbach Publications, CRC press (2013)
2. Using R to Unlock the Value of Big Data: Big Data Analytics with Oracle R Enterprise and
Oracle R Connector for Hadoop, Tom Plunkett, Mark Hornick, McGraw-Hill/Osborne Media
(2013), Oracle press.
3. Professional Hadoop Solutions, Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, Wiley,
ISBN: 9788126551071, 2015.
4. Understanding Big data, Chris Eaton, Dirk deroos et al. McGraw Hill, 2012.
IV Year B.Tech. CSE (DS) I Sem L T P C
3 - - 3
WEB AND SOCIAL MEDIA ANALYTICS
HTML5 Web Workers Client Side Scripting: Introduction to java script: Java script language declaring
variables, scope of variables, functions, event handlers (on click, on submit etc.), Document Object
Model, Form validation.
Introduction to XML, Defining XML tags, their attributes and values, Document Type Definition,
Displaying XML documents with CSS, XML Schemas, DOM and SAX Parser
Social media landscape, Need for SMA; SMA in Small organizations; SMA in large organizations;
Application of SMA in different areas
Introduction, parameters, demographics. Analyzing page audience. Reach and Engagement analysis.
Post-performance on FB, Use of Facebook Business Manager; Social campaigns. Measuring and
Analyzing social campaigns, defining goals and evaluating outcomes, Network Analysis.
(LinkedIn, Instagram, YouTube, Twitter etc.)
Course objectives:
Learning various data science tools to equip with the technologies that fulfil the requirement of
the current industries.
Conditional Formatting, Sparkline and Number Formats, macros, drop down lists, Mastering
charting techniques, Create an Interactive Dashboard
SQL and MySQL, Basic DDL and DML statements, joins and views) and MongoDB database
(Importing-exporting and querying data, creating and manipulating documents, CRUD
operation, indexing and aggregation pipeline
Importing data, data inputting, data visualization, manipulating and managing data, statistical
modeling, R and database.
Introduction to Power BI, Power BI Desktop, Data Analysis Expressions, Data Visualization,
QlikView Workbench
Course Outcomes (CO): After completion of this course students could able to
Text Books:
1. Analyzing Data with Microsoft Power BI and Power Pivot for Excel, Marco Russo, Alberto
Ferrari, PHI.
2. Learn Power BI: A beginner guide to developing interactive business intelligence solutions
using Microsoft Power BI, Greg Deckler.
Reference Books: