0% found this document useful (0 votes)
22 views2 pages

Cab112:Introduction To Data Science: Session 2024-25 Page:1/2

lpu ds syllabus

Uploaded by

muskandeepk803
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views2 pages

Cab112:Introduction To Data Science: Session 2024-25 Page:1/2

lpu ds syllabus

Uploaded by

muskandeepk803
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CAB112:INTRODUCTION TO DATA SCIENCE

L:3 T:0 P:2 Credits:4

Course Outcomes: Through this course students should be able to

CO1 :: gain proficiency in Python programming relevant to data science.

CO2 :: perform EDA techniques to understand data distributions, relationships, and patterns
among problems and solutions.

CO3 :: execute machine learning algorithms for regression, classification, and predictive
modeling.
CO4 :: develop a comprehensive data science project from start to finish.

Unit I
Introduction to Data Science and Python for Data Analysis: Foundations of Data Science :
Overview of data science and its importance, The data science process and lifecycle
Python Libraries for Data Science : NumPy for numerical computations, Pandas for data
manipulation and analysis, Matplotlib and Seaborn for data visualization
Data Collection and Cleaning : Data collection techniques Handling missing data and outliers, Data
transformation and normalization
Unit II
Exploratory Data Analysis and Data Visualization: Exploratory Data Analysis (EDA) :
Descriptive statistics Data visualization techniques, Correlation and covariance
Advanced Data Visualization : Advanced plotting with Matplotlib and Seaborn, Interactive
visualizations with Plotly and Dash
Unit III
Supervised Learning Algorithms: Regression Algorithms : Linear regression, Polynomial
regression, Ridge and Lasso regression
Classification Algorithms : Logistic regression, k-Nearest Neighbors (k-NN), Decision Trees and
Random Forests, Support Vector Machines (SVM)
Model Evaluation : Metrics for regression: MAE, MSE, RMSE, R², Metrics for classification: accuracy,
precision, recall, F1-score, ROC-AUC, Cross-validation techniques
Unit IV
Unsupervised Learning Algorithms: Clustering Algorithms : k-Means clustering, Hierarchical
clustering, DBSCAN
Dimensionality Reduction : Principal Component Analysis (PCA) , t-Distributed Stochastic Neighbor
Embedding (t-SNE), Apriori algorithm, Eclat algorithm
Unit V
Advanced Topics in Machine Learning: Neural Networks and Deep Learning : Introduction to
neural networks, Deep learning basics with TensorFlow and Keras, Convolutional Neural Networks
(CNNs) ? Recurrent Neural Networks (RNNs)
Unit VI
Natural Language Processing : Text preprocessing and normalization, Tokenization and
stemming/lemmatization, Bag-of-Words and TF-IDF, Sentiment analysis and text classification

List of Practicals / Experiments:

Practical's
• Perform complex data manipulations such as merging, joining, and group operations on a real-world
dataset.
• Implement matrix operations and linear algebraic computations.

• Visualize data distributions using histograms, box plots, and scatter plots.

• Generate pair plots and heatmaps for correlation analysis.

• Normalize features using Min-Max scaling and standardize features using Z-score scaling.

• Use advanced imputation techniques (e.g., KNN imputation) and evaluate their impact on the
dataset.

Session 2024-25 Page:1/2


• Generate new features using polynomial transformations, interaction terms, and domain-specific
knowledge.
• Build and evaluate a decision tree classifier for predicting fraudulent transactions.

• Implement logistic regression, SVM, and k-NN classifiers on different datasets.

• Implement linear regression, polynomial regression, and support vector regression (SVR).

• Evaluate regression models using metrics such as MAE, MSE, and RMSE.

• Implement Random Forest, AdaBoost, and Gradient Boosting models on a classification problem.

• Apply k-means and hierarchical clustering to segment customers.

• Build and evaluate a convolutional neural network (CNN) for image classification.

• Implement sentiment analysis on social media data using machine learning algorithms.

Text Books:
1. INTRODUCTION TO DATA SCIENCE:PRACTICAL APPROACH WITH R AND PYTHON by B.
UMA MAHESWARI, R. SUJATHA, NA
2. MACHINE LEARNING USING PYTHON by MANARANJAN PRADHAN ,U DINESH KUMAR, NA

References:
1. DATA SCIENCE AND MACHINE LEARNING USING PYTHON by REEMA THAREJA, MC GRAW
HILL

Session 2024-25 Page:2/2

You might also like