0% found this document useful (0 votes)
2 views2 pages

Course Outline

The Foundations of Data Science course, taught by Ms. Sumaira Saeed in Spring 2025, focuses on data mining theory and algorithms, emphasizing practical skills using tools like KNIME and Python. Key topics include classification, regression, clustering, and data preprocessing, with a grading scheme based on midterms, finals, quizzes, and assignments. Students will gain knowledge in data-driven decision making and practical skills in data cleaning, transformation, and predictive modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views2 pages

Course Outline

The Foundations of Data Science course, taught by Ms. Sumaira Saeed in Spring 2025, focuses on data mining theory and algorithms, emphasizing practical skills using tools like KNIME and Python. Key topics include classification, regression, clustering, and data preprocessing, with a grading scheme based on midterms, finals, quizzes, and assignments. Students will gain knowledge in data-driven decision making and practical skills in data cleaning, transformation, and predictive modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Course: Foundations of Data Science

Spring 2025
Name: Ms. Sumaira Saeed
Email: [email protected]
Counselling hours: Tues/Thurs 2:30 to 3:30

Course Description
The course introduces students to fundamentals of data mining theory and algorithms. In addition to
building a strong mathematical foundation, the course puts heavy emphasis on analysis and mining of
actual data sets via popular data mining tools such as KNIME and Python. The list of covered topics
include classification (k-nearest neighborhood, classification tree, naïve Bayes, random forest),
regression, clustering (k-means, fuzzy c-mean, hierarchical clustering), association rules and text mining.
Feature selection, data cleaning, data transformation, model evaluation and data visualization are also
covered in sufficient details.

Course Objectives
• To excite students about the potential that resides in data and the value that data analytics can
add to business processes
• To impart skills related to data cleaning/wrangling, data transformation/preprocessing, and data
comprehension through statistical analysis
• To impart skills related to analytical (mathematical) data modeling

Learning Outcomes
 Thorough knowledge about the science of data-driven decision making with respect to data
science and its relationship to solving core business problems, along with success stories
• Knowledge of data cleaning/wrangling in data science and practical skillset
• Knowledge on data transformation/ pre-processing in data science and practical skillset
• Practical skillset on extracting initial insights from data to facilitate data comprehension (through
hands-on activity)
• Theoretical mathematical knowledge about standard predictive modeling algorithms (supervised
learning).
• Practical skillset on how predictions can be generated from data.

Grading Scheme
Midterm- 30
Final Exam 40
Quiz 10
Assignments – 15
CP - 5
Course Outline

Week Topics
1 Course Overview, What is Data Mining and its Origin, Typical Data
Mining Tasks, Data Mining Applications/Examples, Data Mining vs.
OLAP, Statistics and Machine Learning
2 CRISP-DM Model , Data preparation, Data Cleaning, Introduction to
Decision Trees
3 Handling Continuous variables, Avoiding overfitting in Decision Trees,
Python Demo of DT
4 Variance-Bias Tradeoff, Receiver Evaluation Metrices
5 Lazy Learner vs. Eager Learner, k-Nearest Neighbor: Pros and Cons,
Hold-out Method vs Cross-Validation
6 ROC curve, Feature Selection and Correlation Analysis through
Hypothesis Testing, Scatterplots
7 Naïve Bayes Classifier, Feature Selection: Filter vs Wrappers, Forward
and Backward Selection
8 Ensemble Methods: Bagging vs Boosting, Working of Random Forest
and AdaBoost
9 Stacking, Revisiting Variance-Bias Tradeoff, Feature Reduction using
Principal Component Analysis (PCA) Python Code
10 Multiple Linear Regression, Regression Diagnostics and Evaluation
11 kNN Regression, Regression Tree and Tree Ensemble Regression
12 Clustering: Agglomerative vs Partitional
13 Association Rule Mining
14 Project Presentations

Reference Books
Principles of Data Mining by Max Bremer (2020)

Data Mining – Concepts, Models, Methods, and Algorithms by Mehmed Kantardzic (2020)

Data Mining for Business Analytics – Concepts, Techniques and Applications in Python (2020)

You might also like