Worksheets Name
DATASCIENCE UNIT 1 QUIZ
Class
Total questions: 50
Worksheet time: 25mins
Date
Instructor name: Mrs. Getzi Jeba 495
1. Which of the following is an example of structured data?
a) A CSV file with rows and columns b) A JSON file with nested objects
c) A video recording of a lecture d) A collection of social media posts
2. A dataset containing product ratings (1 to 5 stars) is best classified as:
a) Continuous b) Ordinal
c) Discrete d) Nominal
3. "Temperature in Celsius" is an example of:
a) Quantitative discrete data b) Quantitative continuous data
c) Nominal data d) Qualitative data
4. A list of employee job titles (e.g., Manager, Analyst, Engineer) is:
a) Nominal b) Ordinal
c) Continuous d) Discrete
5. Which of the following is NOT unstructured data?
a) Audio files b) PDF documents
c) Database tables d) Social media comments
6. "Customer satisfaction levels (Low, Medium, High)" are:
a) Discrete b) Ordinal
c) Nominal d) Continuous
7. "Number of students in a class" is:
a) Qualitative b) Discrete
c) Continuous d) Ordinal
8. "Blood types (A, B, AB, O)" are:
a) Discrete b) Ordinal
c) Continuous d) Nominal
9. "Car speed (mph)" is:
a) Qualitative b) Ordinal
c) Continuous d) Nominal
10. "Movie genres (Action, Comedy, Drama)" are:
a) Ordinal b) Continuous
c) Discrete d) Nominal
11. The phase where missing values are handled is:
a) Model Deployment b) Data Cleaning
c) Visualization d) Data Collection
12. Identifying trends in sales data is part of:
a) Model Building b) Deployment
c) Data Collection d) EDA
13. Feature engineering is done in which phase?
a) Data Collection b) Model Evaluation
c) Data Preparation d) Deployment
14. Training a machine learning model is part of:
a) Model Building b) Visualization
c) Data Collection d) Data Cleaning
15. Integrating a model into a live system is:
a) Deployment b) EDA
c) Feature Engineering d) Data Cleaning
16. Evaluating a model using accuracy metrics is part of:
a) Data Collection b) Model Evaluation
c) Visualization d) Data Cleaning
17. Removing duplicate records is done in:
a) Deployment b) EDA
c) Model Building d) Data Cleaning
18. Creating a dashboard to display insights is part of:
a) Feature Engineering b) Visualization
c) Model Building d) Data Collection
19. Retraining a model with new data is part of:
a) Deployment b) Continuous Improvement
c) Data Cleaning d) Data Collection
20. The first step in a data science project is typically:
a) Deployment b) Data Collection
c) Model Building d) Problem Understanding
21. Which library is best for data manipulation?
a) Pandas b) Matplotlib
c) SciPy d) TensorFlow
22. Which library is used for machine learning?
a) Scikit-learn b) OpenCV
c) Flask d) Seaborn
23. Which library is best for numerical computing?
a) SciKit-Learn b) Matplotlib
c) Pandas d) NumPy
24. Which library is used for advanced statistical visualizations?
a) NumPy b) Pandas
c) Seaborn d) SciPy
25. Which library is used for deep learning?
a) SciPy b) TensorFlow
c) Pandas d) Matplotlib
26. Which library is used for signal processing?
a) Scikit-learn b) SciPy
c) Matplotlib d) Pandas
27. Which library is used for web scraping?
a) Pandas b) Matplotlib
c) NumPy d) BeautifulSoup
28. Which library is used for natural language processing?
a) Matplotlib b) NLTK
c) SciPy d) OpenCV
29. Which library is used for interactive visualizations?
a) Plotly b) SciKit-Learn
c) NumPy d) Pandas
30. Which library is used for handling dates and times?
a) Pandas b) Matplotlib
c) NumPy d) datetime
31. Predicting stock prices is an example of:
a) Classification b) Natural Language Processing
c) Time Series Forecasting d) Clustering
32. Grouping customers based on purchasing behavior is:
a) Classification b) Regression
c) Anomaly Detection d) Clustering
33. Detecting fraudulent transactions is:
a) Sentiment Analysis b) Recommendation
c) Computer Vision d) Anomaly Detection
34. Recommending products based on past purchases is:
a) Recommender Systems b) Clustering
c) NLP d) Regression
35. Analyzing customer reviews to determine sentiment is:
a) Time Series b) Sentiment Analysis
c) Clustering d) Regression
36. Identifying objects in an image is:
a) Time Series b) NLP
c) Computer Vision d) Clustering
37. Predicting house prices is:
a) NLP b) Regression
c) Clustering d) Classification
38. Classifying emails as spam or not spam is:
a) Classification b) Time Series
c) Clustering d) Regression
39. Forecasting weather patterns is:
a) Classification b) Time Series Analysis
c) Clustering d) NLP
40. Automatically summarizing long articles is:
a) Regression b) NLP
c) Computer Vision d) Clustering
41. Which of the following is NOT a supervised learning task?
a) Clustering b) Classification
c) Sentiment Analysis d) Regression
42. Which algorithm is used for classification?
a) PCA b) Logistic Regression
c) K-Means d) Linear Regression
43. Which algorithm is used for clustering?
a) SVM b) Decision Trees
c) K-Means d) Linear Regression
44. Which metric is used for evaluating classification models?
a) Accuracy b) R-squared
c) MSE d) MAE
45. Which metric is used for regression models?
a) Precision b) Recall
c) F1-Score d) RMSE
46. Which technique reduces overfitting?
a) Cross-Validation b) All of the above
c) Feature Selection d) Regularization
47. Which method is used for feature selection?
a) Correlation Analysis b) All of the above
c) Random Forest Importance d) PCA
48. Which Python library is used for SQL queries?
a) SQLAlchemy b) Matplotlib
c) Pandas d) NumPy
49. Which tool is used for big data processing?
a) Matplotlib b) Pandas
c) Apache Spark d) NumPy
50. Which language is NOT typically used in data science?
a) Java b) R
c) HTML d) Python