0% found this document useful (0 votes)

59 views8 pages

DATA analytics previous solved

The document outlines a data analytics examination for T.Y.B.Sc.(CS) students, covering definitions and concepts in data analytics, machine learning, and natural language processing. It includes questions on topics like tokenization, clustering, confusion matrices, and types of data analytics, along with practical applications and challenges in the field. The exam consists of multiple-choice and descriptive questions aimed at assessing students' understanding and application of data analytics concepts.

Uploaded by

borsesumit02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views8 pages

DATA analytics previous solved

Uploaded by

borsesumit02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

T.Y.B.Sc.

(CS)
CS-364:Data Analy cs
(2019 Credit Pa ern) (Semester VI)

[max. marks: 35]

Q1. A empt any eight of the following. [8X1=8]
a) Deﬁne data analy cs.
 Data analy cs refers to the process of examining, interpre ng, and drawing insights from large
sets of data to uncover pa erns, trends, and meaningful informa on.

b) Deﬁne tokeniza on.

 Tokeniza on is the process of breaking down a stream of text or data into smaller units called
tokens.

c) Deﬁne machine learning.

 Machine learning is a type of ar ﬁcial intelligence that enables computer systems to learn and
improve from experience without being explicitly programmed.

d) What is clustering.
 Clustering is a machine learning technique that involves grouping similar data points together.

e) What is frequent itemset.

 Frequent itemset is a set of items that occur together in a transac on or dataset frequently.

f) What is data characteriza on.

 Data characteriza on, also known as data proﬁling or data summariza on, refers to the process
of analyzing and understanding the main features, proper es, and structure of a dataset.

g) What is outlier.
 Outlier is an observa on that lies an abnormal distance from other values in a random sample
from a popula on.

h) What is Bag of words.

 Bag of words is a natural language processing technique used for text classiﬁca on and
document analysis. It involves coun ng the frequency of words in a document and using these
counts as features for further analysis.

i) What is text analy cs.

 Text analy cs is the process of analyzing unstructured text data to extract meaningful insights
and pa erns. It involves techniques such as natural language processing, machine learning, and
sta s cal analysis.

j) Deﬁne trend analy cs.

 Trend analy cs is the process of analyzing data over me to iden fy pa erns and trends. It
involves techniques such as me series analysis, forecas ng, and anomaly detec on.
Q2.A empt any FOUR of the following. [4x2=8]
a) What is confusion matrix.
 A confusion matrix is a table that summarizes the performance of a classiﬁca on model. It shows
the counts or propor ons of true posi ve, true nega ve, false posi ve, and false nega ve
predic ons. It helps evaluate the model's accuracy, precision, recall, speciﬁcity, and F1 score. The
matrix provides insights into the model's ability to correctly classify instances and iden fy any
biases or errors. It is par cularly useful for assessing performance in imbalanced datasets and
making informed decisions about model improvements.

b) Deﬁne support and conﬁdence in associa on rule mining.

 Support:
The number of transac ons that include items in the {X} and {Y} parts of the rule as a
percentage of the total number of transac on.It is a measure of how frequently the collec on of
items occur together as a percentage of all transac ons.
Formula:
Support(A -> B) = (number of transac ons containing A and B) / (total number of transac ons).

Conﬁdence:
It is the ra o of the no. of transac ons that includes all items in {B} as well as the no of
transac ons that includes all items in {A} to the no of transac ons that includes all items in {A}.
Formula:
Conﬁdence (A -> B) = support(A -> B) / support(A).

c) Explain any two machine learning applica ons.

 Two machine learning applica ons are:
o 1. Recommender systems: Recommender systems use machine learning algorithms to
suggest products, services, or content to users based on their past behavior, preferences,
and interests. They are used in e-commerce, social media, and entertainment pla orms
to personalize user experiences and increase engagement and sales.
o 2. Image recogni on: Image recogni on is a type of computer vision that uses machine
learning algorithms to iden fy objects, people, and scenes in digital images or videos. It
has numerous applica ons in security, healthcare, transporta on, and entertainment
industries. For example, it can be used to detect faces in photos, diagnose medical
condi ons from X-rays, or iden fy traﬃc signs in self-driving cars.

d) Write a short note stop words.

 Stop words are common words that are o en removed from text data during natural language
processing to improve the efficiency and accuracy of the analysis. Examples of stop words
include "the," "and," "a," "an," "in," and "to." These words do not carry significant meaning and
can be safely ignored without losing the essence of the text. However, some stop words may be
important in certain contexts, and their removal may affect the accuracy of the analysis.
Therefore, it is important to choose an appropriate stop word list based on the specific needs of
the analysis.
e) Define supervise learning and unsupervised learning.

o Supervised learning: Supervised learning is a type of machine learning where the
algorithm learns to make predic ons from labeled data. The labeled data contains both
input features and the desired output, which is used to train the model. The goal of
supervised learning is to learn a mapping func on from input variables to output
variables, so that the model can make accurate predic ons on new, unseen data.

o Unsupervised learning: Unsupervised learning is a type of machine learning where the

algorithm learns to iden fy pa erns and rela onships in unlabeled data. The data does
not contain any predeﬁned output, so the model must ﬁnd the underlying structure on
its own. The goal of unsupervised learning is to discover hidden or latent variables that
explain the observed data, and to group similar data points together. Clustering and
dimensionality reduc on are common examples of unsupervised learning.

Q3.A empt any two of the following. [2x4=8]

a) What is predic on? Explain any one regression model in detail.
 Predic on is the process of using machine learning algorithms to make informed guesses about
the value of a new, unseen data point based on the pa erns and rela onships learned from a
labeled training dataset. Regression is a type of machine learning algorithm that is used for
predic ng con nuous numeric values based on input features.

One popular regression model is Linear Regression, which models the rela onship between a
dependent variable and one or more independent variables by fi ng a linear equa on to the
observed data. The goal of linear regression is to find the best-fi ng line that minimizes the sum
of squared errors between the predicted values and the actual values. The line is defined by the
slope and intercept, which are es mated from the training data using the method of least
squares.

The equa on of a simple linear regression model is: y = b0 + b1*x, where y is the dependent
variable, x is the independent variable, b0 is the intercept, and b1 is the slope. The slope
represents the change in y for every one-unit increase in x, and the intercept represents the
value of y when x is zero.

The model can be extended to mul ple linear regression, where there are more than one
independent variables. The equa on becomes: y = b0 + b1*x1 + b2*x2 + ... + bn*xn, where n is
the number of independent variables. The slope coeﬃcients b1 to bn represent the change in y
for every one-unit increase in the corresponding x variable, holding all other variables constant.

Linear regression is widely used in fields such as economics, finance, engineering, and social
sciences to model and predict various phenomena, such as stock prices, housing prices, sales,
and customer behavior.
b) Differen ate between stemming and lemma za on.

Stemming Lemmatization

Stemming reduces words to their base Lemmatization reduces words to their base form
or root form by removing suffixes and (known as lemma) based on the word's context and
prefixes. part of speech.

It uses simple and fast rule-based It utilizes more advanced linguistic and language-
approaches. specific algorithms.

Stemmed words may not always be Lemmatized words are always valid words found in a
actual words. dictionary.

Stemming can result in loss of meaning

or ambiguity due to aggressive Lemmatization aims to preserve the meaning and
truncation. context of words.

Examples: stemming reduces "running,"

"runs," and "ran" to the common root Examples: lemmatization reduces "running," "runs,"
"run." and "ran" to the base form "run."

c) Describe the types of data analy cs.

 There are three main types of data analy cs: descrip ve, predic ve, and prescrip ve.

o Descrip ve analy cs: Descrip ve analy cs is the simplest type of analy cs that
summarizes the historical data to provide insights into what happened in the past. It
answers ques ons such as "What happened?" and "How many?" Examples of descrip ve
analy cs include summary sta s cs, frequency distribu ons, and data visualiza on.

o Predic ve analy cs: Predic ve analy cs is the type of analy cs that uses sta s cal
models and machine learning algorithms to analyze historical data and make predic ons
about future events. It answers ques ons such as "What is likely to happen?" and "How
likely?" Examples of predic ve analy cs include regression analysis, me series
forecas ng, and classiﬁca on.

o Prescrip ve analy cs: Prescrip ve analy cs is the most advanced type of analy cs that
uses op miza on and simula on techniques to recommend ac ons that will achieve the
best possible outcome. It answers ques ons such as "What should we do?" and "How
can we op mize?" Examples of prescrip ve analy cs include linear programming,
decision trees, and Monte Carlo simula on.
Each type of analy cs has its own strengths and limita ons, and the choice of which type to use
depends on the speciﬁc business problem and the available data.

Q4.A empt any two of the following. [2x4=8]

a) Consider the following transac onal database and ﬁnd out frequent itemsets using apriori
algorithm with minimum support count=2.
TID List_of_item_IDs
T1 I1,I2,I5
T2 I2,I4
T3 I2,I3
T4 I1,I2,I4
T5 I1,I3
T6 I2,I3
T7 I1,I3
T8 I1,I2,I3,I5
T9 I1,I2,3


b) Which are the challenges in social media analy cs?

 Social media analy cs faces several challenges due to the unique characteris cs of social media
data and the dynamic nature of online pla orms. Some key challenges include:
o 1. Volume and Velocity: The vast amount of data generated on social media pla orms
presents a challenge in terms of data collec on, storage, and processing. The high
velocity of data, with constant updates and real- me interac ons, requires efficient and
scalable analy cs solu ons.
o 2. Data Quality and Noise: Social media data can be noisy, containing spam, irrelevant
content, and user-generated noise. Ensuring data quality and filtering out noise are
crucial for accurate analysis and insights.
o 3. Data Privacy and Ethics: Social media analy cs raises concerns about data privacy,
consent, and ethical considera ons. Balancing the need for data access and analysis with
user privacy rights and ethical guidelines is an ongoing challenge.
o 4. Textual Analysis and Natural Language Processing: Analyzing unstructured text data
from social media poses challenges in understanding language nuances, sen ment
analysis, and dealing with slang, abbrevia ons, and informal language.
o 5. User Bias and Representa veness: Social media data may have biases due to self-
selec on, algorithmic filtering, or the characteris cs of ac ve users. Ensuring the
representa veness of data and mi ga ng biases is essen al for drawing accurate
conclusions and avoiding skewed insights.
o 6. Mul -Modality and Mul media Content: Social media pla orms include various types
of content, including text, images, videos, and audio. Analyzing and extrac ng insights
from mul -modal and mul media content adds complexity to the analy cs process.
o 7. Real-Time Monitoring and Crisis Management: Social media analy cs o en involves
monitoring and managing brand reputa on, crisis situa ons, and emerging trends in
real- me. Quickly iden fying and responding to online events or sen ments is cri cal
but challenging.
o 8. Data Integra on and Pla orm Heterogeneity: Integra ng data from mul ple social
media pla orms and sources, each with its own APIs, data formats, and access
restric ons, can be complex. Dealing with pla orm heterogeneity and ensuring data
consistency pose integra on challenges.

Addressing these challenges requires a combina on of technical exper se, domain

knowledge, data processing capabili es, and ethical considera ons to extract valuable
insights from social media data while naviga ng the complexi es and limita ons of the
pla orms.

c) Explain reinforcement learning.

 Reinforcement learning is a type of machine learning that is used to train an agent to make
decisions in an environment. The agent learns by interac ng with the environment and receiving
feedback in the form of rewards or penal es. The goal of reinforcement learning is to find the
op mal policy that maximizes the cumula ve reward over me.
The reinforcement learning process starts with the agent in a par cular state of the environment.
The agent takes an ac on in response to the state, and the environment transi ons to a new
state and provides a reward to the agent based on the ac on taken. The agent then updates its
policy based on the feedback received and repeats the process.
The key components of reinforcement learning are the policy, the reward func on, and the value
func on. The policy determines the ac on to take given the current state of the environment.
The reward func on provides feedback to the agent in the form of a scalar value that indicates
how good or bad the ac on was. The value func on es mates the expected cumula ve reward
from a par cular state.
Reinforcement learning has been successfully applied to a wide range of applica ons, including
robo cs, game playing, and autonomous vehicles. However, it can be challenging to apply in
prac ce due to the need for extensive training and the difficulty of designing a reward func on
that accurately reflects the desired behavior.

Q5.A empt any one of the following. [1x3=3]

a) Write a short note support vector machine.
 Support Vector Machine (SVM) is a popular supervised machine learning algorithm used for
classifica on and regression analysis. It is a binary classifier that separates data points into two
classes based on their features. SVM finds the best hyperplane that separates the classes by
maximizing the margin between the closest data points from each class. The data points that are
closest to the hyperplane are called support vectors.
SVM is a powerful algorithm that can handle both linear and non-linear data by using a
technique called kernel trick. Kernel trick maps the data points into a higher-dimensional space
where they can be separated by a hyperplane. SVM is widely used in various applica ons such as
image classifica on, bioinforma cs, text classifica on, and fraud detec on. However, SVM can be
computa onally expensive when dealing with large datasets, and it may not perform well when
the classes are overlapping or the data is noisy.
b) Explain lifecycle of data analy cs.

Phase 1: Discovery –
The data science team learn and inves gate the problem.
Develop context and understanding.
Come to know about data sources needed and available for the project.
The team formulates ini al hypothesis that can be later tested with data.

Phase 2: Data Prepara on –

Steps to explore, preprocess, and condi on data prior to modeling and analysis.
It requires the presence of an analy c sandbox, the team execute, load, and transform, to get data
into the sandbox.
Data prepara on tasks are likely to be performed mul ple mes and not in predeﬁned order.
Several tools commonly used for this phase are – Hadoop, Alpine Miner, Open Reﬁne, etc.

Phase 3: Model Planning –

Team explores data to learn about rela onships between variables and subsequently, selects key
variables and the most suitable models.
In this phase, data science team develop data sets for training, tes ng, and produc on purposes.
Team builds and executes models based on the work done in the model planning phase.
Several tools commonly used for this phase are – Matlab, STASTICA.

Phase 4: Model Building –

Team develops datasets for tes ng, training, and produc on purposes.
Team also considers whether its exis ng tools will suﬃce for running the models or if they need
more robust environment for execu ng models.
Free or open-source tools – Rand PL/R, Octave, WEKA.
Commercial tools – Matlab , STASTICA.

Phase 5: Communica on Results –

A er execu ng model team need to compare outcomes of modeling to criteria established for
success and failure.
Team considers how best to ar culate findings and outcomes to various team members and
stakeholders, taking into account warning, assump ons.
Team should iden fy key findings, quan fy business value, and develop narra ve to summarize and
convey findings to stakeholders.
Phase 6: Opera onalize –
The team communicates benefits of project more broadly and sets up pilot project to deploy work in
controlled way before broadening the work to full enterprise of users.
This approach enables team to learn about performance and related constraints of the model in
produc on environment on small scale , and make adjustments before full deployment.
The team delivers final reports, briefings, codes.
Free or open source tools – Octave, WEKA, SQL, MADlib.

Residential Tenancy Agreement
No ratings yet
Residential Tenancy Agreement
11 pages
108 Names of Lord Surya - Ashtottara Shatanamavali of Sun God PDF
No ratings yet
108 Names of Lord Surya - Ashtottara Shatanamavali of Sun God PDF
7 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Os notes Ty bsc cs imp sppu
No ratings yet
Os notes Ty bsc cs imp sppu
96 pages
Capstone Project - Airline Passenger Satisfaction
No ratings yet
Capstone Project - Airline Passenger Satisfaction
18 pages
Movie Recommendation System Using Machine Learning
No ratings yet
Movie Recommendation System Using Machine Learning
23 pages
Student Result Management System Presentation
No ratings yet
Student Result Management System Presentation
11 pages
Steganography Project Report For Major Project in B Tech
No ratings yet
Steganography Project Report For Major Project in B Tech
74 pages
Mfcs PPT (All Units)
No ratings yet
Mfcs PPT (All Units)
103 pages
Report
100% (1)
Report
32 pages
DATA ANALYTICS QUESTION BANK
No ratings yet
DATA ANALYTICS QUESTION BANK
4 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Anna University: Chennai - 600 025
0% (1)
Anna University: Chennai - 600 025
13 pages
DBMS LAB MANUAL FINAL (AutoRecovered)
No ratings yet
DBMS LAB MANUAL FINAL (AutoRecovered)
46 pages
Nikhil MOOC Report
No ratings yet
Nikhil MOOC Report
16 pages
Music Organizer Report
50% (2)
Music Organizer Report
21 pages
Time Table Generation Projects in Java
100% (2)
Time Table Generation Projects in Java
11 pages
Vanishing and Exploding
No ratings yet
Vanishing and Exploding
9 pages
R22-Ids-Question Bank
No ratings yet
R22-Ids-Question Bank
4 pages
Software Testing Notes
100% (1)
Software Testing Notes
12 pages
COSC 3100 Brute Force and Exhaustive Search: Instructor: Tanvir
No ratings yet
COSC 3100 Brute Force and Exhaustive Search: Instructor: Tanvir
44 pages
Sms Spam Detection
No ratings yet
Sms Spam Detection
23 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
PROJECT REPORT For Machine Learning
100% (1)
PROJECT REPORT For Machine Learning
22 pages
Enterprise Computing With Java Practical File: Master of Computer Application
No ratings yet
Enterprise Computing With Java Practical File: Master of Computer Application
45 pages
BCSL 058 Computer Oriented Numerical Techniques Lab Solved Assignment 2019 20
No ratings yet
BCSL 058 Computer Oriented Numerical Techniques Lab Solved Assignment 2019 20
17 pages
MLQuestion-Bank (2)_For IA1
No ratings yet
MLQuestion-Bank (2)_For IA1
2 pages
Data Mining of Restaurant Review Using W PDF
No ratings yet
Data Mining of Restaurant Review Using W PDF
4 pages
App Java Report-Eb Ocr
No ratings yet
App Java Report-Eb Ocr
42 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
AoA Important Question
100% (1)
AoA Important Question
3 pages
DLT Unit-1
No ratings yet
DLT Unit-1
66 pages
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
100% (1)
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
6 pages
OSAssignment K1508
0% (1)
OSAssignment K1508
15 pages
Liver Disease Prediction using Machine learning and Deep Learning
No ratings yet
Liver Disease Prediction using Machine learning and Deep Learning
73 pages
ATCD TutorialQs
No ratings yet
ATCD TutorialQs
14 pages
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
No ratings yet
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
23 pages
OOPS Concepts in PHP
100% (2)
OOPS Concepts in PHP
40 pages
Unit-1 STQA
No ratings yet
Unit-1 STQA
127 pages
Characteristics of Tasks and Task Interactions
No ratings yet
Characteristics of Tasks and Task Interactions
11 pages
Visvesvaraya Technological University: "Car Rental Management System"
No ratings yet
Visvesvaraya Technological University: "Car Rental Management System"
31 pages
STRING-Module 2 Notes
100% (1)
STRING-Module 2 Notes
29 pages
NLP Asgn2
No ratings yet
NLP Asgn2
7 pages
Cs-825 Msitcs Ir
No ratings yet
Cs-825 Msitcs Ir
3 pages
Job Recommender Java Spring Boot
No ratings yet
Job Recommender Java Spring Boot
21 pages
Introduction: Data Analytic Thinking
No ratings yet
Introduction: Data Analytic Thinking
38 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
TYBSc (CS) Sem VI - Practical - Slips-1
No ratings yet
TYBSc (CS) Sem VI - Practical - Slips-1
30 pages
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
No ratings yet
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
3 pages
8 Advanced Interaction Modeling: Here Are Answers For An Electronic Gasoline Pump. Figure A8.1 Shows A Use Case Diagram
No ratings yet
8 Advanced Interaction Modeling: Here Are Answers For An Electronic Gasoline Pump. Figure A8.1 Shows A Use Case Diagram
9 pages
18csc202j Oodp Ct1 Question-Old
No ratings yet
18csc202j Oodp Ct1 Question-Old
8 pages
IS 7118 Unit-5 POS Tagging
No ratings yet
IS 7118 Unit-5 POS Tagging
89 pages
Question Bank - WTL-oral Question Bank - WTL-oral
No ratings yet
Question Bank - WTL-oral Question Bank - WTL-oral
9 pages
Web Technology II
No ratings yet
Web Technology II
31 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Data Analytics All Paper Solution
No ratings yet
Data Analytics All Paper Solution
11 pages
DATA ANALYTICS PYQ
No ratings yet
DATA ANALYTICS PYQ
32 pages
Data Analytics imp
No ratings yet
Data Analytics imp
20 pages
data science
No ratings yet
data science
28 pages
Mcqs 1
No ratings yet
Mcqs 1
34 pages
Dover Complete Title List For DistributionPDF Compressed
No ratings yet
Dover Complete Title List For DistributionPDF Compressed
67 pages
Using The Force-Velocity Curve To Build Better Athletes - Elite FTS PDF
No ratings yet
Using The Force-Velocity Curve To Build Better Athletes - Elite FTS PDF
8 pages
Compile Pre Board Exam Gen Ed 2012 1
No ratings yet
Compile Pre Board Exam Gen Ed 2012 1
156 pages
2023 - Module 1 Organic Chem
No ratings yet
2023 - Module 1 Organic Chem
8 pages
AOP Mid-Year Accomplishment Report (To Be Submitted After The First Semester of Current SY)
No ratings yet
AOP Mid-Year Accomplishment Report (To Be Submitted After The First Semester of Current SY)
6 pages
SILAG 2025 Application Form 1 (2)
No ratings yet
SILAG 2025 Application Form 1 (2)
6 pages
Linking Words Synthesis
No ratings yet
Linking Words Synthesis
5 pages
7 Microsoft Surface Laptop Studio 2 and Laptop Go 3 - How To Preorder - The Verge
No ratings yet
7 Microsoft Surface Laptop Studio 2 and Laptop Go 3 - How To Preorder - The Verge
5 pages
Lesson 10 Gerunds
No ratings yet
Lesson 10 Gerunds
8 pages
Monitoring Checklist For Good Education in Beautiful Classroom (GEBC)
No ratings yet
Monitoring Checklist For Good Education in Beautiful Classroom (GEBC)
4 pages
Din-Dvgwtype Examination Certificate: Din-Dvgw-Baumusterprüfzertifikat
No ratings yet
Din-Dvgwtype Examination Certificate: Din-Dvgw-Baumusterprüfzertifikat
2 pages
Program Schedule - KU International Seminar
No ratings yet
Program Schedule - KU International Seminar
11 pages
A&P Crash Course Reproductive System
No ratings yet
A&P Crash Course Reproductive System
3 pages
Jumbo King Vada Pav
No ratings yet
Jumbo King Vada Pav
6 pages
4-Opportunity Analysis
No ratings yet
4-Opportunity Analysis
8 pages
Exercise 7
No ratings yet
Exercise 7
4 pages
John Bruner's
No ratings yet
John Bruner's
4 pages
CFT
No ratings yet
CFT
25 pages
Ironsworn Jumpchain
No ratings yet
Ironsworn Jumpchain
16 pages
Nagapattinam-26 12 23l
No ratings yet
Nagapattinam-26 12 23l
1 page
XI CBSE Project
No ratings yet
XI CBSE Project
3 pages
Hunk 150
33% (3)
Hunk 150
2 pages
85-119
No ratings yet
85-119
565 pages
Questionpaper Paper1B June2018 IGCSE Edexcel Biology PDF
No ratings yet
Questionpaper Paper1B June2018 IGCSE Edexcel Biology PDF
32 pages
Banning and Unbanning Phones in Schools
No ratings yet
Banning and Unbanning Phones in Schools
12 pages
Birla Institute of Technology and Science, Pilani Bitsat 2020: Hall Ticket
No ratings yet
Birla Institute of Technology and Science, Pilani Bitsat 2020: Hall Ticket
2 pages
Operations Management Reviewer
No ratings yet
Operations Management Reviewer
3 pages
2004 Rules On Notarial Practice 1
No ratings yet
2004 Rules On Notarial Practice 1
102 pages

DATA analytics previous solved

Uploaded by

DATA analytics previous solved

Uploaded by

T.Y.B.Sc.

[max. marks: 35]

b) Deﬁne tokeniza on.

c) Deﬁne machine learning.

e) What is frequent itemset.

f) What is data characteriza on.

h) What is Bag of words.

i) What is text analy cs.

j) Deﬁne trend analy cs.

b) Deﬁne support and conﬁdence in associa on rule mining.

c) Explain any two machine learning applica ons.

d) Write a short note stop words.

o Unsupervised learning: Unsupervised learning is a type of machine learning where the

Q3.A empt any two of the following. [2x4=8]

Stemming can result in loss of meaning

Examples: stemming reduces "running,"

c) Describe the types of data analy cs.

Q4.A empt any two of the following. [2x4=8]

b) Which are the challenges in social media analy cs?

Addressing these challenges requires a combina on of technical exper se, domain

c) Explain reinforcement learning.

Q5.A empt any one of the following. [1x3=3]

Phase 2: Data Prepara on –

Phase 3: Model Planning –

Phase 4: Model Building –

Phase 5: Communica on Results –

You might also like