0% found this document useful (0 votes)

10 views

DM Chapter 1

The document provides an overview of data mining, detailing its processes, architecture, functionalities, technologies, and applications. It outlines the steps involved in the knowledge discovery process, the types of data that can be mined, and the major issues faced in the field. Additionally, it compares data warehouses and data marts, discusses various applications across sectors, and contrasts classification and regression techniques.

Uploaded by

maharanadebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

DM Chapter 1

Uploaded by

maharanadebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 1

Introduction

Data Mining: Data mining is the process of extracting useful information from large sets of
data. It involves using various techniques from statistics, machine learning, and database
systems to identify patterns, relationships, and trends in the data. This information can then be
used to make data-driven decisions, solve business problems, and uncover hidden insights.

KDD Process in Data Mining: Data mining is also referred as knowledge discovery from data
(KDD). Knowledge discovery process is depicted in figure 1.

Figure 1: Data mining as a step in the process of knowledge discovery

Knowledge discovery process consists of following steps:

(a) Data Cleaning − In this step, the noise and inconsistent data is removed.
(b) Data Integration − In this step, multiple data sources are combined.
(c) Data Selection − In this step, data relevant to the analysis task are retrieved from the
database.
(d) Data Transformation − In this step, data is transformed or consolidated into forms
appropriate for mining by performing summary or aggregation operations.
(e) Data Mining − In this step, intelligent methods are applied in order to extract data patterns.
(f) Pattern Evaluation − In this step, data patterns are evaluated.
(g) Knowledge Presentation − In this step, knowledge is represented.

Architecture of Data Mining System: Architecture of a typical data mining system is

provided in figure 2.

Figure 2: Architecture of a typical Data Mining

A detailed description of parts of data mining architecture is given below:

(a) Data Sources: Database, World Wide Web (WWW) and data warehouse are parts of data
sources. The data in these sources may be in the form of plain text, spreadsheets, or other
forms of media like photos or videos. WWW is one of the biggest sources of data.
(b) Database Server: The database server contains the actual data ready to be processed. It
performs the task of handling data retrieval as per the request of the user.
(c) Data Mining Engine: It is one of the core components of the data mining architecture that
performs all kinds of data mining techniques like association, classification,
characterization, clustering, prediction, etc.
(d) Pattern Evaluation Modules: They are responsible for finding interesting patterns in the
data and sometimes they also interact with the database servers for producing the result of
the user requests.
(e) Graphic User Interface: Since the user cannot fully understand the complexity of the data
mining process so graphical user interface helps the user to communicate effectively with
the data mining system.
(f) Knowledge Base: Knowledge Base is an important part of the data mining engine that is
quite beneficial in guiding the search for the result patterns. Data mining engines may also
sometimes get inputs from the knowledge base. This knowledge base may contain data
from user experiences. The objective of the knowledge base is to make the result more
accurate and reliable.

What Kinds of data can be mined?

• Relational Databases − A database system is also called a database management system.

It includes a set of interrelated data, called a database, and a set of software programs to
handle and access the data.
• Data warehouse – A data warehouse is a repository of information collected from multiple
sources, stored under a unified schema, and usually residing at a single site. Data
warehouses are constructed via a process of data cleaning, data integration, data
transformation, data loading, and periodic data refreshing.
• Transactional Databases − A transactional database includes a file where each record
defines a transaction. A transaction generally contains a unique transaction identity number
(trans ID) and a list of the items creating up the transaction (such as items purchased in a
store).
• Object-Relational Databases − Object-relational databases are assembled based on an
object-relational data model. This model continues the relational model by supporting a
rich data type for managing complex objects and object orientation.
• Temporal Database − A temporal database generally stores relational data that contains
time-related attributes. These attributes can include multiple timestamps, each having
several semantics.
• Sequence Database − A sequence database stores sequences of ordered events, with or
without a factual idea of time. For example, customer shopping sequences, Web click
streams, and biological sequences.
• Time-Series Database − A time-series database stores sequences of values or events
accessed over repeated measurements of time (e.g., hourly, daily, weekly). An example
includes data collected from the stock exchange, stock control, and the measurement of
natural phenomena (like temperature and wind).

Data Mining Functionalities: Data mining functionalities represent the patterns that need to
be found in data mining activities. Different data mining functionalities are:
(a) Data characterization – It is a summarization of the general characteristics of an object
class of data. The data corresponding to the user-specified class is generally collected by a
database query. The output of data characterization can be presented in multiple forms.
(b) Data discrimination – It is a comparison of the general characteristics of target class data
objects with the general characteristics of objects from one or a set of contrasting classes.
The target and contrasting classes can be represented by the user, and the equivalent data
objects fetched through database queries.
(c) Association Analysis − It analyses the set of items that generally occur together in a
transactional dataset.
(d) Classification − Classification is the procedure of discovering a model that represents and
distinguishes data classes or concepts, for the objective of being able to use the model to
predict the class of objects whose class label is anonymous. The derived model is
established on the analysis of a set of training data (i.e., data objects whose class label is
common).
(e) Prediction − It defines predict some unavailable data values or pending trends. An object
can be anticipated based on the attribute values of the object and attribute values of the
classes. It can be a prediction of missing numerical values or increase/decrease trends in
time-related information.
(f) Clustering − In cluster analysis, similar data are grouped—or clustered—together under
an unknown class label. Data is split into groups by clustering algorithms based on
similarities, and the data groups are more similar than the other data groups.
(g) Outlier analysis − Analysing outliers helps understand data quality. An outlier is a data
anomaly. A greater number of outliers in a data set implies a lower quality of the data.
Therefore, using a data set with high outliers is not a wise option for finding patterns in the
data or drawing any conclusions. This analysis technique is helpful when the data algorithm
fails to classify data and we encounter data with different attributes that don’t match any
other class or general model.
(h) Evolution analysis − It defines the trends for objects whose behaviour changes over some
time.

Which Technologies are used in data mining?

As a highly application-driven domain, data mining has incorporated many techniques from
other domains such as statistics, machine learning, pattern recognition, database and data
warehouse systems, information retrieval, visualization, algorithms, high performance
computing, and many application domains (Figure 3).
Figure 3: Data Mining adopts techniques from many domain

(a) Statistics: Statistics studies the collection, analysis, interpretation or explanation, and
presentation of data. Data mining has an inherent connection with statistics. A statistical
model is a set of mathematical functions that describe the behaviour of the objects in a
target class in terms of random variables and their associated probability distributions.
(b) Machine Learning: Machine learning investigates how computers can learn (or improve
their performance) based on data. A main research area is for computer programs to
automatically learn to recognize complex patterns and make intelligent decisions based on
data.
(c) Database Systems and Data Warehouses: Many data mining tasks need to handle large
data sets or even real-time, fast streaming data. Therefore, data mining can make good use
of scalable database technologies to achieve high efficiency and scalability on large data
sets. Moreover, data mining tasks can be used to extend the capability of existing database
systems to satisfy advanced users’ sophisticated data analysis requirements.
(d) Information Retrieval: Information retrieval (IR) is the science of searching for
documents or information in documents.

Major Issues in Data Mining: Data mining is a dynamic and fast-expanding field with great
strengths. However, there are some major issues.
(a) Mining Methodology: Researchers have been vigorously developing new data mining
methodologies. This involves the investigation of new kinds of knowledge, mining in
multidimensional space, integrating methods from other disciplines, and the consideration
of semantic tie among data objects. In addition, mining methodologies should consider
issues such as data uncertainty, noise, and incompleteness. Some mining methods explore
how user specified measures can be used to assess the interestingness of discovered patterns
as well as guide the discovery process.
(b) User Interaction: The user plays an important role in the data mining process. Interesting
areas of research include how to interact with a data mining system, how to incorporate a
user’s background knowledge in mining, and how to visualize and comprehend data mining
results.
(c) Efficiency and Scalability: Efficiency and scalability are always considered when
comparing data mining algorithms. As data amounts continue to multiply, these two factors
are especially critical.
(d) Diversity of Database Type: The wide diversity of database types brings about challenges
to data mining. These include:
• Handling complex types of data
• Mining dynamic, networked, and global data repositories
(e) Data Mining and Society: How does data mining impact society? What steps can data
mining take to preserve the privacy of individuals? Do we use data mining in our daily lives
without even knowing that we do? These questions raise the following issues:
• Social impacts of data mining
• Privacy-preserving data mining
• Invisible data mining

Difference between Data Warehouse and Data Mart

Data Warehouse Data Mart
1. Data warehouse is a Centralised system. Data mart is a decentralised system.
2. Lightly denormalization takes place. Highly denormalization takes place.
3. It is top-down model. It is a bottom-up model.
4. To build a warehouse is difficult. To build a mart is easy.
5. Fact constellation schema is used. Star and snowflake schema are used.
6. Complicated design process of creating Easy design process of creating schemas
schemas and views. and views.
7. It is flexible. It is not flexible.
8. It is data-oriented in nature. It is project-oriented in nature.
9. It has long life. It has short life than warehouse.
10. In this, data are contained in detail form. In this, data are contained in summarized
form.
11. It is vast in size. It is smaller than warehouse.
12. It collects data from various data sources. It generally stores data from a data
warehouse.
13. Long time for processing the data because Less time for processing the data because
of large data. of handling only a small amount of data.

Data Mining Applications: Large number of applications are using data mining concept.
Some of them are depicted in Figure 4.
(a) Scientific Analysis: Scientific simulations are generating bulks of data every day. This
includes data collected from nuclear laboratories, data about human psychology, etc. Data
mining techniques are capable of the analysis of these data. Now we can capture and store
more new data faster than we can analyse the old data already accumulated. Example of
scientific analysis:
• Sequence analysis in bioinformatics
• Classification of astronomical objects
• Medical decision support

Figure 4: Data Mining Applications

(b) Intrusion Detection: A network intrusion refers to any unauthorized activity on a digital
network. Network intrusions often involve stealing valuable network resources. Data
mining technique plays a vital role in searching intrusion detection, network attacks, and
anomalies. These techniques help in selecting and refining useful and relevant information
from large data sets. Data mining technique helps in classify relevant data for Intrusion
Detection System. Intrusion Detection system generates alarms for the network traffic
about the foreign invasions in the system. Example are:
• Detect security violations
• Misuse Detection
• Anomaly Detection

(c) Business Transactions: Every business industry is memorized for perpetuity. Such
transactions are usually time-related and can be inter-business deals or intra-business
operations. The effective and in-time use of the data in a reasonable time frame for
competitive decision-making is definitely the most important problem to solve for
businesses that struggle to survive in a highly competitive world. Data mining helps to
analyse these business transactions and identify marketing approaches and decision-
making. Examples are:
• Direct mail targeting
• Stock trading
• Customer segmentation
• Churn prediction (Churn prediction is one of the most popular Big Data use cases
in business)

(d) Market Basket Analysis: Market Basket Analysis is a technique that gives the careful
study of purchases done by a customer in a supermarket. This concept identifies the pattern
of frequent purchase items by customers. This analysis can help to promote deals, offers,
sale by the companies and data mining techniques helps to achieve this analysis task.
Examples are:
• Data mining concepts are in use for Sales and marketing to provide better customer
service, to improve cross-selling opportunities, to increase direct mail response
rates.
• Customer Retention in the form of pattern identification and prediction of likely
defections is possible by Data mining.
• Risk Assessment and Fraud area also use the data-mining concept for identifying
inappropriate or unusual behaviour etc.

(e) Education: For analysing the education sector, data mining uses Educational Data Mining
(EDM) method. This method generates patterns that can be used both by learners and
educators. By using data mining EDM we can perform some educational tasks:
• Predicting students’ admission in higher education
• Predicting students profiling
• Predicting student performance
• Teachers teaching performance
• Curriculum development
• Predicting student placement opportunities

(f) Research: A data mining technique can perform predictions, classification, clustering,
associations, and grouping of data with perfection in the research area. Rules generated by
data mining are unique to find results. In most of the technical research in data mining, we
create a training model and testing model. The training/testing model is a strategy to
measure the precision of the proposed model. It is called Train/Test because we split the
data set into two sets: a training data set and a testing data set. A training data set used to
design the training model whereas testing data set is used in the testing model. Examples
are:
• Classification of uncertain data.
• Information-based clustering.
• Decision support system
• Web Mining
• Domain-driven data mining
• IoT (Internet of Things) and Cybersecurity
• Smart farming IoT (Internet of Things)

(g) Healthcare and Insurance: A Pharmaceutical sector can examine its new deals force
activity and their outcomes to improve the focusing of high-value physicians and figure out
which promoting activities will have the best effect in the following upcoming months,
Whereas the Insurance sector, data mining can help to predict which customers will buy
new policies, identify behaviour patterns of risky customers and identify fraudulent
behaviour of customers. Examples are:
• Claims analysis i.e. which medical procedures are claimed together
• Identify successful medical therapies for different illnesses
• Characterizes patient behaviour to predict office visits

(h) Transportation: A diversified transportation company with a large direct sales force can
apply data mining to identify the best prospects for its services. A large consumer
merchandise organization can apply information mining to improve its business cycle to
retailers. Examples are:
• Determine the distribution schedules among outlets
• Analyse loading patterns

(i) Financial/Banking Sector: A credit card company can leverage its vast warehouse of
customer transaction data to identify customers most likely to be interested in a new credit
product.
• Credit card fraud detection
• Identify ‘Loyal’ customers
• Extraction of information related to customers
• Determine credit card spending by customer groups

Comparison between Classification and Regression

Classification Regression
1. In this problem statement, the target In this problem statement, the target
variables are discrete. variables are continuous.
2. Problems like spam email classification Problems like house price prediction and
and disease prediction are solved using rainfall prediction are solved using
classification algorithms. regression algorithms.
3. In this algorithm, we try to find the best In this algorithm, we try to find the best-
possible decision boundary which can fit line which can represent the overall
separate the two classes with the trend in the data.
maximum possible separation.
4. Evaluation metrics like Accuracy, Evaluation metrics like Mean Squared
Precision, Recall, and F1-Score are used Error (MSE), Mean Absolute Percentage
to evaluate the performance of the Error (MAPE), and R2-Score are used to
classification algorithms. evaluate the performance of the
regression algorithms.
5. Here we face the problems like binary Here we face the problems like linear
classification or multi-class classification regression models as well as non-linear
problems. models.
6. Input data are independent variables and Input data are independent variables and
categorical dependent variable. continuous dependent variable.
7. The classification algorithm’s task is The regression algorithm’s task is
mapping the input value of x with the mapping input value (x) with continuous
discrete output variable of y. output variable (y).
8. Output is Categorical labels. Output is Continuous numerical values.
9. Objective is to predict categorical/class Objective is to predict continuous
labels. numerical values.
10. Examples of classification algorithms Examples of regression algorithms are:
are: Linear Regression, Polynomial
Logistic Regression, Decision Trees, Regression, Ridge Regression, Lasso
Random Forest, Support Vector Regression, Support Vector Regression
Machines (SVM), K-Nearest Neighbours (SVR), Decision Trees for Regression,
(KNN), Naive Bayes, Neural Networks, Random Forest Regression, K-Nearest
K-Means Clustering, Multi-layer Neighbours (KNN) Regression, Neural
Perceptron (MLP), etc. Networks for Regression, etc.

Previewpdf
No ratings yet
Previewpdf
56 pages
Offer Letter: Five-Year BS-MS Dual Degree Programme (2019)
No ratings yet
Offer Letter: Five-Year BS-MS Dual Degree Programme (2019)
3 pages
Historical Development in Social Science
50% (2)
Historical Development in Social Science
17 pages
DWDM R13 Unit 1 PDF
No ratings yet
DWDM R13 Unit 1 PDF
10 pages
18mca52c U1
No ratings yet
18mca52c U1
17 pages
Data Mining Summaries PDF
No ratings yet
Data Mining Summaries PDF
22 pages
data mining unit I notes
No ratings yet
data mining unit I notes
24 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Data Mining
No ratings yet
Data Mining
26 pages
unit-1-dm
No ratings yet
unit-1-dm
62 pages
Whats App
No ratings yet
Whats App
23 pages
Data Mining Notes1
No ratings yet
Data Mining Notes1
56 pages
Chapter-1 - Introduction To Data Mining
No ratings yet
Chapter-1 - Introduction To Data Mining
10 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
DWH Unit 3
No ratings yet
DWH Unit 3
7 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
Unit - I
No ratings yet
Unit - I
22 pages
DMW Notes by Me
No ratings yet
DMW Notes by Me
45 pages
unit-III
No ratings yet
unit-III
101 pages
Notes for DMDWH -Module1
No ratings yet
Notes for DMDWH -Module1
21 pages
Data Mining Issues and Tasks
No ratings yet
Data Mining Issues and Tasks
5 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
UNIT-1 Introduction To Data Mining
No ratings yet
UNIT-1 Introduction To Data Mining
29 pages
module 1
No ratings yet
module 1
41 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Week1-2
No ratings yet
Week1-2
24 pages
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
No ratings yet
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
6 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
8 Data Mining and Warehousing
No ratings yet
8 Data Mining and Warehousing
171 pages
Data Mining Notes
No ratings yet
Data Mining Notes
9 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
3-OLAP Operations-13!08!2021 (13-Aug-2021) Material I 13-Aug-2021 Data Mining - Introductory Slides
No ratings yet
3-OLAP Operations-13!08!2021 (13-Aug-2021) Material I 13-Aug-2021 Data Mining - Introductory Slides
37 pages
Fundamentals of Data Science Unit 1
No ratings yet
Fundamentals of Data Science Unit 1
29 pages
Module 2 Data Mining
No ratings yet
Module 2 Data Mining
49 pages
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
No ratings yet
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
27 pages
Chapter 1&2
No ratings yet
Chapter 1&2
91 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
DM Module1
No ratings yet
DM Module1
15 pages
Module-2-Data Mining
No ratings yet
Module-2-Data Mining
48 pages
01. UNIT-I(DMWH6EM)
No ratings yet
01. UNIT-I(DMWH6EM)
45 pages
DM notes
No ratings yet
DM notes
26 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
30 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Data Mining
No ratings yet
Data Mining
22 pages
LECTURE NOTES ON DATA MINING and DATA WA
No ratings yet
LECTURE NOTES ON DATA MINING and DATA WA
84 pages
DM Module1 notes
No ratings yet
DM Module1 notes
25 pages
Data Mining
No ratings yet
Data Mining
27 pages
Chapter 1. Introduction
No ratings yet
Chapter 1. Introduction
323 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
91 pages
Module 4
No ratings yet
Module 4
54 pages
Chap 1
No ratings yet
Chap 1
32 pages
Unit 1 Datamining For Business Intelligence
No ratings yet
Unit 1 Datamining For Business Intelligence
101 pages
Dwdm Unit-II Notes
No ratings yet
Dwdm Unit-II Notes
29 pages
DWDM Unit3
No ratings yet
DWDM Unit3
15 pages
Unit-1 Notes (1)
No ratings yet
Unit-1 Notes (1)
24 pages
dw and dm notes (1)
No ratings yet
dw and dm notes (1)
89 pages
DMWH M1
No ratings yet
DMWH M1
25 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Appendix B
No ratings yet
Appendix B
23 pages
Teaching As Decision Making: As Quoted and Adopted From
No ratings yet
Teaching As Decision Making: As Quoted and Adopted From
20 pages
Awareness On Biomedical Waste Management Among Dental Students - A Cross Sectional Questionnaire Survey
No ratings yet
Awareness On Biomedical Waste Management Among Dental Students - A Cross Sectional Questionnaire Survey
8 pages
IS14458 Part4 2018
No ratings yet
IS14458 Part4 2018
9 pages
(Research) Working Students in The Midst of Pandemic
No ratings yet
(Research) Working Students in The Midst of Pandemic
77 pages
(Ebook) The Strategy and Tactics of Pricing: A Guide to Growing More Profitably International Student Edition, 7th Edition by Thomas T. Nagle, Georg Müller, Evert Gruyaert ISBN 9781032540726, 1032540729 all chapter instant download
100% (4)
(Ebook) The Strategy and Tactics of Pricing: A Guide to Growing More Profitably International Student Edition, 7th Edition by Thomas T. Nagle, Georg Müller, Evert Gruyaert ISBN 9781032540726, 1032540729 all chapter instant download
81 pages
Socioeconomic Status and The Livelihood Patterns of Potter Community
100% (10)
Socioeconomic Status and The Livelihood Patterns of Potter Community
32 pages
class 11 economics sample paper 1 questions
No ratings yet
class 11 economics sample paper 1 questions
6 pages
03 - Sociolinguistics History
100% (1)
03 - Sociolinguistics History
66 pages
Rapport Stage PFE Finale
No ratings yet
Rapport Stage PFE Finale
57 pages
LO 3 Implement Marketing Activities
No ratings yet
LO 3 Implement Marketing Activities
9 pages
Hypothesis Testing - Problem Statement
No ratings yet
Hypothesis Testing - Problem Statement
4 pages
The Federal Resume Guide: What You Should Know When Applying For A Federal Career
No ratings yet
The Federal Resume Guide: What You Should Know When Applying For A Federal Career
24 pages
Conjoint Exercise
No ratings yet
Conjoint Exercise
2 pages
Mentalization-Based Treatment For Self-Harm in Adolescents - A Randomized Controlled Trial
No ratings yet
Mentalization-Based Treatment For Self-Harm in Adolescents - A Randomized Controlled Trial
13 pages
BSC Data Scienceand Business Analytics Prospectus
No ratings yet
BSC Data Scienceand Business Analytics Prospectus
22 pages
ADA252611
No ratings yet
ADA252611
209 pages
Water and Slurry Bulkheads in Underground Coal Mines: Design, Monitoring and Safety Concerns
No ratings yet
Water and Slurry Bulkheads in Underground Coal Mines: Design, Monitoring and Safety Concerns
7 pages
PDF Analisis Sismico Estatico Ejercicios Compress
No ratings yet
PDF Analisis Sismico Estatico Ejercicios Compress
12 pages
Michelle Thao - Nervous System Mastery Tracker - 581194
No ratings yet
Michelle Thao - Nervous System Mastery Tracker - 581194
12 pages
Normal Distribution
No ratings yet
Normal Distribution
84 pages
g9 q1 Research Module5
No ratings yet
g9 q1 Research Module5
82 pages
Business Analytics Résumé
No ratings yet
Business Analytics Résumé
48 pages
HG 4 Q2 Module 4 Lesson Plan
No ratings yet
HG 4 Q2 Module 4 Lesson Plan
8 pages
Chapter 7 Scaling, Reliability and Validity - DC181
No ratings yet
Chapter 7 Scaling, Reliability and Validity - DC181
43 pages
Sip Banana Final
No ratings yet
Sip Banana Final
31 pages
Assignment 2 Template
No ratings yet
Assignment 2 Template
6 pages

DM Chapter 1

Uploaded by

DM Chapter 1

Uploaded by

Chapter 1

Figure 1: Data mining as a step in the process of knowledge discovery

Knowledge discovery process consists of following steps:

Architecture of Data Mining System: Architecture of a typical data mining system is

Figure 2: Architecture of a typical Data Mining

A detailed description of parts of data mining architecture is given below:

What Kinds of data can be mined?

• Relational Databases − A database system is also called a database management system.

Which Technologies are used in data mining?

Difference between Data Warehouse and Data Mart

Figure 4: Data Mining Applications

Comparison between Classification and Regression

You might also like