0% found this document useful (0 votes)

28 views16 pages

Lab Manual

The document outlines a series of experiments using the WEKA data mining tool, focusing on creating and preprocessing datasets such as Employee and Weather tables. It includes objectives, procedures, and results for each experiment, emphasizing the use of ARFF file format and various data preprocessing techniques. Additionally, it covers the application of the Apriori algorithm for discovering association rules in transactional data.

Uploaded by

swayammallah2006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views16 pages

Lab Manual

Uploaded by

swayammallah2006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Department of CE/ IT & AIDS Engineering

A.Y 2024-25
LAB MANAL
SUB: Data Mining and Warehousing

Experiment No.: 1

Title:

Create an Employee Table with the help of Data Mining Tool WEKA.

Objective:

To create and visualize an Employee dataset using WEKA and understand how ARFF file
format is used to represent structured data for data mining.

Theory:

WEKA (Waikato Environment for Knowledge Analysis) is a powerful suite of machine

learning software written in Java, developed at the University of Waikato, New Zealand. It
supports several standard data mining tasks, more specifically, data preprocessing,
classification, regression, clustering, association rules, and visualization.

In WEKA, data is typically stored in the ARFF (Attribute-Relation File Format) file. The
ARFF file format includes a header section that defines the attributes (fields), and a data
section that includes the records (instances).

ARFF File Format for Employee Table:

Below is a sample ARFF file content for the Employee table.

@relation employee

@attribute emp_id numeric

@attribute name string
@attribute age numeric
@attribute gender {Male, Female}
@attribute department {HR, IT, Sales, Finance}
@attribute salary numeric

@data
101, 'John', 30, Male, IT, 55000
102, 'Sara', 25, Female, HR, 48000
103, 'Mike', 35, Male, Sales, 62000
104, 'Anna', 28, Female, Finance, 50000
105, 'David', 40, Male, IT, 70000

Procedure:

1. Open the WEKA Explorer.

2. Go to the Preprocess tab.
3. Click on Open file… and browse to your .arff file containing the Employee data.
4. Once loaded, you will see attributes like emp_id, name, age, gender, etc.
5. You can now:
o View the summary of data.
o Apply filters to preprocess data.
o Proceed with classification or clustering, if needed.

Result:

The Employee table was successfully created and loaded in WEKA using an ARFF file. The
dataset is now ready for data mining operations.

Viva Questions:

1. What is the full form of ARFF?

2. What types of attributes can WEKA handle?
3. How do you define categorical vs numerical attributes in ARFF?
4. What is the purpose of WEKA?
5. Can WEKA be used for real-time data mining?
Experiment No.: 2

Title:

Create a Weather Table with the help of Data Mining Tool WEKA.

Objective:

To create and visualize a Weather dataset using WEKA and understand how categorical and
numeric attributes are defined using ARFF format.

Theory:

WEKA is a Java-based open-source tool used for data preprocessing, classification,

regression, clustering, and association rules. It uses ARFF (Attribute-Relation File Format) to
load datasets.

The Weather dataset is a classic dataset used in data mining for classification problems (e.g.,
predicting whether to play based on weather conditions). The dataset includes attributes like
outlook, temperature, humidity, wind, and a target class like “play”.

ARFF File Format for Weather Table:

@relation weather

@attribute outlook {sunny, overcast, rainy}

@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE, FALSE}
@attribute play {yes, no}

@data
sunny, 85, 85, FALSE, no
sunny, 80, 90, TRUE, no
overcast, 83, 78, FALSE, yes
rainy, 70, 96, FALSE, yes
rainy, 68, 80, FALSE, yes
rainy, 65, 70, TRUE, no
overcast, 64, 65, TRUE, yes
sunny, 72, 95, FALSE, no
sunny, 69, 70, FALSE, yes
rainy, 75, 80, FALSE, yes
sunny, 75, 70, TRUE, yes
overcast, 72, 90, TRUE, yes
overcast, 81, 75, FALSE, yes
rainy, 71, 91, TRUE, no

Procedure:
1. Open WEKA Explorer.
2. Navigate to the Preprocess tab.
3. Click on Open file… and select the .arff file containing the Weather data.
4. The dataset will be loaded, and attributes such as outlook, temperature, humidity,
windy, and play will be visible.
5. Use the interface to:
o Explore statistics.
o Visualize attribute distributions.
o Apply classification/clustering if required.

Result:

The Weather table was successfully created and visualized using WEKA. The dataset is now
ready for classification or other data mining operations.

Viva Questions:

1. What are the different data types supported by WEKA?

2. Explain the structure of an ARFF file.
3. What is the use of the “@relation” keyword?
4. How do you represent categorical values in ARFF?
5. Why is the Weather dataset commonly used in classification problems?
Experiment No.: 3

Title: Apply Pre-Processing techniques to the training data set of Weather Table

Objective:

To apply various data pre-processing techniques on the weather dataset to improve data
quality and prepare it for further analysis.

Software / Tool Used:

 WEKA (Waikato Environment for Knowledge Analysis)

Theory:

Data preprocessing is an essential step in data mining that involves cleaning and transforming
raw data into an understandable format. It includes the following techniques:

 Handling Missing Values

 Normalization/Standardization
 Discretization
 Attribute Removal
 Reordering Attributes

WEKA provides a GUI to perform these preprocessing tasks using its Preprocess tab.

Dataset Used:

Weather.arff – A small dataset included with WEKA having attributes like outlook,
temperature, humidity, wind, and play.

Procedure:

1. Open WEKA GUI Chooser → Select Explorer.

2. Click on Open file and load the weather.arff dataset.
3. Under the Preprocess tab:
o To remove an attribute: Select the attribute → Click Remove.
o To normalize: Click Filter → Choose
weka.filters.unsupervised.attribute.Normalize.
o To standardize: Click Filter → Choose
weka.filters.unsupervised.attribute.Standardize.
o To replace missing values: Choose
weka.filters.unsupervised.attribute.ReplaceMissingValues.
o To discretize numeric attributes: Choose
weka.filters.unsupervised.attribute.Discretize.
4. Click Apply after selecting any filter to view changes in the dataset.
Sample Code (for scripting in WEKA command line, optional):
java weka.filters.unsupervised.attribute.ReplaceMissingValues -i weather.arff -o
weather_cleaned.arff
java weka.filters.unsupervised.attribute.Normalize -i weather_cleaned.arff -o
weather_normalized.arff

Result:

Various preprocessing techniques were successfully applied to the weather dataset using
WEKA. The dataset is now cleaned, normalized, and ready for further data mining tasks.

Conclusion:

Pre-processing is a crucial step that improves the quality of data and helps in extracting
meaningful patterns during data mining. Using WEKA, these operations can be performed
efficiently.

Viva Questions:

1. What is the importance of data pre-processing in data mining?

2. What is the difference between normalization and standardization?
3. How does WEKA handle missing values?
4. What is discretization, and when is it used?
5. Why might we choose to remove or reorder attributes?
6. Can you name a few filters available in WEKA for preprocessing?
Experiment No.: 4

Title: Apply Pre-Processing techniques to the training data set of Employee Table

Objective:

To apply data pre-processing techniques on an employee dataset using the WEKA tool in
order to clean and prepare the data for further analysis.

Software / Tool Used:

 WEKA (Waikato Environment for Knowledge Analysis)

Theory:

Data preprocessing is a data mining technique used to transform raw data into a clean
dataset. Key preprocessing tasks include:

 Handling Missing Values

 Normalization/Standardization
 Discretization
 Removing/Reordering Attributes
 Data Type Conversion

WEKA provides built-in filters under the Preprocess tab to implement these techniques
easily.

Dataset Used:

Employee.arff (user-created or sample dataset with attributes like: EmpID, Name, Age,
Department, Salary, Experience, City)

Procedure:

1. Launch WEKA GUI Chooser → Open Explorer.

2. Load the dataset employee.arff using the Open file option.
3. Under the Preprocess tab:
o To handle missing values: Use filter
weka.filters.unsupervised.attribute.ReplaceMissingValues.
o To normalize numeric attributes: Use filter
weka.filters.unsupervised.attribute.Normalize.
o To standardize: Use weka.filters.unsupervised.attribute.Standardize.
o To discretize attributes like Age or Salary: Use
weka.filters.unsupervised.attribute.Discretize.
To remove unnecessary attributes: Select attribute → Click Remove.
o
o
To reorder attributes: Use weka.filters.unsupervised.attribute.Reorder.
4. Click Apply after each filter to observe the transformed data.

Sample WEKA Command Line (Optional):

java weka.filters.unsupervised.attribute.ReplaceMissingValues -i employee.arff -o
employee_cleaned.arff
java weka.filters.unsupervised.attribute.Normalize -i employee_cleaned.arff -o
employee_normalized.arff

Result:

Data preprocessing techniques were successfully applied to the Employee dataset using
WEKA. The dataset is now cleaned, normalized, and prepared for analysis and modeling.

Conclusion:

Preprocessing ensures that the dataset is clean, consistent, and suitable for applying data
mining algorithms. WEKA simplifies preprocessing through its graphical interface and
filters.

Viva Questions:

1. Why is data preprocessing essential in data mining?

2. What are common preprocessing techniques?
3. How can you handle missing values in WEKA?
4. What is the difference between normalization and standardization?
5. Why would you discretize a continuous attribute like salary?
6. How can you remove or reorder attributes in WEKA?
Experiment No.: 5

Title: Normalize Weather Table data using Knowledge Flow

Objective:

To normalize the Weather dataset using the Knowledge Flow interface of WEKA.

Software / Tool Used:

 WEKA (Knowledge Flow Interface)

Theory:

Normalization is the process of scaling numeric data to a standard range, typically [0, 1],
which improves the performance of many machine learning algorithms.

Knowledge Flow in WEKA is a visual programming environment that allows users to design
and execute data flows using graphical components instead of command-line or Explorer
interface.

Advantages of Knowledge Flow:

 Visual representation of data processing

 Modular design of data flow
 Easy experimentation and customization

Dataset Used:

Weather.arff (default dataset in WEKA containing attributes such as outlook, temperature,

humidity, wind, and play)

Procedure:

1. Open WEKA GUI Chooser → Click on Knowledge Flow.

2. From the left panel, drag and drop the following components onto the canvas:
o ArffLoader (to load the dataset)
o Normalize (filter for normalization)
o DataViewer (to view results)
3. Connect the components:
o ArffLoader → Normalize (via dataSet)
o Normalize → DataViewer (via outputFormat)
4. Double-click on ArffLoader → Load weather.arff file.
5. Double-click Normalize → Set options if needed.
6. Click Start Loading on ArffLoader.
7. Click Play button on the toolbar to execute the flow.
8. Double-click DataViewer to view the normalized dataset.
Result:

The Weather dataset was successfully normalized using the Knowledge Flow interface in
WEKA. Numeric attributes were scaled to the range [0, 1].

Conclusion:

Knowledge Flow provides a visual and modular way to perform data preprocessing. Using it,
the Weather dataset was normalized efficiently, preparing it for further data mining tasks.

Viva Questions:

1. What is normalization and why is it used?

2. What are the different ways to normalize data?
3. What is the range of data after normalization using WEKA?
4. How does the Knowledge Flow interface differ from Explorer in WEKA?
5. Which filter is used for normalization in Knowledge Flow?
6. Can non-numeric data be normalized? Why or why not?
Experiment No.: 6

Title: Normalize Employee Table data using Knowledge Flow

Objective:

To normalize numeric attributes of the Employee dataset using the Knowledge Flow
interface in WEKA.

Software / Tool Used:

 WEKA (Knowledge Flow Interface)

Theory:

Normalization is a data preprocessing technique used to scale numeric values to a common

range, typically between 0 and 1. It is essential for improving the performance of algorithms
that rely on distance metrics or gradient-based optimization.

Knowledge Flow in WEKA offers a graphical environment to design, visualize, and

execute data processing workflows. It supports drag-and-drop components and clear flow
connections.

Dataset Used:

Employee.arff (user-defined dataset with attributes like EmpID, Name, Age, Salary,
Experience, City, etc.)

Procedure:

1. Launch WEKA GUI Chooser → Select Knowledge Flow.

2. From the left component panel, drag and drop the following components:
o ArffLoader (to load the dataset)
o Normalize (to apply normalization)
o DataViewer (to view results)
3. Connect components:
o ArffLoader → Normalize using dataSet
o Normalize → DataViewer using outputFormat
4. Double-click on ArffLoader → Load the file employee.arff.
5. Double-click on Normalize to configure settings if needed (optional).
6. Click Start Loading on ArffLoader.
7. Click the Run (Play) button on the toolbar.
8. Double-click on DataViewer to view the normalized data output.

Result:

The Employee dataset was successfully normalized using the Knowledge Flow interface.
Numeric attributes like Age, Salary, and Experience were scaled to a uniform range.
Conclusion:

Normalization helps eliminate the bias caused by different attribute ranges. Knowledge Flow
in WEKA provides an intuitive, visual way to apply normalization on datasets like
Employee.arff.

Viva Questions:

1. What is the purpose of normalization in data preprocessing?

2. What is the difference between normalization and standardization?
3. Which filter is used for normalization in WEKA Knowledge Flow?
4. Can categorical data be normalized? Why or why not?
5. What range is used for normalized values in WEKA?
6. How is Knowledge Flow useful compared to WEKA Explorer?
Experiment No.: 7

Title: Finding Association Rules for Buying Data

Objective:

To discover association rules from a transactional buying dataset using the Apriori
algorithm in WEKA.

Software / Tool Used:

 WEKA (Explorer Interface)

Theory:

Association rule mining is used to uncover relationships between items in large transactional
datasets. A classic example is Market Basket Analysis, where rules like:

If a customer buys Bread, then they are likely to buy Butter.

are extracted.

Key terms:

 Support: Frequency of itemset in the dataset.

 Confidence: Likelihood of occurrence of consequent given the antecedent.
 Lift: Ratio of observed support to expected support.

Apriori Algorithm is used to find frequent itemsets and generate rules based on minimum
support and confidence.

Dataset Used:

BuyingData.arff (A dataset containing transactions like: milk, bread, butter, tea, coffee, etc.)

Procedure:

1. Open WEKA GUI Chooser → Click Explorer.

2. Under the Preprocess tab, click Open File and load BuyingData.arff.
3. Go to the Associate tab.
4. Select the algorithm Apriori from the list.
5. (Optional) Click on Choose → Apriori to modify parameters like:
o Minimum Support (default: 0.1)
o Minimum Confidence (default: 0.9)
o Number of Rules (default: 10)
6. Click Start to run the algorithm.
7. View the generated rules in the result window under Associator Output.
Sample Output:
1. butter=TRUE => bread=TRUE conf:(0.85)
2. milk=TRUE bread=TRUE => tea=TRUE conf:(0.78)
...

Result:

Association rules were successfully extracted from the buying data using the Apriori
algorithm in WEKA, revealing patterns in customer purchasing behavior.

Conclusion:

Association rule mining helps in identifying item co-occurrence patterns in large datasets.
The Apriori algorithm is efficient and widely used for such tasks in retail and
recommendation systems.

Viva Questions:

1. What is an association rule? Provide an example.

2. Define support, confidence, and lift.
3. What is the purpose of the Apriori algorithm?
4. What does a high confidence value indicate in a rule?
5. How can association rules help in retail businesses?
6. How do you set minimum support and confidence in WEKA?
Experiment No.: 8

Title: Finding Association Rules for Banking Data

Objective:

To extract meaningful association rules from banking transactional data using the Apriori
algorithm in WEKA.

Software / Tool Used:

 WEKA (Explorer Interface)

Theory:

Association rule mining helps uncover interesting relationships among variables in large
datasets. In the context of banking data, it can reveal patterns like:

If a customer has a savings account, they are likely to apply for a loan.

Key concepts:

 Support: How frequently an itemset appears in the dataset.

 Confidence: Likelihood that the rule is correct.
 Lift: Strength of a rule over random co-occurrence.

WEKA’s Apriori algorithm identifies frequent itemsets and generates rules using defined
thresholds for support and confidence.

Dataset Used:

BankingData.arff
(Sample attributes may include: Has_Savings_Account, Applies_For_Loan,
Owns_Credit_Card, Has_Mobile_Banking, etc.)

Procedure:

1. Open WEKA GUI Chooser → Select Explorer.

2. Go to the Preprocess tab and load BankingData.arff.
3. Navigate to the Associate tab.
4. Choose the algorithm Apriori.
5. (Optional) Adjust parameters such as:
o Minimum support (e.g., 0.2)
o Minimum confidence (e.g., 0.8)
o Number of rules to display (e.g., 10)
6. Click Start to run the algorithm.
7. Review the Associator Output pane to analyze the discovered rules.
Sample Output:
1. Has_Savings_Account=TRUE => Applies_For_Loan=TRUE conf:(0.82)
2. Owns_Credit_Card=TRUE Has_Mobile_Banking=TRUE => Applies_For_Loan=TRUE
conf:(0.77)

Result:

The Apriori algorithm successfully mined association rules from the banking dataset,
revealing patterns in customers’ financial product usage.

Conclusion:

Association rule mining is a powerful tool to analyze customer behavior. In banking, it can be
used for cross-selling, risk analysis, and personalized marketing.

Viva Questions:

1. What are association rules and how are they used in banking?
2. Define support and confidence in the context of data mining.
3. Why is the Apriori algorithm suitable for rule mining?
4. How can association rules help with cross-selling banking products?
5. What does a confidence value of 0.9 imply?
6. Can WEKA generate association rules for numeric attributes directly?

Lab Assignment Report: ECS 851 Data Warehousing and Data Mining
No ratings yet
Lab Assignment Report: ECS 851 Data Warehousing and Data Mining
69 pages
Lab Manual
No ratings yet
Lab Manual
69 pages
Data Mining with WEKA: Lab Manual
No ratings yet
Data Mining with WEKA: Lab Manual
69 pages
DWBI Lab Manual 2023-24 Final
No ratings yet
DWBI Lab Manual 2023-24 Final
40 pages
Data Mining Experiments with WEKA
No ratings yet
Data Mining Experiments with WEKA
33 pages
Data Mining and Warehouse Lab Manual
100% (1)
Data Mining and Warehouse Lab Manual
69 pages
BI - Experiment - No - 1
No ratings yet
BI - Experiment - No - 1
7 pages
DataMining-LabManual 241220 165057
No ratings yet
DataMining-LabManual 241220 165057
69 pages
Data Mining & Warehousing Lab Guide
No ratings yet
Data Mining & Warehousing Lab Guide
35 pages
CVR DWDM Manual
100% (1)
CVR DWDM Manual
70 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
71 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
DW Lab Manual
No ratings yet
DW Lab Manual
62 pages
DWDM Lab - KUNYI KELVIN M
No ratings yet
DWDM Lab - KUNYI KELVIN M
60 pages
DWDM File-Final Ver3.pdf 20241230 172003 0000
No ratings yet
DWDM File-Final Ver3.pdf 20241230 172003 0000
54 pages
Data Warehousing Lab Manual
No ratings yet
Data Warehousing Lab Manual
36 pages
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
100% (1)
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
8 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
70 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
47 pages
Weka: Data Mining and Preprocessing Guide
No ratings yet
Weka: Data Mining and Preprocessing Guide
4 pages
LAB Experiment Data Mining and Warehousing
No ratings yet
LAB Experiment Data Mining and Warehousing
33 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
55 pages
WEKA Guide for ML Practitioners
No ratings yet
WEKA Guide for ML Practitioners
58 pages
Clustering with iris.arff Dataset
No ratings yet
Clustering with iris.arff Dataset
41 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
Printing 1-3
No ratings yet
Printing 1-3
36 pages
Experiment No: 01 Data Exploration & Data Preprocessing
No ratings yet
Experiment No: 01 Data Exploration & Data Preprocessing
54 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
40 pages
Data Mining Lab Manual for CSE
No ratings yet
Data Mining Lab Manual for CSE
50 pages
Weka Data Mining Lab Guide
No ratings yet
Weka Data Mining Lab Guide
20 pages
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
No ratings yet
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
42 pages
MCA Data Mining Lab Manual
No ratings yet
MCA Data Mining Lab Manual
42 pages
Exp 6
No ratings yet
Exp 6
12 pages
DWDM Lab Manual Using Weka-For MIC
No ratings yet
DWDM Lab Manual Using Weka-For MIC
42 pages
Lab Updated - Merged
No ratings yet
Lab Updated - Merged
49 pages
Weka LAB-ALL
No ratings yet
Weka LAB-ALL
19 pages
DWDM Record With Alignment
No ratings yet
DWDM Record With Alignment
69 pages
DM Tools Sample-1
No ratings yet
DM Tools Sample-1
72 pages
Rintro Wekacomplete
No ratings yet
Rintro Wekacomplete
135 pages
DMW LabFile 0901CS243D11 Swastik
No ratings yet
DMW LabFile 0901CS243D11 Swastik
25 pages
Weka Data Mining & Preprocessing Guide
No ratings yet
Weka Data Mining & Preprocessing Guide
11 pages
WEKA Data Mining Practical Guide
No ratings yet
WEKA Data Mining Practical Guide
18 pages
Data Warehouse & WEKA Data Mining Guide
No ratings yet
Data Warehouse & WEKA Data Mining Guide
26 pages
Data Mining & Predictive Modeling Lab
No ratings yet
Data Mining & Predictive Modeling Lab
23 pages
Lecture 12 - Weka Tutorial
No ratings yet
Lecture 12 - Weka Tutorial
84 pages
DMDV Main Manual
No ratings yet
DMDV Main Manual
35 pages
Weka Lab
No ratings yet
Weka Lab
11 pages
DWDM Lab Manual 2024-2025
No ratings yet
DWDM Lab Manual 2024-2025
96 pages
Data Preprocessing in WEKA: ARFF Example
No ratings yet
Data Preprocessing in WEKA: ARFF Example
6 pages
Data Mining - Lab - Manual
No ratings yet
Data Mining - Lab - Manual
20 pages
Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
No ratings yet
Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
63 pages
WEKA Data Analysis Guide
No ratings yet
WEKA Data Analysis Guide
85 pages
CCS341-Data Warehousing Lab Manual (2021)
No ratings yet
CCS341-Data Warehousing Lab Manual (2021)
88 pages
2.3 Weka Tool
No ratings yet
2.3 Weka Tool
84 pages
Complete Advanced Progress Tests Organized
No ratings yet
Complete Advanced Progress Tests Organized
7 pages
Requirements For Client For IE 11
No ratings yet
Requirements For Client For IE 11
13 pages
Huawei Concentric Cell Optimization
100% (1)
Huawei Concentric Cell Optimization
45 pages
Creating Database
No ratings yet
Creating Database
52 pages
Revised Oberon vs. Oberon-07 Differences
No ratings yet
Revised Oberon vs. Oberon-07 Differences
2 pages
Basketball Software User Manual
No ratings yet
Basketball Software User Manual
5 pages
LRM2601 Assessment 02 (2025 S1) Template
No ratings yet
LRM2601 Assessment 02 (2025 S1) Template
5 pages
Elum EPowerControl SD+ Datasheet
No ratings yet
Elum EPowerControl SD+ Datasheet
2 pages
Roadmap to Automation Testing
No ratings yet
Roadmap to Automation Testing
2 pages
Data Link Layer Protocols and Techniques
No ratings yet
Data Link Layer Protocols and Techniques
2 pages
Code of Conduct English (28-11-2024)
No ratings yet
Code of Conduct English (28-11-2024)
16 pages
Mike She Printed v1
No ratings yet
Mike She Printed v1
370 pages
08 - BSC Integration To Rel4
No ratings yet
08 - BSC Integration To Rel4
14 pages
Year 7 Spring Higher MS 2019
No ratings yet
Year 7 Spring Higher MS 2019
3 pages
Urban Modeling for City Planners
No ratings yet
Urban Modeling for City Planners
9 pages
Advantages and Disadvantages of Machine Learning Language
0% (1)
Advantages and Disadvantages of Machine Learning Language
2 pages
CT & Mri (Seimens)
No ratings yet
CT & Mri (Seimens)
4 pages
Jeppesen Legend
100% (1)
Jeppesen Legend
11 pages
IdeaLLiance - PRISM, Publishing Requirements For Industry Standard Metadata v2.2 - The PRISM-PAM Inline Markup Specification, 2014
No ratings yet
IdeaLLiance - PRISM, Publishing Requirements For Industry Standard Metadata v2.2 - The PRISM-PAM Inline Markup Specification, 2014
19 pages
Inbound 8546083162991566973
No ratings yet
Inbound 8546083162991566973
27 pages
NetWorker - Updating The NetWorker software-NetWorker 19.4
No ratings yet
NetWorker - Updating The NetWorker software-NetWorker 19.4
28 pages
Silicon PNP Power Transistors: TIP42/42A/42B/42C
No ratings yet
Silicon PNP Power Transistors: TIP42/42A/42B/42C
4 pages
ICT in Inventory Management
No ratings yet
ICT in Inventory Management
10 pages
LJBA - Key To Correction
No ratings yet
LJBA - Key To Correction
1 page
Oracle Isupplier & Host To Host Implementations At: Etisalat, Uae
No ratings yet
Oracle Isupplier & Host To Host Implementations At: Etisalat, Uae
13 pages
Computer Science
0% (1)
Computer Science
47 pages
Deductors Manual - Quarterly Correction - Ver 1.4
No ratings yet
Deductors Manual - Quarterly Correction - Ver 1.4
21 pages
Pc200-6 Custom Parts Book
100% (21)
Pc200-6 Custom Parts Book
261 pages
Bangladesh Railway E-Ticket Details
No ratings yet
Bangladesh Railway E-Ticket Details
1 page
Basics of Electrical and Electronics Lab-1
No ratings yet
Basics of Electrical and Electronics Lab-1
25 pages

Lab Manual

Uploaded by

Lab Manual

Uploaded by

Department of CE/ IT & AIDS Engineering

WEKA (Waikato Environment for Knowledge Analysis) is a powerful suite of machine

ARFF File Format for Employee Table:

Below is a sample ARFF file content for the Employee table.

@attribute emp_id numeric

1. Open the WEKA Explorer.

1. What is the full form of ARFF?

WEKA is a Java-based open-source tool used for data preprocessing, classification,

ARFF File Format for Weather Table:

@attribute outlook {sunny, overcast, rainy}

1. What are the different data types supported by WEKA?

Software / Tool Used:

 WEKA (Waikato Environment for Knowledge Analysis)

 Handling Missing Values

1. Open WEKA GUI Chooser → Select Explorer.

1. What is the importance of data pre-processing in data mining?

Software / Tool Used:

 WEKA (Waikato Environment for Knowledge Analysis)

 Handling Missing Values

1. Launch WEKA GUI Chooser → Open Explorer.

Sample WEKA Command Line (Optional):

1. Why is data preprocessing essential in data mining?

Title: Normalize Weather Table data using Knowledge Flow

Software / Tool Used:

 WEKA (Knowledge Flow Interface)

Advantages of Knowledge Flow:

 Visual representation of data processing

Weather.arff (default dataset in WEKA containing attributes such as outlook, temperature,

1. Open WEKA GUI Chooser → Click on Knowledge Flow.

1. What is normalization and why is it used?

Title: Normalize Employee Table data using Knowledge Flow

Software / Tool Used:

 WEKA (Knowledge Flow Interface)

Normalization is a data preprocessing technique used to scale numeric values to a common

Knowledge Flow in WEKA offers a graphical environment to design, visualize, and

1. Launch WEKA GUI Chooser → Select Knowledge Flow.

1. What is the purpose of normalization in data preprocessing?

Title: Finding Association Rules for Buying Data

Software / Tool Used:

 WEKA (Explorer Interface)

If a customer buys Bread, then they are likely to buy Butter.

 Support: Frequency of itemset in the dataset.

1. Open WEKA GUI Chooser → Click Explorer.

1. What is an association rule? Provide an example.

Title: Finding Association Rules for Banking Data

Software / Tool Used:

 WEKA (Explorer Interface)

 Support: How frequently an itemset appears in the dataset.

1. Open WEKA GUI Chooser → Select Explorer.

You might also like