0% found this document useful (0 votes)
20 views

Assignment 12

Uploaded by

riteshpc13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Assignment 12

Uploaded by

riteshpc13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Assignment no 12

write a python program to creating a dataframe to implement one hot


encoding from csv file.

import csv

# Define the header and rows

header = ['Name', 'Color', 'Size']

rows = [

['Alice', 'Red', 'Small'],

['Bob', 'Blue', 'Medium'],

['Charlie', 'Red', 'Large'],

['David', 'Green', 'Medium'],

['Eva', 'Blue', 'Small']

# Specify the file name

filename = 'data.csv'

# Write to the CSV file

with open(filename, 'w', newline='') as file:

writer = csv.writer(file)

writer.writerow(header) # Write the header

writer.writerows(rows) # Write the rows

print(f"CSV file '{filename}' created successfully.")

import pandas as pd
# Load the CSV file into a DataFrame

df = pd.read_csv('data.csv')

# Display the original DataFrame

print("Original DataFrame:")

print(df)

# Perform one-hot encoding on categorical columns

df_encoded = pd.get_dummies(df, columns=['Color', 'Size'])

# Display the DataFrame with one-hot encoding

print("\nDataFrame with One-Hot Encoding:")

print(df_encoded)

# Optionally, save the encoded DataFrame to a new CSV file

df_encoded.to_csv('data_encoded.csv', index=False)

output:-

Original DataFrame:

Name Color Size

0 Alice Red Small

1 Bob Blue Medium

2 Charlie Red Large

3 David Green Medium

4 Eva Blue Small

DataFrame with One-Hot Encoding:


Name Color_Blue Color_Green ... Size_Large Size_Medium
Size_Small

0 Alice 0 0 ... 0 0 1

1 Bob 1 0 ... 0 1
0

2 Charlie 0 0 ... 1 0
0

3 David 0 1 ... 0 1
0

4 Eva 1 0 ... 0 0
1

[5 rows x 7 columns]

To perform one-hot encoding on categorical data from a CSV file using


Pandas, you will follow these steps:

1. Load the CSV File: Use Pandas to read the data from a CSV file
into a DataFrame.

2. Perform One-Hot Encoding: Use Pandas' get_dummies() function


to encode categorical features into one-hot vectors.

3. Display or Save the Result: Show the DataFrame with the one-hot
encoded features, and optionally save it to a new CSV file.

Explanation

1. Load CSV File:

o pd.read_csv('data.csv') reads the CSV file into a DataFrame.

2. One-Hot Encoding:

o pd.get_dummies(df, columns=['Color', 'Size']) performs one-


hot encoding on the specified categorical columns. The
columns parameter specifies which columns to encode.

o Each unique value in the categorical columns is transformed


into a binary vector.

3. Display the Result:

o Print the original and the one-hot encoded DataFrame to


compare.

4. Save to CSV (Optional):


o df_encoded.to_csv('data_encoded.csv', index=False) saves the
one-hot encoded DataFrame to a new CSV file.

You might also like