0% found this document useful (0 votes)

81 views4 pages

Pca Implementation Notebook

This document demonstrates how to perform principal component analysis (PCA) on economic data with 6 variables and 16 observations. It loads and explores the data, applies PCA to reduce the data to 3 principal components, visualizes the first two components, and calculates that the first component explains 75.6% of the variance in the data.

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views4 pages

Pca Implementation Notebook

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

13/09/2023, 21:06 principal-component-analysis

Import all the libraries :

In [22]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Loading Data :

In [23]:

df = pd.read_csv('/kaggle/input/principal-component-analysis/Longley (1).csv')

In [24]:

Out[24]:

GNP.deflator GNP Unemployed Armed.Forces Population Employed

0 83.0 234.289 235.6 159.0 107.608 60.323

1 88.5 259.426 232.5 145.6 108.632 61.122

2 88.2 258.054 368.2 161.6 109.773 60.171

3 89.5 284.599 335.1 165.0 110.929 61.187

4 96.2 328.975 209.9 309.9 112.075 63.221

5 98.1 346.999 193.2 359.4 113.270 63.639

6 99.0 365.385 187.0 354.7 115.094 64.989

7 100.0 363.112 357.8 335.0 116.219 63.761

8 101.2 397.469 290.4 304.8 117.388 66.019

9 104.6 419.180 282.2 285.7 118.734 67.857

10 108.4 442.769 293.6 279.8 120.445 68.169

11 110.8 444.546 468.1 263.7 121.950 66.513

12 112.6 482.704 381.3 255.2 123.366 68.655

13 114.2 502.601 393.1 251.4 125.368 69.564

14 115.7 518.173 480.6 257.2 127.852 69.331

15 116.9 554.894 400.7 282.7 130.081 70.551

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 1/4
13/09/2023, 21:06 principal-component-analysis

In [25]:

df.dtypes

Out[25]:

GNP.deflator float64
GNP float64
Unemployed float64
Armed.Forces float64
Population float64
Employed float64
dtype: object

In [26]:

X = df.drop('Employed', axis=1)
Y = df['Employed']

In [27]:

correlation = df.corr()
correlation

Out[27]:

GNP.deflator GNP Unemployed Armed.Forces Population Employed

GNP.deflator 1.000000 0.991589 0.620633 0.464744 0.979163 0.970899

GNP 0.991589 1.000000 0.604261 0.446437 0.991090 0.983552

Unemployed 0.620633 0.604261 1.000000 -0.177421 0.686552 0.502498

Armed.Forces 0.464744 0.446437 -0.177421 1.000000 0.364416 0.457307

Population 0.979163 0.991090 0.686552 0.364416 1.000000 0.960391

Employed 0.970899 0.983552 0.502498 0.457307 0.960391 1.000000

Apply PCA :

In [28]:

from sklearn.preprocessing import StandardScaler

In [29]:

# Scale data before applying PCA

scaling=StandardScaler()

In [30]:

# Use fit and transform method

scaling.fit(df)
Scaled_data=scaling.transform(df)

In [31]:

from sklearn.decomposition import PCA

In [32]:

# Set the n_components=3

principal=PCA(n_components=3)
principal.fit(Scaled_data)
x=principal.transform(Scaled_data)

In [33]:

# Check the dimensions of data after PCA

print(x.shape)

(16, 3)

Check Components :

In [34]:

# Check the values of eigen vectors

# prodeced by principal components
principal.components_

Out[34]:

array([[-0.46695493, -0.46748987, -0.30646472, -0.21200613, -0.4656055

6,
-0.45579661],
[ 0.02628724, 0.02306569, -0.62227098, 0.77353962, -0.0762474
5,
0.08589854],
[-0.04906877, -0.16405382, 0.67228378, 0.58400807, -0.0917922
6,
-0.41136586]])

Plot the components (Visualization) :

In [35]:

# plt.figure(figsize=(10,10))
plt.scatter(x[:,0],x[:,1],c=df['Employed'],cmap='plasma')
plt.xlabel('pc1')
plt.ylabel('pc2')

Out[35]:

Text(0, 0.5, 'pc2')

Calculate variance ratio :

In [36]:

# check how much variance is explained by each principal component

print(principal.explained_variance_ratio_)

[0.75584735 0.19778211 0.0419845 ]

In [ ]:

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 4/4

Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
NB 15
No ratings yet
NB 15
20 pages
Lexical Analysis Sample
100% (1)
Lexical Analysis Sample
13 pages
Boston House Prediction - Colab1
No ratings yet
Boston House Prediction - Colab1
10 pages
What Is PCA: When Should You Use PCA?
No ratings yet
What Is PCA: When Should You Use PCA?
21 pages
116_Principal_components_analysis
No ratings yet
116_Principal_components_analysis
6 pages
It Journal
No ratings yet
It Journal
30 pages
Machine Learning Numpy
No ratings yet
Machine Learning Numpy
39 pages
Minitab Statguide Multivariate
No ratings yet
Minitab Statguide Multivariate
25 pages
ModuleAr Merged
No ratings yet
ModuleAr Merged
42 pages
The Fibonacci Number Series
From Everand
The Fibonacci Number Series
Michael Husted
5/5 (1)
DV Journal
No ratings yet
DV Journal
30 pages
1856
No ratings yet
1856
25 pages
Analyse en Composants Principales TP
No ratings yet
Analyse en Composants Principales TP
45 pages
Matplotlib Library in Python
No ratings yet
Matplotlib Library in Python
85 pages
advertising in ML
No ratings yet
advertising in ML
9 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
Practical Guide To Principal Component Analysis (PCA) in R & Python
No ratings yet
Practical Guide To Principal Component Analysis (PCA) in R & Python
33 pages
BCG Virtual Experience Task 3 Feature Engineering1
No ratings yet
BCG Virtual Experience Task 3 Feature Engineering1
12 pages
Lec 17 - Principal Component Analysis PDF
No ratings yet
Lec 17 - Principal Component Analysis PDF
30 pages
code
No ratings yet
code
2 pages
ML LAB_EXP1-10
No ratings yet
ML LAB_EXP1-10
4 pages
Data_Analyzer
No ratings yet
Data_Analyzer
10 pages
Eda - 1@3pm 8th Nov
No ratings yet
Eda - 1@3pm 8th Nov
2 pages
Analisis Peubah Ganda: Pertemuan VIII
No ratings yet
Analisis Peubah Ganda: Pertemuan VIII
163 pages
Unit1 ML Programs
No ratings yet
Unit1 ML Programs
5 pages
DS_prac_9
No ratings yet
DS_prac_9
3 pages
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
No ratings yet
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
18 pages
066d3536-105d-471c-bda8-367c910b8ddc (1)
No ratings yet
066d3536-105d-471c-bda8-367c910b8ddc (1)
33 pages
ML IU48prac1,2
No ratings yet
ML IU48prac1,2
16 pages
External
No ratings yet
External
11 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Practical 10
No ratings yet
Practical 10
2 pages
PCA_review_reset
No ratings yet
PCA_review_reset
24 pages
230103-ECON209_S2025__Lab_2.ipynb-Colab
No ratings yet
230103-ECON209_S2025__Lab_2.ipynb-Colab
10 pages
Terror Casualty Attack
No ratings yet
Terror Casualty Attack
6 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
DA Exp2output
No ratings yet
DA Exp2output
3 pages
Introduction to Kernel PCA
No ratings yet
Introduction to Kernel PCA
1 page
Aim: Theory: Experiment 3
No ratings yet
Aim: Theory: Experiment 3
3 pages
MScFE 600 Financial Data GWP1_Grp_7982_Ques3
No ratings yet
MScFE 600 Financial Data GWP1_Grp_7982_Ques3
6 pages
Practical Guide To Principal Component N R
No ratings yet
Practical Guide To Principal Component N R
43 pages
Stats
No ratings yet
Stats
33 pages
Medium Sudoku Puzzle Book (Printable Version)
From Everand
Medium Sudoku Puzzle Book (Printable Version)
Sheba Blake
No ratings yet
Predictive Modelling Alternative Firm Level PDF
100% (4)
Predictive Modelling Alternative Firm Level PDF
26 pages
01 - Lesson - Visualization - Jupyter Notebook
No ratings yet
01 - Lesson - Visualization - Jupyter Notebook
18 pages
Pi
From Everand
Pi
Scott Hemphill
5/5 (1)
Module 4-2 Principal Components Analysis
No ratings yet
Module 4-2 Principal Components Analysis
18 pages
Practical 5
No ratings yet
Practical 5
6 pages
ml_labmanual (3)
No ratings yet
ml_labmanual (3)
33 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
72 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
Education - Post 12th Standard - CSV
No ratings yet
Education - Post 12th Standard - CSV
11 pages
As3 - Sailau Dinara - Colaboratory
No ratings yet
As3 - Sailau Dinara - Colaboratory
6 pages
Doc-20240330-Wa0002 240330 194818
No ratings yet
Doc-20240330-Wa0002 240330 194818
10 pages
EDA Python Code Cheatsheets
No ratings yet
EDA Python Code Cheatsheets
52 pages
PCA using R
No ratings yet
PCA using R
12 pages
DALab Part-B BCU&BU
No ratings yet
DALab Part-B BCU&BU
12 pages
G11A_G_T2_W1_24_25
No ratings yet
G11A_G_T2_W1_24_25
44 pages
Numpy Day7
No ratings yet
Numpy Day7
12 pages
G10-Week4-T2-2024-2025
No ratings yet
G10-Week4-T2-2024-2025
35 pages
Generative AI With LArge Language Models
No ratings yet
Generative AI With LArge Language Models
36 pages
G10-Week1-T2-2024-2025 (Electricity and electronics)
No ratings yet
G10-Week1-T2-2024-2025 (Electricity and electronics)
34 pages
G11A_G_T2_W7_W8_24_25
No ratings yet
G11A_G_T2_W7_W8_24_25
18 pages
Intro Gen AI 6p
100% (1)
Intro Gen AI 6p
6 pages
Career With AI - Himanshu Ramchandani
No ratings yet
Career With AI - Himanshu Ramchandani
19 pages
ScientificPythonLectures Simple
100% (1)
ScientificPythonLectures Simple
687 pages
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
No ratings yet
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
20 pages
Pca Handwritten
No ratings yet
Pca Handwritten
13 pages
Data Analysis Process
No ratings yet
Data Analysis Process
95 pages
Intro HTML Css Preso 2
No ratings yet
Intro HTML Css Preso 2
8 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
29 pages
K Means Clustering
100% (1)
K Means Clustering
10 pages
First: Lego League UK and Ireland Operational Partner
No ratings yet
First: Lego League UK and Ireland Operational Partner
12 pages
Customer Churn Prediction
100% (1)
Customer Churn Prediction
32 pages
Writing For The Web
No ratings yet
Writing For The Web
10 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
30 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
28 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
Lesson 1 Week 18 Do Now
No ratings yet
Lesson 1 Week 18 Do Now
1 page
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
How To Create A Wireframe: Adobe Photoshop Guide
No ratings yet
How To Create A Wireframe: Adobe Photoshop Guide
8 pages
RGB Shades Task: Colour Colour Code
No ratings yet
RGB Shades Task: Colour Colour Code
1 page
Do Now Lesson 2
No ratings yet
Do Now Lesson 2
1 page
Bowtie Risk Management Methodology and Quantification
No ratings yet
Bowtie Risk Management Methodology and Quantification
10 pages
Markov Chain Modeling and Its Application in Electrical Power Engineering
No ratings yet
Markov Chain Modeling and Its Application in Electrical Power Engineering
16 pages
Revised TUPLE in Python
No ratings yet
Revised TUPLE in Python
40 pages
NUMERICAL ANALYSIS Project
No ratings yet
NUMERICAL ANALYSIS Project
13 pages
1 Solution To Linear Time-Invariant Systems: MAE 280A 1 Maur Icio de Oliveira
No ratings yet
1 Solution To Linear Time-Invariant Systems: MAE 280A 1 Maur Icio de Oliveira
11 pages
Cryptography Stallings CH02 Answers
100% (2)
Cryptography Stallings CH02 Answers
6 pages
Ad3251 Unit 2 Notes Edu Engg
No ratings yet
Ad3251 Unit 2 Notes Edu Engg
35 pages
3.quantization and Transmission of Audio
No ratings yet
3.quantization and Transmission of Audio
10 pages
8ma0-21-0624-qu-afwfawafw
No ratings yet
8ma0-21-0624-qu-afwfawafw
17 pages
1D Colocated SIMPLE Solution
100% (1)
1D Colocated SIMPLE Solution
14 pages
Heisenberg Group Fourier Transform
No ratings yet
Heisenberg Group Fourier Transform
13 pages
Monte Carlo Schedule Risk Analysis
No ratings yet
Monte Carlo Schedule Risk Analysis
3 pages
Automatic Extractive Text Summarization For Nepali Language With Bidirectional Encorder Representation Transformer and K Mean Clustering1
No ratings yet
Automatic Extractive Text Summarization For Nepali Language With Bidirectional Encorder Representation Transformer and K Mean Clustering1
16 pages
01 Chap1 The perfect gas C
No ratings yet
01 Chap1 The perfect gas C
14 pages
Linear Algebra and Feature Selection - Course Notes
No ratings yet
Linear Algebra and Feature Selection - Course Notes
49 pages
Lecture Attention Neural Networks
No ratings yet
Lecture Attention Neural Networks
74 pages
Section 7.1 Systems of Linear Equations in Two Variables
No ratings yet
Section 7.1 Systems of Linear Equations in Two Variables
27 pages
ME 301 HW2 2023-2024 Fall Solution
No ratings yet
ME 301 HW2 2023-2024 Fall Solution
4 pages
Diffusion Constants Near The Critical Point For Time-Dependent Ising
No ratings yet
Diffusion Constants Near The Critical Point For Time-Dependent Ising
7 pages
Question Bank: Department of Computer Science and Engineering
No ratings yet
Question Bank: Department of Computer Science and Engineering
7 pages
Noc20 Cs81 Assignment 01 Week 06
No ratings yet
Noc20 Cs81 Assignment 01 Week 06
5 pages
Maximum and Minimum Work, Thermodynamic Inequalities: Chapter II. Thermodynamic Quantities
No ratings yet
Maximum and Minimum Work, Thermodynamic Inequalities: Chapter II. Thermodynamic Quantities
12 pages
Entropy in Statistical Mechanics
No ratings yet
Entropy in Statistical Mechanics
8 pages
Quant Finance RoadMap
No ratings yet
Quant Finance RoadMap
8 pages
Control System Unit 1 Question Bank
No ratings yet
Control System Unit 1 Question Bank
2 pages
Finite Element Analysis of Cutting Forces in High Speed Machining
No ratings yet
Finite Element Analysis of Cutting Forces in High Speed Machining
8 pages
CCS354 – Network Security
No ratings yet
CCS354 – Network Security
27 pages
Assignment 01 - Query Optimization and Performance Tuning
No ratings yet
Assignment 01 - Query Optimization and Performance Tuning
6 pages
NM Lecture 4
No ratings yet
NM Lecture 4
16 pages

Pca Implementation Notebook

Uploaded by

Pca Implementation Notebook

Uploaded by

13/09/2023, 21:06 principal-component-analysis

Import all the libraries :

GNP.deflator GNP Unemployed Armed.Forces Population Employed

0 83.0 234.289 235.6 159.0 107.608 60.323

1 88.5 259.426 232.5 145.6 108.632 61.122

2 88.2 258.054 368.2 161.6 109.773 60.171

3 89.5 284.599 335.1 165.0 110.929 61.187

4 96.2 328.975 209.9 309.9 112.075 63.221

5 98.1 346.999 193.2 359.4 113.270 63.639

6 99.0 365.385 187.0 354.7 115.094 64.989

7 100.0 363.112 357.8 335.0 116.219 63.761

8 101.2 397.469 290.4 304.8 117.388 66.019

9 104.6 419.180 282.2 285.7 118.734 67.857

10 108.4 442.769 293.6 279.8 120.445 68.169

11 110.8 444.546 468.1 263.7 121.950 66.513

12 112.6 482.704 381.3 255.2 123.366 68.655

13 114.2 502.601 393.1 251.4 125.368 69.564

14 115.7 518.173 480.6 257.2 127.852 69.331

15 116.9 554.894 400.7 282.7 130.081 70.551

GNP.deflator GNP Unemployed Armed.Forces Population Employed

GNP.deflator 1.000000 0.991589 0.620633 0.464744 0.979163 0.970899

GNP 0.991589 1.000000 0.604261 0.446437 0.991090 0.983552

Unemployed 0.620633 0.604261 1.000000 -0.177421 0.686552 0.502498

Armed.Forces 0.464744 0.446437 -0.177421 1.000000 0.364416 0.457307

Population 0.979163 0.991090 0.686552 0.364416 1.000000 0.960391

Employed 0.970899 0.983552 0.502498 0.457307 0.960391 1.000000

from sklearn.preprocessing import StandardScaler

# Scale data before applying PCA

# Use fit and transform method

from sklearn.decomposition import PCA

# Set the n_components=3

# Check the dimensions of data after PCA

# Check the values of eigen vectors

array([[-0.46695493, -0.46748987, -0.30646472, -0.21200613, -0.4656055

Plot the components (Visualization) :

Text(0, 0.5, 'pc2')

Calculate variance ratio :

# check how much variance is explained by each principal component

[0.75584735 0.19778211 0.0419845 ]

You might also like