0% found this document useful (0 votes)

3 views7 pages

Week - 5 Pandas essentials

The document explains the differences between Series and DataFrame objects in Pandas, highlighting that a Series is a one-dimensional labeled array while a DataFrame is a two-dimensional table. It also discusses correlation and covariance, noting that covariance indicates the direction of a relationship between variables, whereas correlation measures both strength and direction in a standardized way. Additionally, it describes the .loc and .iloc functions for selecting data in DataFrames, emphasizing that .loc uses label-based indexing while .iloc uses position-based indexing.

Uploaded by

nghiemhoa4895

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views7 pages

Week - 5 Pandas essentials

Uploaded by

nghiemhoa4895

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Pandas functions

Series and Dataframe Objects in Pandas

In Python's pandas library, the main difference between a DataFrame and a
Series object is their structure and intended use:

1. Series:

A Series is essentially a one-dimensional labeled array capable of holding

any data type (integers, strings, floats, etc.).

It has an index that labels each element in the series.

It can be thought of as a single column of data.

Example:Output:

python
Copy code
import pandas as pd
s = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
print(s)

css
Copy code
a 1
b 2
c 3
d 4
dtype: int64

Pandas functions 1
2. DataFrame:

A DataFrame is a two-dimensional labeled data structure with columns of

potentially different types (like a table with rows and columns).

Each column in a DataFrame is a Series, and the DataFrame is a collection

of Series aligned along a common index.

It can be thought of as a table (or spreadsheet) where rows and columns

can hold different types of data.

Example:Output:

python
Copy code
data = {'Column1': [1, 2, 3, 4], 'Column2': ['A', 'B',
'C', 'D']}
df = pd.DataFrame(data)
print(df)

css
Copy code
Column1 Column2
0 1 A
1 2 B
2 3 C
3 4 D

In summary:

A Series is a single-dimensional array, while a DataFrame is a multi-

dimensional table.

A DataFrame is essentially a collection of Series aligned in a tabular format.

Correlation and Covariance

Pandas functions 2
Correlation and Covariance are both measures used in statistics and data
analysis to describe the relationship between two variables. However, they differ
in their interpretation, scale, and how they measure this relationship:

1. Covariance:
Definition: Covariance measures the direction of the linear relationship
between two variables. It tells us whether two variables tend to increase or
decrease together.

F ormula : Cov(X, Y ) = n1i = 1 ∑ n(Xi− Xˉ)(Yi− Y ˉ)

Interpretation:

If the covariance is positive, both variables tend to increase or decrease

together.

If the covariance is negative, when one variable increases, the other tends
to decrease.

If the covariance is zero, there is no linear relationship between the

variables.

Scale: The magnitude of covariance is not standardized, meaning it can take

any value and is sensitive to the units of the variables. This makes it difficult to
compare across different datasets.

2. Correlation:
Definition: Correlation measures both the strength and the direction of the
linear relationship between two variables, but it is normalized and unit-free,
making it easier to interpret and compare across different datasets.

Formula (for Pearson correlation):

Corr(X, Y ) = σXσY Cov(X, Y )

Cov(X, Y )
Corr(X, Y ) = Cov(X, Y )σXσY Corr(X, Y ) =

σX σY

Where σX\sigma_XσX and σY\sigma_YσY are the standard deviations of X and Y.

Pandas functions 3
Interpretation:

The correlation coefficient (r) ranges from 1 to 1:

r=1r = 1r=1: Perfect positive correlation (variables increase together).

r=−1r = -1r=−1: Perfect negative correlation (one variable increases

while the other decreases).

r=0r = 0r=0: No linear correlation.

Scale: Correlation is standardized, meaning the values range between -1 and

1, making it easier to compare relationships across different variables or
datasets.

Key Differences:
Feature Covariance Correlation

Measures the direction of the Measures the strength and

Meaning
relationship between variables direction of the relationship

Unbounded (can be any positive or

Range Always between -1 and 1
negative number)

Easier to interpret due to

Interpretation Difficult to interpret due to scale
standardization

Depends on the units of the

Units Unit-free
variables

Easier to compare across

Comparison Difficult to compare across datasets
datasets

In summary:

Covariance tells you if two variables move together (positively or negatively)

but not how strongly.

Correlation tells you both the direction and strength of the linear relationship in
a more interpretable and standardized form.

.loc and .iloc

Pandas functions 4
In pandas, .loc and .iloc are used to select rows and columns from a DataFrame,
but they differ in how they index the data:

1. .loc : Label-based indexing

Definition: .loc is primarily used for selecting data by label or index name. It
includes both the start and end labels when selecting a range.

Behavior:

It allows for label-based indexing, which means you can select rows and
columns based on their explicit labels (index names or column names).

It supports slicing and selecting specific rows and columns by their labels.

Example:

python
Copy code
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data, index=['row1', 'row2', 'row3'])

# Using .loc to select by labels

df_loc = df.loc['row1', 'A'] # Selects the value in 'row
1' and column 'A'
print(df_loc)

Output:

Copy code
1

Selecting rows by label:

python
Copy code

Pandas functions 5
df.loc['row2'] # Selects all columns of row2

Selecting multiple rows and columns:

python
Copy code
df.loc['row1':'row2', ['A', 'B']] # Slices rows from
'row1' to 'row2' and columns 'A' and 'B'

2. .iloc : Position-based indexing

Definition: .iloc is used for selecting data by integer position. It excludes the
end index when selecting a range.

Behavior:

It allows for position-based indexing, which means you select rows and
columns based on their integer index positions, regardless of the actual
labels.

Like Python slicing, .iloc excludes the ending position when slicing
ranges.

Example:

python
Copy code
# Using .iloc to select by positions
df_iloc = df.iloc[0, 0] # Selects the value at the 0th ro
w and 0th column (first row, first column)
print(df_iloc)

Output:

Pandas functions 6
Copy code
1

Selecting rows by position:

python
Copy code
df.iloc[1] # Selects all columns of the second row (in
dex 1)

Selecting multiple rows and columns:

python
Copy code
df.iloc[0:2, 0:2] # Slices rows from position 0 to 1 a
nd columns from position 0 to 1

Key Differences:

Feature .loc (Label-based) .iloc (Position-based)

Position-based (integer
Indexing method Label-based (index/column names)
positions)

Start/end Includes both start and end in

Excludes the end in ranges
inclusion ranges

Type of input Row and column labels Integer index positions

When you know the label/index of When you want to select by

Use case
rows/columns integer positions

Summary:

Use .loc when you want to select rows and columns by labels.

Use .iloc when you want to select rows and columns by position.

Pandas functions 7

Pandas Worksheets ALL
100% (1)
Pandas Worksheets ALL
8 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
PANDAS Python
No ratings yet
PANDAS Python
2 pages
Pandas Questions
No ratings yet
Pandas Questions
11 pages
Pandas in Py: A Detailed Overview Into Series and Dataframe Functions in Pandas
No ratings yet
Pandas in Py: A Detailed Overview Into Series and Dataframe Functions in Pandas
21 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Pandas
No ratings yet
Pandas
9 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas
No ratings yet
Pandas
42 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Pandas
No ratings yet
Pandas
29 pages
Pandas
No ratings yet
Pandas
13 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
99c949c0-5910-425f-9ac5-155882800fa5
No ratings yet
99c949c0-5910-425f-9ac5-155882800fa5
36 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Unit_2_2
No ratings yet
Unit_2_2
4 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
05Getting Started With Pandas
No ratings yet
05Getting Started With Pandas
44 pages
Pandas - Panel Data System
No ratings yet
Pandas - Panel Data System
4 pages
unit 3
No ratings yet
unit 3
10 pages
Python Unit 4&5 Que
No ratings yet
Python Unit 4&5 Que
33 pages
JOINS (1)
No ratings yet
JOINS (1)
10 pages
PPT Pandas(Assignment 3)
No ratings yet
PPT Pandas(Assignment 3)
24 pages
Eda Unit 2
No ratings yet
Eda Unit 2
65 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
1501992967_1496666168_Pandas
No ratings yet
1501992967_1496666168_Pandas
63 pages
Panda
No ratings yet
Panda
46 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Pandas Functions (1)
No ratings yet
Pandas Functions (1)
3 pages
DV
No ratings yet
DV
53 pages
Pandas
No ratings yet
Pandas
12 pages
Phan1_Pandas_Numpy_Matplotlib
No ratings yet
Phan1_Pandas_Numpy_Matplotlib
158 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
ML UNIT-2 NOTES
No ratings yet
ML UNIT-2 NOTES
17 pages
Pandas
No ratings yet
Pandas
11 pages
PYTHON UNIT IV- PANDAS
No ratings yet
PYTHON UNIT IV- PANDAS
36 pages
Lecture 14
No ratings yet
Lecture 14
33 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Pandas Summarized Visually in 8
100% (2)
Pandas Summarized Visually in 8
8 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
Line By Line 12 IP
No ratings yet
Line By Line 12 IP
21 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Pandas Basics
No ratings yet
Pandas Basics
21 pages
Python Pandas Series
No ratings yet
Python Pandas Series
45 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
Data Science Notes Unit-1 Part -2
No ratings yet
Data Science Notes Unit-1 Part -2
22 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Data Analytics Training Program Da Ab Brochure
No ratings yet
Data Analytics Training Program Da Ab Brochure
10 pages
List of Practical Ip065 Xii Session 2025 Ckc Academy
No ratings yet
List of Practical Ip065 Xii Session 2025 Ckc Academy
19 pages
FROM ZERO TO AI HERO BOOKLET -compressed
No ratings yet
FROM ZERO TO AI HERO BOOKLET -compressed
8 pages
Books
No ratings yet
Books
25 pages
Data Analytics With Python Laboratory - Lab Manual
No ratings yet
Data Analytics With Python Laboratory - Lab Manual
45 pages
pandasquiz
No ratings yet
pandasquiz
7 pages
Internship Report (1)
No ratings yet
Internship Report (1)
15 pages
Bhanuprakash Avadutha_Datacrew
No ratings yet
Bhanuprakash Avadutha_Datacrew
1 page
Mca 20230907104314
No ratings yet
Mca 20230907104314
105 pages
Python and Iot
No ratings yet
Python and Iot
38 pages
ANUP SAKHARE - Resume-1
No ratings yet
ANUP SAKHARE - Resume-1
2 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
5 pages
Students Industrial Work Experience Scheme
No ratings yet
Students Industrial Work Experience Scheme
49 pages
Pranav k Hub Chandan i Resume
No ratings yet
Pranav k Hub Chandan i Resume
1 page
A Review On Python Libraries and Ides For Data Science: November 2021
No ratings yet
A Review On Python Libraries and Ides For Data Science: November 2021
19 pages
Python Programming Syallabus
No ratings yet
Python Programming Syallabus
3 pages
InformaticsPractices-24-25 classs XII docx
No ratings yet
InformaticsPractices-24-25 classs XII docx
16 pages
Learning the Pandas Library Python Tools for Data Munging Analysis and Visual Matt Harrison pdf download
100% (1)
Learning the Pandas Library Python Tools for Data Munging Analysis and Visual Matt Harrison pdf download
63 pages
ML Lab Manual 2023-2024
No ratings yet
ML Lab Manual 2023-2024
44 pages
BVM IP 2324 3papers
No ratings yet
BVM IP 2324 3papers
20 pages
Data Science Handwritten Notes
No ratings yet
Data Science Handwritten Notes
44 pages
Codigo base stocks prediction LSTM Thushan GAnegedara
No ratings yet
Codigo base stocks prediction LSTM Thushan GAnegedara
3 pages
Lecture 01-05 Data, Central Tendency PDF
No ratings yet
Lecture 01-05 Data, Central Tendency PDF
51 pages
unit 3
No ratings yet
unit 3
148 pages
Pandas
No ratings yet
Pandas
46 pages
ML Practical Format
No ratings yet
ML Practical Format
82 pages
DS Manual
No ratings yet
DS Manual
29 pages
DA Unit - IV
No ratings yet
DA Unit - IV
216 pages
Exploratory Data Analysis With Polars
No ratings yet
Exploratory Data Analysis With Polars
339 pages