Introduction to Pandas
Introduction to Pandas
Today’s Session
01 03 05
02 04 06
www.intellipaat.com
Python Certification Course
www.intellipaat.com
Introduction to Pandas
www.intellipaat.com
Agenda for
Today’s Session
01 03
What is Pandas
Features of Pandas
02 04
www.intellipaat.com
What is Pandas?
Open-source Python
01
library
Introductio 02
Simple yet powerful and
expressive tool
www.intellipaat.com
Where did the name Pandas come from?
Panel Data
www.intellipaat.com
Who created Pandas?
Introductio
n to Pandas
www.intellipaat.com
Features of Pandas:
01 02 03 04 05 06
Handling of Group by
missing data functionality
www.intellipaat.com
Features of Pandas:
n to Pandas functionality
06 06 07 08 09 10 06
Robust Input
Reshaping
Output tool
www.intellipaat.com
Pandas vs Numpy
Introductio
n to Pandas
Pandas performs better than Numpy performs better for 50k
06
numpy for 500k rows or more. rows or less.
www.intellipaat.com
Pandas vs Numpy
Introductio
n to Pandas
Pandas Series Object is more Elements in NumPy arrays are
06
flexible as you can define your accessed by their default integer
own labeled index to index and position
access elements of an array
www.intellipaat.com
www.intellipaat.com
India : +91-7847955955
www.intellipaat.com
How to import Pandas in Python?
www.intellipaat.com
How to import Pandas in Python?
Working
with import pandas as pd
Pandas
06
www.intellipaat.com
What kind of data does suit Pandas the most?
Working
with
Pandas Tabular data
www.intellipaat.com
Data-set in Pandas
Working
with One
Dimensional
Multi
Dimensional
Pandas
Series Object DataFrame
www.intellipaat.com
What is a series object? Series Object
with Example:
data= [1, 2, 3, 4]
Pandas series1 = pd.Series(data)
series1
One Multi
Dimensional Dimensional
www.intellipaat.com
How to check the type? Series Object
One Dimensional
Working
with
Pandas type(series1)
One Multi
Dimensional Dimensional
www.intellipaat.com
Create different Series Object Series Object
datatypes
One Dimensional
Working
with
Pandas Array Dictionary Scalar
One Multi
Dimensional Dimensional
www.intellipaat.com
How to create a series object? Series Object
One Dimensional
Introductio
n to Pandas pd.Series(data)
One Multi
Dimensional Dimensional
www.intellipaat.com
How to change the index name? Series Object
One Dimensional
Introductio
n to Pandas
a
b
c
d
One Multi
Dimensional Dimensional
www.intellipaat.com
How to change the index name? Series Object
One Dimensional
Introductio
n to Pandas series1 = pd.Series([1, 2, 3, 4]index=['a', 'b', 'c', 'd’]))
series1
One Multi
Dimensional Dimensional
www.intellipaat.com
What is a DataFrame? DataFrame
One Multi
Dimensional Dimensional
www.intellipaat.com
Features of DataFrame DataFrame
Mutable Size
Multi Dimensional
02
Introductio Different
Column 01 Labeled axes
03
n to Pandas types
Features Arithmetic
03 operations on
rows and
columns
One Multi
Dimensional Dimensional
www.intellipaat.com
How to create a DataFrame? DataFrame
Multi Dimensional
Introductio
n to Pandas pd.DataFrame(data)
One Multi
Dimensional Dimensional
www.intellipaat.com
How to create a DataFrame? DataFrame
Multi Dimensional
One Multi
Dimensional Dimensional
www.intellipaat.com
Create a DataFrame from a List DataFrame
Multi Dimensional
n to Pandas df = pd.DataFrame(data)
df
One Multi
Dimensional Dimensional
www.intellipaat.com
Create a DataFrame from a Dictionary DataFrame
Multi Dimensional
n to Pandas df = pd.DataFrame(dict1)
df
One Multi
Dimensional Dimensional
www.intellipaat.com
Create a DataFrame from a Series DataFrame
Multi Dimensional
n to Pandas df = pd.DataFrame([data])
df
One Multi
Dimensional Dimensional
www.intellipaat.com
Create a DataFrame from a numpy ND array DataFrame
Multi Dimensional
df = pd.DataFrame({'A':data[:,0],'B':data[:,1]})
df
One Multi
Dimensional Dimensional
www.intellipaat.com
Understanding Pandas Operations with
example
www.intellipaat.com
Hands-on Demonstration
Import Convention
Data Analysis
Data Manipulation
Data Visualization
www.intellipaat.com
• Dataset is based on product reviews from Amazon
the Data-set
www.intellipaat.com
Importing First read the data
Data-set
import pandas as pd
with Product_Review=pd.read_csv("Amazon_Products_Review.csv")
Pandas
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Let’s explore the type
Importing
Data with type(Product_Review)
Pandas
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
For files other than CSV format
Importing
Data with pd.read_table(“filename”)
pd.read_excel(“filename”)
Pandas pd.read_sql(query, connection_object)
pd.read_json(json_string)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Read from SQL Query or Database Table
Importing
Data with
Pandas >>> from sqlnew import create_table
Data Manipulation
Data Visualization
www.intellipaat.com
Analyzing Data-set
www.intellipaat.com
Basic Print the first 5 rows of the DataFrame
DataFrame Product_Review.head()
Functionality
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Basic Print the last 5 rows of the DataFrame
DataFrame Product_Review.tail()
Functionality
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Basic Print the number of rows and columns
DataFrame Product_Review.shape
Functionality
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Basic Information of Index, Datatype and Memory
DataFrame Product_Review.info
Functionality
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame for Pandas Merge
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame for Pandas Merge
Merge, DataFrame-1
player = ['Player1', 'Player2', 'Player3']
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame for Pandas Merge
Merge, DataFrame-2
player = ['Player1','Player5','Player6']
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Inner Merge
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Left Merge
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Right Merge
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Outer Merge
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame for Pandas Join
Merge, DataFrame-1
player = ['Player1', 'Player2', 'Player3']
df1.set_index('Player')
Importing Convention
df1
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame for Pandas Join
Merge, DataFrame-2
player = ['Player1','Player5','Player6']
df2.set_index('Player')
Importing Convention
df2
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df1
df2
Inner Join
Join and
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Left Merge
Join and
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df1
right Join
Join and
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
df1 df2
Outer Join
Join and
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Merge, Concatenate
Concatenate
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Mean
DataFrame Product_Review.mean()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Median
DataFrame Product_Review.median()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Standard Deviation:
DataFrame Product_Review.std()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Maximum of each column:
DataFrame Product_Review.max()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Minimum of each column:
DataFrame Product_Review.min()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Count of non-null values in each column:
DataFrame Product_Review.count()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Summary statistics for numerical column
DataFrame Product_Review.describe()
Methods
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas e.g.: Divide every value in the Product Rating column by 2
Mathematical Product_Review[“Product_Rating”] /2
Operations
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Manipulating Data-set
www.intellipaat.com
01
Selecting by Position
DataFrame 01
Indexing
Indexing
02
Selecting by Label 02
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by Position
DataFrame Product_Review.iloc[:,0]
Indexing
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by Position
DataFrame Product_Review.iloc[0:5,4]
Indexing
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by Position
DataFrame Product_Review.iloc[:,:]
Indexing
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by Position
DataFrame Product_Review.iloc[6:,4:]
Indexing
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by Position
DataFrame Produt_Reviews= Product_Reviews.iloc[:,1]
Indexing Product_Reviews.head()
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by label:
DataFrame
Indexing Prodcut_Review.loc[:5,"Product_Title"]
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
01
Selecting by Position
Selecting by Label 02
Selecting by label:
DataFrame
Indexing Product_Review.loc[:5,"Product_Title","Product_Rating"]
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Setting a value to one specific column
DataFrame
Setting Product_Review['Platform'] = 6
Product_Review
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Double up all numeric values using lambda function
Applying
Functions on f = lambda x: x*2
DataFrame df.apply(f)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
By default in ascending order.
DataFrame
Sorting Product_Review.sort_values(by=‘Product_Rating’)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas For descending order make ascending=False
DataFrame
Product_Review.sort_values(‘Product_Rating’, ascending=False)
Sorting
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Rank the Product_Rating column
DataFrame
Ranking Product_Review["Product_Rating"].rank()
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Drop Product_Rating column from the dataset
DataFrame
Dropping Product_Review.drop('Product_Rating', axis=1)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Filtering the column by value:
DataFrame
filter1 = Product_Review["Product_Rating"] > 3
Filtering filter1.head()
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Filtering the column by value:
filtered_new = Product_Review[filter1]
Filtering filtered_new.head()
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Pandas Filtering the column by numeric and Boolean value:
filter2 = (Product_Review["Product_Rating"] > 3)
filtered_review
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Data-set Visualization
www.intellipaat.com
Histogram:
Data %matplotlib inline
Visualization Product_Review[Product_Review["Product_Category"]
== "Footwear"]["Product_Rating"].plot(kind="hist")
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Data
Scatter plot
Visualization
Using Product_Review.plot.scatter(x="Product_Launch_Year",
y="Product_Rating")
Pandas
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame.plot():
import numpy as np
Visualization df = pd.DataFrame(np.random.randn(20,4),index=pd.date_range('1/1/2019',
periods=20), columns=list(‘PQRS'))
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame.plot.bar():
import numpy as np
Visualization df = pd.DataFrame(np.random.rand(20,4),columns=[‘p',’q',’r',’s')
df.plot.bar()
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame.plot.bar(stacked=True):
df = pd.DataFrame(np.random.rand(20,4),columns=[‘p',’q',’r',’s')
Visualization df.plot.bar(stacked=True)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame.plot.barh(stacked=True):
Visualization df = pd.DataFrame(np.random.rand(20,4),columns=[‘p',’q',’r',’s')
df.plot.barh(stacked=True)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
DataFrame.plot.hist():
import pandas as pd
Data import numpy as np
Visualization df = pd.DataFrame({'p':np.random.randn(500)+1,'q':np.random.randn(500),'c':
df.plot.hist(bins=20)
Importing Convention
Data Analysing
Data Manipulation
Data Visualization
www.intellipaat.com
Summary
Data Manipulation
www.intellipaat.com
www.intellipaat.com
India : +91-7847955955
www.intellipaat.com
India : +91-7847955955
www.intellipaat.com