100% found this document useful (1 vote)
71 views2 pages

Data Analysis With Pandas - Aggregates in Pandas Cheatsheet - Codecademy

Pandas allows calculating aggregate statistics on DataFrame columns using functions like mean(), std(), median(), max(), min(), count(), and nunique(). These can provide summary information like the average, standard deviation, median, maximum, minimum, count, and number of unique values of a column. Aggregate functions can also be applied across multiple rows using groupby(), grouping rows that are the same in one column and replacing values in another column with aggregate statistics like mean() of that column for each group.

Uploaded by

Utsav Soi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
71 views2 pages

Data Analysis With Pandas - Aggregates in Pandas Cheatsheet - Codecademy

Pandas allows calculating aggregate statistics on DataFrame columns using functions like mean(), std(), median(), max(), min(), count(), and nunique(). These can provide summary information like the average, standard deviation, median, maximum, minimum, count, and number of unique values of a column. Aggregate functions can also be applied across multiple rows using groupby(), grouping rows that are the same in one column and replacing values in another column with aggregate statistics like mean() of that column for each group.

Uploaded by

Utsav Soi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

5/6/2020 Data Analysis with Pandas: Aggregates in Pandas Cheatsheet | Codecademy

Cheatsheets / Data Analysis with Pandas

Aggregates in Pandas
Pandas DataFrame Aggregate Function
Pandas’ aggregate statistics functions can be used to calculate statistics on a column of
a DataFrame. For example, df.columnName.mean() computes the mean of the column
columnName of dataframe df . The code block shows how to calculate statistics on the
column columnName of df using Pandas’ aggregate statistics functions.

df.columnName.mean() # Average of all values in column


df.columnName.std() # Standard deviation of column
df.columnName.median() # Median value of column
df.columnName.max() # Maximum value in column
df.columnName.min() # Minimum value in column
df.columnName.count() # Number of values in column
df.columnName.nunique() # Number of unique values in column
df.columnName.unique() # List of unique values in column

Pandas’ Groupby
In a pandas DataFrame , aggregate statistic functions can be applied across multiple rows
by using a groupby function. In the example, the code takes all of the elements that are
the same in Name and groups them, replacing the values in Grade with their mean.
Instead of mean() any aggregate statistics function, like median() or max() , can be used.
Note that to use the groupby() function, at least two columns must be supplied.

df = pd.DataFrame([
["Amy","Assignment 1",75],
["Amy","Assignment 2",35],
["Bob","Assignment 1",99],
["Bob","Assignment 2",35]
], columns=["Name", "Assignment", "Grade"])

df.groupby('Name').Grade.mean()

# output of the groupby command


|Name | Grade|
| - | - |

https://www.codecademy.com/learn/paths/data-science/tracks/data-processing-pandas/modules/dspath-agg-pandas/cheatsheet 1/2
5/6/2020 Data Analysis with Pandas: Aggregates in Pandas Cheatsheet | Codecademy

|Amy | 55|
|Bob | 67|

https://www.codecademy.com/learn/paths/data-science/tracks/data-processing-pandas/modules/dspath-agg-pandas/cheatsheet 2/2

You might also like