0% found this document useful (0 votes)
28 views2 pages

Learn Data Analysis With Pandas - Aggregates in Pandas

The document discusses how to aggregate statistics in Pandas using groupby and aggregate functions. groupby allows calculating statistics across multiple rows that are the same in certain columns. Aggregate functions like mean(), median(), max() can then be applied. DataFrame aggregate functions directly calculate statistics on a single column, such as mean, standard deviation, median, minimum, maximum, count and number of unique values.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views2 pages

Learn Data Analysis With Pandas - Aggregates in Pandas

The document discusses how to aggregate statistics in Pandas using groupby and aggregate functions. groupby allows calculating statistics across multiple rows that are the same in certain columns. Aggregate functions like mean(), median(), max() can then be applied. DataFrame aggregate functions directly calculate statistics on a single column, such as mean, standard deviation, median, minimum, maximum, count and number of unique values.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Cheatsheets / Learn Data Analysis with Pandas

Aggregates in Pandas
Pandas’ Groupby
In a pandas DataFrame , aggregate statistic functions can
be applied across multiple rows by using a groupby df = pd.DataFrame([
function. In the example, the code takes all of the   ["Amy","Assignment 1",75],
elements that are the same in Name and groups them,   ["Amy","Assignment 2",35],
replacing the values in Grade with their mean. Instead
  ["Bob","Assignment 1",99],
of mean() any aggregate statistics function, like
  ["Bob","Assignment 2",35]
median() or max() , can be used. Note that to use the
  ], columns=["Name", "Assignment", "Grade"])
groupby() function, at least two columns must be
supplied.
df.groupby('Name').Grade.mean()

# output of the groupby command


|Name | Grade|
| -  | - |
|Amy | 55|
|Bob |  67|

Pandas DataFrame Aggregate Function


Pandas’ aggregate statistics functions can be used to
calculate statistics on a column of a DataFrame. For df.columnName.mean() # Average of all values
example, df.columnName.mean() computes the mean of in column
the column columnName of dataframe df . The code df.columnName.std() # Standard deviation of
block shows how to calculate statistics on the column column
columnName of df using Pandas’ aggregate statistics
df.columnName.median() # Median value of
functions.
column
df.columnName.max() # Maximum value in column
df.columnName.min() # Minimum value in column
df.columnName.count() # Number of values in
column
df.columnName.nunique() # Number of unique
values in column
df.columnName.unique() # List of unique
values in column

You might also like