List of pandas articles
Contents
- Versions and settings
- Basics of
DataFrame
andSeries
- File inputs and outputs
- Converting to other types
- Extracting rows, columns, and elements
- Replacing values
- Adding, concatenating, and merging rows/columns
- Removing rows/columns
- Applying functions to values, rows, and columns
for
loops- Transposing, sorting, and shuffling
- Handling strings
- Handling numeric values
- Handling missing values (
NaN
) - Handling
index
andcolumns
- Data analysis
- Other tips
Versions and settings
Basics of DataFrame
and Series
- Get the number of rows, columns, all elements (size) of DataFrame
- Cast DataFrame to a specific dtype with astype()
- Convert between DataFrame and Series
- Views and copies in DataFrame
- Check if DataFrame/Series is empty
File inputs and outputs
CSV
JSON
Converting to other types
- Convert between pandas DataFrame/Series and NumPy array
- Convert between pandas DataFrame/Series and Python list
Extracting rows, columns, and elements
By row/column names and numbers
- Get/Set element values with at, iat, loc, iloc
- Select rows/columns in DataFrame by indexing "[]"
- Get first/last n rows of DataFrame with head(), tail(), slice
By conditions
- Query DataFrame with query()
- Select rows with multiple conditions
- Extract rows that contain specific strings from a DataFrame
- Find and remove duplicate rows of DataFrame, Series
- Extract columns from pandas.DataFrame based on dtype
- Extract rows/columns from DataFrame according to labels
Random sampling
Replacing values
- Replace values in DataFrame and Series with replace()
- Replace Series values with map()
- Replace values based on conditions with where(), mask()
Adding, concatenating, and merging rows/columns
- Add rows/columns to DataFrame with assign(), insert()
- Concat multiple DataFrame/Series with concat()
- Merge DataFrame with merge(), join() (INNER, OUTER JOIN)
Removing rows/columns
Applying functions to values, rows, and columns
for
loops
Transposing, sorting, and shuffling
- Transpose DataFrame (swap rows and columns)
- Sort DataFrame, Series with sort_values(), sort_index()
- Shuffle rows/elements of DataFrame/Series
Handling strings
- Split string columns by delimiters or regular expressions
- Handle strings (replace, strip, case conversion, etc.)
- Slice substrings from each element in columns
Handling numeric values
Handling missing values (NaN
)
- Missing values in pandas (nan, None, pd.NA)
- Remove missing values (NaN) with dropna()
- Replace missing values (NaN) with fillna()
- Extract rows/columns with missing values (NaN)
- Detect and count missing values (NaN) with isnull(), isna()
- Interpolate NaN with interpolate()
Handling index
and columns
- Rename columns/index names (labels) of DataFrame
- Assign existing column to the DataFrame index with set_index()
- Reset index of DataFrame, Series with reset_index()
- Reorder rows and columns in DataFrame with reindex()
Data analysis
- Get summary statistics for each column with describe()
- Get unique values and their counts in a column
- Count DataFrame/Series elements matching conditions
- Grouping data with groupby()
- Aggregate data with agg(), aggregate()
- Get the mode (the most frequent value) with mode()
- Find the quantile with quantile()
- Cumulative calculations (cumsum, cumprod, cummax, cummin)
- Get dummy variables with pd.get_dummies()
- Data binning with cut() and qcut()