Pandas Data Wrangling Cheatsheet Datacamp PDF

Uploaded by

Pamungkas Aji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

965 views1 page

Pandas Data Wrangling Cheatsheet Datacamp PDF

Uploaded by

Pamungkas Aji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Python For Data Science Cheat Sheet Advanced Indexing Also see NumPy Arrays Combining Data

Selecting data1 data2

Pandas >>> df3.loc[:,(df3>1).any()] Select cols with any vals >1 X1 X2 X1 X3
Learn Python for Data Science Interactively at www.DataCamp.com >>> df3.loc[:,(df3>1).all()] Select cols with vals > 1
>>> df3.loc[:,df3.isnull().any()] Select cols with NaN a 11.432 a 20.784
>>> df3.loc[:,df3.notnull().all()] Select cols without NaN b 1.303 b NaN
Indexing With isin c 99.906 d 20.784
>>> df[(df.Country.isin(df2.Type))] Find same elements
Reshaping Data >>> df3.filter(items=”a”,”b”]) Filter on values
Merge
>>> df.select(lambda x: not x%5) Select specific elements
Pivot Where X1 X2 X3
>>> pd.merge(data1,
>>> df3= df2.pivot(index='Date', Spread rows into columns >>> s.where(s > 0) Subset the data data2, a 11.432 20.784
columns='Type', Query how='left',
values='Value') b 1.303 NaN
>>> df6.query('second > first') Query DataFrame on='X1')
c 99.906 NaN
Date Type Value

0 2016-03-01 a 11.432 Type a b c Setting/Resetting Index >>> pd.merge(data1, X1 X2 X3

1 2016-03-02 b 13.031 Date data2, a 11.432 20.784
>>> df.set_index('Country') Set the index
how='right',
2 2016-03-01 c 20.784 2016-03-01 11.432 NaN 20.784 >>> df4 = df.reset_index() Reset the index b 1.303 NaN
on='X1')
3 2016-03-03 a 99.906 >>> df = df.rename(index=str, Rename DataFrame d NaN 20.784
2016-03-02 1.303 13.031 NaN columns={"Country":"cntry",
4 2016-03-02 a 1.303 "Capital":"cptl", >>> pd.merge(data1,
2016-03-03 99.906 NaN 20.784 "Population":"ppltn"}) X1 X2 X3
5 2016-03-03 c 20.784 data2,
how='inner', a 11.432 20.784
Pivot Table Reindexing on='X1') b 1.303 NaN
>>> s2 = s.reindex(['a','c','d','e','b'])
>>> df4 = pd.pivot_table(df2, Spread rows into columns X1 X2 X3
values='Value', Forward Filling Backward Filling >>> pd.merge(data1,
index='Date', data2, a 11.432 20.784
columns='Type']) >>> df.reindex(range(4), >>> s3 = s.reindex(range(5), how='outer', b 1.303 NaN
method='ffill') method='bfill') on='X1') c 99.906 NaN
Stack / Unstack Country Capital Population 0 3
0 Belgium Brussels 11190846 1 3 d NaN 20.784
>>> stacked = df5.stack() Pivot a level of column labels 1 India New Delhi 1303171035 2 3
>>> stacked.unstack() Pivot a level of index labels 2 Brazil Brasília 207847528 3 3 Join
3 Brazil Brasília 207847528 4 3
0 1 1 5 0 0.233482 >>> data1.join(data2, how='right')
1 5 0.233482 0.390959 1 0.390959 MultiIndexing Concatenate
2 4 0.184713 0.237102 2 4 0 0.184713
>>> arrays = [np.array([1,2,3]),
3 3 0.433522 0.429401 1 0.237102 np.array([5,4,3])] Vertical
>>> df5 = pd.DataFrame(np.random.rand(3, 2), index=arrays) >>> s.append(s2)
Unstacked 3 3 0 0.433522
>>> tuples = list(zip(*arrays)) Horizontal/Vertical
1 0.429401 >>> index = pd.MultiIndex.from_tuples(tuples, >>> pd.concat([s,s2],axis=1, keys=['One','Two'])
Stacked names=['first', 'second']) >>> pd.concat([data1, data2], axis=1, join='inner')
>>> df6 = pd.DataFrame(np.random.rand(3, 2), index=index)
Melt >>> df2.set_index(["Date", "Type"])
>>> pd.melt(df2, Gather columns into rows
Dates
id_vars=["Date"],
value_vars=["Type", "Value"],
Duplicate Data >>> df2['Date']= pd.to_datetime(df2['Date'])
>>> df2['Date']= pd.date_range('2000-1-1',
value_name="Observations") >>> s3.unique() Return unique values periods=6,
>>> df2.duplicated('Type') Check duplicates freq='M')
Date Type Value
Date Variable Observations >>> dates = [datetime(2012,5,1), datetime(2012,5,2)]
0 2016-03-01 Type a >>> df2.drop_duplicates('Type', keep='last') Drop duplicates >>> index = pd.DatetimeIndex(dates)
0 2016-03-01 a 11.432 1 2016-03-02 Type b
>>> df.index.duplicated() Check index duplicates >>> index = pd.date_range(datetime(2012,2,1), end, freq='BM')
1 2016-03-02 b 13.031 2 2016-03-01 Type c
2 2016-03-01 c 20.784 3 2016-03-03 Type a Grouping Data Visualization Also see Matplotlib
4 2016-03-02 Type a
3 2016-03-03 a 99.906
5 2016-03-03 Type c Aggregation >>> import matplotlib.pyplot as plt
4 2016-03-02 a 1.303 >>> df2.groupby(by=['Date','Type']).mean()
6 2016-03-01 Value 11.432 >>> s.plot() >>> df2.plot()
>>> df4.groupby(level=0).sum()
5 2016-03-03 c 20.784 7 2016-03-02 Value 13.031 >>> df4.groupby(level=0).agg({'a':lambda x:sum(x)/len(x), >>> plt.show() >>> plt.show()
8 2016-03-01 Value 20.784 'b': np.sum})
9 2016-03-03 Value 99.906 Transformation
>>> customSum = lambda x: (x+x%2)
10 2016-03-02 Value 1.303
>>> df4.groupby(level=0).transform(customSum)
11 2016-03-03 Value 20.784

Iteration Missing Data

>>> df.dropna() Drop NaN values
>>> df.iteritems() (Column-index, Series) pairs >>> df3.fillna(df3.mean()) Fill NaN values with a predetermined value
>>> df.iterrows() (Row-index, Series) pairs >>> df2.replace("a", "f") Replace values with others
DataCamp
Learn Python for Data Science Interactively

Advanced Excel Course
No ratings yet
Advanced Excel Course
2 pages
Python Assignment
No ratings yet
Python Assignment
3 pages
Excel Practice Sheet
No ratings yet
Excel Practice Sheet
156 pages
IICT-Final-Exam
No ratings yet
IICT-Final-Exam
7 pages
1301 Excel Assignment Fall 2016
No ratings yet
1301 Excel Assignment Fall 2016
2 pages
3.1 Structured Query Language (SQL) : Unit-Iii
No ratings yet
3.1 Structured Query Language (SQL) : Unit-Iii
18 pages
Intermediate STATS 10
100% (1)
Intermediate STATS 10
35 pages
Solutions To Pandas Basic Questions
No ratings yet
Solutions To Pandas Basic Questions
1 page
Quick Sort and Selection Sort
No ratings yet
Quick Sort and Selection Sort
8 pages
Test Bank for Information Technology Project Management, 6th Edition: Schwalbe - Free Access To All Available Content For Download
100% (14)
Test Bank for Information Technology Project Management, 6th Edition: Schwalbe - Free Access To All Available Content For Download
43 pages
Excel Report
No ratings yet
Excel Report
31 pages
Pythonic Data Cleaning With Numpy and Pandas
No ratings yet
Pythonic Data Cleaning With Numpy and Pandas
11 pages
Conditional Formatting in Excel
No ratings yet
Conditional Formatting in Excel
23 pages
Microsoft Excel For Advanced Syllabus
No ratings yet
Microsoft Excel For Advanced Syllabus
2 pages
Mme 2
No ratings yet
Mme 2
145 pages
Pandas - Basics - Practice: Consider The Following Python Dictionary Data and Python List Labels
No ratings yet
Pandas - Basics - Practice: Consider The Following Python Dictionary Data and Python List Labels
6 pages
Excel Function List
No ratings yet
Excel Function List
130 pages
Ch8 Data Wrangling Join, Combine, and Reshape
No ratings yet
Ch8 Data Wrangling Join, Combine, and Reshape
13 pages
Excel Clavier
No ratings yet
Excel Clavier
15 pages
IT Cycle Sheet
No ratings yet
IT Cycle Sheet
8 pages
Office Automation Unit_ III-1
100% (1)
Office Automation Unit_ III-1
12 pages
Data Analytics with MS Excel Lab Manual Full 2024-25
No ratings yet
Data Analytics with MS Excel Lab Manual Full 2024-25
30 pages
Excel
No ratings yet
Excel
16 pages
Excel 2010 Tables
100% (1)
Excel 2010 Tables
26 pages
11 IP Sample Paper
No ratings yet
11 IP Sample Paper
33 pages
EXCEL Solution
No ratings yet
EXCEL Solution
19 pages
Computer Lab Report (MS-EXCEL)(2081!07!13) (1)
No ratings yet
Computer Lab Report (MS-EXCEL)(2081!07!13) (1)
16 pages
Average If
No ratings yet
Average If
22 pages
Sankalp
No ratings yet
Sankalp
21 pages
101 Excel Functions
No ratings yet
101 Excel Functions
33 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
Step by Step Business Math and Statistics Sneak Preview
No ratings yet
Step by Step Business Math and Statistics Sneak Preview
42 pages
Ms Excel 2007 - Manual
100% (1)
Ms Excel 2007 - Manual
77 pages
Excel In-Class Assignment 1 2 Instructions
No ratings yet
Excel In-Class Assignment 1 2 Instructions
7 pages
Pivot Table Exercise 1
No ratings yet
Pivot Table Exercise 1
7 pages
Task 1 - Unit 5 - V2
No ratings yet
Task 1 - Unit 5 - V2
9 pages
Database Development and Implementation Lesson
100% (1)
Database Development and Implementation Lesson
21 pages
Lab Report Excel
No ratings yet
Lab Report Excel
3 pages
Excel Lab Exercises - V3
No ratings yet
Excel Lab Exercises - V3
15 pages
3 Honours Excel Practical List
No ratings yet
3 Honours Excel Practical List
8 pages
Microsoft Exel Formulas
No ratings yet
Microsoft Exel Formulas
30 pages
WHAT-IF Analysis - Goal Seek - Data Table
No ratings yet
WHAT-IF Analysis - Goal Seek - Data Table
15 pages
COEB3042 Project Management Semester 2, Year 2020/2021 Group Project Guidelines
No ratings yet
COEB3042 Project Management Semester 2, Year 2020/2021 Group Project Guidelines
2 pages
Pandas Notes(1)
No ratings yet
Pandas Notes(1)
44 pages
Week 1: Practice Challenge - Taking Charge of Excel
No ratings yet
Week 1: Practice Challenge - Taking Charge of Excel
5 pages
Excel Training Presentation
No ratings yet
Excel Training Presentation
13 pages
CF - Bba1 - Format Practical File
No ratings yet
CF - Bba1 - Format Practical File
12 pages
5.global and Local Variable
No ratings yet
5.global and Local Variable
2 pages
Syllabus - Advanced Excel - Ver 7.0
No ratings yet
Syllabus - Advanced Excel - Ver 7.0
5 pages
Ce2017 Data Visualization
No ratings yet
Ce2017 Data Visualization
5 pages
Excel Assignments
No ratings yet
Excel Assignments
18 pages
BCSL 013
No ratings yet
BCSL 013
17 pages
Basic Spreadsheet Exercise
No ratings yet
Basic Spreadsheet Exercise
7 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Data Wrangling Cheat Sheet
No ratings yet
Data Wrangling Cheat Sheet
1 page
Data WranglingGUIA PYTHON-05
No ratings yet
Data WranglingGUIA PYTHON-05
1 page
Python For Data Science: Advanced Indexing Data Wrangling in Pandas Cheat Sheet Combining Data
No ratings yet
Python For Data Science: Advanced Indexing Data Wrangling in Pandas Cheat Sheet Combining Data
1 page
Data Wrangling
No ratings yet
Data Wrangling
2 pages
Cheat Sheet: Learn Python For Data Science Interactively at
No ratings yet
Cheat Sheet: Learn Python For Data Science Interactively at
1 page
Pandas
No ratings yet
Pandas
44 pages

Pandas Data Wrangling Cheatsheet Datacamp PDF

Uploaded by

Pandas Data Wrangling Cheatsheet Datacamp PDF

Uploaded by

Python For Data Science Cheat Sheet Advanced Indexing Also see NumPy Arrays Combining Data

Selecting data1 data2

0 2016-03-01 a 11.432 Type a b c Setting/Resetting Index >>> pd.merge(data1, X1 X2 X3

Iteration Missing Data

You might also like