Scraping Your Mailbox with Python & Basics in Parsing
Scraping Your Mailbox with Python & Basics in Parsing
Parsing
Index
● Installing Jupyter Notebook
● Authentication (OAuth2 & app passwords)
● Accessing your mailbox (IMAP)
For new users, it’s highly recommended to install Anaconda. Anaconda conveniently
installs Python, the Jupyter Notebook. Use the following installation steps:
Once the installation is complete, Anaconda Navigator will appear in your Launchpad
Once Anaconda Navigator is open and you click on “Launch” under the Jupyter
Notebook tile as shown in the screenshot, this is what will happen:
This is required when accessing your Gmail via Python scripts or third-party apps that
don’t support 2-Step Verification directly. You must have 2-Step Verification enabled
on your Google account as a pre-requisite.
You’ll first need to install all the required third-party libraries before running the script to
scrape your mailbox. Run the below command in your jupyter notebook cell.
This section explains the code used to connect to your Gmail inbox using Python
and retrieve emails securely
#Importing the necessary libraries
import numpy as np
import pandas as pd
import re
import imaplib, email
from datetime import datetime, timedelta
from tqdm import tqdm
import pytz
These libraries handle different parts of the process:
● re: Used for pattern matching (e.g., extracting info from email text).
user = '[email protected]'
imap_url = 'imap.gmail.com'
# Connect to Gmail
my_mail = imaplib.IMAP4_SSL(imap_url) # connect to gmail
my_mail.login(user, password) # sign in with your credentials
my_mail.select('Inbox') # select the folder that you want to retrieve