How to write Pandas DataFrame to PostgreSQL table?
Last Updated :
27 Jan, 2022
In this article, we will be looking at some methods to write Pandas dataframes to PostgreSQL tables in the Python.
Method 1: Using to_sql() function
to_sql function is used to write the given dataframe to a SQL database.
Syntax
df.to_sql('data', con=conn, if_exists='replace', index=False)
Parameters :
- data: name of the table.
- con: connection to the database.
- if_exists: if table exists or not. "replace" or "append".
- index: True or False.
Example:
In the example demonstrated below, we import the required packages and modules, establish a connection to the PostgreSQL database and convert the dataframe to PostgreSQL table by using the to_sql() method. Finally, all rows are fetched using fetchall() method.
To access the CSV file used click here.
Python3
# import packages
import psycopg2
import pandas as pd
from sqlalchemy import create_engine
# establish connections
conn_string = 'postgres://postgres:[email protected]/Airlines_Database'
db = create_engine(conn_string)
conn = db.connect()
conn1 = psycopg2.connect(
database="Airlines_Database",
user='postgres',
password='pass',
host='127.0.0.1',
port= '5432'
)
conn1.autocommit = True
cursor = conn1.cursor()
# drop table if it already exists
cursor.execute('drop table if exists airlines_final')
sql = '''CREATE TABLE airlines_final(id int ,
day int ,airline char(20),destination char(20));'''
cursor.execute(sql)
# import the csv file to create a dataframe
data = pd.read_csv("airlines_final.csv")
data = data[["id","day","airline","destination"]]
# Create DataFrame
print(data)
# converting data to sql
data.to_sql('airlines_final', conn, if_exists= 'replace')
# fetching all rows
sql1='''select * from airlines_final;'''
cursor.execute(sql1)
for i in cursor.fetchall():
print(i)
conn1.commit()
conn1.close()
Output:

Method 2: Using execute_values() function
The execute_values() function from the psycopg2 library is used to get the postgres table of the given data frame.
Syntax:
psycopg2.extras.execute_values(cur, sql, argslist, template=None, page_size=100, fetch=False)
Parameters:
- cur – the cursor that will be used to run the query.
- sql – the query that will be run. A single percent s placeholder must be present, which will be substituted by a VALUES list.
- argslist — a list of series or dictionaries containing the query's arguments. The template must be followed in terms of font and content.
- template – the snippet that will be merged into each item in the argslist to form the query
- page size – the maximum amount of argslist items that each statement can have.
- fetch - it's similar to fetchall. the values can be "True" or "False"
Example:
In this example, the same CSV file is used in this method. code begins with importing packages, then we form a custom function execute_values, where the given dataframe, connection, and table name are given as arguments. The dataframe rows and values are updated into the PostgreSQL table using the execute_values() method. The defined method contains an exception handling block, if there's no exception "execute_values() done" is printed.
Python3
# import packages
import psycopg2
import psycopg2.extras as extras
import pandas as pd
def execute_values(conn, df, table):
tuples = [tuple(x) for x in df.to_numpy()]
cols = ','.join(list(df.columns))
# SQL query to execute
query = "INSERT INTO %s(%s) VALUES %%s" % (table, cols)
cursor = conn.cursor()
try:
extras.execute_values(cursor, query, tuples)
conn.commit()
except (Exception, psycopg2.DatabaseError) as error:
print("Error: %s" % error)
conn.rollback()
cursor.close()
return 1
print("execute_values() done")
cursor.close()
# establishing connection
conn = psycopg2.connect(
database="Airlines_Database",
user='postgres',
password='sherlockedisi',
host='127.0.0.1',
port='5432'
)
sql = '''CREATE TABLE airlines_final1(id int ,day
char(20) ,airline char(20),destination char(20));'''
# creating a cursor
cursor = conn.cursor()
cursor.execute(sql)
data = pd.read_csv("airlines_final.csv")
data = data[["id", "day", "airline", "destination"]]
# using the function defined
execute_values(conn, data, 'airlines_final1')
Output:
execute_values() done
Similar Reads
pandas.concat() function in Python The pandas.concat() function does all the heavy lifting of performing concatenation operations along with an axis of Pandas objects while performing optional set logic (union or intersection) of the indexes (if any) on the other axes. Pandas concat() function SyntaxSyntax: concat(objs, axis, join, i
4 min read
Python | Pandas dataframe.cov() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.cov() is used to compute pairwise covariance of columns. If some of t
2 min read
Pandas DataFrame duplicated() Method - Python Pandas is widely used library in Python used for tasks like cleaning, analyzing and transforming data. One important part of cleaning data is identifying and handling duplicate rows which can lead to incorrect results if left unchecked.The duplicated() method in Pandas helps us to find these duplica
2 min read
Pandas dataframe.drop_duplicates() When working with data in Pandas one common task is removing duplicate rows to ensure clean and accurate datasets. The drop_duplicates() method in Pandas is designed to make this process quick and easy. It allows us to remove duplicate rows from a DataFrame based on all columns or specific ones. By
4 min read
Pandas DataFrame.dropna() Method Pandas is one of the packages that makes importing and analyzing data much easier. Sometimes CSV file has null values, which are later displayed as NaN in Pandas DataFrame. Pandas dropna() method allows the user to analyze and drop Rows/Columns with Null values in different ways.  Pandas DataFrame.
3 min read
Python | Pandas dataframe.diff() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.diff() is used to find the first discrete difference of objects over
2 min read
Pandas Dataframe rank() | Rank DataFrame Entries Python is a great language for data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas DataFrame rank() method returns a rank of every respective entry (1 through n) along
3 min read
Python | Pandas dataframe.mask() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.mask() function return an object of same shape as self and whose corr
3 min read
Python | Pandas dataframe.resample() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.resample() function is primarily used for time series data. A time se
4 min read
Python | Pandas DataFrame.transform Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure o
3 min read