TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Guide to PostgreSQL PerformanceHow to Reduce Bloat in Large PostgreSQL TablesDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables A Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)Top PostgreSQL Drivers for PythonWhen to Consider Postgres PartitioningGuide to PostgreSQL Database OperationsUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksBuilding a Scalable DatabaseRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL Database DesignHow to Use Psycopg2: The PostgreSQL Adapter for Python
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Performance

Best Practices for Postgres PerformanceTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPY

Database design and modeling

How to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Use PostgreSQL for Data Normalization

Database operations

Best Practices for PostgreSQL Database Operations

Data analysis

Best Practices for PostgreSQL Data Analysis

Data aggregation

Best Practices for PostgreSQL Aggregation

Database replication

Best Practices for Postgres Database Replication

Query optimization

How to Use a Common Table Expression (CTE) in SQL

Scaling postgres

Best Practices for Scaling PostgreSQL

Data management

How to Manage Your Data With Data Retention PoliciesHow to Store Video in PostgreSQL Using BYTEABest Practices for Postgres Data Management

Database security

Best Practices for Postgres Security

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2025 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Feb 25, 2025

How to Store Video in PostgreSQL Using BYTEA

Try for free

Start supercharging your PostgreSQL today.

Written by Semab Tariq

PostgreSQL is a powerful open-source relational database widely used by businesses and developers for managing diverse types of data. 

One practical use case is managing video content, which involves storing both video files and related metadata such as titles, descriptions, and user engagement data. While databases are crucial for organizing metadata and enabling fast searches, they can also store videos directly. However, handling large video files efficiently requires thoughtful storage strategies.

In this blog, I will walk you through how to store videos in PostgreSQL, explore different approaches, and discuss best practices for efficient video storage.

Different Ways to Store Videos in a PostgreSQL Database

There are mainly three ways to store videos in a PostgreSQL database:

  • Storing videos directly in PostgreSQL

  • Using external storage with database references

  • Hybrid approach: balancing database and external storage

Storing videos directly in PostgreSQL

Storing a video directly in PostgreSQL means saving the video file as binary data within a table. This is done using data types like BYTEA (which stores binary data in a single row) or large objects (LOBs) (which store data in chunks across multiple rows). When a video is stored in the database, it is converted into a binary format and inserted into a column. Later, when needed, the video data can be retrieved, decoded, and played back. This approach allows PostgreSQL to manage the video as part of its structured data, keeping it within the database rather than storing it separately in a filesystem.

Using external storage with database references

Using external storage with database references implies saving only the location (file path or URL) of the video file instead of the actual video data. The video itself is kept in an external filesystem, cloud storage, or a media server. In PostgreSQL, this is done by storing the file path as a TEXT or VARCHAR column. When accessing the video, the application retrieves its path from the database, fetches the file from its storage location, and then processes it as needed, such as displaying the video or performing other operations.

Hybrid approach: Balancing database and external storage

A hybrid approach to storing videos in PostgreSQL involves combining both direct storage and path-based storage. In this method, the database stores essential video metadata along with a small preview or thumbnail using the BYTEA data type. Meanwhile, the full video file is kept in an external filesystem or cloud storage, with its file path saved in a TEXT or VARCHAR column. When needed, the application retrieves the metadata and preview from the database while fetching the actual video from its storage location. 

Comparison of Video Storage Approaches

Below is a table outlining the pros and cons of each approach:

Approach

Advantages

Disadvantages

Storing videos directly in PostgreSQL

Guaranteed data consistency (ACID compliance).

Simplified backups (single system).

Built-in security/access controls for files.

Significant database bloat (large files increase storage costs).

Slow queries (scanning binary data is resource-heavy).

Backups become slower and bulkier.

Using external storage with database references

Minimal database size (only text paths stored).

Independent scalability (scale storage separately from the database).

Cheaper storage (use cost-effective object storage).

No transactional guarantees for external files.

Manual cleanup is required (orphaned files if metadata is deleted).

Security depends on external storage configuration.

Hybrid approach: balancing database and external storage

Performance optimization (store small previews/thumbnails in DB + full videos externally).

Partial ACID benefits for critical metadata.

Flexibility to balance cost and performance.

Complexity (manage synchronization between DB and external storage).

Partial consistency risk (e.g., previews may not match external video versions).

Higher maintenance overhead (two systems to monitor).

To store videos in PostgreSQL, we can use a data type called BYTEA.

What Is the BYTEA Data Type?

BYTEA (short for "byte array") is a PostgreSQL data type designed to store raw binary data, such as images, documents, or videos. 

Think of it as a "digital container" that holds data in its purest form: a sequence of bytes. 

Each byte is a tiny unit of data represented by a combination of zeros (0) and ones (1)—binary code.

When you save a video to a BYTEA column, here is what happens:

  • Conversion: Video gets converted into a long string of bytes.

  • Storage: These bytes are stored directly in the database.

When we fetch a video stored in a BYTEA column, this happens:

  • PostgreSQL returns the raw bytes exactly as they were stored.

  • Application (e.g., Python, Java, etc.) is responsible for converting those bytes back into a usable video file.

Note: If images, videos, or audio files are too large to fit within the BYTEA size limit, we can store them directly in PostgreSQL using large objects (LOB).

Methods to Store Video in PostgreSQL

Storing video using an SQL query

First, we need to create a database table where we will store the entire video using the BYTEA data type.

CREATE TABLE IF NOT EXISTS videos (     id SERIAL PRIMARY KEY,     time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,      tag TEXT NOT NULL,     video BYTEA );

This creates a table called videos with three columns:

  • id: A unique identifier (auto-incremented).

  • time: Time at which the video was inserted

  • tag: Video tags

  • video: Binary data of the video file.

Next, we need to write an SQL query to insert the video into the database by converting it into bytes and then storing it in the table.

INSERT INTO videos (tag, video) VALUES ('comedy', pg_read_binary_file('/path/to/video/file'));

Here pg_read_binary_file is a function in PostgreSQL that lets you read all or part of a file and returns the content in binary format (BYTEA). Unlike pg_read_file, it doesn't check text encoding, making it suitable for reading any type of file, including images or videos.

Note: This function is available to only superusers by default, but permissions can be granted to other users.

Let's see what our data looks like in the table:

SELECT * FROM public.video ORDER BY id ASC;

image

To retrieve and play a video stored in a PostgreSQL table, you need an application to handle the extraction and conversion process. The upcoming code examples will demonstrate how to achieve this.

Next, you will find coding examples in Python, Java, and C++ demonstrating how to complete the following tasks:

  • Create a table

  • Insert a video into the table

  • Retrieve the video from the table

  • Play the retrieved video

Before running the code, ensure you have the following dependencies installed:

  • Python: psycopg2

  • Java: PostgreSQL JDBC Connector

  • C++: libpqxx

Additionally, update the database credentials to match your system and provide the correct file path for the video on your filesystem.

Storing video using Python

import psycopg2 import os import subprocess

# Database connection connection = psycopg2.connect(     dbname="postgres",     user="postgres",     password="<PASSWORD>",     host="localhost",     port="5432" ) cursor = connection.cursor()

# Create table if not exists cursor.execute("""     CREATE TABLE IF NOT EXISTS videos (         id SERIAL PRIMARY KEY,         time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,         tag TEXT NOT NULL,         video BYTEA     ) """) connection.commit()

# Path to video file (Update as needed) video_path = "<VIDEO_PATH>"

# Ensure file exists before reading if not os.path.exists(video_path):     print("Error: Video file not found!")     exit()

# Read video as binary with open(video_path, "rb") as video_file:     video_data = video_file.read()

# Insert into database cursor.execute("""     INSERT INTO videos (tag, video)      VALUES (%s, %s) RETURNING id """, ("comedy", video_data)) video_id = cursor.fetchone()[0]  # Get inserted row ID connection.commit()

# Verify video is saved in database cursor.execute("SELECT COUNT(*) FROM videos WHERE tag = 'comedy'") count = cursor.fetchone()[0]

if count > 0:     print(f" Video successfully stored in database with ID: {video_id}") else:     print(" Error: Video was not stored in database!")     exit()

# Retrieve video data from database cursor.execute("SELECT video FROM videos WHERE id = %s", (video_id,)) video_data = cursor.fetchone()[0]

# Check if video exists in database if not video_data:     print(" Error: No video found in database!")     exit()

# Save video data to a temporary file temp_video_path = "<TMP_VIDEO_PATH>" with open(temp_video_path, "wb") as temp_file:     temp_file.write(video_data)

# Verify if the file was saved if os.path.exists(temp_video_path):     print(f" Video retrieved and saved as: {temp_video_path}") else:     print(" Error: Video file could not be created!")     exit()

# Start the video os.startfile(temp_video_path)

# Cleanup cursor.close() connection.close()

Storing video using Java

import java.sql.*; import java.io.*; import java.awt.*; public class main { public static void main(String[] args) { storeVideo(); retrieveVideo(); } private static void storeVideo() { try { Class.forName("org.postgresql.Driver"); Connection connection = DriverManager.getConnection( "jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>"); // Create table if not exists String createTableQuery = "CREATE TABLE IF NOT EXISTS videos (" + "id SERIAL PRIMARY KEY, " + "time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, " + "tag TEXT NOT NULL, " + "video BYTEA NOT NULL)"; Statement statement = connection.createStatement(); statement.executeUpdate(createTableQuery); // Load video file File videoFile = new File("<VIDEO_PATH>"); if (!videoFile.exists()) { System.out.println("Error: Video file not found!"); return; } FileInputStream fis = new FileInputStream(videoFile); // Insert video String insertQuery = "INSERT INTO videos (tag, video) VALUES (?, ?)"; PreparedStatement preparedStatement = connection.prepareStatement(insertQuery); preparedStatement.setString(1, "comedy"); preparedStatement.setBinaryStream(2, fis, (int) videoFile.length()); if (preparedStatement.executeUpdate() > 0) { System.out.println("Video stored successfully!"); } fis.close(); preparedStatement.close(); statement.close(); connection.close(); } catch (Exception e) { e.printStackTrace(); } } private static void retrieveVideo() { try { Class.forName("org.postgresql.Driver"); Connection connection = DriverManager.getConnection( "jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>"); // Fetch most recent video String query = "SELECT video FROM videos WHERE tag = 'comedy' ORDER BY time DESC LIMIT 1"; Statement statement = connection.createStatement(); ResultSet resultSet = statement.executeQuery(query); if (resultSet.next()) { byte[] videoData = resultSet.getBytes("video"); // Save to file FileOutputStream fos = new FileOutputStream("retrieved_video.mp4"); fos.write(videoData); fos.close(); // Auto-play video Desktop.getDesktop().open(new File("retrieved_video.mp4")); System.out.println("Video playing automatically!"); } resultSet.close(); statement.close(); connection.close(); } catch (Exception e) { e.printStackTrace(); } } }

Storing video using C++

#include <iostream> #include <pqxx/pqxx> #include <fstream> #include <vector> #include <cstdlib> using namespace std;

void createTable(pqxx::connection &c) { pqxx::work w(c); w.exec(R"( CREATE TABLE IF NOT EXISTS videos ( id SERIAL PRIMARY KEY, time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, tag TEXT NOT NULL, video BYTEA ) )"); w.commit(); cout << "Table checked/created successfully." << endl; }

void insertVideo(pqxx::connection &c, const string &videoPath, const string &tag) { ifstream file(videoPath, ios::binary); if (!file) { cerr << "Error: Unable to open file " << videoPath << endl; return; } vector<char> videoData((istreambuf_iterator<char>(file)), istreambuf_iterator<char>()); file.close();

pqxx::work w(c); w.exec_params( "INSERT INTO videos (tag, video) VALUES ($1, $2)", tag, pqxx::binarystring(videoData.data(), videoData.size())); w.commit(); cout << "Video inserted successfully." << endl; }

void retrieveAndPlayVideo(pqxx::connection &c, const string &tag) { pqxx::work w(c); pqxx::result r = w.exec_params("SELECT video FROM public.videos WHERE tag = $1 ORDER BY id ASC", tag);

if (r.empty()) { cout << "No video found with tag: " << tag << endl; return; }

pqxx::binarystring videoData(r[0][0]); ofstream videoFile("output_video.mp4", ios::binary); videoFile.write(reinterpret_cast<const char*>(videoData.data()), videoData.size()); videoFile.close();

cout << "Video retrieved and saved as output_video.mp4" << endl; system("mpv output_video.mp4"); }

int main() { try { // Connect to PostgreSQL pqxx::connection c("dbname=postgres user=postgres password=<PASSWORD> host=localhost"); if (!c.is_open()) { cerr << "Can't open database" << endl; return 1; } cout << "Opened database successfully: " << c.dbname() << endl;

createTable(c); string videoPath = "<VIDEO_PATH>"; string tag = "comedy"; insertVideo(c, videoPath, tag); retrieveAndPlayVideo(c, tag); } catch (const exception &e) { cerr << "Error: " << e.what() << endl; return 1; } return 0; } Note: This approach involves writing the video to a file every time it's read from the database, which can be inefficient due to repeated disk writes and file dependency.

Storing Video as Timestamped Data in PostgreSQL

Yes, you read that right! We can store video as timestamped data in a PostgreSQL table. Here’s how.:

For this example, I downloaded a random five5-minute and six6-second traffic video from YouTube. We extracted frames at one1-second intervals, resulting in a total of 306 frames. Each frame was then stored as BYTEA in a PostgreSQL table, along with its corresponding timestamp in the video.

Why is this useful?

Imagine needing to retrieve a specific frame from a video at an exact timestamp. By simply querying the database with the desired timestamp, we can instantly retrieve the relevant frame. This approach can be invaluable in scenarios like accident investigations, where identifying events and culprits at a specific moment is crucial.

import cv2 import psycopg from datetime import datetime import os

# Database connection conn = psycopg.connect("dbname=postgres user=postgres password=<PASSWORD> host=localhost") cur = conn.cursor()

# Create table if not exists cur.execute(""" CREATE TABLE IF NOT EXISTS segments ( id SERIAL PRIMARY KEY, segment BYTEA, segment_number INT, created_at TIMESTAMP, timeline TEXT ) """) conn.commit()

# Load the video video_path = "D:\\traffic.mp4" cap = cv2.VideoCapture(video_path) fps = int(cap.get(cv2.CAP_PROP_FPS)) frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) duration = int(frame_count / fps) frames_per_segment = fps # Split video into 1-second segments for segment_number in range(duration): frames = [] for _ in range(frames_per_segment): ret, frame = cap.read() if not ret: Break frames.append(frame) segment_path = f"segment_{segment_number}.mp4" out = cv2.VideoWriter( Segment_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (int(cap.get(3)), int(cap.get(4))) ) for frame in frames: out.write(frame) out.release()

# Check if file was created if not os.path.exists(segment_path) or os.path.getsize(segment_path) == 0: print(f"Error: Segment file {segment_path} not created correctly.") continue

# Read and store the segment in the database with open(segment_path, "rb") as file: segment_bytes = file.read()

# Calculate timeline (HH:MM:SS) hours = segment_number // 3600 minutes = (segment_number % 3600) // 60 seconds = segment_number % 60 timeline = f"{hours:02d}:{minutes:02d}:{seconds:02d}" cur.execute( "INSERT INTO segments (segment, segment_number, created_at, timeline) VALUES (%s, %s, %s, %s)", (segment_bytes, segment_number, datetime.now(), timeline) ) conn.commit() os.remove(segment_path) cap.release()

# Display stored timelines for verification cur.execute("SELECT timeline FROM segments ORDER BY timeline") rows = cur.fetchall() print("Stored timelines:", [row[0] for row in rows])

# Display specific segments by timeline timeline = "00:01:40" cur.execute("SELECT segment FROM segments WHERE timeline = %s", (timeline,)) segment_data = cur.fetchone() if segment_data: segment_bytes = segment_data[0] temp_video_path = f"temp_segment_{timeline.replace(':', '-')}.mp4" with open(temp_video_path, "wb") as temp_file: temp_file.write(segment_bytes) print(f"Playing segment at timeline {timeline}...") os.startfile(temp_video_path) # Opens with default media player else: print(f"Segment with timeline {timeline} not found.") cur.close() conn.close()

When to Store Videos in PostgreSQL 

Here are a few reasons you might consider storing videos directly in PostgreSQL, even though it’s generally not recommended for large files.

Simplified architecture

Avoid managing separate storage systems (e.g., S3, file servers). Everything (data + videos) lives in one place, reducing operational complexity.

Transactional consistency

Ensure atomicity: If a transaction involves both video metadata and the video itself (e.g., uploading a video and updating user records), PostgreSQL guarantees they succeed or fail together.

Built-in security

Leverage PostgreSQL’s encryption, role-based access control (RBAC), and auditing features to secure videos without configuring external storage permissions.

Backup/restore simplicity

Backups include videos and metadata in a single snapshot, simplifying disaster recovery compared to syncing databases with external storage.

Small file use cases

For tiny videos (e.g., short clips, thumbnails), the overhead of external storage might outweigh the benefits. PostgreSQL’s BYTEA or large objects work for limited-scale needs.

When to Store Videos Externally and Keep Only Metadata in PostgreSQL

Scalability for large volumes

When: Videos are large (e.g., HD/4K) or numerous. Why: External storage (e.g., AWS S3, Google Cloud Storage) scales infinitely for massive files.

Optimized performance

When: Fast read/write operations are critical (e.g., streaming platforms). Why: External storage systems handle parallel access and large file transfers more efficiently.

Separation of concerns

When: You need to manage metadata (e.g., titles, tags, user access) separately from raw video files. Why: PostgreSQL excels at querying structured metadata, while external storage handles unstructured data. This simplifies architecture and maintenance.

Leverage specialized features

When: Advanced video processing (e.g., transcoding, CDN delivery) is required. Why: Tools like Cloudflare, Cloudinary, or AWS MediaConvert integrate seamlessly with external storage, offering features databases can’t provide (e.g., global caching, adaptive streaming).

Conclusion

Storing videos in PostgreSQL using BYTEA provides a convenient way to manage binary data directly within the database. We explored how to achieve this using SQL, Python, Java, and C++, demonstrating different approaches for inserting and retrieving videos. 

While this method ensures data integrity and centralized access, it may not be ideal for large-scale video storage due to performance and scalability concerns. In such cases, external storage with metadata in PostgreSQL is a more efficient alternative.

Now that you've seen how to store video data in PostgreSQL, you're ready to tackle other challenges in your applications, such as real-time analytics on time-series data. While storing videos directly in PostgreSQL works for certain use cases, managing high-volume, real-time data requires optimized solutions.

This is where TimescaleDB comes in.

Built on PostgreSQL, TimescaleDB gives you the best of both worlds: the reliability and SQL interface you love about PostgreSQL, with the performance and scalability needed for demanding workloads.

Whether you're building video analytics platforms, IoT applications, or monitoring systems, TimescaleDB helps you:

  • Store and query billions of data points efficiently

  • Maintain high write and read performance at scale

  • Use familiar PostgreSQL syntax and tools

  • Keep your entire data stack within one database ecosystem

Try TimescaleDB for free, or check out our documentation to see how it can enhance your specific use case.

On this page

    Try for free

    Start supercharging your PostgreSQL today.