Skip to content

glassflow/clickhouse-etl

Repository files navigation

GlassFlow Logo

Join our weekly office hours every Wednesday 15:00-18:00 CET

Join Next Office Hour

Slack Email Support Twitter

GlassFlow for ClickHouse Streaming ETL

GlassFlow is an open-source ETL tool that enables real-time data processing from Kafka to ClickHouse with features like deduplication and temporal joins.

Quick Start

  1. Clone the repository:
git clone https://github.com/glassflow/clickhouse-etl.git
cd clickhouse-etl
  1. Start the services:
docker-compose up
  1. Access the web interface at http://localhost:8080 to configure your pipeline.

Demo

GlassFlow Overview Video

Documentation

For detailed documentation, visit docs.glassflow.dev. The documentation includes:

Features

  • Real-time data processing from Kafka to ClickHouse
  • Deduplication with configurable time windows
  • Temporal joins between multiple Kafka topics
  • Web-based UI for pipeline management
  • Docker-based deployment
  • Local development environment

Support

License

This project is licensed under the Apache License 2.0.

About

Real-time deduplication and temporal joins for streaming data

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 5