Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Business
Data Management
Data Pipeline Tools
Search Results

Search Results for "linux for windows"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 35
Mac 35
Windows 35
More...
BSD 5
ChromeOS 5

Category

Business 35
Software Development 6
System 4
Artificial Intelligence 2
Database 1
Education 1
Scientific/Engineering 1
Security 1

License

OSI-Approved Open Source 32
Other License 2
Creative Commons Attribution License 1

Translations

Brazilian Portuguese 1
Chinese (Simplified) 1
Dutch 1
English 1
More...
French 1
Spanish 1

Programming Language

Python 12
Java 9
Go 6
JavaScript 2
More...
C++ 1
Elixir 1
JSP 1
Scala 1
TypeScript 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Mature 2
Alpha 1
Beta 1
Production/Stable 1

Showing 35 open source projects for "linux for windows"

View related business solutions

Data Pipeline Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Context for your AI agents
Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.

Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.

Try for free
1

Best-of Python

A ranked list of awesome Python open-source libraries

This curated list contains 390 awesome open-source projects with a total of 1.4M stars grouped into 28 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web...

Downloads: 4 This Week

Last Update: 17 hours ago
See Project
2

Backstage

Backstage is an open platform for building developer portals

Powered by a centralized software catalog, Backstage restores order to your infrastructure and enables your product teams to ship high-quality code quickly, without compromising autonomy. At Spotify, we've always believed in the speed and ingenuity that comes from having autonomous development teams. But as we learned firsthand, the faster you grow, the more fragmented and complex your software ecosystem becomes. And then everything slows down again. By centralizing services and...

Downloads: 3 This Week

Last Update: 2025-12-29
See Project
3

Kestra

Kestra is an infinitely scalable orchestration and scheduling platform

Build reliable workflows, blazingly fast, deploy in just a few clicks. Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate...

Downloads: 2 This Week

Last Update: 3 days ago
See Project
4

Union Pandera

Light-weight, flexible, expressive statistical data testing library

The open-source framework for precision data testing for data scientists and ML engineers. Pandera provides a simple, flexible, and extensible data-testing framework for validating not only your data but also the functions that produce them. A simple, zero-configuration data testing framework for data scientists and ML engineers seeking correctness. Access a comprehensive suite of built-in tests, or easily create your own validation rules for your specific use cases. Validate the functions...

Downloads: 2 This Week

Last Update: 17 hours ago
See Project
Easy-to-Use Website Accessibility Widget
An accessibility solution for quick website accessibility improvement.

All in One Accessibility is an AI based accessibility tool that helps organizations to enhance the accessibility and usability of websites quickly.

Learn More
5

StarRocks

StarRocks is a next-gen sub-second MPP database for full analytics

StarRocks is the next generation of real-time SQL engines for enterprise analytics. Real-time analytics is notoriously difficult. Complex data pipelines and de-normalized tables have always been a necessary evil. Processing any updates or deletes once data arrives has not been possible- until now. StarRocks solves these challenges and makes real-time analytics easy. Get amazing query performance on Star or Snowflake Schemas directly. From canceled orders to updated items, your analytics...

Downloads: 2 This Week

Last Update: 2025-12-29
See Project
6

rudderstack

Privacy and Security focused Segment-alternative, in Golang

Quickly deploy flexible, powerful customer data pipelines, then send the data to your entire stack—without the engineering headache. Our complete toolset makes it easy to level-up your customer data stack. Spare your data engineers the headache. Our 180+ integrations, along with custom webhook sources and destinations, save data teams hundred of hours. Say goodbye to different versions of the truth. Our SDKs track anonymous and known users at the source and reconcile users in your warehouse...

Downloads: 1 This Week

Last Update: 15 hours ago
See Project
7

lakeFS

lakeFS - Git-like capabilities for your object storage

Increase data quality and reduce the painful cost of errors. Data engineering best practices using git-like operations on data. lakeFS is an open-source data version control for data lakes. It enables zero-copy Dev / Test isolated environments, continuous quality validation, atomic rollback on bad data, reproducibility, and more. Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake is...

Downloads: 1 This Week

Last Update: 2025-12-24
See Project
8

Conduit

Conduit streams data between data stores. Kafka Connect replacement

Conduit is a data streaming tool written in Go. It aims to provide the best user experience for building and running real-time data pipelines. Conduit comes with batteries included, it provides a UI, common connectors, processors and observability data out of the box. Sync data between your production systems using an extensible, event-first experience with minimal dependencies that fit within your existing workflow. Eliminate the multi-step process you go through today. Just download the...

Downloads: 1 This Week

Last Update: 2025-08-11
See Project
9

go-streams

A lightweight stream processing library for Go

A lightweight stream processing library for Go. go-streams provides a simple and concise DSL to build data pipelines. In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer storage is often inserted between elements.

Downloads: 0 This Week

Last Update: 2025-05-10
See Project
Trumba is an All-in-one Calendar Management and Event Registration platform
Great for live, virtual and hybrid events

Publish, promote and track your events more affordably and effectively—all in one place.

Learn More
10

The Tengo Language

A fast script language for Go

Tengo is a small, dynamic, fast, secure script language for Go. Tengo is fast and secure because it's compiled/executed as bytecode on stack-based VM that's written in native Go. Securely Embeddable and Extensible. Compiler/runtime written in native Go (no external deps or cgo). Executable as a standalone language / REPL. Use cases, rules engine, state machine, data pipeline, transpiler. If you need to evaluate a simple expression, you can use Eval function instead.

Downloads: 0 This Week

Last Update: 2025-05-24
See Project
11

Elementary

Open-source data observability for analytics engineers

Elementary is an open-source data observability solution for data & analytics engineers. Monitor your dbt project and data in minutes, and be the first to know of data issues. Gain immediate visibility, detect data issues, send actionable alerts, and understand the impact and root cause. Generate a data observability report, host it or share with your team. Monitoring of data quality metrics, freshness, volume and schema changes, including anomaly detection. Elementary data monitors are...

Downloads: 0 This Week

Last Update: 2025-12-07
See Project
12

GenStage

Producer and consumer actors with back-pressure for Elixir

GenStage is a specification and set of behaviours for building demand-driven data pipelines on the BEAM. It formalizes the roles of producers, consumers, and producer-consumers, using back-pressure so that fast producers don’t overwhelm downstream stages. Developers implement callbacks like handle_demand and handle_events to control how items are emitted, transformed, and consumed across asynchronous boundaries. Because stages are OTP processes, you gain fault tolerance, supervised restarts,...

Downloads: 0 This Week

Last Update: 2025-09-01
See Project
13

whylogs

The open standard for data logging

whylogs is an open-source library for logging any kind of data. With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to track changes in their dataset Create data constraints to know whether their data looks the way it should. Quickly visualize key summary statistics about their datasets. whylogs profiles are the core of the whylogs library. They capture key statistical properties of data, such as the distribution (far beyond...

Downloads: 0 This Week

Last Update: 2024-12-03
See Project
14

memphis

Next-Generation Event Processing Platform

Memphis enables building modern queue-based applications that require large volumes of streamed and enriched data, modern protocols, zero ops, up to x9 faster development, up to x46 fewer costs, and significantly lower dev time for data-oriented developers and data engineers. Queues and brokers are a mission-critical component in the modern application architecture and should be highly available and stable as possible. Provide great performance while maintaining efficient resource...

Downloads: 0 This Week

Last Update: 2024-05-27
See Project
15

Apache SeaTunnel

SeaTunnel is a distributed, high-performance data integration platform

SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies. There are hundreds of commonly-used data sources of which versions are incompatible. With the emergence of new technologies, more data sources are appearing. It is difficult for users to find a tool that can...

Downloads: 0 This Week

Last Update: 2025-09-05
See Project
16

Dolphin Scheduler

A distributed and extensible workflow scheduler platform

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dedicated to solving the complex task dependencies in data processing, making the scheduler system out of the box for data processing. Decentralized multi-master and multi-worker, HA is supported by itself, overload processing. All process...

Downloads: 0 This Week

Last Update: 2025-10-26
See Project
17

Automated Tool for Optimized Modelling

Automated Tool for Optimized Modelling

During the exploration phase of a machine learning project, a data scientist tries to find the optimal pipeline for his specific use case. This usually involves applying standard data cleaning steps, creating or selecting useful features, trying out different models, etc. Testing multiple pipelines requires many lines of code, and writing it all in the same notebook often makes it long and cluttered. On the other hand, using multiple notebooks makes it harder to compare the results and to...

Downloads: 0 This Week

Last Update: 2024-07-05
See Project
18

gusty

Making DAG construction easier

gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's create_dag function. gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define. All you have to do is...

Downloads: 0 This Week

Last Update: 2025-05-14
See Project
19

Mage.ai

Build, run, and manage data pipelines for integrating data

Open-source data pipeline tool for transforming and integrating data. The modern replacement for Airflow. Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow? That’s why we designed an easy developer experience that you’ll enjoy. Each step in your pipeline is a...

Downloads: 0 This Week

Last Update: 2025-09-18
See Project
20

AutoGluon

AutoGluon: AutoML for Image, Text, and Tabular Data

AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning image, text, and tabular data. Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning,...

Downloads: 0 This Week

Last Update: 2025-12-19
See Project
21

Luigi

Python module that helps you build complex pipelines of batch jobs

Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more. The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop...

Downloads: 0 This Week

Last Update: 2024-12-06
See Project
22

Pentaho

Pentaho offers comprehensive data integration and analytics platform.

Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Free Trial...

69 Reviews

Downloads: 1,475 This Week

Last Update: 2025-02-06
See Project
23

PipeRider

Code review for data in dbt

PipeRider automatically compares your data to highlight the difference in impacted downstream dbt models so you can merge your Pull Requests with confidence. PipeRider can profile your dbt models and obtain information such as basic data composition, quantiles, histograms, text length, top categories, and more. PipeRider can integrate with dbt metrics and present the time-series data of metrics in the report. PipeRider generates a static HTML report each time it runs, which can be viewed...

Downloads: 0 This Week

Last Update: 2023-11-22
See Project
24

Datapipe

Real-time, incremental ETL library for ML with record-level depend

Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking. Datapipe is designed to streamline the creation of data processing pipelines. It excels in scenarios where data is continuously changing, requiring pipelines to adapt and process only the modified data efficiently. This library tracks dependencies for each record in the pipeline, ensuring minimal and efficient data processing.

3 Reviews

Downloads: 3 This Week

Last Update: 1 day ago
See Project
25

DataGym.ai

Open source annotation and labeling tool for image and video assets

DATAGYM enables data scientists and machine learning experts to label images up to 10x faster. AI-assisted annotation tools reduce manual labeling effort, give you more time to finetune ML models and speed up your go to market of new products. Accelerate your computer vision projects by cutting down data preparation time up to 50%. A machine learning model is only as good as its training data. DATAGYM is an end-to-end workbench to create, annotate, manage, and export the right training data...

Downloads: 0 This Week

Last Update: 2023-06-01
See Project

Previous
You're on page 1
2
Next

Related Searches

kettle

pentaho

pentaho data integration

pdi-ce-9.1.0.0-324.zip

pentaho data integration (pdi-ce-8.2.0.0-342.zip)

spoon

pdi

download

pdi-ce

pdi-ce-9.4.0.0-343.zip

Related Categories

Business

Software Development

System

Artificial Intelligence

Database

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: