State of the Art Natural Language Processing
A Spark library for Amazon SageMaker
A unified analytics engine for large-scale data processing
Docker image used to run data processing workloads
A free, open-source, and cross-platform big data analytics framework
Apache Spark to Apache Cassandra connector
Web-based, cross-platform and full-featured Remote Administration Tool
Deequ is a library built on top of Apache Spark
Simple and distributed Machine Learning
Jupyter magics and kernels for working with remote Spark clusters
A unified interface for distributed computing
Apache Kyuubi is a distributed and multi-tenant gateway
Command-line tool from the Alire project and supporting library
A Cloud Native Batch System (Project under CNCF)
An end-to-end, realtime and cloud native Lakehouse framework
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vincuna, etc.
Open source platform for the machine learning lifecycle
Scalable and Flexible Gradient Boosting
R interface for Apache Spark
Distributed DataFrame for Python designed for the cloud
Series (one-dimensional) and dataframes (two-dimensional)
NumPy aware dynamic Python compiler using LLVM
Python Stream Processing