State of the Art Natural Language Processing
A unified analytics engine for large-scale data processing
A Spark library for Amazon SageMaker
Docker image used to run data processing workloads
Apache Spark to Apache Cassandra connector
A free, open-source, and cross-platform big data analytics framework
Web-based, cross-platform and full-featured Remote Administration Tool
Deequ is a library built on top of Apache Spark
Simple and distributed Machine Learning
A unified interface for distributed computing
A Cloud Native Batch System (Project under CNCF)
Apache Kyuubi is a distributed and multi-tenant gateway
Command-line tool from the Alire project and supporting library
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vincuna, etc.
Open source platform for the machine learning lifecycle
Distributed DataFrame for Python designed for the cloud
Jupyter magics and kernels for working with remote Spark clusters
NumPy aware dynamic Python compiler using LLVM
R interface for Apache Spark
Series (one-dimensional) and dataframes (two-dimensional)
Scalable and Flexible Gradient Boosting
A Scala kernel for Jupyter
Apache Iceberg