Prevision
Building a model is an iterative process that can take weeks, months, or even years, and reproducing model results, maintaining version control, and auditing past work are complex. Model building is an iterative process. Ideally, you record not only each step but also how you arrived there. A model shouldn’t be a file hidden away somewhere, but instead a tangible object that all parties can track and analyze consistently. Prevision.io allows you to record each experiment as you train it along with its characteristics, automated analyses, and versions as your project progress, whether you created it using our AutoML or your own tools. Automatically experiment with dozens of feature engineering strategies and algorithm types to build highly performant models. In a single command, the engine automatically tries out different feature engineering strategies for every type of data (e.g. tabular, text, images) to maximize the information in your datasets.
Learn more
Union Cloud
Union.ai is an award-winning, Flyte-based data and ML orchestrator for scalable, reproducible ML pipelines. With Union.ai, you can write your code locally and easily deploy pipelines to remote Kubernetes clusters.
“Flyte’s scalability, data lineage, and caching capabilities enable us to train hundreds of models on petabytes of geospatial data, giving us an edge in our business.”
— Arno, CTO at Blackshark.ai
“With Flyte, we want to give the power back to biologists. We want to stand up something that they can play around with different parameters for their models because not every … parameter is fixed. We want to make sure we are giving them the power to run the analyses.”
— Krishna Yeramsetty, Principal Data Scientist at Infinome
“Flyte plays a vital role as a key component of Gojek's ML Platform by providing exactly that."
— Pradithya Aria Pura, Principal Engineer at Goj
Learn more
Keepsake
Keepsake is an open-source Python library designed to provide version control for machine learning experiments and models. It enables users to automatically track code, hyperparameters, training data, model weights, metrics, and Python dependencies, ensuring that all aspects of the machine learning workflow are recorded and reproducible. Keepsake integrates seamlessly with existing workflows by requiring minimal code additions, allowing users to continue training as usual while Keepsake saves code and weights to Amazon S3 or Google Cloud Storage. This facilitates the retrieval of code and weights from any checkpoint, aiding in re-training or model deployment. Keepsake supports various machine learning frameworks, including TensorFlow, PyTorch, scikit-learn, and XGBoost, by saving files and dictionaries in a straightforward manner. It also offers features such as experiment comparison, enabling users to analyze differences in parameters, metrics, and dependencies across experiments.
Learn more
MLflow
MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. MLflow currently offers four components. Record and query experiments: code, data, config, and results. Package data science code in a format to reproduce runs on any platform. Deploy machine learning models in diverse serving environments. Store, annotate, discover, and manage models in a central repository. The MLflow Tracking component is an API and UI for logging parameters, code versions, metrics, and output files when running your machine learning code and for later visualizing the results. MLflow Tracking lets you log and query experiments using Python, REST, R API, and Java API APIs. An MLflow Project is a format for packaging data science code in a reusable and reproducible way, based primarily on conventions. In addition, the Projects component includes an API and command-line tools for running projects.
Learn more