-
Products & documentation Red Hat AI
A platform of products and services for the development and deployment of AI across the hybrid cloud.
Red Hat AI Inference Server
Optimize model performance with vLLM for fast and cost-effective inference at scale.
Red Hat Enterprise Linux AI
Develop, test, and run generative AI models with optimized inference capabilities.
Red Hat OpenShift AI
Build and deploy AI-enabled applications and models at scale across hybrid environments.
Cloud services
Red Hat AI InstructLab on IBM Cloud
A scalable, cost-effective solution to customize AI models in the cloud.
-
Learn -
AI partners
Red Hat OpenShift AI
Red Hat® OpenShift® AI is a platform for managing the lifecycle of predictive and generative AI (gen AI) models, at scale, across hybrid cloud environments.
What is Red Hat OpenShift AI?
Built using open source technologies, OpenShift AI provides trusted, operationally consistent capabilities for teams to experiment, serve models, and deliver innovative applications.
OpenShift AI enables data acquisition and preparation, model training and fine-tuning, model serving and model monitoring, and hardware acceleration. With an open ecosystem of hardware and software partners, OpenShift AI delivers the flexibility you need for your specific use cases.
Bring AI-enabled applications to production faster
Combine the proven capabilities of Red Hat OpenShift AI and Red Hat OpenShift in a single enterprise-ready AI application platform that brings teams together. Data scientists, engineers, and app developers can collaborate in a single destination that promotes consistency, security, and scalability.
The latest release of OpenShift AI includes a curated collection of optimized, production-ready, third-party models, validated for Red Hat OpenShift AI. Access to this third-party model catalog gives your team more control over model accessibility and visibility to help meet security and policy requirements.
Additionally, OpenShift AI helps manage costs of inferencing with distributed serving through an optimized vLLM framework. To further reduce operational complexity, it offers advanced tooling to automate deployments and self-service access to models, tools, and resources.
Features & benefits
Less time managing AI infrastructure
Get on-demand access to high-performing models that make it easier to self-service, scale, and serve. Developers can skip the complexity, maintain control, and optimize costs.
Features like Models-as-a-Service (MaaS) are currently in developer preview. It offers an AI access approach in the form of API endpoints that encourages private and faster AI at scale.
Tested and supported AI/ML tooling
Red Hat tests, integrates, and supports AI/ML tooling and model serving, so you don’t have to. OpenShift AI draws from years of incubation in our Open Data Hub community project and open source projects like Kubeflow.
Our experience and open source expertise allow us to provide a generative AI-ready foundation, so customers can have greater choice and confidence in their gen AI strategies.
Flexibility across the hybrid cloud
Offered as either self-managed software or as a fully managed cloud service on top of OpenShift, Red Hat OpenShift AI provides a secure and flexible platform that gives you the choice of where you develop and deploy your models–whether on-premise, the public cloud, or even at the edge.
Leverage our best practices
Red Hat Services provides expertise, training, and support that helps you overcome the challenges of AI no matter where you are in your adoption journey.
Whether you’d like to prototype an AI solution, streamline the deployment of your AI platform, or advance your MLOps strategies, Red Hat Consulting will provide support and mentorship.
llm-d provides well-lit paths for developers
Red Hat OpenShift AI includes llm-d, an open source framework that solves the challenges of distributed AI inference at scale.
Scaling models across a distributed fleet of GPUs offers a new level of control and observability. By disaggregating the inference pipeline into modular, intelligent services, enterprises can optimize complex LLMs at scale.
MCP servers for Red Hat OpenShift AI
Explore our curated collection of MCP servers from technology partners that integrate with Red Hat OpenShift AI.
Model Context Protocol (MCP) is an open source protocol that enables 2-way connection and standardized communication between AI applications and external services.
Now, you can take advantage of these MCP servers to integrate enterprise tools and resources into your AI applications and agentic workflows.
Optimize with vLLM for fast and cost-effective inference at scale.
Red Hat AI Inference Server is part of the Red Hat AI platform. It is available as a standalone product and is included in both Red Hat Enterprise Linux® AI and Red Hat OpenShift® AI.
Red Hat AI use cases
Generative AI
Produce new content, like text and software code.
Red Hat AI lets you run the generative AI models of your choice, faster, with fewer resources, and lower inference costs.
Predictive AI
Connect patterns and forecast future outcomes.
With Red Hat AI, organizations can build, train, serve and monitor predictive models, all while maintaining consistency across the hybrid cloud.
Operationalized AI
Create systems that support the maintenance and deployment of AI at scale.
With Red Hat AI, manage and monitor the lifecycle of AI-enabled applications while saving on resources and ensuring compliance with privacy regulations.
Agentic AI
Build workflows that perform complex tasks with limited supervision.
Red Hat AI provides a flexible approach and stable foundation for building, managing and deploying agentic AI workflows within existing applications.
Partnerships
Get more from the Red Hat OpenShift AI platform by extending it with other integrated services and products.
NVIDIA and Red Hat offer customers a scalable platform that accelerates a diverse range of AI use cases with unparalleled flexibility.
Intel® and Red Hat help organizations accelerate AI adoption and rapidly operationalize their AI/ML models.
IBM and Red Hat provide open source innovation to accelerate AI development, including through IBM watsonx.aiTM, an enterprise-ready AI studio for AI builders.
Starburst Enterprise and Red Hat support better and more timely insights through rapid data analysis across multiple disparate and distributed data platforms.
Scalable Kubernetes Infrastructure for AI platforms
Learn how to apply machine learning operations (MLOPs) principles and practices to build AI-powered applications.
Collaborating through model workbenches
AI hub and gen AI studio allow both platform engineers and AI engineers to collaboratively bring gen AI models to production faster.
AI hub allows platform engineers to manage LLMs with centralized AI workloads alongside performance insights from 3rd party model validation. Gen AI studio gives AI engineers a hands-on environment to interact with models and rapidly prototype gen AI applications through the playground. Verify model suitability prior to life-cycle integration with access to a sandbox to experiment with chat and retrieval-augmented generation (RAG) flows.
Plus, data scientists can access pre-built or customized cluster images to work with models using the preferred IDE and frameworks. Red Hat OpenShift AI tracks changes to Jupyter, PyTorch, Kubeflow and other open source AI technologies.
Scaling model serving and safety with Red Hat OpenShift AI
Models can be served using an optimized version of vLLM (or other model servers of your choice) for integration into AI-enabled applications on-premise, in the public cloud or at the edge. These models can be rebuilt, redeployed, and monitored based on changes to the source notebook.
Hallucination and bias can compromise the integrity of your models and make it harder to scale. To help maintain fairness, safety, and scalability, OpenShift AI allows data practitioners to monitor alignment between model outputs and training data.
Drift detection tools can monitor when live data used for model inference deviates from original training data. AI guardrails are also included to protect your model inputs and outputs from harmful information such as abusive and profane speech, personal data, or domain-specific restrictions.
Solution Pattern
Red Hat AI applications with NVIDIA AI Enterprise
Create a RAG application
Red Hat OpenShift AI is a platform for building data science projects and serving AI-enabled applications. You can integrate all the tools you need to support retrieval-augmented generation (RAG), a method for getting AI answers from your own reference documents. When you connect OpenShift AI with NVIDIA AI Enterprise, you can experiment with large language models (LLMs) to find the optimal model for your application.
Build a pipeline for documents
To make use of RAG, you first need to ingest your documents into a vector database. In our example app, we embed a set of product documents in a Redis database. Since these documents change frequently, we can create a pipeline for this process that we’ll run periodically, so we always have the latest versions of the documents.
Browse the LLM catalog
NVIDIA AI Enterprise gives you access to a catalog of different LLMs, so you can try different choices and select the model that delivers the best results. The models are hosted in the NVIDIA API catalog. Once you’ve set up an API token, you can deploy a model using the NVIDIA NIM model serving platform directly from OpenShift AI.
Choose the right model
As you test different LLMs, your users can rate each generated response. You can set up a Grafana monitoring dashboard to compare the ratings, as well as latency and response time for each model. Then you can use that data to choose the best LLM to use in production.
AI customer stories from Red Hat Summit and AnsibleFest 2025
Turkish Airlines doubled the speed of deployment times with organization-wide data access.
JCCM improved the region's environmental impact assessment (EIA) processes using AI.
Denizbank sped up time to market from days to minutes.
Hitachi operationalized AI across its entire business with Red Hat OpenShift AI.
How to try Red Hat OpenShift AI
Developer Sandbox
For developers and data scientists who want to experiment with building AI-enabled applications in a preconfigured and flexible environment.
60-day trial
When your organization is ready to evaluate the full capabilities of OpenShift AI, explore them with a 60-day product trial. An existing Red Hat OpenShift cluster is required.