
403
Sorry!!
Error! We're sorry, but the page you were looking for doesn't exist.
Red Hat Brings Distributed AI Inference To Production AI Workloads With Red Hat AI 3
(MENAFN- Mid-East Info) Red Hat's hybrid cloud-native AI platform streamlines AI workflows and offers powerful new inference capabilities, building the foundation for agentic AI at scale and empowering IT teams and AI engineers to innovate faster and more efficiently
October, 2025 – Red Hat, the world's leading provider of open-source solutions, today announced Red Hat AI 3, a significant evolution of its enterprise AI platform. Bringing together the latest innovations of Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI, the platform helps simplify the complexities of high-performance AI inference at scale, enabling organizations to more readily move workloads from proofs-of-concept to production and improve collaboration around AI-enabled applications. As enterprises move beyond AI experimentation, they face significant hurdles, including data privacy, cost control and managing diverse models.“The GenAI Divide: State of AI in Business” from the Massachusetts Institute of Technology NANDA project, highlights the reality of production AI, with approximately 95% of organizations failing to see measurable financial returns from ~$40 billion in enterprise spending. Red Hat AI 3 focuses on directly addressing these challenges by providing a more consistent, unified experience for CIOs and IT leaders to maximize their investments accelerated computing technologies. It makes it possible to rapidly scale and distribute AI workloads across hybrid, multi-vendor environments while simultaneously improving cross-team collaboration on next-generation AI workloads like agents, all on the same common platform. With a foundation built on open standards, Red Hat AI 3 meets organizations where they are on their AI journey, supporting any model on any hardware accelerator, from datacentres to public cloud and sovereign AI environments to the farthest edge. From training to“doing”: The shift to enterprise AI inference As organizations move AI initiatives into production, the emphasis shifts from training and tuning models to inference, the“doing” phase of enterprise AI. Red Hat AI 3 emphasizes scalable and cost-effective inference, by building on the wildly-successful vLLM and llm-d community projects and Red Hat's model optimization capabilities to deliver production-grade serving of large language models (LLMs). To help CIOs get the most out of their high-value hardware acceleration, Red Hat OpenShift AI 3.0 introduces the general availability of llm-d, which reimagines how LLMs run natively on Kubernetes. llm-d enables intelligent distributed inference, tapping the proven value of Kubernetes orchestration and the performance of vLLM, combined with key open source technologies like Kubernetes Gateway API Inference Extension, the NVIDIA Dynamo low latency data transfer library (NIXL), and the DeepEP Mixture of Experts (MoE) communication library, allowing organizations to:
October, 2025 – Red Hat, the world's leading provider of open-source solutions, today announced Red Hat AI 3, a significant evolution of its enterprise AI platform. Bringing together the latest innovations of Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI, the platform helps simplify the complexities of high-performance AI inference at scale, enabling organizations to more readily move workloads from proofs-of-concept to production and improve collaboration around AI-enabled applications. As enterprises move beyond AI experimentation, they face significant hurdles, including data privacy, cost control and managing diverse models.“The GenAI Divide: State of AI in Business” from the Massachusetts Institute of Technology NANDA project, highlights the reality of production AI, with approximately 95% of organizations failing to see measurable financial returns from ~$40 billion in enterprise spending. Red Hat AI 3 focuses on directly addressing these challenges by providing a more consistent, unified experience for CIOs and IT leaders to maximize their investments accelerated computing technologies. It makes it possible to rapidly scale and distribute AI workloads across hybrid, multi-vendor environments while simultaneously improving cross-team collaboration on next-generation AI workloads like agents, all on the same common platform. With a foundation built on open standards, Red Hat AI 3 meets organizations where they are on their AI journey, supporting any model on any hardware accelerator, from datacentres to public cloud and sovereign AI environments to the farthest edge. From training to“doing”: The shift to enterprise AI inference As organizations move AI initiatives into production, the emphasis shifts from training and tuning models to inference, the“doing” phase of enterprise AI. Red Hat AI 3 emphasizes scalable and cost-effective inference, by building on the wildly-successful vLLM and llm-d community projects and Red Hat's model optimization capabilities to deliver production-grade serving of large language models (LLMs). To help CIOs get the most out of their high-value hardware acceleration, Red Hat OpenShift AI 3.0 introduces the general availability of llm-d, which reimagines how LLMs run natively on Kubernetes. llm-d enables intelligent distributed inference, tapping the proven value of Kubernetes orchestration and the performance of vLLM, combined with key open source technologies like Kubernetes Gateway API Inference Extension, the NVIDIA Dynamo low latency data transfer library (NIXL), and the DeepEP Mixture of Experts (MoE) communication library, allowing organizations to:
-
Lower costs and improve response times with intelligent inference-aware model scheduling and disaggregated serving.
Improve response times and latency with an intelligent, inference-aware load balancer built to handle the variable nature of AI workloads.
Deliver operational simplicity and maximum reliability with prescriptive“Well-lit Paths” that streamline the deployment and optimization of massive models at scale.
-
Model as a Service (MaaS) capabilities build on distributed inference and enable IT teams to act as their own MaaS providers, serving common models centrally and delivering on-demand access for both AI developers and AI applications. This allows for better cost management and supports use cases that cannot run on public AI services due to privacy or data concerns.
AI hub empowers platform engineers to explore, deploy and manage foundational AI assets. It provides a central hub with a curated catalog of models, including validated and optimized gen AI models, a registry to manage the lifecycle of models and a deployment environment to configure and monitor all AI assets running on OpenShift AI.
Gen AI studio provides a hands-on environment for AI engineers to interact with models and rapidly prototype new gen AI applications. With the AI assets endpoint feature, engineers can easily discover and consume available models and MCP servers, which are designed to streamline how models interact with external tools. The built-in playground provides an interactive, stateless environment to experiment with models, test prompts and tune parameters for use cases like chat and retrieval-augmented generation (RAG).
New Red Hat validated and optimized models are included to simplify development. The curated selection includes popular open-source models like OpenAI's gpt-oss, DeepSeek-R1, and specialized models such as Whisper for speech-to-text and Voxtral Mini for voice-enabled agents.

Legal Disclaimer:
MENAFN provides the
information “as is” without warranty of any kind. We do not accept
any responsibility or liability for the accuracy, content, images,
videos, licenses, completeness, legality, or reliability of the information
contained in this article. If you have any complaints or copyright
issues related to this article, kindly contact the provider above.
Most popular stories
Market Research

- Casper Network Advances Regulated Tokenization With ERC-3643 Standard
- Forex Expo Dubai Wins Guinness World Recordstm With 20,021 Visitors
- Superiorstar Prosperity Group Russell Hawthorne Highlights New Machine Learning Risk Framework
- Freedom Holding Corp. (FRHC) Shares Included In The Motley Fool's TMF Moneyball Portfolio
- Versus Trade Launches Master IB Program: Multi-Tier Commission Structure
- Ozzy Tyres Grows Their Monsta Terrain Gripper Tyres Performing In Australian Summers
Comments
No comment