NVIDIA has announced the availability of NVIDIA NIM, inference microservices that provide models as optimized containers. These microservices enable the world’s 28 million developers to build generative AI applications easily. These applications can be deployed on clouds, data centers, or workstations, significantly reducing development time from weeks to minutes.
NVIDIA has announced the availability of NVIDIA NIM, inference microservices that provide models as optimized containers. These microservices enable the world’s 28 million developers to build generative AI applications easily. These applications can be deployed on clouds, data centers, or workstations, significantly reducing development time from weeks to minutes.
As generative AI applications become increasingly complex, often utilizing multiple models for generating text, images, video, and speech, NVIDIA NIM enhances developer productivity by offering a standardized method for integrating generative AI into applications. NIM allows enterprises to maximize their infrastructure investments, exemplified by its capability to run Meta Llama 3-8B with up to three times more generative AI tokens on accelerated infrastructure than traditional methods. This efficiency boost enables enterprises to generate more responses with the same compute resources.
Broad Industry Adoption
Nearly 200 technology partners, including Cadence, Cloudera, Cohesity, DataStax, NetApp, Scale AI, and Synopsys, are integrating NVIDIA NIM to expedite generative AI deployments for domain-specific applications such as copilots, code assistants, and digital human avatars. Hugging Face is also offering NIM, starting with Meta Llama 3.
Jensen Huang, NVIDIA’s founder and CEO, emphasized NIM’s accessibility and impact of NIM, stating, “Every enterprise is looking to add generative AI to its operations, but not every enterprise has a dedicated team of AI researchers.” NVIDIA NIM makes generative AI available for nearly every organization.
Enterprises can deploy AI applications using NIM through the NVIDIA AI Enterprise software platform. Starting next month, members of the NVIDIA Developer Program can access NIM for free for research, development, and testing on their preferred infrastructure.
Powering Generative AI Across Modalities
NIM containers are pre-built to accelerate model deployment for GPU-accelerated inference and include NVIDIA CUDA software, NVIDIA Triton Inference Server, and NVIDIA TensorRT-LLM software. Over 40 models, including Databricks DBRX, Google’s Gemma, Meta Llama 3, Microsoft Phi-3, and Mistral Large, are available as NIM endpoints on ai.nvidia.com.
Developers can access NVIDIA NIM microservices for Meta Llama 3 models via the Hugging Face AI platform, allowing them to run Llama 3 NIM with ease using Hugging Face Inference Endpoints powered by NVIDIA GPUs.
Extensive Ecosystem Support
Platform providers such as Canonical, Red Hat, Nutanix, and VMware support NIM on open-source KServe or enterprise solutions. AI application companies, including Hippocratic AI, Glean, Kinetica, and Redis, are deploying NIM to power generative AI inference. Leading AI tools and MLOps partners like Amazon SageMaker, Microsoft Azure AI, Dataiku, DataRobot, and others have embedded NIM into their platforms, enabling developers to build and deploy domain-specific generative AI applications with optimized inference.
Global system integrators and service delivery partners like Accenture, Deloitte, Infosys, Latentview, Quantiphi, SoftServe, TCS, and Wipro have developed NIM competencies to help enterprises quickly develop and deploy production AI strategies. Enterprises can run NIM-enabled applications on NVIDIA-Certified Systems from manufacturers such as Cisco, Dell Technologies, Hewlett-Packard Enterprise, Lenovo, and Supermicro, as well as on servers from ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, Pegatron, QCT, Wistron, and Wiwynn. NIM microservices are also integrated into major cloud platforms, including Amazon Web Services, Google Cloud, Azure, and Oracle Cloud Infrastructure.
Industry Adoption and Use Cases
Leading companies are leveraging NIM for diverse applications across industries. Foxconn uses NIM for domain-specific LLMs in AI factories, smart cities, and electric vehicles. Pegatron employs NIM for Project TaME to advance local LLM development for various industries. Amdocs uses NIM for a customer billing LLM, significantly reducing costs and latency while improving accuracy. ServiceNow integrates NIM microservices within its Now AI multimodal model, providing customers with fast and scalable LLM development and deployment.
Availability
Developers can experiment with NVIDIA microservices at ai.nvidia.com at no charge. Enterprises can deploy production-grade NIM microservices with NVIDIA AI Enterprise on NVIDIA-certified systems and leading cloud platforms. Developers interested in free access to NIM for research and testing can register for access, which is expected to be available next month.
Engage with StorageReview
Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed