
ProteinMPNN
Predicts amino acid sequences from 3D structure of proteins.

AlphaFold2
A widely used model for predicting the 3D structures of proteins from their amino acid sequences.

AlphaFold2-Multimer
A widely used model for predicting the 3D structures of proteins from their amino acid sequences. This version of the container supports multimers, i.e. proteins made up of 2 or more polypeptide chains.

Deepseek-R1-Distill-Qwen-7B
GPU accelerated DeepSeek-R1-Distill-Qwen-7B inference through OpenAI compatible APIs. This distill model is fine-tuned based on open-source Qwen2.5-Math-7B model, using samples generated by DeepSeek-R1.

Eye Contact
Maxine Eye Contact NIM leverages state-of-the-art AI models to dynamically redirect a user's eye position towards the camera in real-time to simulate natural eye contact and enhance remote digital engagement.

VISTA-3D NIM
The VISTA-3D NIM accelerates ground truth data creation by streamlining segmentation and annotations with an interactive foundation model.

NVIDIA Retrieval QA E5 Embedding v5
GPU accelerated NVIDIA Retrieval QA E5 Embedding v5 inference

NVIDIA Retrieval QA Mistral 7B Embedding v2
GPU accelerated NVIDIA Retrieval QA Mistral 7B Embedding v2 inference

Snowflake Arctic Embed Large Embedding
GPU accelerated Snowflake Arctic Embed Large Embedding inference

NVIDIA Retrieval QA Mistral 4B Reranking v3
GPU accelerated NVIDIA Retrieval QA Mistral 4B Reranking v3 inference

NV-CLIP
NV-CLIP NIM microservice for multimodal embeddings model for image and text

ASR Parakeet CTC Riva 1.1b
RIVA ASR NIM delivers accurate English speech-to-text transcription and enables easy-to-use optimized ASR inference for large scale deployments.

NMT Megatron Riva 1b
Riva NMT NIM provide easy access to state-of-the-art neural machine translation (NMT) models, capable of translating text from one language to another with exceptional accuracy.

TTS FastPitch HifiGAN Riva
RIVA TTS NIM provide easy access to state-of-the-art text to speech models, capable of synthesizing English speech from text

MolMIM
MolMIM is a transformer-based model developed by NVIDIA for controlled small molecule generation.

Earth-2 FourCastNet
FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

DeepSeek-R1
GPU accelerated DeepSeek-R1 inference through OpenAI compatible APIs

llama-3.1-8b-instruct
GPU accelerated Llama 3.1 8B inference through OpenAI compatible APIs

DiffDock
Diffdock predicts the 3D structure of the interaction between a molecule and a protein.

meta-llama-2-13b-chat
GPU accelerated Llama 2 13B inference through OpenAI compatible APIs

Llama-3.1-405b-instruct
GPU accelerated Llama 3.1 405B inference through OpenAI compatible APIs

meta-llama-2-7b-chat
GPU accelerated Llama 2 7B inference through OpenAI compatible APIs

Llama-3.2-90B-Vision-Instruct
The Llama 3.2 Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

ASR Parakeet 1.1b RNNT Multilingual
RIVA Parakeet 1.1b RNNT Multilingual ASR NIM delivers accurate speech-to-text transcription for 25 languages

Phi-3-Mini-4K-Instruct
GPU supported Phi-3-Mini-4K-Instruct inference through OpenAI compatible APIs

Cosmos-Predict1-7B-Text2World
GPU accelerated Cosmos-Predict1-7B-Text2World inference through OpenAI compatible APIs

StarCoderBase 15.5B
GPU accelerated StarCoderBase15.5B inference through OpenAI compatible APIs

Phind-CodeLlama-34B-v2-Instruct
Phind-CodeLlama-34B-v2 is a large language AI model based on CodeLlama, capable of generating code and proficient in Python, C/C++, TypeScript, Java, and more.

qwen-2.5-72b-instruct
GPU accelerated Qwen-2.5-7B-Instruct inference through OpenAI compatible APIs

Riva TTS NIM
RIVA TTS NIM provide easy access to state-of-the-art text to speech models, capable of synthesizing English speech from text

NVIDIA Retrieval QA E5 Embedding v5 PB October 2024 (PB 24h2)
NVIDIA Retrieval QA E5 Embedding v5 NIM Production Branch October 2024 (PB 24h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.

NemoGuard JailbreakDetect
Container for classifying jailbreak attempts using NemoGuard JailbreakDetect

Mistral-7B-Instruct-v0.3
GPU accelerated Mistral-7B-Instruct-v0.3 inference through OpenAI compatible APIs

Llama-3.1-Swallow-70B-Instruct-v0.1
GPU accelerated Llama-3.1-Swallow-70B-Instruct inference through OpenAI compatible APIs

Llama3-8b-instruct
GPU accelerated Llama 3 8B inference through OpenAI compatible APIs

Llama-3.1-8b-base
GPU accelerated Llama 3.1 8B inference through OpenAI compatible APIs

Mixtral-8x7B-Instruct-v0.1
GPU accelerated Mixtral-8x7B-Instruct-v0.1 inference through OpenAI compatible APIs

Mixtral-8x22B-Instruct-v0.1
GPU accelerated Mixtral-8x22B-Instruct-v0.1 inference through OpenAI compatible APIs

nemotron-4-340b-instruct
GPU accelerated Nemotron-4-340B-Instruct inference through OpenAI compatible APIs

ASR Parakeet 1.1b CTC en-US
Parakeet 1.1b CTC en-US ASR NIM delivers accurate English speech-to-text transcription and enables easy-to-use optimized ASR inference for large scale deployments.

Gemma-2-9B-IT
Gemma-2-9B-IT comes from the family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

Nemotron-4-340B-Reward
GPU accelerated Nemotron-4-340B-Reward inference through OpenAI compatible APIs

NVIDIA Retrieval QA Llama 3.2 1B Embedding v2
The NVIDIA Retrieval QA Llama3.2 1b Embedding NIM is an embedding NIM optimized for multilingual and crosslingual text question-answering retrieval.

DeepSeek-R1-Distill-Llama-70B
GPU accelerated DeepSeek-R1-Distill-Llama-70B inference through OpenAI compatible APIs

Magpie TTS Multilingual NIM
Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

Evo 2 40B
Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences.

CodeLlama-13B-Instruct
GPU accelerated CodeLlama-13B inference through OpenAI compatible APIs

Mistral-NeMo-Minitron-8B-Instruct
GPU accelerated Mistral-NeMo-Minitron-8B-Instruct through OpenAI compatible APIs

Deepseek-R1-Distill-Llama-8B
GPU accelerated DeepSeek-R1-Distill-Llama-8B inference through OpenAI compatible APIs

Earth-2 CorrDiff
Correction Diffusion (CorrDiff) is a generative AI model that downscales surface and atmospheric variables to improve the accuracy and resolution of weather data.

Llama-3.2-3B-Instruct
GPU accelerated Llama-3.2-3B-Instruct inference through OpenAI compatible APIs

Llama-3.3-70b-Instruct
GPU accelerated Llama 3.3 70B inference through OpenAI compatible APIs

Studio Voice
Studio Voice NIM leverages state-of-the-art AI models to enhance the input speech recorded through low quality microphones in noisy and reverberant environments to studio-recorded quality speech.

RFdiffusion
Generates new protein structures (binder designs, motif scaffoldings, etc.)

NeMo Retriever YOLOX Structured Images v1
NVIDIA NeMo™ Retriever NIM for YOLOX structured images v1 is a fine-tuned object detection model, trained specifically for detecting charts and tables in documents

Llama 3.1 NemoGuard 8B Content Safety
GPU accelerated Llama 3.1 NemoGuard 8B Content Safety inference through OpenAI compatible APIs

Llama-3.1-Nemotron-Ultra-253B-v1
GPU accelerated Llama-3.1-Nemotron-Ultra-253B-v1 inference through OpenAI compatible APIs

Qwen-2.5-7B-Instruct
GPU accelerated Qwen-2.5-7B-Instruct inference through OpenAI compatible APIs

MAISI NIM
MAISI NIM generated high-quality synthetic CT images with or without anatomical annotations.

Llama-3.3-Nemotron-Super-49B-v1
GPU accelerated Llama-3.3-Nemotron-Super-49B-v1 inference through OpenAI compatible APIs

ai-generated-image-detection
Hives AI Generated Image Detection model analyzes images and returns a confidence score on how likely the image is AI generated or modified in some way.

flux.1-dev
FLUX.1 is a state-of-the-art suite of image generation models

Llama-3.1-Nemotron-Nano-8B-v1
GPU accelerated Llama-3.1-Nemotron-Nano-8B-v1 inference through OpenAI compatible APIs

NeMo Retriever Page Elements v2
NVIDIA NeMo™ Retriever NIM for page elements v2 is a fine-tuned object detection model, trained specifically for detecting charts, tables, infographics, and titles on a document page.

Audio2Face-2D NIM
Audio2Face-2D animates a person's portrait photo using a driving audio.

Riva ASR NIM
RIVA ASR NIM delivers accurate English speech-to-text transcription and enables easy-to-use optimized ASR inference for large scale deployments.

Deepseek-R1-Distill-Qwen-14B
GPU accelerated DeepSeek-R1-Distill-Qwen-14B inference through OpenAI compatible APIs

CodeLlama-34B-Instruct
GPU accelerated CodeLlama-34B inference through OpenAI compatible APIs

Llama-3-SQLCoder-8B
GPU accelerated Llama-3-SQLCoder-8B inference through OpenAI compatible APIs

msa-search
The Multiple Sequence Alignment (MSA) Search NIM enables protein structure prediction models by providing fast, accurate sequence alignments of a query amino acid sequence against large databases of known proteins.

Starcoder2-7B
GPU accelerated StarCoder2-7B inference through OpenAI compatible APIs

Cosmos-Predict1-7B-Video2World
GPU accelerated Cosmos-Predict1-7B-Video2World inference through OpenAI compatible APIs

Openfold2
OpenFold2 is a protein structure prediction model from the OpenFold Consortium and the Alquraishi Laboratory.

NeMo Retriever Table Structure v1
NVIDIA NeMo™ Retriever NIM for table structure v1 is a fine-tuned object detection model, trained specifically for detecting the structure of complex tables.

nemoretriever-parse
nemoretriever-parse is a tiny autoregressive Vision Language Model (VLM) designed for document transcription from images. It outputs text in reading order.

Llama-3.2-1B-Instruct
GPU accelerated Llama-3.2-1B-Instruct inference through OpenAI compatible APIs

NeMo Retriever Graphic Elements v1
NVIDIA NeMo™ Retriever NIM for graphic elements v1 is a fine-tuned object detection model, trained specifically for detecting the elements of charts and tables in documents

DoMINO-Automotive-Aero NIM
This NIM serves as a demonstration of the potential of foundation models for early stage design evaluation in automotive aerodynamics.

bge-m3
GPU accelerated BAAI/bge-m3 inference.

bge-large-zh-v1-5
GPU accelerated BAAI/bge-large-zh-v1.5 inference

NVIDIA Retrieval QA Llama 3.2 1B Reranking v2
The NVIDIA Retrieval QA Llama 1B Reranking NIM is a NIM optimized for providing a logit score that represents how relevant a document(s) is to a given query, fine-tuned for multilingual and cross-lingual text question-answering retrieval.

Llama-3-Swallow-70B-Instruct-v0.1
GPU accelerated Llama-3-Swalow-70B-Instruct-v0.1 inference through OpenAI compatible APIs

Llama-3.1-8b-instruct PB October 2024 (PB 24h2)
GPU accelerated Llama 3.1 8B inference through OpenAI compatible APIs

Gemma-2-2B-IT
GPU accelerated Gemma-2-2B-IT inference through OpenAI compatible APIs

meta-llama-2-70b-chat
GPU accelerated Llama 2 70B inference through OpenAI compatible APIs

Llama-3.1-Swallow-8B-Instruct-v0.1
GPU accelerated Llama 3.1 Swallow 8B inference through OpenAI compatible APIs

Mistral-Nemo-12B-Instruct
GPU accelerated Mistral-NeMo-12B-Instruct inference through OpenAI compatible APIs

deepfake-image-detection
Hive’s Deepfake Image Detection model analyzes images and returns a confidence score on how likely the image contains a deepfake.

Llama-3.1-70b-instruct PB October 2024 (PB 24h2)
Llama 3.1 70B-Instruct NIM Production Branch October 2024 (PB 24h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.
- 1