NVIDIA Optimizes OpenAI GPT-OS Models for Fast Inference on GPUs

NVIDIA has optimized the company’s new open-source GPT-OS models from OpenAI for use on its GPUs, delivering fast and smart inference from the cloud to PCs. The new models, gpt-oss-20b and gpt-oss-120b, enable agentic AI applications such as web search, in-depth research, and more.

The models are designed with flexible, open-weight reasoning capabilities and adjustable effort levels, supporting features like instruction-following and tool use. Trained on NVIDIA H100 GPUs, they can reason through context problems, ideal for tasks like coding assistance and document comprehension.

Users can access the models through popular tools and frameworks such as Ollama, llama.cpp, and Microsoft AI Foundry Local. The new app, Ollama, provides a user-friendly interface for easy testing of the models on RTX AI PCs with at least 24GB of VRAM.

Developers can also use the models through command line interfaces or software development kits (SDKs) to power their applications and workflows. NVIDIA continues to collaborate with the open-source community to optimize performance on RTX GPUs.

The release of these open-source models marks the next wave of AI innovation, enabling enthusiasts and developers to add reasoning capabilities to their AI-accelerated Windows applications.

Source: https://blogs.nvidia.com/blog/rtx-ai-garage-openai-oss

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Related Posts: