Skip to content
Tutorial emka
Menu
  • Home
  • Debian Linux
  • Ubuntu Linux
  • Red Hat Linux
Menu
openresponse in agentic ai

Is OpenAI’s New Open Responses API: A Game Changer for Open Models?

Posted on January 20, 2026

Everyone is talking about OpenAI’s latest move to support open models, but we need to ask if this is a genuine helping hand or just a marketing strategy. In this lesson, we are going to explore the new “Open Responses” initiative, compare it with how Anthropic is doing things, and actually look at the code to see how you can build agents with it.

To understand why this matters, we first need to look at how we talk to Artificial Intelligence models. Until recently, if you wanted to build an app, you mostly used the standard OpenAI “chat completions” style. It was the rule everyone followed. However, technology moves fast, and now developers want more control. We have seen this with Gemini’s interactions API, and recently, Anthropic has become very popular. Many developers are switching to the Claude API because of tools like “Claude Code.” In fact, even Chinese model providers like Moonshot AI and ZAI are making their models compatible with Claude’s style just so they can work with these cool new coding tools. This created a bit of a problem for OpenAI because not everyone was using their standard anymore.

This is where the new “Open Responses” initiative comes in. OpenAI realized they could not force everyone to use their proprietary method, so they proposed a new standard designed specifically for open models. The goal is to make sure that whether you use a model from Hugging Face, Ollama, or VLLM, the way you send commands—especially for complex tasks like calling tools or analyzing images—remains the same. This is excellent news for us because it means we do not have to learn a completely new coding language for every single new robot brain that gets released. It is designed to be multi-provider by default, which means big community players like Hugging Face and OpenRouter are already supporting it.

Let’s dig into the technical details of how this standard actually works. The core concept here relies on an “agentic loop.” Instead of just sending a text and getting a text back, the system breaks things down into “Items.” An item can be a message, a function call, or even a reasoning state. This is very helpful because it allows the computer to track if a task is currently in progress or if it has been completed. For example, if you are asking the AI to solve a math problem, the API can now handle the “reasoning tokens”—which are the AI’s internal thoughts—in a standard way. Previously, extracting these thoughts from different open models required rewriting your code constantly. Now, the API supports both raw reasoning and summaries out of the box, making it much easier to see how the model arrived at an answer.

Another major feature of this standard is how it handles tools. We are moving toward a future where model providers act more like system providers. This means the tools might be hosted internally on the server, such as a sandbox for running code or a direct Google Search integration. The Open Responses standard supports these internal tools as well as external tools that you might build yourself. It also includes “tool choice,” which gives you control over whether the AI must use a tool, cannot use a tool, or can decide for itself. This level of control is something open models struggled with before, but this standard provides a blueprint for training them to handle these complex instructions reliably.

Now, let us look at how to implement this in code using Python and Hugging Face. To start, you would set up your client just like you normally would, but you will point it to a model supported by the Open Responses standard, such as the Kimmy K2 or Qwen models. Instead of the old method, you will call client.responses.create. In your code, you can define instructions and inputs, and enabling features like event-based streaming is very straightforward. You write the command to stream the response, and the API will send back events one by one. This allows you to see the text appearing on your screen in real-time. If you want to use tool calling, you simply define the tools in your request, and the model will return a “function calling item” with the name of the function and the arguments needed to run it.

If you prefer to run things locally on your own computer, you can use Ollama. The process is almost identical, which proves how useful this standard is. You would initialize your client by pointing it to your local host address, usually something like localhost:11434. You do not even need a real API key for Ollama; you can just put a placeholder text there. Once your client is ready, you can write a script to check if the specific model you have loaded supports the Open Responses format. If it does, you can run the exact same client.responses commands. You might notice that loading the model takes a moment if it is not already running, but once it is active, you can stream reasoning traces and tool calls right from your own machine. This bridges the gap between powerful cloud models and the private models running on your laptop.

In conclusion, the Open Responses initiative is a significant step forward for the open-source community. By creating a unified standard, it allows powerful open models to behave more like the top-tier proprietary ones, specifically regarding agentic workflows and tool usage. While some companies might still lean toward Anthropic’s style due to the popularity of Claude Code, having a robust standard supported by OpenAI, Hugging Face, and Ollama ensures that developers have a reliable way to build complex systems. I highly recommend you try running a local model with Ollama using this new format to see the “reasoning tokens” in action. It gives you a fascinating look into how the AI “thinks” before it speaks.

Recent Posts

  • How to build a high-performance private photo cloud with Immich and TrueNAS SCALE
  • How to Build an Endgame Local AI Agent Setup Using an 8-Node NVIDIA Cluster with 1TB Memory
  • How to Master Windows Event Logs to Level Up Your Cybersecurity Investigations and SOC Career
  • How to Build Ultra-Resilient Databases with Amazon Aurora Global Database and RDS Proxy for Maximum Uptime and Performance
  • How to Build Real-Time Personalization Systems Using AWS Agentic AI to Make Every User Feel Special
  • How to Transform Your Windows 11 Interface into a Sleek and Modern Aesthetic Masterpiece
  • How to Understand Google’s New TPU 8 Series for Massive AI Training and Inference
  • How to Level Up Your PC Gaming Experience with the New Valve Steam Controller and Its Advanced Features
  • Is it Time to Replace Nano? Discover Fresh, the Terminal Text Editor You Actually Want to Use
  • How to Design a Services Like Google Ads
  • How to Fix 0x800ccc0b Outlook Error: Step-by-Step Guide for Beginners
  • How to Fix NVIDIA App Error on Windows 11: Simple Guide
  • How to Fix Excel Formula Errors: Quick Fixes for #NAME
  • How to Clear Copilot Memory in Windows 11 Step by Step
  • How to Show Battery Percentage on Windows 11
  • How to Fix VMSp Service Failed to Start on Windows 10/11
  • How to Fix Taskbar Icon Order in Windows 11/10
  • How to Disable Personalized Ads in Copilot on Windows 11
  • What is the Microsoft Teams Error “We Couldn’t Connect the Call” Error?
  • Why Does the VirtualBox System Service Terminate Unexpectedly? Here is the Full Definition
  • Why is Your Laptop Touchpad Overheating? Here are the Causes and Fixes
  • How to Disable All AI Features in Chrome Using Windows 11 Registry
  • How to Avoid Problematic Windows Updates: A Guide to System Stability
  • What is Microsoft Visual C++ Redistributable and How to Fix Common Errors?
  • What is the 99% Deletion Bug? Understanding and Fixing Windows 11 File Errors
  • Inilah Jadwal Pelaksanaan SPMB SD Jakarta 2026
  • Tanggal Penerbitan KK & SKD untuk Pendaftaran SPMB 2026 Dimana?
  • Inilah Lima HP Xiaomi Rp1 Jutaan Sudah Punya NFC
  • Apa itu Jabatan Panitera Muda Mahkamah Agung, Berapa Gaji & Tunjangannya 2026?
  • Inilah Kenapa Bisa Ada Sensasi Mencekam di Bangunan Tua
  • How to Automate Your Entire SEO Strategy Using a Swarm of 100 Free AI Agents Working in Parallel
  • How to create professional presentations easily using NotebookLM’s AI power for school projects and beyond
  • How to Master SEO Automation with Google Gemini 3.1 Flash-Lite in Google AI Studio
  • How to create viral AI video ads and complete brand assets using the Claude and Higgsfield MCP integration
  • How to Transform Your Mac Into a Supercharged AI Assistant with Perplexity Personal Computer
  • Apa itu Spear-Phishing via npm? Ini Pengertian dan Cara Kerjanya yang Makin Licin
  • Apa Itu Predator Spyware? Ini Pengertian dan Kontroversi Penghapusan Sanksinya
  • Mengenal Apa itu TONESHELL: Backdoor Berbahaya dari Kelompok Mustang Panda
  • Siapa itu Kelompok Hacker Silver Fox?
  • Apa itu CVE-2025-52691 SmarterMail? Celah Keamanan Paling Berbahaya Tahun 2025
©2026 Tutorial emka | Design: Newspaperly WordPress Theme