Our system did one thing, and it did it well: It turned natural-language questions into API calls.The users were analysts, account managers, and operations leads. They knew what data they needed, but assembling it manually meant pulling from four dashboards, two BI tools, and a Salesforce report builder. With our system, they typed the request in plain English. A request like “Compile a report on sales volume for January through March 2026 for the Northeast region, broken down by city” was translated into an API call that the system could act on:json{ “description”: “User requested sales volume for the given date range, here is the API call to get the response”, “api_call”: “/api/sales_volume”, “post_body”: { “start_date”: “2026-01-01”, “end_date”: “2026-03-31”, “region”: “northeast” }}The rest of the pipeline was conventional engineering. The system dispatched the call to the right backend — we had integrations with internal reporting portals, Salesforce, and several homegrown services — applied a large language model (LLM)(-generated JSON query to filter and shape the response, and delivered it via email, as a Drive document, or rendered as a chart in the browser.By mid-2025, the system was generating several hundred reports a month. These reports were consumed by leadership and analysts and circulated to external stakeholders. It had become the default way most teams pulled ad-hoc data.The contract between the LLM and the rest of the system was a structured JSON object as described in the above example.json{ “description”: “User requested sales volume for the given date range, here is the API call to get the response”, “api_call”: “/api/sales_volume”, “post_body”: { “start_date”: “2026-01-01”, “end_date”: “2026-03-31”, “region”: “northeast” }}We built it on Claude Sonnet 3.5 in early 2025. We upgraded to 3.7 without incident, and to 4.0 without incident. By the time Sonnet 4.5 shipped, we had …