Today’s LLMs excel at reasoning, but can still struggle with context. This is particularly true in real-time ordering systems like Instacart. Instacart CTO Anirban Kundu calls it the “brownie recipe problem.” It’s not as simple as telling an LLM ‘I want to make brownies.’ To be truly assistive when planning the meal, the model must go beyond that simple directive to understand what’s available in the user’s market based on their preferences — say, organic eggs versus regular eggs — and factor that into what’s deliverable in their geography so food doesn’t spoil. This among other critical factors. For Instacart, the challenge is juggling latency with the right mix of context to provide experiences in, ideally, less than one second’s time. “If reasoning itself takes 15 seconds, and if every interaction is that slow, you’re gonna lose the user,” Kundu said at a recent VB event. Mixing reasoning, real-world state, personalizationIn grocery delivery, there’s a “world of reasoning” and a “world of state” (what’s available in the real world), Kundu noted, both of which must be understood by an LLM along with user preference. But it’s not as simple as loading the entirety of a user’s purchase history and known interests into a reasoning model. “Your LLM is gonna blow up into a size that will be unmanageable,” said Kundu. To get around this, Instacart splits processing into chunks. First, data is fed into a large foundational model that can understand intent and categorize products. That processed data is then routed to small language models (SLMs) designed for catalog context (the types of food or other items that work together) and semantic understa …