Recursive language models (RLMs) are an inference technique developed by researchers at MIT CSAIL that treat long prompts as an external environment to the model. Instead of forcing the entire prompt into the model’s context window, the framework allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the text.Rather than expanding context windows or summarizing old information, the MIT team reframes long-context reasoning as a systems problem. By letting models treat prompts as something they can inspect with code, recursive language models allow LLMs to reason over millions of tokens without retraining. This offers enterprises a practical path to long-horizon tasks like codebase analysis, legal review, and multi-step reasoning that routinely break today’s models.Because the framework is designed as a wrapper around existing models, it can serve as a drop-in replacement for applications that make direct calls to LLMs.The LLM context problemWhile frontier models are becoming increasingly sophisticated at reasoning, their ability to process massive amounts of information is not scaling at the same rate. This bottleneck is driven by two distinct limitations: the hard physical constraint on how much text a model can process at once (context length) and “context rot.”The challenge, the researchers argue, is whether it’s possible to scale the effective context size of general-purpose LLMs by orders of magnitude without retraining them. This capability is becoming increasingly important for enterprise applications, where LLMs are adopted for long-horizon tasks requiring the processing of millions of tokens — a challenge Zhang argues can’t be solved by simply expanding context windows.”There is an entropy argument that implies you …