Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Integrating Rust and Python for Knowledge Science

    January 24, 2026

    Thomas Pilz on innovation and security in robotics

    January 24, 2026

    Why AI is the Final Working System You’ll Ever Want

    January 23, 2026
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Machine Learning & Research»Bringing Engineering Self-discipline to Prompts—Half 2 – O’Reilly
    Machine Learning & Research

    Bringing Engineering Self-discipline to Prompts—Half 2 – O’Reilly

    Oliver ChambersBy Oliver ChambersAugust 24, 2025No Comments11 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Bringing Engineering Self-discipline to Prompts—Half 2 – O’Reilly
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    The next is Half 2 of three from Addy Osmani’s unique submit “Context Engineering: Bringing Engineering Self-discipline to Elements.” Half 1 may be discovered right here.

    Nice context engineering strikes a steadiness—embody all the things the mannequin really wants however keep away from irrelevant or extreme element that might distract it (and drive up price).

    As Andrej Karpathy described, context engineering is a fragile mixture of science and artwork.

    The “science” half includes following sure rules and strategies to systematically enhance efficiency. For instance, for those who’re doing code technology, it’s nearly scientific that it’s best to embody related code and error messages; for those who’re doing question-answering, it’s logical to retrieve supporting paperwork and supply them to the mannequin. There are established strategies like few-shot prompting, retrieval-augmented technology (RAG), and chain-of-thought prompting that we all know (from analysis and trial) can increase outcomes. There’s additionally a science to respecting the mannequin’s constraints—each mannequin has a context size restrict, and overstuffing that window can’t solely improve latency/price however doubtlessly degrade the standard if the necessary items get misplaced within the noise.

    Karpathy summed it up properly: “Too little or of the mistaken type and the LLM doesn’t have the proper context for optimum efficiency. An excessive amount of or too irrelevant and the LLM prices may go up and efficiency may come down.”

    So the science is in strategies for choosing, pruning, and formatting context optimally. As an example, utilizing embeddings to seek out probably the most related docs to incorporate (so that you’re not inserting unrelated textual content) or compressing lengthy histories into summaries. Researchers have even catalogued failure modes of lengthy contexts—issues like context poisoning (the place an earlier hallucination within the context results in additional errors) or context distraction (the place an excessive amount of extraneous element causes the mannequin to lose focus). Realizing these pitfalls, a great engineer will curate the context rigorously.

    Then there’s the “artwork” aspect—the instinct and creativity born of expertise.

    That is about understanding LLMs’ quirks and refined behaviors. Consider it like a seasoned programmer who “simply is aware of” methods to construction code for readability: An skilled context engineer develops a really feel for methods to construction a immediate for a given mannequin. For instance, you may sense that one mannequin tends to do higher for those who first define an answer method earlier than diving into specifics, so that you embody an preliminary step like “Let’s suppose step-by-step…” within the immediate. Otherwise you discover that the mannequin typically misunderstands a selected time period in your area, so that you preemptively make clear it within the context. These aren’t in a handbook—you be taught them by observing mannequin outputs and iterating. That is the place prompt-crafting (within the outdated sense) nonetheless issues, however now it’s in service of the bigger context. It’s just like software program design patterns: There’s science in understanding frequent options however artwork in realizing when and methods to apply them.

    Let’s discover a number of frequent methods and patterns context engineers use to craft efficient contexts:

    Retrieval of related information: One of the highly effective strategies is retrieval-augmented technology. If the mannequin wants information or domain-specific knowledge that isn’t assured to be in its coaching reminiscence, have your system fetch that data and embody it. For instance, for those who’re constructing a documentation assistant, you may vector-search your documentation and insert the highest matching passages into the immediate earlier than asking the query. This fashion, the mannequin’s reply will likely be grounded in actual knowledge you supplied moderately than in its generally outdated inner information. Key abilities right here embody designing good search queries or embedding areas to get the proper snippet and formatting the inserted textual content clearly (with citations or quotes) so the mannequin is aware of to make use of it. When LLMs “hallucinate” information, it’s actually because we failed to offer the precise truth—retrieval is the antidote to that.

    Few-shot examples and position directions: This hearkens again to traditional immediate engineering. If you would like the mannequin to output one thing in a selected type or format, present it examples. As an example, to get structured JSON output, you may embody a few instance inputs and outputs in JSON within the immediate, then ask for a brand new one. Few-shot context successfully teaches the mannequin by instance. Likewise, setting a system position or persona can information tone and conduct (“You’re an professional Python developer serving to a person…”). These strategies are staples as a result of they work: They bias the mannequin towards the patterns you need. Within the context-engineering mindset, immediate wording and examples are only one a part of the context, however they continue to be essential. In actual fact, you would say immediate engineering (crafting directions and examples) is now a subset of context engineering—it’s one software within the toolkit. We nonetheless care loads about phrasing and demonstrative examples, however we’re additionally doing all these different issues round them.

    Managing state and reminiscence: Many purposes contain a number of turns of interplay or long-running classes. The context window isn’t infinite, so a significant a part of context engineering is deciding methods to deal with dialog historical past or intermediate outcomes. A typical method is abstract compression—after every few interactions, summarize them and use the abstract going ahead as a substitute of the total textual content. For instance, Anthropic’s Claude assistant robotically does this when conversations get prolonged, to keep away from context overflow. (You’ll see it produce a “[Summary of previous discussion]” that condenses earlier turns.) One other tactic is to explicitly write necessary information to an exterior retailer (a file, database, and so forth.) after which later retrieve them when wanted moderately than carrying them in each immediate. That is like an exterior reminiscence. Some superior agent frameworks even let the LLM generate “notes to self” that get saved and may be recalled in future steps. The artwork right here is determining what to maintain, when to summarize, and how to resurface previous data on the proper second. Executed properly, it lets an AI keep coherence over very lengthy duties—one thing that pure prompting would battle with.

    Instrument use and environmental context: Trendy AI brokers can use instruments (e.g., calling APIs, operating code, internet shopping) as a part of their operations. After they do, every software’s output turns into new context for the subsequent mannequin name. Context engineering on this state of affairs means instructing the mannequin when and the way to make use of instruments after which feeding the outcomes again in. For instance, an agent may need a rule: “If the person asks a math query, name the calculator software.” After utilizing it, the end result (say 42) is inserted into the immediate: “Instrument output: 42.” This requires formatting the software output clearly and possibly including a follow-up instruction like “Given this end result, now reply the person’s query.” Numerous work in agent frameworks (LangChain, and so forth.) is actually context engineering round software use—giving the mannequin a listing of obtainable instruments, together with syntactic tips for invoking them, and templating methods to incorporate outcomes. The bottom line is that you simply, the engineer, orchestrate this dialogue between the mannequin and the exterior world.

    Info formatting and packaging: We’ve touched on this, but it surely deserves emphasis. Typically you’ve gotten extra data than suits or is helpful to incorporate totally. So that you compress or format it. In case your mannequin is writing code and you’ve got a big codebase, you may embody simply operate signatures or docstrings moderately than total information, to present it context. If the person question is verbose, you may spotlight the principle query on the finish to focus the mannequin. Use headings, code blocks, tables—no matter construction greatest communicates the information. For instance, moderately than “Person knowledge: [massive JSON]… Now reply query.” you may extract the few fields wanted and current “Person’s Title: X, Account Created: Y, Final Login: Z.” That is simpler for the mannequin to parse and in addition makes use of fewer tokens. In brief, suppose like a UX designer, however your “person” is the LLM—design the immediate for its consumption.

    The affect of those strategies is big. If you see a powerful LLM demo fixing a fancy activity (say, debugging code or planning a multistep course of), you’ll be able to guess it wasn’t only a single intelligent immediate behind the scenes. There was a pipeline of context meeting enabling it.

    As an example, an AI pair programmer may implement a workflow like:

    1. Search the codebase for related code.
    2. Embrace these code snippets within the immediate with the person’s request.
    3. If the mannequin proposes a repair, run checks within the background.
    4. If checks fail, feed the failure output again into the immediate for the mannequin to refine its answer.
    5. Loop till checks move.

    Every step has rigorously engineered context: The search outcomes, the check outputs, and so forth., are every fed into the mannequin in a managed method. It’s a far cry from “simply immediate an LLM to repair my bug” and hoping for one of the best.

    The Problem of Context Rot

    As we get higher at assembling wealthy context, we run into a brand new drawback: Context can really poison itself over time. This phenomenon, aptly termed “context rot” by developer Workaccount2 on Hacker Information, describes how context high quality degrades as conversations develop longer and accumulate distractions, dead-ends, and low-quality data.

    The sample is frustratingly frequent: You begin a session with a well-crafted context and clear directions. The AI performs fantastically at first. However because the dialog continues—particularly if there are false begins, debugging makes an attempt, or exploratory rabbit holes—the context window fills with more and more noisy data. The mannequin’s responses regularly turn out to be much less correct and extra confused, or it begins hallucinating.

    Why does this occur? Context home windows aren’t simply storage—they’re the mannequin’s working reminiscence. When that reminiscence will get cluttered with failed makes an attempt, contradictory data, or tangential discussions, it’s like attempting to work at a desk coated in outdated drafts and unrelated papers. The mannequin struggles to establish what’s presently related versus what’s historic noise. Earlier errors within the dialog can compound, making a suggestions loop the place the mannequin references its personal poor outputs and spirals additional off monitor.

    That is particularly problematic in iterative workflows—precisely the type of complicated duties the place context engineering shines. Debugging classes, code refactoring, doc enhancing, or analysis tasks naturally contain false begins and course corrections. However every failed try leaves traces within the context that may intrude with subsequent reasoning.

    Sensible methods for managing context rot embody:

    • Context pruning and refresh: Workaccount2’s answer is “I work round it by recurrently making summaries of situations, after which spinning up a brand new occasion with recent context and feed within the abstract of the earlier occasion.” This method preserves the important state whereas discarding the noise. You’re basically doing rubbish assortment in your context.
    • Structured context boundaries: Use clear markers to separate completely different phases of labor. For instance, explicitly mark sections as “Earlier makes an attempt (for reference solely)” versus “Present working context.” This helps the mannequin perceive what to prioritize.
    • Progressive context refinement: After important progress, consciously rebuild the context from scratch. Extract the important thing choices, profitable approaches, and present state, then begin recent. It’s like refactoring code—often it’s essential to clear up the amassed cruft.
    • Checkpoint summaries: At common intervals, have the mannequin summarize what’s been achieved and what the present state is. Use these summaries as seeds for recent context when beginning new classes.
    • Context windowing: For very lengthy duties, break them into phases with pure boundaries the place you’ll be able to reset context. Every section will get a clear begin with solely the important carry-over from the earlier section.

    This problem additionally highlights why “simply dump all the things into the context” isn’t a viable long-term technique. Like good software program structure, good context engineering requires intentional data administration—deciding not simply what to incorporate but additionally when to exclude, summarize, or refresh.


    AI instruments are shortly shifting past chat UX to stylish agent interactions. Our upcoming AI Codecon occasion, Coding for the Agentic World, will spotlight how builders are already utilizing brokers to construct modern and efficient AI-powered experiences. We hope you’ll be a part of us on September 9 to discover the instruments, workflows, and architectures defining the subsequent period of programming. It’s free to attend. Register now to avoid wasting your seat.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    Integrating Rust and Python for Knowledge Science

    January 24, 2026

    All the things You Have to Know About How Python Manages Reminiscence

    January 23, 2026

    The Human Behind the Door – O’Reilly

    January 23, 2026
    Top Posts

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025

    Meta resumes AI coaching utilizing EU person knowledge

    April 18, 2025
    Don't Miss

    Integrating Rust and Python for Knowledge Science

    By Oliver ChambersJanuary 24, 2026

    Picture by Creator   # Introduction  Python is the default language of knowledge science for good…

    Thomas Pilz on innovation and security in robotics

    January 24, 2026

    Why AI is the Final Working System You’ll Ever Want

    January 23, 2026

    Performative Coverage: When Anti-Racism Is Managed, Not Practiced

    January 23, 2026
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2026 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.