AI brokers are quickly evolving from mere chat interfaces into subtle autonomous staff that deal with complicated, time-intensive duties. As organizations deploy brokers to coach machine studying (ML) fashions, course of massive datasets, and run prolonged simulations, the Mannequin Context Protocol (MCP) has emerged as an ordinary for agent-server integrations. However a essential problem stays: these operations can take minutes or hours to finish, far exceeding typical session timeframes. Through the use of Amazon Bedrock AgentCore and Strands Brokers to implement persistent state administration, you’ll be able to allow seamless, cross-session process execution in manufacturing environments. Think about your AI agent initiating a multi-hour information processing job, your person closing their laptop computer, and the system seamlessly retrieving accomplished outcomes when the person returns days later—with full visibility into process progress, outcomes, and errors. This functionality transforms AI brokers from conversational assistants into dependable autonomous staff that may deal with enterprise-scale operations. With out these architectural patterns, you’ll encounter timeout errors, inefficient useful resource utilization, and potential information loss when connections terminate unexpectedly.
On this submit, we give you a complete strategy to attain this. First, we introduce a context message technique that maintains steady communication between servers and purchasers throughout prolonged operations. Subsequent, we develop an asynchronous process administration framework that enables your AI brokers to provoke long-running processes with out blocking different operations. Lastly, we show how one can carry these methods along with Amazon Bedrock AgentCore and Strands Brokers to construct production-ready AI brokers that may deal with complicated, time-intensive operations reliably.
Widespread approaches to deal with long-running duties
When designing MCP servers for long-running duties, you would possibly face a basic architectural choice: ought to the server keep an energetic connection and supply real-time updates, or ought to it decouple process execution from the preliminary request? This selection results in two distinct approaches: context messaging and async process administration.
Utilizing context messaging
The context messaging strategy maintains steady communication between the MCP server and shopper all through process execution. That is achieved through the use of MCP’s built-in context object to ship periodic notifications to the shopper. This strategy is perfect for eventualities the place duties are sometimes accomplished inside 10–quarter-hour and community connectivity stays steady. The context messaging strategy provides these benefits:
- Simple implementation
- No further polling logic required
- Simple shopper implementation
- Minimal overhead
Utilizing async process administration
The async process administration strategy separates process initiation from execution and consequence retrieval. After executing the MCP software, the software instantly returns a process initiation message whereas executing the duty within the background. This strategy excels in demanding enterprise eventualities the place duties would possibly run for hours, customers want flexibility to disconnect and reconnect, and system reliability is paramount. The async process administration strategy gives these advantages:
- True fire-and-forget operation
- Secure shopper disconnection whereas duties proceed processing
- Information loss prevention by means of persistent storage
- Assist for long-running operations (hours)
- Resilience in opposition to community interruptions
- Asynchronous workflows
Context messaging
Let’s start by exploring the context messaging strategy, which gives an easy answer for dealing with reasonably lengthy operations whereas sustaining energetic connections. This strategy builds instantly on current capabilities of MCP and requires minimal further infrastructure, making it a superb start line for extending your agent’s processing cut-off dates. Think about you’ve constructed an MCP server for an AI agent that helps information scientists practice ML fashions. When a person asks the agent to coach a posh mannequin, the underlying course of would possibly take 10–quarter-hour—far past the standard 30-second to 2-minute HTTP timeout restrict in most environments. With out a correct technique, the connection would drop, the operation would fail, and the person could be left annoyed. In a Streamable HTTP transport for MCP shopper implementation, these timeout constraints are significantly limiting. When process execution exceeds the timeout restrict, the connection aborts and the agent’s workflow interrupts. That is the place context messaging is available in. The next diagram illustrates the workflow when implementing the context messaging strategy. Context messaging makes use of the built-in context object of MCP to ship periodic indicators from the server to the MCP shopper, successfully conserving the connection alive all through longer operations. Consider it as sending “heartbeat” messages that assist stop the connection from timing out.
Here’s a code instance to implement the context messaging:
The important thing aspect right here is the Context parameter within the software definition. While you embody a parameter with the Context sort annotation, FastMCP mechanically injects this object, providing you with entry to strategies similar to ctx.data() and ctx.report_progress(). These strategies ship messages to the linked shopper with out terminating software execution.
The report_progress() calls inside the coaching loop function these essential heartbeat messages, ensuring the MCP connection stays energetic all through the prolonged processing interval.
For a lot of real-world eventualities, precise progress can’t be simply quantified—similar to when processing unpredictable datasets or making exterior API calls. In these circumstances, you’ll be able to implement a time-based heartbeat system:
This sample creates an asynchronous timer that runs alongside your important process, sending common standing updates each few seconds. Utilizing asyncio.Occasion() for coordination facilitates clear shutdown of the timer when the principle work is accomplished.
When to make use of context messaging
Context messaging works greatest when:
- Duties take 1–quarter-hour to finish*
- Community connections are usually steady
- The shopper session can stay energetic all through the operation
- You want real-time progress updates throughout processing
- Duties have predictable, finite execution instances with clear termination circumstances
*Observe: “quarter-hour” is predicated on the utmost time for synchronous requests Amazon Bedrock AgentCore supplied. Extra particulars about Bedrock AgentCore service quotas may be discovered at Quotas for Amazon Bedrock AgentCore. If the infrastructure internet hosting the agent doesn’t implement laborious cut-off dates, be extraordinarily cautious when utilizing this strategy for duties that may doubtlessly dangle or run indefinitely. With out correct safeguards, a caught process may keep an open connection indefinitely, resulting in useful resource depletion, unresponsive processes, and doubtlessly system-wide stability points.
Listed here are some vital limitations to think about:
- Steady connection required – The shopper session should stay energetic all through your complete operation. If the person closes their browser or the community drops, the work is misplaced.
- Useful resource consumption – Maintaining connections open consumes server and shopper assets, doubtlessly growing prices for long-running operations.
- Community dependency – Community instability can nonetheless interrupt the method, requiring a full restart.
- Final timeout limits – Most infrastructures have laborious timeout limits that may’t be circumvented with heartbeat messages.
Due to this fact, for really long-running operations that may take hours or for eventualities the place customers must disconnect and reconnect later, you’ll want the extra sturdy asynchronous process administration strategy.
Async process administration
In contrast to the context messaging strategy the place purchasers should keep steady connections, the async process administration sample follows a “fireplace and overlook” mannequin:
- Activity initiation – Consumer makes a request to begin a process and instantly receives a process ID
- Background processing – Server executes the work asynchronously, with no shopper connection required
- Standing checking – Consumer can reconnect each time to examine progress utilizing the duty ID
- Consequence retrieval – After they’re accomplished, outcomes stay obtainable for retrieval each time the shopper reconnects
The next determine illustrates the workflow within the asynchronous process administration strategy.
This sample mirrors the way you work together with batch processing techniques in enterprise environments—submit a job, disconnect, and examine again later when handy. Right here’s a sensible implementation that demonstrates these rules:
This implementation creates a process administration system with three distinct MCP instruments:
model_training()– The entry level that initiates a brand new process. Moderately than performing the work instantly, it:- Generates a novel process identifier utilizing Universally Distinctive Identifier (UUID)
- Creates an preliminary process report within the storage dictionary
- Launches the precise processing as a background process utilizing
asyncio.create_task() - Returns instantly with the duty ID, permitting the shopper to disconnect
check_task_status()– Permits purchasers to observe progress at their comfort by:- Wanting up the duty by ID within the storage dictionary
- Returning present standing and progress info
- Offering applicable error dealing with for lacking duties
get_task_results()– Retrieves accomplished outcomes when prepared by:- Verifying the duty exists and is accomplished
- Returning the outcomes saved throughout background processing
- Offering clear error messages when outcomes aren’t prepared
The precise work occurs within the personal _execute_model_training() perform, which runs independently within the background after the preliminary shopper request is accomplished. It updates the duty’s standing and progress within the shared storage because it progresses, making this info obtainable for subsequent standing checks.
Limitations to think about
Though the async process administration strategy helps resolve connectivity points, it introduces its personal set of limitations:
- Person expertise friction – The strategy requires customers to manually examine process standing, keep in mind process IDs throughout classes, and explicitly request outcomes, growing interplay complexity.
- Risky reminiscence storage – Utilizing in-memory storage (as in our instance) means the duties and outcomes are misplaced if the server restarts, making the answer unsuitable for manufacturing with out persistent storage.
- Serverless surroundings constraints – In ephemeral serverless environments, cases are mechanically terminated after durations of inactivity, inflicting the in-memory process state to be completely misplaced. This creates a paradoxical state of affairs the place the answer designed to deal with long-running operations turns into weak to the precise length it goals to assist. Except customers keep common check-ins to assist stop session cut-off dates, each duties and outcomes may vanish.
Transferring towards a strong answer
To deal with these essential limitations, that you must embody exterior persistence that survives each server restarts and occasion terminations. That is the place integration with devoted storage providers turns into important. Through the use of exterior agent reminiscence storage techniques, you’ll be able to essentially change the place and the way process info is maintained. As an alternative of counting on the MCP server’s unstable reminiscence, this strategy makes use of persistent exterior agent reminiscence storage providers that stay obtainable no matter server state.
The important thing innovation on this enhanced strategy is that when the MCP server runs a long-running process, it writes the interim or last outcomes instantly into exterior reminiscence storage, similar to Amazon Bedrock AgentCore Reminiscence that the agent can entry, as illustrated within the following determine. This helps create resilience in opposition to two kinds of runtime failures:
- The occasion operating the MCP server may be terminated because of inactivity after process completion
- The occasion internet hosting the agent itself may be recycled in ephemeral serverless environments
With exterior reminiscence storage, when customers return to work together with the agent—whether or not minutes, hours, or days later—the agent can retrieve the finished process outcomes from persistent storage. This strategy minimizes runtime dependencies: even when each the MCP server and agent cases are terminated, the duty outcomes stay safely preserved and accessible when wanted.
The following part will discover how one can implement this sturdy answer utilizing Amazon Bedrock AgentCore Runtime as a serverless internet hosting surroundings, AgentCore Reminiscence for persistent agent reminiscence storage, and the Strands Brokers framework to orchestrate these elements right into a cohesive system that maintains process state throughout session boundaries.
Amazon Bedrock AgentCore and Strands Brokers implementation
Earlier than diving into the implementation particulars, it’s vital to grasp the deployment choices obtainable for MCP servers on Amazon Bedrock AgentCore. There are two main approaches: Amazon Bedrock AgentCore Gateway and AgentCore Runtime. AgentCore Gateway has a 5-minute timeout for invocations, making it unsuitable for internet hosting MCP servers that present instruments requiring prolonged response instances or long-running operations. AgentCore Runtime provides considerably extra flexibility with a 15-minute request timeout (for synchronous requests) and adjustable most session length (for asynchronous processes; the default length is 8 hours) and idle session timeout. Though you can host an MCP server in a standard serverful surroundings for limitless execution time, AgentCore Runtime gives an optimum stability for many manufacturing eventualities. You achieve serverless advantages similar to automated scaling, pay-per-use pricing, and no infrastructure administration, whereas the adjustable maximums session length covers most real-world lengthy operating duties—from information processing and mannequin coaching to report era and complicated simulations. You need to use this strategy to construct subtle AI brokers with out the operational overhead of managing servers whereas reserving serverful deployments just for the uncommon circumstances that genuinely require multiday executions. For extra details about AgentCore Runtime and AgentCore Gateway service quotas, seek advice from Quotas for Amazon Bedrock AgentCore.
Subsequent, we stroll by means of the implementation, which is illustrated within the following diagram. This implementation consists of two interconnected elements: the MCP server that executes long-running duties and writes outcomes to AgentCore Reminiscence, and the agent that manages the dialog stream and retrieves these outcomes when wanted. This structure creates a seamless expertise the place customers can disconnect throughout prolonged processes and return later to search out their outcomes ready for them.
MCP server implementation
Let’s look at how our MCP server implementation makes use of AgentCore Reminiscence to attain persistence:
The implementation depends on two key elements that allow persistence and session administration.
- The
agentcore_memory_client.create_event()technique serves because the bridge between software execution and protracted reminiscence storage. When a background process is accomplished, this technique saves the outcomes on to the agent’s reminiscence in AgentCore Reminiscence utilizing the required reminiscence ID, actor ID, and session ID. In contrast to conventional approaches the place outcomes is likely to be saved quickly or require guide retrieval, this integration permits process outcomes to grow to be everlasting elements of the agent’s conversational reminiscence. The agent can then reference these leads to future interactions, making a steady knowledge-building expertise throughout a number of classes. - The second essential part entails extracting session context by means of
ctx.request_context.request.headers.get("mcp-session-id", ""). The"Mcp-Session-Id"is a part of normal MCP protocol. You need to use this header to go a composite identifier containing three important items of data in a delimited format:session_id@@@memory_id@@@actor_id. This strategy permits our implementation to retrieve the required context identifiers from a single header worth. Headers are used as a substitute of surroundings variables by necessity—these identifiers change dynamically with every dialog, whereas surroundings variables stay static from container startup. This design selection is especially vital in multi-tenant eventualities the place a single MCP server concurrently handles requests from a number of customers, every with their very own distinct session context.
One other vital facet on this instance entails correct message formatting when storing occasions. Every message saved to AgentCore Reminiscence requires two elements: the content material and a task identifier. These two elements have to be formatted in a approach that the agent framework may be acknowledged. Right here is an instance for Strands Brokers framework:
The content material is an inside JSON object (serialized with json.dumps()) that comprises the message particulars, together with position, textual content content material, and message ID. The outer position identifier (USER on this instance) helps AgentCore Reminiscence categorize the message supply.
Strands Brokers implementation
Integrating Amazon Bedrock AgentCore Reminiscence with Strands Brokers is remarkably simple utilizing the AgentCoreMemorySessionManager class from the Bedrock AgentCore SDK. As proven within the following code instance, implementation requires minimal configuration—create an AgentCoreMemoryConfig together with your session identifiers, initialize the session supervisor with this config, and go it on to your agent constructor. The session supervisor transparently handles the reminiscence operations behind the scenes, sustaining dialog historical past and context throughout interactions whereas organizing recollections utilizing the mix of session_id, memory_id, and actor_id. For extra info, seek advice from AgentCore Reminiscence Session Supervisor.
The session context administration is especially elegant right here. The agent receives session identifiers by means of the payload and context parameters equipped by AgentCore Runtime. These identifiers kind a vital contextual bridge that connects person interactions throughout a number of classes. The session_id may be extracted from the context object (producing a brand new one if wanted), and the memory_id and actor_id may be retrieved from the payload. These identifiers are then packaged right into a customized HTTP header (Mcp-Session-Id) that’s handed to the MCP server throughout connection institution.
To keep up this persistent expertise throughout a number of interactions, purchasers should persistently present the identical identifiers when invoking the agent:
By persistently offering the identical memory_id, actor_id, and runtimeSessionId throughout invocations, customers can create a steady conversational expertise the place process outcomes persist independently of session boundaries. When a person returns days later, the agent can mechanically retrieve each dialog historical past and the duty outcomes that have been accomplished throughout their absence.
This structure represents a big development in AI agent capabilities—reworking long-running operations from fragile, connection-dependent processes into sturdy, persistent duties that proceed working no matter connection state. The result’s a system that may ship really asynchronous AI help, the place complicated work continues within the background and outcomes are seamlessly built-in each time the person returns to the dialog.
Conclusion
On this submit, we’ve explored sensible methods to assist AI brokers deal with duties that take minutes and even hours to finish. Whether or not utilizing the extra simple strategy of conserving connections alive or the extra superior technique of injecting process outcomes to agent’s reminiscence, these strategies allow your AI agent to deal with invaluable complicated work with out irritating cut-off dates or misplaced outcomes.
We invite you to strive these approaches in your personal AI agent initiatives. Begin with context messaging for average duties, then transfer to async administration as your wants develop. The options we’ve shared may be shortly tailored to your particular wants, serving to you construct AI that delivers outcomes reliably—even when customers disconnect and return days later. What long-running duties may your AI assistants deal with higher with these strategies?
To be taught extra, see the Amazon Bedrock AgentCore documentation and discover our pattern pocket book.
In regards to the Authors
Haochen Xie is a Senior Information Scientist at AWS Generative AI Innovation Middle. He’s an strange particular person.
Flora Wang is an Utilized Scientist at AWS Generative AI Innovation Middle, the place she works with prospects to architect and implement scalable Generative AI options that tackle their distinctive enterprise challenges. She makes a speciality of mannequin customization strategies and agent-based AI techniques, serving to organizations harness the total potential of generative AI know-how.
Yuan Tian is an Utilized Scientist on the AWS Generative AI Innovation Middle, the place he works with prospects throughout various industries—together with healthcare, life sciences, finance, and vitality—to architect and implement generative AI options similar to agentic techniques. He brings a novel interdisciplinary perspective, combining experience in machine studying with computational biology.
Hari Prasanna Das is an Utilized Scientist on the AWS Generative AI Innovation Middle, the place he works with AWS prospects throughout totally different verticals to expedite their use of Generative AI. Hari holds a PhD in Electrical Engineering and Pc Sciences from the College of California, Berkeley. His analysis pursuits embody Generative AI, Deep Studying, Pc Imaginative and prescient, and Information-Environment friendly Machine Studying.




