Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Google Disrupts UNC2814 GRIDTIDE Marketing campaign After 53 Breaches Throughout 42 International locations

    February 26, 2026

    8 billion tokens a day compelled AT&T to rethink AI orchestration — and lower prices by 90%

    February 26, 2026

    Constructing a Private Productiveness Agent with GLM-5 

    February 25, 2026
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Emerging Tech»8 billion tokens a day compelled AT&T to rethink AI orchestration — and lower prices by 90%
    Emerging Tech

    8 billion tokens a day compelled AT&T to rethink AI orchestration — and lower prices by 90%

    Sophia Ahmed WilsonBy Sophia Ahmed WilsonFebruary 26, 2026No Comments6 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    8 billion tokens a day compelled AT&T to rethink AI orchestration — and lower prices by 90%
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link



    When your common every day token utilization is 8 billion a day, you could have a large scale drawback.

    This was the case at AT&T, and chief knowledge officer Andy Markus and his staff acknowledged that it merely wasn’t possible (or economical) to push all the things by means of massive reasoning fashions.

    So, when constructing out an inside Ask AT&T private assistant, they reconstructed the orchestration layer. The consequence: A multi-agent stack constructed on LangChain the place massive language mannequin “tremendous brokers” direct smaller, underlying “employee” brokers performing extra concise, purpose-driven work.

    This versatile orchestration layer has dramatically improved latency, pace and response instances, Markus instructed VentureBeat. Most notably, his staff has seen as much as 90% value financial savings.

    “I consider the way forward for agentic AI is many, many, many small language fashions (SLMs),” he stated. “We discover small language fashions to be nearly as correct, if not as correct, as a big language mannequin on a given area space.”

    Most not too long ago, Markus and his staff used this re-architected stack together with Microsoft Azure to construct and deploy Ask AT&T Workflows, a graphical drag-and-drop agent builder for workers to automate duties.

    The brokers pull from a collection of proprietary AT&T instruments that deal with doc processing, pure language-to-SQL conversion, and picture evaluation. “Because the workflow is executed, it's AT&T’s knowledge that's actually driving the selections,” Markus stated. Relatively than asking normal questions, “we're asking questions of our knowledge, and we convey our knowledge to bear to verify it focuses on our data because it makes choices.”

    Nonetheless, a human at all times oversees the “chain response” of brokers. All agent actions are logged, knowledge is remoted all through the method, and role-based entry is enforced when brokers move workloads off to at least one one other.

    “Issues do occur autonomously, however the human on the loop nonetheless gives a examine and stability of all the course of,” Markus stated.

    Not overbuilding, utilizing ‘interchangeable and selectable’ fashions

    AT&T doesn’t take a "construct all the things from scratch" mindset, Markus famous; it’s extra counting on fashions which might be “interchangeable and selectable” and “by no means rebuilding a commodity.” As performance matures throughout the trade, they’ll deprecate homegrown instruments in lieu of off the shelf choices, he defined.

    “As a result of on this house, issues change each week, if we're fortunate, typically a number of instances every week,” he stated. “We’d like to have the ability to pilot, plug in and plug out totally different elements.”

    They do “actually rigorous” evaluations of obtainable choices in addition to their very own; as an example, their Ask Information with Relational Data Graph has topped the Spider 2.0 textual content to SQL accuracy leaderboard, and different instruments have scored extremely on the BERT SQL benchmark.

    Within the case of homegrown agentic instruments, his staff makes use of LangChain as a core framework, fine-tunes fashions with customary retrieval-augmented era (RAG) and different in-house algorithms, and companions intently with Microsoft, utilizing the tech large’s search performance for his or her vector retailer.

    Finally, although, it’s necessary to not simply fuse agentic AI or different superior instruments into all the things for the sake of it, Markus suggested. “Generally we over complicate issues,” he stated. “Generally I've seen an answer over engineered.”

    As an alternative, builders ought to ask themselves whether or not a given instrument truly must be agentic. This might embrace questions like: What accuracy degree could possibly be achieved if it was a less complicated, single-turn generative resolution? How may they break it down into smaller items the place each bit could possibly be delivered “far more precisely”?, as Markus put it.

    Accuracy, value and power responsiveness ought to be core rules. “Even because the options have gotten extra difficult, these three fairly primary rules nonetheless give us a whole lot of course,” he stated.

    How 100,000 staff are literally utilizing it

    Ask AT&T Workflows has been rolled out to 100,000-plus staff. Greater than half say they use it every single day, and energetic adopters report productiveness good points as excessive as 90%, Markus stated.

    “We're taking a look at, are they utilizing the system repeatedly? As a result of stickiness is an effective indicator of success,” he stated.

    The agent builder presents “two journeys” for workers. One is pro-code, the place customers can program Python behind the scenes, dictating guidelines for the way brokers ought to work. The opposite is no-code, that includes a drag-and-drop visible interface for a “fairly gentle consumer expertise,” Markus stated.

    Curiously, even proficient customers are gravitating towards the latter choice. At a latest hackathon geared to a technical viewers, contributors got a alternative of each, and greater than half selected low code. “This was a shock to us, as a result of these individuals have been all very competent within the programming side,” Markus stated.

    Workers are utilizing brokers throughout quite a lot of capabilities; as an example, a community engineer could construct a sequence of them to deal with alerts and reconnect prospects after they lose connectivity. On this situation, one agent can correlate telemetry to determine the community situation and its location, pull change logs and examine for identified points. Then, it will possibly open a bother ticket.

    One other agent may then give you methods to resolve the problem and even write new code to patch it. As soon as the issue is resolved, a 3rd agent can then write up a abstract with preventative measures for the long run.

    “The [human] engineer would watch over all of it, ensuring the brokers are performing as anticipated and taking the appropriate actions,” Markus stated.

    AI-fueled coding is the long run

    That very same engineering self-discipline — breaking work into smaller, purpose-built items — is now reshaping how AT&T writes code itself, by means of what Markus calls "AI-fueled coding."

    He in contrast the method to RAG; devs use agile coding strategies in an built-in improvement setting (IDE) together with “function-specific” construct archetypes that dictates how code ought to work together.

    The output just isn’t unfastened code; the code is “very near manufacturing grade,” and will attain that high quality in a single flip. “We've all labored with vibe coding, the place we have now an agentic form of code editor,” Markus famous. However AI-fueled coding “eliminates a whole lot of the backwards and forwards iterations that you just may see in vibe coding.”

    He sees this coding method as “tangibly redefining” the software program improvement cycle, finally shortening improvement timelines and growing output of production-grade code. Non-technical groups may get in on the motion, utilizing plain language prompts to construct software program prototypes.

    His staff, as an example, has used the method to construct an inside curated knowledge product in 20 minutes; with out AI, constructing it could have taken six weeks. “We develop software program with it, modify software program with it, do knowledge science with it, do knowledge analytics with it, do knowledge engineering with it,” Markus stated. “So it's a recreation changer.”

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Sophia Ahmed Wilson
    • Website

    Related Posts

    Samsung Galaxy Unpacked 2026 LIVE: The way to Watch S26 Extremely Reveal

    February 25, 2026

    Greatest Lego deal: Save $12 on the Lego Botanicals Flower Bouquet at Amazon

    February 25, 2026

    Peacock Promo Codes: 40% Off February 2026

    February 25, 2026
    Top Posts

    Google Disrupts UNC2814 GRIDTIDE Marketing campaign After 53 Breaches Throughout 42 International locations

    February 26, 2026

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025
    Don't Miss

    Google Disrupts UNC2814 GRIDTIDE Marketing campaign After 53 Breaches Throughout 42 International locations

    By Declan MurphyFebruary 26, 2026

    Ravie LakshmananFeb 25, 2026Cyber Espionage / Community Safety Google on Wednesday disclosed that it labored…

    8 billion tokens a day compelled AT&T to rethink AI orchestration — and lower prices by 90%

    February 26, 2026

    Constructing a Private Productiveness Agent with GLM-5 

    February 25, 2026

    Translating music into gentle and movement with robots

    February 25, 2026
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2026 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.