Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Russian hackers accused of assault on Poland electrical energy grid

    January 26, 2026

    Palantir Defends Work With ICE to Workers Following Killing of Alex Pretti

    January 26, 2026

    The Workers Who Quietly Maintain Groups Collectively

    January 26, 2026
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Machine Learning & Research»Stress Testing FastAPI Software – KDnuggets
    Machine Learning & Research

    Stress Testing FastAPI Software – KDnuggets

    Oliver ChambersBy Oliver ChambersAugust 15, 2025No Comments8 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Stress Testing FastAPI Software – KDnuggets
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    Stress Testing FastAPI Software – KDnuggets
    Picture by Creator

     

    # Introduction

     
    Stress testing is essential for understanding how your utility behaves beneath heavy load. For machine learning-powered APIs, it’s particularly essential as a result of mannequin inference may be CPU-intensive. By simulating numerous customers, we will establish efficiency bottlenecks, decide the capability of our system, and guarantee reliability.

    On this tutorial, we might be utilizing:

    • FastAPI: A contemporary, quick (high-performance) internet framework for constructing APIs with Python.
    • Uvicorn: An ASGI server to run our FastAPI utility.
    • Locust: An open-source load testing software. You outline consumer conduct with Python code, and swarm your system with tons of of simultaneous customers.
    • Scikit-learn: For our instance machine studying mannequin.

     

    # 1. Undertaking Setup and Dependencies

     
    Arrange the venture construction and set up the required dependencies.

    1. Create necessities.txt file and add the next Python packages:
    2. fastapi==0.115.12
      locust==2.37.10
      numpy==2.3.0
      pandas==2.3.0
      pydantic==2.11.5
      scikit-learn==1.7.0
      uvicorn==0.34.3
      orjson==3.10.18

       

    3. Open your terminal, create a digital atmosphere, and activate it.
    4. python -m venv venv
      venvScriptsactivate

       

    5. Set up all of the Python packages utilizing the necessities.txt file.
    6. pip set up -r necessities.txt

       

     

    # 2. Constructing the FastAPI Software

     
    On this part, we are going to create a file for coaching the Regression mannequin, for pydantic fashions, and the FastAPI utility.

    This ml_model.py handles the machine studying mannequin. It makes use of a singleton sample to make sure just one occasion of the mannequin is loaded. The mannequin is a Random Forest Regressor educated on the California housing dataset. If a pre-trained mannequin (mannequin.pkl and scaler.pkl) does not exist, it trains and saves a brand new one.

    app/ml_model.py:

    import os
    import threading
    
    import joblib
    import numpy as np
    from sklearn.datasets import fetch_california_housing
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import StandardScaler
    
    class MLModel:
        _instance = None
        _lock = threading.Lock()
    
        def __new__(cls):
            if cls._instance is None:
                with cls._lock:
                    if cls._instance is None:
                        cls._instance = tremendous().__new__(cls)
            return cls._instance
    
        def __init__(self):
            if not hasattr(self, "initialized"):
                self.mannequin = None
                self.scaler = None
                self.model_path = "mannequin.pkl"
                self.scaler_path = "scaler.pkl"
                self.feature_names = None
                self.initialized = True
                self.load_or_create_model()
    
        def load_or_create_model(self):
            """Load current mannequin or create a brand new one utilizing California housing dataset"""
            if os.path.exists(self.model_path) and os.path.exists(self.scaler_path):
                self.mannequin = joblib.load(self.model_path)
                self.scaler = joblib.load(self.scaler_path)
                housing = fetch_california_housing()
                self.feature_names = housing.feature_names
                print("Mannequin loaded efficiently")
            else:
                print("Creating new mannequin...")
                housing = fetch_california_housing()
                X, y = housing.information, housing.goal
                self.feature_names = housing.feature_names
    
                X_train, X_test, y_train, y_test = train_test_split(
                    X, y, test_size=0.2, random_state=42
                )
    
                self.scaler = StandardScaler()
                X_train_scaled = self.scaler.fit_transform(X_train)
    
                self.mannequin = RandomForestRegressor(
                    n_estimators=50,  # Lowered for quicker predictions
                    max_depth=8,  # Lowered for quicker predictions
                    random_state=42,
                    n_jobs=1,  # Single thread for consistency
                )
                self.mannequin.match(X_train_scaled, y_train)
    
                joblib.dump(self.mannequin, self.model_path)
                joblib.dump(self.scaler, self.scaler_path)
    
                X_test_scaled = self.scaler.rework(X_test)
                rating = self.mannequin.rating(X_test_scaled, y_test)
                print(f"Mannequin R² rating: {rating:.4f}")
    
        def predict(self, options):
            """Make prediction for home value"""
            features_array = np.array(options).reshape(1, -1)
            features_scaled = self.scaler.rework(features_array)
            prediction = self.mannequin.predict(features_scaled)[0]
            return prediction * 100000
    
        def get_feature_info(self):
            """Get details about the options"""
            return {
                "feature_names": checklist(self.feature_names),
                "num_features": len(self.feature_names),
                "description": "California housing dataset options",
            }
    
    # Initialize mannequin as singleton
    ml_model = MLModel()

     

    The pydantic_models.py file defines the Pydantic fashions for request and response information validation and serialization.

    app/pydantic_models.py:

    from typing import Listing
    
    from pydantic import BaseModel, Area
    
    class PredictionRequest(BaseModel):
        options: Listing[float] = Area(
            ...,
            description="Listing of 8 options: MedInc, HouseAge, AveRooms, AveBedrms, Inhabitants, AveOccup, Latitude, Longitude",
            min_length=8,
            max_length=8,
        )
    
        model_config = {
            "json_schema_extra": {
                "examples": [
                    {"features": [8.3252, 41.0, 6.984, 1.024, 322.0, 2.556, 37.88, -122.23]}
                ]
            }
        }

     

    app/principal.py: This file is the core FastAPI utility, defining the API endpoints.

    import asyncio
    from contextlib import asynccontextmanager
    
    from fastapi import FastAPI, HTTPException
    from fastapi.responses import ORJSONResponse
    
    from .ml_model import ml_model
    from .pydantic_models import (
        PredictionRequest,
    )
    
    @asynccontextmanager
    async def lifespan(app: FastAPI):
        # Pre-load the mannequin
        _ = ml_model.get_feature_info()
        yield
    
    app = FastAPI(
        title="California Housing Worth Prediction API",
        model="1.0.0",
        description="API for predicting California housing costs utilizing Random Forest mannequin",
        lifespan=lifespan,
        default_response_class=ORJSONResponse,
    )
    
    @app.get("/well being")
    async def health_check():
        """Well being examine endpoint"""
        return {"standing": "wholesome", "message": "Service is operational"}
    
    @app.get("/model-info")
    async def model_info():
        """Get details about the ML mannequin"""
        strive:
            feature_info = await asyncio.to_thread(ml_model.get_feature_info)
            return {
                "model_type": "Random Forest Regressor",
                "dataset": "California Housing Dataset",
                "options": feature_info,
            }
        besides Exception:
            increase HTTPException(
                status_code=500, element="Error retrieving mannequin data"
            )
    
    @app.put up("/predict")
    async def predict(request: PredictionRequest):
        """Make home value prediction"""
        if len(request.options) != 8:
            increase HTTPException(
                status_code=400,
                element=f"Anticipated 8 options, received {len(request.options)}",
            )
        strive:
            prediction = ml_model.predict(request.options)
            return {
                "prediction": float(prediction),
                "standing": "success",
                "features_used": request.options,
            }
        besides ValueError as e:
            increase HTTPException(status_code=400, element=str(e))
        besides Exception:
            increase HTTPException(status_code=500, element="Prediction error")

     

    Key factors:

    • lifespan supervisor: Ensures the ML mannequin is loaded throughout utility startup.
    • asyncio.to_thread: That is essential as a result of scikit-learn’s predict technique is CPU-bound (synchronous). Working it in a separate thread prevents it from blocking FastAPI’s asynchronous occasion loop, permitting the server to deal with different requests concurrently.

    Endpoints:

    • /well being: A easy well being examine.
    • /model-info: Offers metadata concerning the ML mannequin.
    • /predict: Accepts an inventory of options and returns a home value prediction.

    run_server.py: It accommodates the script that’s used to run the FastAPI utility utilizing Uvicorn.

    import uvicorn
    
    if __name__ == "__main__":
    
        uvicorn.run("app.principal:app", host="localhost", port=8000, staff=4)

     

    All of the information and configurations can be found on the GitHub repository: kingabzpro/Stress-Testing-FastAPI

     

    # 3. Writing the Locust Stress Take a look at

     
    Now, let’s create the stress take a look at script utilizing Locust.

    checks/locustfile.py: This file defines the conduct of simulated customers.

    import json
    import logging
    import random
    
    from locust import HttpUser, job
    
    # Scale back logging to enhance efficiency
    logging.getLogger("urllib3").setLevel(logging.WARNING)
    
    class HousingAPIUser(HttpUser):
        def generate_random_features(self):
            """Generate random however reasonable California housing options"""
            return [
                round(random.uniform(0.5, 15.0), 4),  # MedInc
                round(random.uniform(1.0, 52.0), 1),  # HouseAge
                round(random.uniform(2.0, 10.0), 2),  # AveRooms
                round(random.uniform(0.5, 2.0), 2),  # AveBedrms
                round(random.uniform(3.0, 35000.0), 0),  # Population
                round(random.uniform(1.0, 10.0), 2),  # AveOccup
                round(random.uniform(32.0, 42.0), 2),  # Latitude
                round(random.uniform(-124.0, -114.0), 2),  # Longitude
            ]
    
        @job(1)
        def model_info(self):
            """Take a look at well being endpoint"""
            with self.consumer.get("/model-info", catch_response=True) as response:
                if response.status_code == 200:
                    response.success()
                else:
                    response.failure(f"Mannequin data failed: {response.status_code}")
    
        @job(3)
        def single_prediction(self):
            """Take a look at single prediction endpoint"""
            options = self.generate_random_features()
    
    
            with self.consumer.put up(
                "/predict", json={"options": options}, catch_response=True, timeout=10
            ) as response:
                if response.status_code == 200:
                    strive:
                        information = response.json()
                        if "prediction" in information:
                            response.success()
                        else:
                            response.failure("Invalid response format")
                    besides json.JSONDecodeError:
                        response.failure("Did not parse JSON")
                elif response.status_code == 503:
                    response.failure("Service unavailable")
                else:
                    response.failure(f"Standing code: {response.status_code}")

     

    Key factors:

    1. Every simulated consumer will wait between 0.5 and a couple of seconds between executing duties.
    2. Creates reasonable random characteristic information for the prediction requests.
    3. Every consumer will make one health_check request and three single_prediction requests.

     

    # 4. Working the Stress Take a look at

     

    1. To judge the efficiency of your utility beneath load, start by beginning your asynchronous machine studying utility in a single terminal.
    2.  

      Mannequin loaded efficiently
      INFO:     Began server course of [26216]
      INFO:     Ready for utility startup.
      INFO:     Software startup full.
      INFO:     Uvicorn operating on http://0.0.0.0:8000 (Press CTRL+C to stop)

       

    3. Open your browser and navigate to http://localhost:8000/docs. Use the interactive API documentation to check your endpoints and guarantee they’re functioning appropriately.
    4.  

      Stress Testing FastAPI ApplicationStress Testing FastAPI Application

       

    5. Open a brand new terminal window, activate the digital atmosphere, and navigate to your venture’s root listing to run Locust with the Internet UI:
    6. locust -f checks/locustfile.py --host http://localhost:8000

       

      Entry the Locust internet UI at http://localhost:8089 in your browser.

    7. Within the Locust internet UI, set the full variety of customers to 500, the spawn fee to 10 customers per second, and run it for a minute.
    8.  

      Stress Testing FastAPI ApplicationStress Testing FastAPI Application

       

    9. Through the take a look at, Locust will show real-time statistics, together with the variety of requests, failures, and response occasions for every endpoint.
    10.  

      Stress Testing FastAPI ApplicationStress Testing FastAPI Application

       

    11. As soon as the take a look at is full, click on on the Charts tab to view interactive graphs displaying the variety of customers, requests per second, and response occasions.
    12.  

      Stress Testing FastAPI ApplicationStress Testing FastAPI Application

       

    13. To run Locust with out the net UI and routinely generate an HTML report, use the next command:
    14. locust -f checks/locustfile.py --host http://localhost:8000 --users 500 --spawn-rate 10 --run-time 60s --headless  --html report.html

       

    After the take a look at finishes, an HTML report named report.html might be saved in your venture listing for later assessment.

     

    Stress Testing FastAPI ApplicationStress Testing FastAPI Application

     

    # Last Ideas

     
    Our app can deal with numerous customers as we’re utilizing a easy machine studying mannequin. The outcomes present that the model-info endpoint has a better response time than the prediction, which is spectacular. That is the best-case state of affairs for testing your utility regionally earlier than pushing it to manufacturing.

    If you want to expertise this setup firsthand, please go to the kingabzpro/Stress-Testing-FastAPI repository and comply with the directions within the documentation.
     
     

    Abid Ali Awan (@1abidaliawan) is a licensed information scientist skilled who loves constructing machine studying fashions. At present, he’s specializing in content material creation and writing technical blogs on machine studying and information science applied sciences. Abid holds a Grasp’s diploma in expertise administration and a bachelor’s diploma in telecommunication engineering. His imaginative and prescient is to construct an AI product utilizing a graph neural community for college students scuffling with psychological sickness.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    How CLICKFORCE accelerates data-driven promoting with Amazon Bedrock Brokers

    January 26, 2026

    5 Breakthroughs in Graph Neural Networks to Watch in 2026

    January 26, 2026

    AI within the Workplace – O’Reilly

    January 26, 2026
    Top Posts

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025

    Meta resumes AI coaching utilizing EU person knowledge

    April 18, 2025
    Don't Miss

    Russian hackers accused of assault on Poland electrical energy grid

    By Declan MurphyJanuary 26, 2026

    On Dec. 29 and 30, the Polish electrical energy grid was subjected to a cyberattack…

    Palantir Defends Work With ICE to Workers Following Killing of Alex Pretti

    January 26, 2026

    The Workers Who Quietly Maintain Groups Collectively

    January 26, 2026

    Nike Knowledge Breach Claims Floor as WorldLeaks Leaks 1.4TB of Recordsdata On-line – Hackread – Cybersecurity Information, Knowledge Breaches, AI, and Extra

    January 26, 2026
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2026 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.