Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    July 30, 2025

    AI vs. AI: Prophet Safety raises $30M to interchange human analysts with autonomous defenders

    July 30, 2025

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Machine Learning & Research»Constructing a Customized PDF Parser with PyPDF and LangChain
    Machine Learning & Research

    Constructing a Customized PDF Parser with PyPDF and LangChain

    Oliver ChambersBy Oliver ChambersJune 12, 2025No Comments15 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Constructing a Customized PDF Parser with PyPDF and LangChain
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link



    Picture by Creator | Canva

     

    PDF recordsdata are in all places. You’ve in all probability seen them in varied locations, resembling school papers, electrical energy payments, workplace contracts, product manuals, and extra. They’re tremendous widespread, however working with them just isn’t as straightforward because it appears to be like. Let’s say you wish to extract helpful info from a PDF, like studying the textual content, splitting it into sections, or getting a fast abstract. This will likely sound easy, however you’ll see it’s not so clean when you strive.

    Not like Phrase or HTML recordsdata, PDFs don’t retailer content material in a neat, readable approach. As an alternative, they’re designed to look good, to not be learn by applications. The textual content will be in every single place, break up into bizarre blocks, scattered throughout the web page, or combined up with tables and pictures. This makes it onerous to get clear, structured knowledge from them.

    On this article, we’re going to construct one thing that may deal with this mess. We’ll create a customized PDF parser that may:

    • Extract and clear textual content from PDFs on the web page stage, with non-obligatory structure preservation for higher formatting
    • Deal with picture metadata extraction
    • Take away undesirable headers and footers by detecting repeated traces throughout pages to scale back noise
    • Retrieve detailed doc and page-level metadata, resembling creator, title, creation date, rotation, and web page dimension
    • Chunk the content material into manageable items for additional NLP or LLM processing

    Let’s get began.

     

    Folder Construction

     
    Earlier than beginning, it’s good to prepare your undertaking recordsdata for readability and scalability.

    custom_pdf_parser/
    │
    ├── parser.py           
    ├── langchain_loader.py  
    ├── pipeline.py          
    ├── instance.py     
    ├── necessities.txt     # Dependencies checklist
    └── __init__.py         # (Non-compulsory) to mark listing as Python bundle

     
    You’ll be able to go away the __init.py__ file empty, as its most important objective is solely to point that this listing ought to be handled as a Python bundle. I’ll clarify the aim of every of the remaining recordsdata step-by-step.

     

    Instruments Required(necessities.txt)

     
    The required libraries required are:

    • PyPDF: A pure Python library to learn and write PDF recordsdata. It is going to be used to extract the textual content from PDF recordsdata
    • LangChain: A framework to construct context-aware purposes with language fashions (we’ll use it to course of and chain doc duties). It is going to be used to course of and manage the textual content correctly.

    Set up them with:

    pip set up pypdf langchain

     

    If you wish to handle dependencies neatly, create a necessities.txt file with:

     

    And run:

    pip set up -r necessities.txt

     

    Step 1: Set Up the PDF Parser(parser.py)

     
    The core class CustomPDFParser makes use of PyPDF to extract textual content and metadata from every PDF web page. It additionally consists of strategies to scrub textual content, extract picture info (non-obligatory), and take away repeated headers or footers that usually seem on every web page.

    • It helps preserving structure formatting
    • It extracts metadata like web page quantity, rotation, and media field dimensions
    • It might filter out pages with too little content material
    • Textual content cleansing removes extreme whitespace whereas preserving paragraph breaks

    The logic that implements all of those is:

    import os
    import logging
    from pathlib import Path
    from typing import Checklist, Dict, Any
    import pypdf
    from pypdf import PdfReader
    # Configure logging to point out data and above messages
    logging.basicConfig(stage=logging.INFO)
    logger = logging.getLogger(__name__)
    class CustomPDFParser:
      def __init__(
          self,extract_images: bool = False,preserve_layout: bool = True,remove_headers_footers: bool = True,min_text_length: int = 10
      ):
          """
          Initialize the parser with choices to extract photos, protect structure, take away repeated headers/footers, and minimal textual content size for pages.
          Args:
              extract_images: Whether or not to extract picture data from pages
              preserve_layout: Whether or not to maintain structure spacing in textual content extraction
              remove_headers_footers: Whether or not to detect and take away headers/footers
              min_text_length: Minimal size of textual content for a web page to be thought-about legitimate
          """
          self.extract_images = extract_images
          self.preserve_layout = preserve_layout
          self.remove_headers_footers = remove_headers_footers
          self.min_text_length = min_text_length
      def extract_text_from_page(self, web page: pypdf.PageObject, page_num: int) -> Dict[str, Any]:
          """
          Extract textual content and metadata from a single PDF web page.
          Args:
              web page: PyPDF web page object
              page_num: zero-based web page quantity
          Returns:
              dict with keys:
                  - 'textual content': extracted and cleaned textual content string,
                  - 'metadata': web page metadata dict,
                  - 'word_count': variety of phrases in extracted textual content
          """
          strive:
     # Extract textual content, optionally preserving the structure for higher formatting
              if self.preserve_layout:
                  textual content = web page.extract_text(extraction_mode="structure")
              else:
                  textual content = web page.extract_text()
            # Clear textual content: take away further whitespace and normalize paragraphs
              textual content = self._clean_text(textual content)
            # Collect web page metadata (web page quantity, rotation angle, mediabox)
              metadata = {
                  "page_number": page_num + 1,  # 1-based numbering
                  "rotation": getattr(web page, "rotation", 0),
                  "mediabox": str(getattr(web page, "mediabox", None)),
              }
              # Optionally, extract picture data from web page if requested
              if self.extract_images:
                  metadata["images"] = self._extract_image_info(web page)
              # Return dictionary with textual content and metadata for this web page
              return {
                  "textual content": textual content,
                  "metadata": metadata,
                  "word_count": len(textual content.break up()) if textual content else 0
              }
          besides Exception as e:
              # Log error and return empty knowledge for problematic pages
              logger.error(f"Error extracting web page {page_num}: {e}")
              return {
                  "textual content": "",
                  "metadata": {"page_number": page_num + 1, "error": str(e)},
                  "word_count": 0
              }
      def _clean_text(self, textual content: str) -> str:
          """
          Clear and normalize extracted textual content, preserving paragraph breaks.
          Args:
              textual content: uncooked textual content extracted from PDF web page
          Returns:
              cleaned textual content string
          """
          if not textual content:
              return ""
          traces = textual content.break up('n')
          cleaned_lines = []
          for line in traces:
              line = line.strip()  # Take away main/trailing whitespace
              if line:
                  # Non-empty line; preserve it
                  cleaned_lines.append(line)
              elif cleaned_lines and cleaned_lines[-1]:
                  # Protect paragraph break by retaining empty line provided that earlier line exists
                  cleaned_lines.append("")
          cleaned_text="n".be a part of(cleaned_lines)
    #Scale back any cases of greater than two consecutive clean traces to 2
          whereas 'nnn' in cleaned_text:
              cleaned_text = cleaned_text.substitute('nnn', 'nn')
          return cleaned_text.strip()
      def _extract_image_info(self, web page: pypdf.PageObject) -> Checklist[Dict[str, Any]]:
          """
          Extract primary picture metadata from web page, if out there.
          Args:
              web page: PyPDF web page object
          Returns:
              Checklist of dictionaries with picture data (index, identify, width, peak)
          """
          photos = []
          strive:
              # PyPDF pages can have an 'photos' attribute itemizing embedded photos
              if hasattr(web page, 'photos'):
                  for i, picture in enumerate(web page.photos):
                      photos.append({
                          "image_index": i,
                          "identify": getattr(picture, 'identify', f"image_{i}"),
                          "width": getattr(picture, 'width', None),
                          "peak": getattr(picture, 'peak', None)
                      })
          besides Exception as e:
              logger.warning(f"Picture extraction failed: {e}")
          return photos
    
      def _remove_headers_footers(self, pages_data: Checklist[Dict[str, Any]]) -> Checklist[Dict[str, Any]]:
          """
          Take away repeated headers and footers that seem on many pages.
          That is accomplished by figuring out traces showing on over 50% of pages
          firstly or finish of web page textual content, then eradicating these traces.
          Args:
              pages_data: Checklist of dictionaries representing every web page's extracted knowledge.
          Returns:
              Up to date checklist of pages with headers/footers eliminated
          """
          # Solely try removing if sufficient pages and possibility enabled
          if len(pages_data) < 3 or not self.remove_headers_footers:
              return pages_data
          # Accumulate first and final traces from every web page's textual content for evaluation
          first_lines = [page["text"].break up('n')[0] if web page["text"] else "" for web page in pages_data]
          last_lines = [page["text"].break up('n')[-1] if web page["text"] else "" for web page in pages_data]
          threshold = len(pages_data) * 0.5  # Greater than 50% pages
          # Determine candidate headers and footers showing ceaselessly
          potential_headers = [line for line in set(first_lines)
                              if first_lines.count(line) > threshold and line.strip()]
          potential_footers = [line for line in set(last_lines)
                              if last_lines.count(line) > threshold and line.strip()]
          # Take away recognized headers and footers from every web page's textual content
          for page_data in pages_data:
              traces = page_data["text"].break up('n')
              # Take away header if it matches a frequent header
              if traces and potential_headers:
                  for header in potential_headers:
                      if traces[0].strip() == header.strip():
                          traces = traces[1:]
                          break
              # Take away footer if it matches a frequent footer
              if traces and potential_footers:
                  for footer in potential_footers:
                      if traces[-1].strip() == footer.strip():
                          traces = traces[:-1]
                          break
    
              page_data["text"] = 'n'.be a part of(traces).strip()
          return pages_data
      def _extract_document_metadata(self, pdf_reader: PdfReader, pdf_path: str) -> Dict[str, Any]:
          """
          Extract metadata from the PDF doc itself.
          Args:
              pdf_reader: PyPDF PdfReader occasion
              pdf_path: path to PDF file
          Returns:
              Dictionary of metadata together with file data and PDF doc metadata
          """
          metadata = {
              "file_path": pdf_path,
              "file_name": Path(pdf_path).identify,
              "file_size": os.path.getsize(pdf_path) if os.path.exists(pdf_path) else None,
          }
          strive:
              if pdf_reader.metadata:
                  # Extract widespread PDF metadata keys if out there
                  metadata.replace({
                      "title": pdf_reader.metadata.get('/Title', ''),
                      "creator": pdf_reader.metadata.get('/Creator', ''),
                      "topic": pdf_reader.metadata.get('/Topic', ''),
                      "creator": pdf_reader.metadata.get('/Creator', ''),
                      "producer": pdf_reader.metadata.get('/Producer', ''),
                      "creation_date": str(pdf_reader.metadata.get('/CreationDate', '')),
                      "modification_date": str(pdf_reader.metadata.get('/ModDate', '')),
                  })
          besides Exception as e:
              logger.warning(f"Metadata extraction failed: {e}")
          return metadata
      def parse_pdf(self, pdf_path: str) -> Dict[str, Any]:
          """
          Parse the complete PDF file. Opens the file, extracts textual content and metadata web page by web page, removes headers/footers if configured, and aggregates outcomes.
          Args:
              pdf_path: Path to the PDF file
          Returns:
              Dictionary with keys:
                  - 'full_text': mixed textual content from all pages,
                  - 'pages': checklist of page-wise dicts with textual content and metadata,
                  - 'document_metadata': file and PDF metadata,
                  - 'total_pages': whole pages in PDF,
                  - 'processed_pages': variety of pages stored after filtering,
                  - 'total_words': whole phrase rely of parsed textual content
          """
          strive:
              with open(pdf_path, 'rb') as file:
                  pdf_reader = PdfReader(file)
                  doc_metadata = self._extract_document_metadata(pdf_reader, pdf_path)
                  pages_data = []
                  # Iterate over all pages and extract knowledge
                  for i, web page in enumerate(pdf_reader.pages):
                      page_data = self.extract_text_from_page(web page, i)
                      # Solely preserve pages with enough textual content size
                      if len(page_data["text"]) >= self.min_text_length:
                          pages_data.append(page_data)
                  # Take away repeated headers and footers
                  pages_data = self._remove_headers_footers(pages_data)
               # Mix all web page texts with a double newline as a separator
                  full_text="nn".be a part of(web page["text"] for web page in pages_data if web page["text"])
                  # Return remaining structured knowledge
                  return {
                      "full_text": full_text,
                      "pages": pages_data,
                      "document_metadata": doc_metadata,
                      "total_pages": len(pdf_reader.pages),
                      "processed_pages": len(pages_data),
                      "total_words": sum(web page["word_count"] for web page in pages_data)
                  }
          besides Exception as e:
              logger.error(f"Didn't parse PDF {pdf_path}: {e}")
              increase

     

     

    Step 2: Combine with LangChain (langchain_loader.py)

     
    The LangChainPDFLoader class wraps the customized parser and converts parsed pages into LangChain Doc objects, that are the constructing blocks for LangChain pipelines.

    • It permits chunking of paperwork into smaller items utilizing LangChain’s RecursiveCharacterTextSplitter
    • You’ll be able to customise chunk sizes and overlap for downstream LLM enter
    • This loader helps clear integration between uncooked PDF content material and LangChain’s doc abstraction

    The logic behind that is:

    from typing import Checklist, Non-compulsory, Dict, Any
    from langchain.schema import Doc
    from langchain.document_loaders.base import BaseLoader
    from langchain.text_splitter import RecursiveCharacterTextSplitter
    from parser import CustomPDFParser  # import the parser outlined above
    class LangChainPDFLoader(BaseLoader):
       def __init__(
           self,file_path: str,parser_config: Non-compulsory[Dict[str, Any]] = None,chunk_size: int = 500, chunk_overlap: int = 50
       ):
           """
           Initialize the loader with the PDF file path, parser configuration, and chunking parameters.
           Args:
               file_path: path to PDF file
               parser_config: dictionary of parser choices
               chunk_size: chunk dimension for splitting lengthy texts
               chunk_overlap: chunk overlap for splitting
           """
           self.file_path = file_path
           self.parser_config = parser_config or {}
           self.chunk_size = chunk_size
           self.chunk_overlap = chunk_overlap
           self.parser = CustomPDFParser(**self.parser_config)
       def load(self) -> Checklist[Document]:
           """
           Load PDF, parse pages, and convert every web page to a LangChain Doc.
           Returns:
               Checklist of Doc objects with web page textual content and mixed metadata.
           """
           parsed_data = self.parser.parse_pdf(self.file_path)
           paperwork = []
           # Convert every web page dict to a LangChain Doc
           for page_data in parsed_data["pages"]:
               if page_data["text"]:
                   # Merge document-level and page-level metadata
                   metadata = {**parsed_data["document_metadata"], **page_data["metadata"]}
                   doc = Doc(page_content=page_data["text"], metadata=metadata)
                   paperwork.append(doc)
           return paperwork
       def load_and_split(self) -> Checklist[Document]:
           """
           Load the PDF and break up massive paperwork into smaller chunks.
           Returns:
               Checklist of Doc objects after splitting massive texts.
           """
           paperwork = self.load()
           # Initialize a textual content splitter with the specified chunk dimension and overlap
           text_splitter = RecursiveCharacterTextSplitter(
               chunk_size=self.chunk_size,
               chunk_overlap=self.chunk_overlap,
               separators=["nn", "n", " ", ""]  # hierarchical splitting
           )
           # Break up paperwork into smaller chunks
           split_docs = text_splitter.split_documents(paperwork)
           return split_docs

     

    Step 3: Construct a Processing Pipeline (pipeline.py)

     
    The PDFProcessingPipeline class supplies a higher-level interface for:

    • Processing a single PDF
    • Choosing output format (uncooked dict, LangChain paperwork, or plain textual content)
    • Enabling or disabling chunking with configurable chunk sizes
    • Dealing with errors and logging

    This abstraction permits straightforward integration into bigger purposes or workflows. The logic behind that is:

    from typing import Checklist, Non-compulsory, Dict, Any
    from langchain.schema import Doc
    from parser import CustomPDFParser
    from langchain_loader import LangChainPDFLoader
    import logging
    logger = logging.getLogger(__name__)
    class PDFProcessingPipeline:
       def __init__(self, parser_config: Non-compulsory[Dict[str, Any]] = None):
           """
           Args:
              parser_config: dictionary of choices handed to CustomPDFParser
           """
           self.parser_config = parser_config or {}
       def process_single_pdf(
           self,pdf_path: str,output_format: str = "langchain",chunk_documents: bool = True,chunk_size: int = 500,chunk_overlap: int = 50
       ) -> Any:
           """
           Args:
               pdf_path: path to PDF file
               output_format: "uncooked" (dict), "langchain" (Paperwork), or "textual content" (string)
               chunk_documents: whether or not to separate LangChain paperwork into chunks
               chunk_size: chunk dimension for splitting
               chunk_overlap: chunk overlap for splitting
           Returns:
               Parsed content material within the requested format
           """
           if output_format == "uncooked":
               # Use uncooked CustomPDFParser output
               parser = CustomPDFParser(**self.parser_config)
               return parser.parse_pdf(pdf_path)
           elif output_format == "langchain":
               # Use LangChain loader, optionally chunked
               loader = LangChainPDFLoader(pdf_path, self.parser_config, chunk_size, chunk_overlap)
               if chunk_documents:
                   return loader.load_and_split()
               else:
                   return loader.load()
           elif output_format == "textual content":
               # Return mixed plain textual content solely
               parser = CustomPDFParser(**self.parser_config)
               parsed_data = parser.parse_pdf(pdf_path)
               return parsed_data.get("full_text", "")
           else:
               increase ValueError(f"Unknown output_format: {output_format}")

     

    Step 4: Take a look at the Parser (instance.py)

     
    Let’s check the parser as follows:

    import os
    from pathlib import Path
    def most important():
       print("👋 Welcome to the Customized PDF Parser!")
       print("What would you love to do?")
       print("1. View full parsed uncooked knowledge")
       print("2. Extract full plain textual content")
       print("3. Get LangChain paperwork (no chunking)")
       print("4. Get LangChain paperwork (with chunking)")
       print("5. Present doc metadata")
       print("6. Present per-page metadata")
       print("7. Present cleaned web page textual content (header/footer eliminated)")
       print("8. Present extracted picture metadata")
       selection = enter("Enter the variety of your selection: ").strip()
       if selection not in {'1', '2', '3', '4', '5', '6', '7', '8'}:
           print("❌ Invalid possibility.")
           return
       file_path = enter("Enter the trail to your PDF file: ").strip()
       if not Path(file_path).exists():
           print("❌ File not discovered.")
           return
       # Initialize pipeline
       pipeline = PDFProcessingPipeline({
           "preserve_layout": False,
           "remove_headers_footers": True,
           "extract_images": True,
           "min_text_length": 20
       })
       # Uncooked knowledge is required for many choices
       parsed = pipeline.process_single_pdf(file_path, output_format="uncooked")
       if selection == '1':
           print("nFull Uncooked Parsed Output:")
           for ok, v in parsed.objects():
               print(f"{ok}: {str(v)[:300]}...")
       elif selection == '2':
           print("nFull Cleaned Textual content (truncated preview):")
           print("Previewing the primary 1000 characters:n"+parsed["full_text"][:1000], "...")
       elif selection == '3':
           docs = pipeline.process_single_pdf(file_path, output_format="langchain", chunk_documents=False)
           print(f"nLangChain Paperwork: {len(docs)}")
           print("Previewing the primary 500 characters:n", docs[0].page_content[:500], "...")
       elif selection == '4':
           docs = pipeline.process_single_pdf(file_path, output_format="langchain", chunk_documents=True)
           print(f"nLangChain Chunks: {len(docs)}")
           print("Pattern chunk content material (first 500 chars):")
           print(docs[0].page_content[:500], "...")
       elif selection == '5':
           print("nDocument Metadata:")
           for key, worth in parsed["document_metadata"].objects():
               print(f"{key}: {worth}")
       elif selection == '6':
           print("nPer-page Metadata:")
           for i, web page in enumerate(parsed["pages"]):
               print(f"Web page {i+1}: {web page['metadata']}")
       elif selection == '7':
           print("nCleaned Textual content After Header/Footer Removing.")
           print("Exhibiting the primary 3 pages and first 500 characters of the textual content from every web page.")
           for i, web page in enumerate(parsed["pages"][:3]):  # First 3 pages
               print(f"n--- Web page {i+1} ---")
               print(web page["text"][:500], "...")
       elif selection == '8':
           print("nExtracted Picture Metadata (if out there):")
           discovered = False
           for i, web page in enumerate(parsed["pages"]):
               photos = web page["metadata"].get("photos", [])
               if photos:
                   discovered = True
                   print(f"n--- Web page {i+1} ---")
                   for img in photos:
                       print(img)
           if not discovered:
               print("No picture metadata discovered.")
    if __name__ == "__main__":
       most important()

     

    Run this and you’ll be directed to enter the selection no and path to the PDF. Enter that. The PDF I’m utilizing is publicly accessible, and you may obtain it utilizing the hyperlink.

    👋 Welcome to the Customized PDF Parser!
    What would you love to do?
    1. View full parsed uncooked knowledge
    2. Extract full plain textual content
    3. Get LangChain paperwork (no chunking)
    4. Get LangChain paperwork (with chunking)
    5. Present doc metadata
    6. Present per-page metadata
    7. Present cleaned web page textual content (header/footer eliminated)
    8. Present extracted picture metadata.
    Enter the variety of your selection: 5
    Enter the trail to your PDF file: /content material/articles.pdf
    
    Output:
    LangChain Chunks: 16
    First chunk preview:
    San José State College Writing Middle
    www.sjsu.edu/writingcenter
    Written by Ben Aldridge
    
    Articles (a/an/the), Spring 2014.                                                                                   1 of 4
    Articles (a/an/the)
    
    There are three articles within the English language: a, an, and the. They're positioned earlier than nouns
    and present whether or not a given noun is normal or particular.
    
    Examples of Articles

     

    Conclusion

     
    On this information, you’ve discovered the right way to construct a versatile and highly effective PDF processing pipeline utilizing solely open-source instruments. As a result of it’s modular, you possibly can simply prolong it, possibly add a search bar utilizing Streamlit, retailer chunks in a vector database like FAISS for smarter lookups, and even plug this right into a chatbot. You don’t need to rebuild something, you simply join the following piece.PDFs don’t need to really feel like locked bins anymore. With this strategy, you possibly can flip any doc into one thing you possibly can learn, search, and perceive in your phrases.
     
     

    Kanwal Mehreen Kanwal is a machine studying engineer and a technical author with a profound ardour for knowledge science and the intersection of AI with medication. She co-authored the e-book “Maximizing Productiveness with ChatGPT”. As a Google Era Scholar 2022 for APAC, she champions variety and educational excellence. She’s additionally acknowledged as a Teradata Variety in Tech Scholar, Mitacs Globalink Analysis Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having based FEMCodes to empower ladies in STEM fields.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025

    MMAU: A Holistic Benchmark of Agent Capabilities Throughout Numerous Domains

    July 29, 2025

    Construct a drug discovery analysis assistant utilizing Strands Brokers and Amazon Bedrock

    July 29, 2025
    Top Posts

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    July 30, 2025

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025
    Don't Miss

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    By Declan MurphyJuly 30, 2025

    Cybersecurity researchers have disclosed a now-patched essential safety flaw in a well-liked vibe coding platform…

    AI vs. AI: Prophet Safety raises $30M to interchange human analysts with autonomous defenders

    July 30, 2025

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025

    Robotic arm with gentle grippers helps individuals with disabilities make pizza and extra

    July 30, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2025 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.