Available for AI Engineering Contracts

I Build
|

I don't just wrap APIs — I architect scalable agent swarms like Quest (10-Agent System) that generate 30-page deep-dive reports in under 2 minutes. From low-latency Voice Agents to quantized VLMs on the edge, I build and ship production-grade AI.

10+
Agent Swarms
<2 min
30-Page Reports
95%
SQL Accuracy
25%
RAG Improvement
LangGraph Gemini Live RAG / Hybrid Search Qwen2.5-VL Fine-Tuning (Triplet Loss) Quantization (bf16 / GGUF) FastAPI / WebSockets Docker GCP / Azure PostgreSQL (pgvector) Prompt Engineering Context Engineering Llama 3 CI/CD Pipelines Microservices Real-Time Audio/Video LangGraph Gemini Live RAG / Hybrid Search Qwen2.5-VL Fine-Tuning (Triplet Loss) Quantization (bf16 / GGUF) FastAPI / WebSockets Docker GCP / Azure PostgreSQL (pgvector) Prompt Engineering Context Engineering Llama 3 CI/CD Pipelines Microservices Real-Time Audio/Video

Where I've Shipped Production AI

AI Engineer
Fraterny — fraterny.com/quest
July 2025 – Present · Remote
  • 10-Agent Swarm Architecture: Architected "Quest," a distributed Multi-Agent System (LangGraph) that orchestrates 10 specialized agents to generate 30-page psychological reports in under 2 minutes.
  • Zero-Hallucination RAG: Fine-tuned the all-mpnet-base-v2 embedding model using Contrastive Learning (Triplet Loss), improving semantic retrieval accuracy by 25% for clinical domains.
  • State Consistency: Engineered a Long-Context Memory layer combined with Hybrid Search, ensuring 100% output consistency across complex, multi-turn user sessions.
  • High-Scale Async Backend: Implemented asynchronous processing (asyncio) in FastAPI to handle intensive reasoning tasks (140s+ context windows) without blocking concurrent users.
  • Cloud Infrastructure: Managed a Dual-Tier deployment on GCP, optimizing costs by routing traffic between lightweight sequential agents and heavy parallel swarms.

Systems I've Engineered

🧠

Quest

Multi-Agent Swarm · LangGraph

A distributed 10-agent system that orchestrates specialized AI agents to generate comprehensive 30-page psychological deep-dive reports in under 2 minutes with zero hallucinations via fine-tuned embeddings and hybrid search.

LangGraph FastAPI GCP RAG Triplet Loss
10 Agents
<2 min Generation
25% RAG Boost
🏋️

AI Fit Trainer

Real-Time Multimodal Agent · Gemini Live

A low-latency mobile fitness trainer using FastAPI WebSockets and Gemini Live Native Audio for sub-second voice interaction, real-time form correction, and bandwidth-optimized video streaming on minimal connections.

FastAPI Gemini Live WebSockets Context Eng.
Sub-sec Voice
Real-Time Video
Low-BW Optimized
🔍

Local Lens

Privacy-First Edge AI · Qwen2.5-VL

A local semantic image search engine powered by Qwen2.5-VL-3B quantized to bf16, running entirely on consumer hardware (24 GB VRAM) without cloud dependency. Sub-second retrieval via FAISS indexing.

Qwen2.5-VL Quantization FAISS Python
3B Params
bf16 Quantized
Sub-sec Search

Enterprise Text-to-SQL

Tesla ERP Case Study · LangChain

A high-precision Text-to-SQL Agent translating complex natural-language queries into optimized SQL with 95% syntax accuracy. Dynamic Schema Retrieval and strict prompt guardrails eliminate hallucinations entirely.

LangChain Text-to-SQL Multi-Agent ERP
95% Accuracy
100% Schema Adherence

Academic Foundation

Christ University

M.Sc. Artificial Intelligence & Machine Learning
Online · Expected 2027

Guru Gobind Singh Indraprastha University

B.C.A. (Specialization in AI/ML)
New Delhi, India · Graduated 2025
🤗
Quantization Fundamentals & Preprocessing Unstructured Data
Hugging Face
🔬
Introduction to AI
IBM
🎓
Supervised Machine Learning: Regression and Classification
Stanford Online (Coursera)
🐍
Data Science and Machine Learning on Python
Python Institute
☁️
Introduction to Amazon SageMaker
AWS

Let's Build Something Extraordinary

Get in Touch

Have a project that needs production-grade AI? Whether it's a multi-agent architecture, a real-time multimodal system, or an edge AI deployment — I'm ready to ship it.

Harsimran's AI Assistant

Ask me anything about my work
Hey! 👋 I'm Harsimran's AI assistant. Ask me about his projects, skills, or how he can help with your AI engineering needs.