Available for new opportunities

Hargurjeet
Singh Ganger

Senior Data Scientist

Building production-grade LLM systems, RAG pipelines, and agentic workflows at British Telecom. 15+ years turning complex data into measurable business outcomes.

LangChainCrewAIAWS BedrockRAGLangGraphFastAPIPython
Open To
Lead / Principal GenAI Scientist
AI Platform Architecture
Enterprise RAG & Agentic Systems
Senior Data Science & ML

15+

Years Experience

TCS → Shell → BT

$10M+

Business Value Generated

Across AI & ML initiatives

100K+

Documents Processed

Multimodal, production scale

7+

AI Systems Shipped

In telecom, energy & IT

About Me

From IT Analyst to AI Systems Builder

Hargurjeet Singh Ganger

At British Telecom I'm leading the Generative AI charge — deploying RAG-powered chatbots on AWS Bedrock, building multi-agent workflows with CrewAI, and designing LLM evaluation frameworks that catch hallucinations before they reach production. My focus is always the same: AI that works in the real world, not just the notebook.

Before that, at Royal Dutch Shell, I moved into data science — building predictive maintenance models that cut equipment downtime by 25% across oil refineries, and forecasting dashboards that saved 10% of budget allocations across five geographies. That's where I fell in love with the gap between a working model and a working solution.

I started my career at TCS testing point-of-sale systems, spending a year in the UK guiding offshore teams through complex software rollouts. Those early years taught me how enterprise systems break under real conditions — a foundation that still shapes how I build today.

LLMsRAGAgentic AIMLOpsAWSPythonGenAI

Value I Bring

What I Can Solve

Scalable RAG pipelines

Design and ship hybrid retrieval systems (BM25 + dense vectors + reranker) that process 100K+ multimodal documents with 90%+ accuracy in production.

Agentic AI workflows

Architect multi-agent systems with CrewAI and LangGraph — tool-augmented, guardrailed, and evaluated with Ragas before they touch production.

LLM evaluation frameworks

Build frameworks that measure faithfulness, relevancy, hallucination, toxicity, and bias — so model upgrades don't silently degrade your product.

Research → production AI

Take proof-of-concept LLM experiments and harden them: containerise, add CI/CD gates, instrument with latency and drift monitoring on AWS.

AWS Bedrock & GenAI infra

Deploy secure, cost-efficient GenAI systems using Bedrock, Textract, OpenSearch, and Step Functions — with CloudWatch observability built in.

ML-driven business outcomes

Translate messy enterprise data into XGBoost, Random Forest, or deep-learning models that move real metrics: 30% fewer maintenance incidents, 10% budget saved.

Expertise

Technical Skills

Generative AI

RAG PatternsAgentsFine-tuningGuardrailsPII FilteringVector DBObservabilityMCP

LLM Frameworks & APIs

LangChainLangGraphCrewAIOpenAI APIAnthropic APIGemini APILangfuseRAGAS

MLOps

DockerFastAPIMLflowCI/CDGitHub ActionsGitLabAWS SageMakerFeature Store

Cloud (AWS)

BedrockTextractOpenSearchLambdaStep FunctionsCloudWatchSageMaker

Core ML & Data

XGBoostRandom ForestsPyTorchTensorFlowscikit-learnBERTPySparkSQL

Career

Work Experience

Senior Data Scientist

May 2022 – Present

British Telecom (BT) · London, UK

  • Deployed RAG-powered chatbot using LLMs + AWS Bedrock, reducing manual data extraction time by 70%
  • Built multi-agent agentic workflows with CrewAI for batch processing with hallucination guardrails
  • Designed LLM evaluation framework using Ragas — faithfulness, relevancy, bias, and hallucination detection
  • Processed 100K+ multimodal documents with 90%+ accuracy using AWS Textract + OpenSearch
  • Engineered recommendation models (XGBoost + market basket analysis) increasing VAS sales by 30%

Data Scientist

Sep 2016 – May 2022

Royal Dutch Shell · Netherlands / Remote

  • Built predictive maintenance models (XGBoost, Random Forests) cutting maintenance costs by 30% and unplanned downtime by 25%
  • Developed Power BI forecasting dashboard for materials on-time delivery across 5 geographies, saving 10% budget
  • Applied NLP, clustering, and statistical inference across large datasets using PySpark and SparkSQL

IT Analyst

Dec 2010 – Aug 2016

Tata Consultancy Services (TCS) · India / UK

  • Led System Integration Testing & UAT to validate client PoS systems at enterprise scale
  • Spent one year in the UK guiding offshore teams through new PoS software implementation
  • Worked with card and payment systems, PCI standards, and ISO 8583 protocols

Work

Featured Projects

Live App

Production-Grade RAG Evaluation Pipeline

  • Hybrid retrieval — BM25 sparse + contextual dense search — fed through a Cohere reranker
  • Citation enforcement grounds every answer in source documents; no hallucinated references
  • Prompts version-controlled in a config file — every change is tracked and reproducible
  • Offline RAGAS script measures faithfulness, answer relevancy, and context precision
  • GitHub Actions gate runs eval on every PR; merge blocked if any metric drops below threshold

Outcome: Quality regressions caught at PR stage, not in production. Stack: LangChain/LangGraph, Chroma vector store, Cohere reranker — every retrieval step traceable, every prompt change auditable.

RAGRAGASBM25LangChainLangGraphCohere RerankerChromaGitHub ActionsPython
Live App

AI-Powered Resume Parser

  • PDF → structured JSON pipeline: pdfplumber extracts text → Llama 3.3 70B parses via Fireworks AI
  • JSON schema enforcement: instructor library constrains LLM output to an exact Pydantic v2 model
  • Retry mechanism: catches invalid outputs, re-prompts the LLM once, then fails gracefully — no silent errors
  • Split-view UI: original PDF alongside experience timeline, color-coded skill tags, and education cards
  • One-click JSON export, dark/light mode, drag-and-drop upload with animated progress steps

Outcome: Live on Fly.io. Structured output guaranteed at the schema level — retry logic and graceful failure handle the edge cases that plain prompting misses.

FastAPINext.jsLlama 3.3 70BFireworks AIpdfplumberPydanticFly.io
Open Source

Local AI Assistant using Ollama

  • Runs Phi-3 Mini, Mistral 7B, and LLaMA 3.2 fully on-device via Ollama — no cloud calls
  • FastAPI backend for offline operation with structured JSON outputs
  • Pydantic validation and retry logic ensure consistent, well-formed responses
  • Benchmarked all three models across speed, accuracy, and resource trade-offs

Outcome: Rigorous benchmark published on Dev.to — documented which model wins for which use case on consumer hardware.

OllamaFastAPILLaMA 3.2Mistral 7BPhi-3PydanticPython

Academic

Education

2023 – 2025

Liverpool John Moores University

M.S. Machine Learning & Artificial Intelligence

Liverpool, UK

2022 – 2023

IIIT Bangalore

Executive PG in Data Science & AI

Bangalore, India

Statistics & Probability · ML · NLP · Neural Networks · MLOps

2006 – 2010

New Horizon College of Engineering

B.E. Electronics & Communication

Bangalore, India

Visvesvaraya Technological University

Get in Touch

Let's Connect

Open to senior data science and AI engineering roles. If you're building something ambitious with LLMs or agentic systems, I'd love to talk.