Skip to main content

Ruby Jha

Engineering Manager · Applied AI · Cloud The next decade belongs to engineering leaders who build with AI.

I spent the last two decades leading engineering teams at State Street, Centene, and EY. Building products that handle real money, real patient data, and real regulatory scrutiny. The kind of work where downtime means someone's claim doesn't get processed and a bad deployment means financial loss.

Now I am applying that same engineering discipline to AI. I am building 9 AI systems covering RAG pipelines, embedding fine-tuning, and multi-agent orchestration. Each one has evaluation frameworks, architecture decision records, and metrics I would trust in a code review. The same standards I would hold any production system to.

The Full Stack

Leadership

People Management Hiring & Team Building Performance & Promotions Executive Communication Technical Strategy

Technical

Python Java TypeScript OpenAI API LangChain CrewAI FastAPI ChromaDB Azure Docker Kubernetes React Spring Boot Astro

Featured Projects

What I'm building

Project 01
Demo: streamlit Completed

Synthetic Data Generation Pipeline

I built a pipeline that generates synthetic training data, validates it with an LLM judge, and self-corrects until every record passes. Started with a 20% failure rate, ended at zero.

Python Pydantic OpenAI API GPT-4o-mini GPT-4o +1
Project 02
Demo: streamlit Completed

RAG Evaluation Pipeline

I tested 16 RAG configurations and found that semantic chunking + OpenAI embeddings + Cohere reranking gets 0.747 Recall@5 on structured Markdown docs. This is how I got there.

Python LangChain RAGAS Sentence-Transformers Braintrust +4
Project 03
Completed

Contrastive Embedding Fine-Tuning

I fine-tuned all-MiniLM-L6-v2 on 1,475 dating profile pairs and flipped Spearman from -0.22 to +0.85. LoRA got 96.9% of that using 0.32% of the parameters.

Python Sentence-Transformers PEFT/LoRA PyTorch UMAP +2
Project 05
Completed

ShopTalk Knowledge Management Agent

I built a RAG system from scratch with no LangChain, tested 46 configurations across 5 chunking strategies, 4 embedding models, and 3 retrieval methods, and found that heading-aware chunking + OpenAI embeddings hits NDCG@5 = 0.896 and Recall@5 = 1.0.

Python PyMuPDF FAISS SentenceTransformers OpenAI +6

Latest Blog Posts

structured-output Apr 7, 2026

The Decision Chain That Got Structured Output to 100%

How Instructor, flat schemas, and two-phase validation got me to 100% structured output success across 580 LLM-generated records.

10 min read

leadership Apr 1, 2026

When Leadership Promises Don't Survive the Reorg

What to do when your team was promised promotions or headcount that never materialized, and you're the new manager holding the bag.

6 min read

leadership Apr 1, 2026

Your Team Is Doing the Work. Someone Else Is Taking the Credit.

How to fix invisible attribution in distributed teams, and why credit theft is the most corrosive trust pattern a manager can inherit.

6 min read

More about my background