Hello! I am Saurabh Kumar Singh, a Senior Consultant at Deloitte, Bangalore. Experienced AIML Application Developer with 5+ years of expertise in cutting-edge AI technologies, skilled in end-to-end development and deployment with strong hands-on experience across GenAI, Computer Vision, and ML infrastructure.
Specialized in developing end-to-end GenAI RAG engines and Document Processing pipelines for enterprise applications, incorporating Knowledge Graphs, Vector Databases, and NLP techniques. Proficient in ML model optimization and production deployment โ including LLM fine-tuning, Triton Inference Server optimization, model distillation/quantization, and scalable inference pipelines on cloud-native platforms.
What I Do
Generative AI & LLMs
Building enterprise RAG engines, AI agents, and chatbots using OpenAI, Gemini, and open-source LLMs with Langchain & Llama-index.
ML Optimization & Serving
Triton Inference Server, model distillation/quantization, ONNX/TensorRT optimization, and scalable inference pipelines on cloud-native platforms.
Computer Vision & Edge AI
Intelligent Video Analytics on NVIDIA DeepStream & Jetson. Object detection, tracking, and real-time inference at scale.
Knowledge Graphs & Vector DBs
Designing hybrid retrieval systems with Neo4j, ChromaDB, Pinecone, and graph-augmented RAG for enterprise search.