Senior ML Engineer
Role Overview
You will lead the creation and productionization of our AI-driven search pipeline — from building vector indexes and deploying RAG-based systems to designing scalable APIs. You’ll work closely with our engineering team to ingest structured legal data, vectorize it, and ensure seamless integration with our user-facing web application. This role requires both deep technical expertise, a product-focused mindset and an enthusiasm to learn new techniques in the fast changing AI landscape.
Key Responsibilities
1. AI Based Search Development & Optimization
- Design and build AI-powered search models that improve retrieval and ranking of legal documents.
- Implement retrieval-augmented generation (RAG) workflows using pre-trained LLMs (e.g., OpenAI GPT-4).
- Fine-tune LLMs for legal use cases where necessary (experience with custom LLM training is a strong plus).
- Improve search quality through relevance testing, feedback loops, and query understanding.
- Research and implement any new techniques for improving search result relevancy.
2. Data Processing & Vector Indexing
- Build pipelines to ingest, chunk, and vectorize legal texts (case law, statutes, etc.).
- Create and maintain indexes in Vector Databases, supporting fast and relevant results.
- Maintain an evolving legal search index by ingesting new documents on a weekly basis.
3. Model Deployment & API Development
- Deploy ML models into production using Azure cloud infrastructure.
- Develop REST APIs (with FastAPI or Flask) to expose model functionality to the application layer.
- Monitor and optimize latency, scalability, and reliability of deployed solutions.
4. Collaboration & Product Integration
- Work closely with product managers and full-stack engineers to ship ML-backed features.
- Participate in design reviews and own technical decisions around AI architecture.
- Track and improve system performance using user feedback, telemetry, and experimentation.
Tech Stack & Tools
- ML/NLP: Python, PyTorch/TensorFlow, Hugging Face, Azure OpenAI APIs
- Vector Search: Azure AI Search (primary), experience with FAISS, Pinecone or Elasticsearch a plus
- Deployment: Azure (App Services, Azure Functions, Blob Storage, Key Vault)
- Data Processing: Pandas, NumPy, spaCy, NLTK
- APIs: REST APIs built with FastAPI or Flask
Required Skills & Experience
- 5+ years of experience in machine learning, NLP, or AI-based search systems.
- Strong knowledge of vector search, document embeddings, and retrieval techniques.
- Experience building and scaling RAG pipelines with LLMs.
- Proficiency with Azure AI Search for document indexing and search optimization.
- Demonstrated ability to deploy models to production and build robust APIs.
- Familiarity with search ranking algorithms (BM25, hybrid search, learning-to-rank).
- Experience working with document-heavy datasets in legal, academic, or enterprise domains.
- Experience with fine tuning models and creation of datasets used in fine tuning.
- On-site position for Lucknow, India.
Nice to Have
- Background in legal tech, contract analysis, or legal document retrieval.
- Exposure to open-source search frameworks like Elasticsearch or OpenSearch.
- Knowledge of observability, logging, and system performance profiling.