top of page

AI Engineer

Hyderabad, Telangana, India

Job Type

Full Time

Workspace

Hybrid

Salary Range

INR 30 - 45 Lakhs / 70 Lakhs - 1 Cr < + Early Stage Equity

About the Role

You will build the AI layer that makes our platform work - agents that reason about enterprise data, pipelines that process it reliably, and evaluation systems that catch failures before they reach production. This is not prompt wrapping or demo engineering. It is production AI at enterprise scale, in compliance-heavy environments, on data that is messy by design.

The Problem You’ll Own:

LLMs are powerful. They are also brittle in ways that matter enormously when a model is making decisions about a multi-million dollar purchase order, a patient record, or a financial transaction. Hallucinations, context failures, retrieval mismatches, and inconsistent outputs are not acceptable in our use case.

Your job is to build the systems that make AI reliable in exactly these environments: validation agents that reason about business rules, evaluation pipelines that catch regressions, and observability tooling that tells you when something has gone wrong before the customer does.

What You’ll Do:

+ Build and deploy LLM-powered agents for enterprise data validation — reading specs, reasoning about business rules, identifying failure modes, and generating structured outputs

+ Design and own evaluation frameworks: automated test suites, LLM-as-judge pipelines, regression detection, and benchmarks that track whether our agents are improving

+ Build RAG pipelines that work reliably on real enterprise data — messy schemas, inconsistent formats, mixed structured and unstructured content

+ Integrate AI systems with enterprise infrastructure (SAP, Snowflake, Databricks, Postgres, REST APIs) with attention to latency, data residency, and compliance

+ Design agentic workflows with tool use, multi-step reasoning, and deterministic guardrails

+ Build observability tooling: trace agent reasoning, track output reliability, and detect hallucinations or drift in production

+ Work directly with FDSEs to understand real deployment failures and translate them into system improvements

The Stack:

+ Languages: Python (primary), Go, Node.js

+ AI/ML: LLMs (Claude, GPT-4, Command R+), RAG, vector databases, embeddings, fine-tuning

+ Evaluation: LLM-as-judge, automated eval pipelines, custom benchmarks

+ Data: Snowflake, Databricks, Postgres

+ Infra: containers, Kubernetes / ECS / Cloud Run

+ Tools: LangChain, LlamaIndex, OpenAI / Anthropic APIs, LangSmith

Compensation & Logistics:

Salary: INR 30 - 45 Lakhs (mid) / 70 Lakhs - 1 Cr+ (senior) depending on experience
Equity: Early-stage equity grant

Requirements

  • Production builder: you’ve shipped LLM-powered features real users depend on and debugged them when they broke


  • LLM practitioner: you understand hallucinations, retrieval failures, context limits, and what it takes to make agents deterministic enough for enterprise use


  • Systems thinker: you design for latency, failure modes, retry logic, and observability before features


  • Enterprise-aware: data residency, compliance, audit trails, and deterministic guardrails are first-class design constraints for you


Background That Maps Well:


  • 3+ years in AI/ML or backend engineering with strong AI exposure


  • Hands-on production experience with LLM APIs (Anthropic, OpenAI, Cohere)


  • Experience designing evaluation frameworks: automated evals, regression tests, or LLM-as-judge pipelines


  • Strong Python; experience with LangChain, LlamaIndex, or similar agentic frameworks


  • Familiarity with RAG architectures: chunking, embedding models, vector DBs, retrieval quality

About the Company

We’re building systems that continuously validate data and business processes across large enterprise environments. Enterprises run on multiple systems: ERP (e.g., SAP), APIs, internal tools, and data platforms (Databricks, Snowflake, Postgres). Inconsistencies in data - either from external vendors, internal processes, or data migrations break workflows. When AI is layered on top, those failures scale.

We build the layer that:
+ Prevents inconsistent data entry
+ Detects inconsistencies across systems
+ Validates business logic in real time
+ Enables AI-driven workflows to run safely and reliably

We’re already live at a Fortune 100 AI company and launching at Fortune 500 scale companies in healthcare and financial services.

bottom of page