
< session />
Tue, April 21DeepTech OpsTech Architecture
GenAI applications rarely fail because they do not work. They fail because costs become unpredictable. What begins as a simple prototype can quickly evolve into a complex system of chained agents, retrieval pipelines, re-rankers, and repeated inference calls. Without deliberate cost design, token usage grows, latency increases, and cloud spend exceeds expectations. At enterprise scale, this becomes a critical risk.
This session examines the unit economics of production GenAI systems. Drawing from real deployments, it breaks down where costs accumulate, including prompt inflation from poorly managed context windows, retrieval overhead from inefficient vector stores, redundant inference from uncontrolled agent loops, and escalation from multi-model routing. The session also presents practical approaches to manage these challenges through prompt budgeting, tiered model selection, caching strategies, and observability patterns that make GenAI spend predictable.
What You Will Learn
Who Should Attend
< speaker_info />
Akhil Jain is a Senior Solutions Architect at Amazon Web Services with 17+ years of experience across the USA and India, building and operationalizing AI systems at enterprise scale. He specializes in GenAI architecture, agentic workflows, LLM orchestration, IoT and Big Data — working hands-on with engineering teams to move AI from promising prototype to production-grade reality, within the real constraints of cost, latency, and data maturity.
Having worked directly with AWS customers across the USA, Canada, Mexico, El Salvador, the Netherlands, Australia, and India, Akhil brings a genuinely global perspective to enterprise technology — one shaped by firsthand experience of how AI adoption, cloud strategy and architectural decisions play out across different markets, industries and regulatory environments.
Before AWS, he led big data and cloud architecture programs at Informatica. A Carnegie Mellon alumnus, he believes GenAI's true potential lies not in benchmarks or demos — but in the lasting human impact it creates when built with purpose and deployed at scale.
His sessions are built on production reality — the tradeoffs, the failure modes and the hard numbers that practitioners actually encounter when scaling AI beyond the demo.