AI-Assisted Development in Practice: Chatbot + Agentic Testing from Scratch
Speaker
Alex Marmuzevich
I am a software professional and solution architect with 30+ years of experience in IT. I have been developing in Python since 2010. Prior to that, I worked extensively with ASM, C/C++, C#, etc. I have been actively using AI in software development since 2023. At present, up to 80% of my production code is written with the help of AI agents. My main focus is on applying AI-assisted development practices in a disciplined, engineering-driven way. I am an AI Ambassador at EPAM, where I promote practical adoption of AI tools and workflows in everyday software engineering.
Abstract
GenAI-based projects are easy to prototype, but hard to test. In this hands-on workshop, we will build a simple AI chatbot from scratch and, in parallel, create an AI agent that tests it during development. The testing agent will generate user scenarios, detect failures such as prompt issues, incorrect tool usage, etc., and then suggest concrete next steps for improvement. The core theme of the workshop is building AI systems with the help of AI itself, while keeping an engineering-driven approach to design
Description
Modern AI-powered features often fail not because of models, but because they are hard to reason about, test, and debug. This workshop focuses on a practical engineering approach to AI-assisted development using Python.
We will build two systems from scratch:
- a tool-using chatbot with a strict behavioral contract (system prompt + tools),
- an agentic test bot that evaluates the chatbot in real time.
The chatbot will operate only via prompts and tools (no RAG), making its behavior explicit and observable. The testing agent will act as a QA engineer and developer coach: it will generate realistic and adversarial user inputs, detect failures such as prompt leakage, incorrect tool usage, and broken assumptions, and then produce actionable feedback on what to fix next.
Participants will see how to:
- design tool-using agents with clear constraints,
- detect common LLM failure modes,
- combine deterministic checks with LLM-as-a-judge,
- turn exploratory testing into a repeatable regression suite.
The workshop emphasizes simplicity, engineering discipline, and production-minded thinking rather than frameworks or hype. All examples use Python and minimal dependencies.