XAI - Explainable AI tools and techniques

Friday, April 10, 2026

09:00 AM - 09:30 AM

Krantas/Shore 213 (3rd building)

Speaker

Viraj Sharma

I am a passionate technologist with a strong interest in Python, artificial intelligence and Edge computing. I am currently studying in Class 9 at Presidium School, Delhi, INDIA. I have worked in areas such as torch.nn visualization, Anthropic technologies (MCP, Skills), large concept models, and TensorCore/CUDA benchmarks, Edge AI on raspbrry pi running small models with sensors. Recently I have been working on XAI (AI explaianability) and putting my work as a project on my AI Lab - modelrecon.com. As an active member of the Python community and AI communities, I enjoy learning from experienced developers and sharing my insights with others. I attend major tech events including PyCons, Linux Fests, OS Summits, GDG events, p99conf, and various AI conferences, where I actively present my projects and ideas.

Abstract

In this talk, I propose to discuss the problem of building explainable AI with the two approaches - causal vs correlational. I will talk about what mech interp in LLMs. As a way to understand how models answer questions by looking inside them and checking which neurons activate when. I will discuss Anthropic's open sourced a python module - circuit-tracer, the Neuronpedia portal , will also talk about my own work with "activation cube" data structure (this is not a standard - I came up with it)

Description

basic problem of causal vs. correlational techniques and the limitations of corrlational.

Why Explainability Matters 2-3 mins

We need to understand why AI models make certain choices, not just what answers they give. Without this, the model feels like a black box. - in this I will include example of human behavior

What Transformers Hide 2-3 mins

I will talk about basic transformers internal steps and features that are hard to see. highliting that tools only show the final output, not the thinking process. Infact - I will highlight that it comes as a surpirse to normal people that we dont know how models "actually" arrive at specific answers. -

How Circuit Tracer Helps 3 - 5 mins

I will talk about how Anthropic’s Circuit Tracer shows the inside connections of the model. It turns hidden activations into easy-to-understand features and shows how they link together. It not that easy, but we can get used to the graphs (like the link guy in the matrix movie- he could just understand by looking at the matrix runtime code) - I will show some graphs and walk through the reasoning path on colab

Seeing the Reasoning Path 10 minutes or more

The tool draws a clear path from input → inner features → final output This lets everyone see which parts of the model caused the answer. _ this would be fun as the type of path a model takes are weird sometimes.

Why This Is Important 2 minutes

With this method, we can:

check if the model is behaving safely
fix mistakes inside the model
build trust by seeing how it thinks

I will finish with description with some of my work