GitHub - ajayp/ai-stacktrace-inspector: Proof-of-concept exploring how semantic search and code structure analysis via Abstract Syntax Trees (ASTs) can be used to find relevant code and execution context for debugging errors.

Proof of Concept (PoC)

Introduction

This project is a PoC demonstrating how to retrieve relevant code snippets and similar historical stack traces based on a new error event. It utilizes sentence embeddings for semantic similarity and Abstract Syntax Tree (AST) analysis to identify structurally relevant code.

The PoC is implemented as an interactive Streamlit application, making it easy to visualize the retrieval process.

Why use AST?

While sentence embeddings capture the semantic meaning of code (what it does), AST helps understand its structure (how it's organized). For error context, structural information is key: errors related to exceptions are often near try/except, input errors near validation ifs, and processing errors near loops. AST analysis complements semantic search by identifying code structurally relevant to the error type.

How AST works:

Parsing: Source code is parsed into a tree structure.
Tree Generation: Nodes represent code constructs (functions, loops, conditionals, etc.).
Traversal: The tree is walked to find specific node types.
Feature Extraction: Relevant structural features (like the presence of try/except or function calls) are collected.

Features

Semantic Search: Uses a pre-trained SentenceTransformer model (all-MiniLM-L6-v2) to embed code snippets and stack trace frames.
AST Feature Extraction: Analyzes code snippets to identify structural elements like try/except blocks, conditionals (if), and loops (for, while).
Interactive UI: A Streamlit application allows users to input a simulated error message and stack trace, adjust retrieval parameters (similarity threshold, top-k results), and view the retrieved context.
Contextual Relevance: Highlights code snippets that are structurally relevant based on simple AST heuristics (e.g., showing try/except blocks if the error message suggests an exception).

Setup

Clone the repository (if applicable):

# If this is part of a larger repo
# git clone <your-repo-url>
# cd <your-repo-directory>

Install dependencies: This project requires Python 3.7+. It uses streamlit, sentence-transformers, torch, and numpy.
```
pip install streamlit sentence-transformers torch numpy
```
Note: Installing torch might require specific commands depending on your operating system and CUDA requirements. Refer to the PyTorch installation guide for details.

How to Run

Navigate to the directory containing app.py in your terminal.
Run the Streamlit application:
```
streamlit run app.py
```
Your web browser should open automatically to the Streamlit application.

Using the Application

Input Error: In the main area, you can either manually type an error message and stack trace or select one of the provided examples from the dropdown.
Configure Retrieval: Use the sidebar sliders and number inputs to adjust the Similarity Threshold (how similar items must be to be considered relevant) and Top-K Results per Frame (how many results to show for each frame in the input stack trace).
Retrieve Context: Click the "🔍 Retrieve Context" button.
View Results: The application will display "Relevant Code Snippets" and "Similar Past Stack Frames" found in the simulated database based on your input and settings. Expand the sections to view the full code or stack frame content.

Project Structure

app.py: Contains all the code for the Streamlit application, including data simulation, AST parsing, embedding, and retrieval logic.

Future Improvements (PoC Ideas)

Integrate with a real vector database (e.g., Chroma, Weaviate, Pinecone).
Load code snippets and stack traces from actual project files or logs.
Implement more sophisticated AST feature extraction and weighting.
Combine semantic similarity and AST features in a more advanced ranking algorithm.
Use a larger, domain-specific embedding model.
Add functionality to link retrieved code back to specific lines in the original files.
Incorporate Large Language Models (LLMs) to summarize the retrieved context or suggest potential fixes.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Proof of Concept (PoC)

Table of Contents

Introduction

Why use AST?

How AST works:

Features

Setup

How to Run

Using the Application

Project Structure

Future Improvements (PoC Ideas)

Example

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ajayp/ai-stacktrace-inspector

Folders and files

Latest commit

History

Repository files navigation

Proof of Concept (PoC)

Table of Contents

Introduction

Why use AST?

How AST works:

Features

Setup

How to Run

Using the Application

Project Structure

Future Improvements (PoC Ideas)

Example

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages