QueryMind

Conversational AI for Data Processing to Dashboards

LLMs

Automated Data Processing

Multi-Agent Architecture

University

University of Colorabo, Boulder

Date

Spring 2025

Program

Master in Science in Data Science

Blog

Link

Venn diagram of the main ideas from the project

QueryMind

QueryMind is a capstone project designed to make data analysis accessible to everyone, regardless of coding experience. Built by graduate students from the University of Colorado Boulder in collaboration with 99P Labs at Honda Research Institute USA, the platform combines conversational AI with a no-code interface to streamline the end-to-end data workflow. It enables users to upload datasets, clean and explore them, generate visualizations, and export reports—all through simple chat-based interactions.

At its core, QueryMind leverages large language models to automate the most time-consuming parts of data analysis. It uses GPT-4 to generate Python code for data cleaning, including handling missing values, normalizing data types, and identifying outliers. The system retains both original and cleaned datasets to ensure transparency and allow users to verify the results before proceeding further.

The platform simplifies exploratory data analysis (EDA) by providing key insights with no setup required. Users can access summary statistics, distribution plots, and correlation matrices, or issue custom queries through natural language. Visual outputs such as histograms, box plots, and heatmaps are generated in real time, letting users explore their data dynamically without technical overhead.

A central innovation of QueryMind is its multi-agent architecture. The backend intelligently interprets user intent—determining when to respond with natural language, when to generate code, and when to return visual outputs. This orchestration is built with LangChain and LangGraph, using Streamlit for the UI and sandboxed Python environments to safely execute generated code.

Users can also download automated reports at any point in the session. The platform compiles metadata, visualizations, chat logs, and timestamps into a polished PDF report, making it easy to document findings or share them with stakeholders. This is especially useful for non-technical users or teams that need clear, presentable summaries of their analysis.

Throughout the project, the team encountered and addressed several real-world challenges. These included model reliability, dataset size limits, and user trust in auto-generated code. Solutions such as sample-based rendering, code preview options, and verification steps helped improve performance and usability. Key findings showed that GPT-4 excelled at cleaning tasks, while smaller models like Gemma2–9b-it performed well on visualizations.

QueryMind demonstrates how AI can bridge the gap between complex data tasks and everyday decision-making. By reimagining the analytics pipeline through natural language and automation, the project highlights a path forward for more inclusive, intelligent data tools. It serves as a proof of concept for how large language models can transform the way people interact with data—whether in academia, business, or beyond.

Stay Connected

Follow our journey on Medium and LinkedIn.

Read Our Blog Connect on LinkedIn