Project

ragprop: RAG Agent for Proposal Extraction from Meeting Transcripts

ragllmpythoninformation-extractionlangchain

LLM-powered RAG agent that retrieves and structures proposals, action items, and decisions from meeting minutes and transcript documents.

Meeting minutes and transcripts are dense, unstructured, and full of buried commitments. ragprop is a RAG-based LLM agent that reads a corpus of meeting documents and answers queries about what was proposed, decided, or assigned - surfacing the actionable signal without requiring manual re-reads.

What It Does

Given a directory of meeting transcripts or minutes (plain text, Markdown, or PDF), the system:

Chunks and embeds documents into a local vector store
Accepts natural-language queries: "What proposals were made about the authentication system?" or "What action items were assigned to the ML team?"
Retrieves the most relevant passages and runs them through an LLM to extract structured proposals with context

The output is a ranked list of proposals with the source document, date, and participant context included - making it traceable, not just a text dump.

Setup

cd ragprop
pip install virtualenv
virtualenv --python=python3.11 myenv
source myenv/bin/activate
pip install -r requirements.txt

Design Decisions

Standard RAG retrieves documents; ragprop structures them. The post-retrieval step uses a prompt template that asks the LLM to extract proposals in a consistent format (proposer, description, status, referenced document) rather than just summarizing the retrieved context.

The name reflects this: RAG + propagation of structured proposal data through the pipeline.

Use Case

Built for teams that run recurring planning meetings and lose track of proposals across sessions. Particularly useful for distributed teams where multiple stakeholders have overlapping proposals that need deduplication before a decision meeting.

I write about this kind of work - reliability, uncertainty, building things that work in production. One email per month.