Pedram Agand
← Writing
Programming

Cashflow tracker with AI

Step 0: Get your monthly bank statement In this step, you simply need to access your online banking and request a monthly or period statement.

2023-11-17·5 min read·ai, finance, llm, machine-learning, python
Use with AI
Cashflow tracker with AI

I was tired of spending 30 minutes every month manually categorizing credit card statements into a spreadsheet — only to get a number I half-trusted. Mint was sunset. Quicken costs money. And I didn't want to hand my bank credentials to another third-party app.

So I built a free alternative using GPT-4 and Python. The whole pipeline takes about 10 minutes once you've set it up: paste your bank statement, get a JSON breakdown by category, and plot it. No subscriptions, no OAuth, no cloud sync.

Here's exactly how it works.


Step 0: Get your bank statement

Log into your online banking and download your monthly statement. A CSV file is ideal — the dates, amounts, and descriptions parse cleanly. If your bank only exports PDF, copy the transaction rows (dates, amounts, descriptions) into a plain text file manually.

Use your checking account if you can. It captures both income and outgoing transfers in one place, which gives you a complete picture. Credit-card-only statements work too, but you'll miss income flows.


Step 1: Access free GPT-4 via Microsoft Bing

At the time I wrote this (late 2023), Microsoft Bing offered free access to GPT-4 through Microsoft Edge. Download Edge, go to bing.com, and click the chat icon at the top. Choose "More Precise" mode — it keeps the model focused on following instructions rather than being creative with your financial data.

This is no longer the only option. ChatGPT free tier and other tools have expanded since then, but the prompt below works with any GPT-4-class model.


Step 2: Prompt engineering — turn the statement into JSON

This is the core of the approach. Paste the following prompt, then append your statement rows after the ### delimiter.

Instruction: You are an expert in personal financial management and processing csv file for finance data and create a monthly cashflow report for it including the budget categorization, bills, save similar as the one in Mint by intuit or Simplify by Quicken. The transactions delimited by ###. When writing the list, put them in date order so I can double check. If there are multiple entries with the same date as key, add -1, -2, ... for more entries as the key. Do not double calculate any entry.

Do not calculate anything, just return a json format for the categories. here is a sample format
{"Income": 
{"Payroll Deposit":{date: amount}, "Accounts Payable": {date: amount}, ... },
{"Savings":
{Customer Transfer Dr:{date: amount} ,...},
{"Expenses": 
{"Miscellaneous Express":{date: amount}, "WITHDRAWAL" :{date: amount}}
}

Use this for breaking down the list:

•	Income: any positive cash flow except from Customer Transfer Cr. MB-TRANSFER. This includes my full-time job income (Payroll Deposit), other deposits (DEPOSIT and Accounts Payable), and positive transfers from my savings account (Customer Transfer Cr. MB-TRANSFER).
•	Savings: any transaction with Customer Transfer Cr. MB-TRANSFER. consider the sign
•	Expenses: any negative cash flow that is not saving This includes my credit card payments (Miscellaneous Payment), withdrawals (WITHDRAWAL), and bill payments (Bill Payment: MB-QUESTRADE). 

Create each month separately: January, then February, etc.

###
2/27/2023,80.00,-,DEPOSIT ,FREE INTERAC E-TRANSFER
1/3/2023,-12.45,-,Miscellaneous Payment ,Credit card

###

A few things worth noting about this prompt:

  • The date-key collision rule (-1, -2 suffixes) prevents the model from silently merging two transactions on the same day.
  • Separating Savings from Expenses is deliberate — investment transfers out of checking are not the same as spending, and conflating them understates your real cashflow.
  • "Do not calculate anything" keeps the model from hallucinating subtotals. You'll do the math in Python.

The output is a JSON object organized by month, then category, then sub-category. Save it as a .json file in your working directory.


Step 3: Plot it with Python

Load the JSON and build stacked area charts for Income, Expenses, and Savings across months. The code handles sparse categories — if a category only appears in some months, it fills zeros for the others.

cashflow = []
incomes_plot = {}
expenses_plot = {}
savings_plot = {}

for month in data:
    # Income
    incomes = data[month]["Income"]
    income_values = list(incomes.values())
    income_values = [sum(i.values()) for i in income_values]
    income_category = list(incomes.keys())
    total_incomes = sum(income_values)
    print(income_category)

    for i in income_category:
        if i not in incomes_plot:
            incomes_plot[i] = [0] * len(data)
        incomes_plot[i][list(data).index(month)] = income_values[income_category.index(i)]

    # Expenses
    expenses = data[month]["Expenses"]
    expenses_values = list(expenses.values())
    expenses_values = [sum(i.values()) for i in expenses_values]
    expense_category = list(expenses.keys())
    total_expenses = sum(expenses_values)

    for i in expense_category:
        if i not in expenses_plot:
            expenses_plot[i] = [0] * len(data)
        expenses_plot[i][list(data).index(month)] = expenses_values[expense_category.index(i)]

    # Savings
    savings = data[month]["Savings"]
    savings_values = list(savings.values())
    savings_values = [sum(i.values()) for i in savings_values]
    savings_category = list(savings.keys())
    total_savings = sum(savings_values)

    for i in savings_category:
        if i not in savings_plot:
            savings_plot[i] = [0] * len(data)
        savings_plot[i][list(data).index(month)] = savings_values[savings_category.index(i)]

    cashflow.append(total_incomes + total_expenses + total_savings)

# Cashflow over time
print(cashflow)
plt.figure(figsize=(10, 5))
plt.plot(list(data), cashflow, label='Cashflow')
plt.plot(list(data), [sum(cashflow[:i]) for i in range(1, len(cashflow)+1)], label='Cumulative Cashflow')
plt.xlabel('Month')
plt.legend(loc='best')

# Move investment transfers out of Expenses into a separate series
invest = expenses_plot.pop('Bill Payment')

# Income breakdown
plt.figure(figsize=(10, 5))
plt.stackplot(list(data), incomes_plot.values(), labels=incomes_plot.keys())
plt.legend(loc='upper left')
plt.title('Income')

# Expense breakdown
plt.figure(figsize=(10, 5))
plt.stackplot(list(data), expenses_plot.values(), labels=expenses_plot.keys())
plt.legend(loc='best')
plt.title('Expenses')

# Savings + investments
plt.figure(figsize=(10, 5))
plt.stackplot(list(data), savings_plot.values(), labels=savings_plot.keys())
plt.plot(list(data), [-i for i in invest], label='Invested')
plt.title('Savings')
plt.legend(loc='best')

plt.show()

The investment transfer pop is intentional. Brokerage contributions show up as bill payments in most Canadian bank exports — you don't want them stacking with coffee and groceries in the Expenses chart.


Full code and a sample JSON file are on GitHub: github.com/pagand/finance

Want this implemented in your workflow?

I work with SaaS companies, real-estate, finance, and regulated-industry teams on AI adoption. Book a 20-minute strategy call — no pitch, just a focused conversation about your situation.

I publish one post like this per month. Join AI Command Room and I'll send it directly to you.