AI-Powered Stripe Subscription Analytics — No SQL, No Data Warehouse
Most Stripe analytics setups fall into one of two traps. The first is living inside the Stripe Dashboard — useful for high-level numbers, but it has no cross-table queries, no custom logic, and no way to ask the subscription analytics questions that actually drive decisions. The second is building a data pipeline into a warehouse — powerful in theory, but it means infrastructure, engineering time, and a BI tool on top before you get any answers.
There’s a third path. Today we’re releasing the Stripe AI Analytics Starter: an open-source project that loads your Stripe data locally with dlt and gives your AI agent direct SQL access via PlyDB — so you can ask questions about MRR, churn, at-risk subscribers, and expansion opportunities in plain English, with no warehouse and nothing leaving your machine.
Demo with Claude
How AI-Powered Stripe Analytics Works
The Stripe Analytics Starter is an open-source project that connects your AI agent to your Stripe account so you can ask subscription analytics questions in plain English and get answers instantly — no SQL, no data warehouse, no engineering overhead.
Unlike the Stripe Dashboard, which is limited to pre-built reports, this lets you ask cross-table queries, apply custom business logic, and get answers the moment you ask — without writing SQL or standing up any infrastructure.
Here’s how it works:
- dlt — an open-source Python data loading library — pulls your Stripe data from the API and loads it into a local DuckDB file, covering subscriptions, customers, invoices, products, prices, balance transactions, and events.
- PlyDB — the open-source universal database gateway for AI agents — sits between your agent and that file, translating plain-English questions into SQL and executing them.
- A semantic overlay bundled with the repo teaches the agent Stripe’s data model: which fields are in cents, which timestamps are Unix integers, how nested objects get flattened, and how tables relate to each other.
- You ask questions in plain English. The agent returns answers — no SQL required.
The result is an agent that can answer complex billing analytics questions correctly from the first session. Your data stays on your machine — zero ETL, zero data movement, no cloud.
Who is this for? SaaS founders, RevOps teams, and billing leads who want to explore their Stripe data without setting up infrastructure — and developers who want a working example of AI-powered subscription analytics to adapt for their own stack.
What You Can Ask Your Stripe Data
The real value isn’t in asking “what is our MRR” — the Stripe Dashboard can tell you that. It’s in the questions that require joining tables, applying business logic, or crossing multiple signals at once.
Revenue at risk, right now. Ask the agent to find every active subscription
where cancel_at_period_end is set, or where there’s been a recent failed
payment attempt, and sum their MRR. That’s your confirmed revenue walkoff for
the month unless someone acts. One question; an actionable number.
Who should you pitch an upgrade to? Ask the agent to find customers on your entry-level plan ranked by tenure and payment reliability. These are the accounts that have self-selected into your product, pay on time, and haven’t churned after a year — the natural expansion cohort. Identifying them historically required a spreadsheet, a SQL query, and someone to run it. Here it’s a prompt.
Why are people canceling, and when? Cancellation reasons and timing both
live in the subscription table. Ask the agent to break down canceled_at
relative to start_date and cross-reference cancellation_details__reason. Are
customers churning in the first month (onboarding problem) or after a year
(pricing problem)? The answer shapes very different responses.
How healthy are your cohorts? Group subscribers by the month they started and track how many are still active at 3, 6, and 12 months. Which cohorts retained best — and what was different about those months? The agent can run this in seconds; building the same query in a BI tool takes an afternoon.
Payout reconciliation. Net revenue after Stripe fees, broken down month by month, alongside refunds and the gross charge volume — in a single question. Useful for board prep, investor updates, or just keeping the books honest.
Extending Your Stripe Analytics Setup
This repo is intentionally minimal. It gets you to working AI-powered Stripe subscription analytics in under ten minutes. What you do with it from there depends on your setup. Three directions worth considering:
Cross-Domain RevOps
The Stripe data becomes a lot more powerful when it’s not isolated. PlyDB can JOIN across multiple data sources in a single query — add your HubSpot CRM, Google Analytics traffic, or ad spend exports and suddenly you can ask questions that no single-source tool can answer: true CAC-to-LTV ratios, pipeline-to-revenue conversion, or margin by customer segment. The Stripe data is the revenue layer; everything else plugs in around it.
SQL
conversion
segment
Proactive Monitoring
Most revenue dashboards go stale the moment they’re built — they answer the questions you thought to ask when you built them, not the ones that matter today. The alternative is to flip the model: schedule your agent to run nightly against fresh data and brief you on what changed. A churn spike, a cluster of failed payments, a cohort underperforming its peers — anomalies already flagged, with a recommended action for each. You get a morning briefing instead of a chart that requires interpretation.
From Insight to Action
This is where an AI agent with live data access stops being a chatbot and becomes a strategic partner. Once the agent has identified at-risk subscribers or upgrade candidates, it can draft the outreach too — ranked by revenue impact, ready for your review. The analysis and the response live in the same conversation. You don’t switch tools, copy data into a spreadsheet, or re-explain context. Nothing sends until you approve it.
Get Started with the Stripe Analytics Starter
The repo has everything you need: a dlt pipeline to load your Stripe data into a local DuckDB file, a pre-configured PlyDB setup, and a semantic overlay so your agent understands Stripe’s data model out of the box.
You’ll need a Stripe API key, Python 3.8+, and PlyDB installed. The README walks through the rest in four steps.
Frequently Asked Questions
Do I need to know SQL to use this? No. Your AI agent handles all the SQL. You ask questions in plain English; the agent translates them into queries, runs them via PlyDB, and returns the answer.
What’s the difference between this and Stripe’s built-in analytics? The Stripe Dashboard shows pre-built reports across its own data. This lets you ask cross-table questions, apply your own business logic, and combine Stripe data with other sources — in plain English, with no SQL. It also covers data the Dashboard doesn’t surface directly, such as full invoice line history, balance transaction detail, and event history.
How do I analyze Stripe MRR with AI? Load your Stripe data with the dlt pipeline (one command), then open Claude Code or any PlyDB-compatible agent in the project directory and ask: “What is our current MRR, broken down by plan?” The agent queries the local DuckDB file via PlyDB and returns the answer.
Does my Stripe data leave my machine? No. The dlt pipeline loads data into a local DuckDB file. PlyDB is read-only by default — your AI can analyze and find patterns, but it cannot alter your data. Nothing is sent to any external service beyond the initial Stripe API call to download your data.
Does it work with live data? The pipeline loads a snapshot of your Stripe account each time you run it. For up-to-date analysis, run the pipeline before your session, or schedule it to run automatically.
What Stripe data does it cover? Subscriptions, customers, products, prices, invoices, balance transactions, and events — seven endpoints in total. The full field-level detail is in the README.
What AI agent do I use with it? Any agent that supports MCP or CLI tools. The repo is set up for Claude Code, but it works with Claude Desktop (via MCP), ChatGPT, Gemini, and any other agent that can call external tools. See the PlyDB documentation for the full compatibility list.
Can I add other data sources alongside Stripe? Yes — that’s one of the most natural extensions. PlyDB supports cross-source SQL queries, so you can add a HubSpot export, Google Analytics data, or any other dlt-supported source and query across all of them at once.
Does this work with Stripe test mode? Yes. Use a sk_test_ API key in
.dlt/secrets.toml to load test-mode data. Useful for exploring the setup
before running against your production account.
Is this production-ready? It’s a starter project. It works, and the semantic overlay is thorough enough for serious analysis. For production use you’d likely want to schedule the data loads, possibly move storage to S3 or a cloud warehouse, and extend the semantic overlay with your own business context. The repo README covers the main adaptation paths.
The Stripe Dashboard is great at showing you what happened. This is for when you need to know what to do about it.
Try the Stripe Analytics Starter →
Want to pair Stripe revenue with web analytics? The Google Analytics AI Starter connects your GA4 traffic and acquisition data to the same PlyDB setup — so you can query revenue and web analytics together in a single SQL statement.
Already building something similar with PlyDB? We’d love to hear about it. PlyDB is open source — join us on GitHub.