I Asked an AI to Build Me a Bank Data Platform. Here's What Happened.
Claude wrote this post for me and we reviewed and revised it together. I know this won't be popular with at least a few of my regular readers, but it felt like the right thing to do given the project it's describing.
Lambdas don't play well with RDS databases. Connection pooling, VPC cold starts, idle timeouts — it's a well-documented headache. But a Lambda is the obvious choice for a cloud-native Open Banking API implementation. And RDS is usually the best choice when you want to query and analyse data. I'd hit this tension before and never resolved it cleanly.
So I wanted to see if Apache Iceberg could be the answer. It works with Lambdas (just write Parquet files to S3), and it can be queried like a relational database via Spark or Athena. I was also keen to see what the Lambda integration actually looked like in practice.
I decided to pair with Claude Code on the build — treating it as a junior engineer who never sleeps and never complains about writing Terraform tests.
Why a Lakehouse?
Bank data arrives as events. A customer grants consent, accounts are discovered, transactions are fetched. You need two things at once — operational tracking (where are we in the import?) and analytical storage (let me query all my transactions over time).
I settled on a two-store pattern. DynamoDB handles the ephemeral state: connection lifecycle, import progress, and token storage. It's fast, cheap, and purpose-built for key-value access patterns. Apache Iceberg on S3 holds the settled data: accounts, balances, and transactions. It supports schema evolution, queries cleanly via Spark or Athena, and keeps an immutable append-only history.
The architecture looks like this:
SQS sits in the middle on purpose — it decouples the consent flow from the heavy lifting, so a slow data fetch never blocks a customer's browser.
The Open Banking Detour
I originally designed the whole system around GoCardless Bank Account Data. They offered free Open Banking API access to UK bank accounts, a webhook-driven model, and good documentation — exactly what I needed. The architecture doc, the implementation plan, and the first few modules were all built around their API.
Then I discovered, mid-build, that GoCardless had quietly stopped accepting new signups. No announcement. No deprecation notice. The login page had no link to register — just a message saying new signups were closed. It appears to be a quiet wind-down of the free self-service tier while existing users retain access.
So I evaluated alternatives:
| Provider | Verdict |
|---|---|
| Tink (Visa) | Free sandbox, but states "business use only" |
| Plaid | Sandbox free, but more US-centric |
| Yapily | Enterprise-only, no self-service free tier |
| Enable Banking | UK coverage uncertain, requires sales quote |
| TrueLayer | UK-native, free sandbox, no limits, Mock Bank with test credentials |
TrueLayer won. It's UK-native, the sandbox is genuinely free with no expiry, and you don't need a commercial agreement or FCA registration to use it. The Mock Bank provides a full end-to-end flow with test credentials.
However, it's a fundamentally different integration model. GoCardless uses webhooks — they push events to you when accounts are discovered and transactions are ready. TrueLayer's Data API is synchronous — after consent, you pull the data yourself.
The architecture survived the pivot because the queue-based decoupling meant only "what triggers the message" changed, not "what processes it." The webhook Lambda became unnecessary, but the SQS-to-processor pattern stayed. The lesson: loose coupling paid off on day one.
How Claude Code Built It
The build happened in roughly five stages across two days. I'll go deep on the most interesting moment — the provider pivot — because that's where AI-assisted development felt most different from working alone.
Design phase. I described the requirements in plain English. Claude produced the architecture document and a phased implementation plan, both of which are still in the repo. This was the most valuable output — not code, but structure. Ten phases, each with test-first development, clear module boundaries, and explicit verification steps.
Scaffold and modules. I built the Terraform modules one at a time, following the plan. Each module came with tests — terraform test for infrastructure and vitest for the Node.js Lambdas. The data-processing Lambda turned out to need Python (more on that shortly), so pytest joined the toolchain. TDD meant fewer "it compiles but doesn't work" surprises.
The pivot. When I discovered GoCardless was unavailable, I described the constraint to Claude. It produced the ADR (architecture decision record), rewrote the implementation plan, and migrated all the Lambda code. The connection-based model — replacing GoCardless's requisition-based approach — was its suggestion. A fundamental architecture change, handled calmly and systematically. No panic, no throwaway prototypes. Just: here's the decision, here's the rationale, here's the new code.
This is what felt different. A solo developer hitting a dead end mid-build has to context-switch between disappointment, research, decision-making, and re-implementation. Claude compressed that into a single focused session: evaluate alternatives, pick one, document why, rewrite the code.
Integration and debugging. The first deploy had issues. Iceberg write failures (schema mismatches between TrueLayer's response format and my table definitions), CORS problems, and callback URL wiring all needed sorting out. Claude diagnosed and fixed from error logs — I'd paste the CloudWatch error, and it would explain what was wrong and produce a fix.
The accounts page. The next morning, I added an end-to-end feature in a single session: a query Lambda that reads from Iceberg, an API Gateway route, and a frontend page that displays the data. Planned, built, and deployed in about an hour.
What Worked Well
- The original hypothesis held — Lambdas can write to Iceberg cleanly, and the resulting banking data is queryable from Spark or Athena exactly as if it lived in an RDS instance. No connection pools, no VPC cold starts, no idle timeouts.
- Structured planning before coding — the implementation plan prevented scope creep and gave both of us a shared reference point.
- TDD approach — caught integration issues early, particularly around DynamoDB access patterns and Iceberg schema definitions.
- Handling the pivot — the ADR plus the rewritten plan meant nothing was lost in the transition; the reasoning is documented.
What Needed Human Judgement
- Choosing TrueLayer over the alternatives — that needed domain knowledge about the UK Open Banking landscape and what "free sandbox" actually means in practice.
- Deciding sandbox-only was fine — a product scope decision, not a technical one.
- Validating the end-to-end flow in a real browser — clicking through the consent redirect, checking the email arrived, and viewing the accounts page.
- AWS account setup, SES verification, and SSM secrets — the "real world" bits that aren't in any codebase.
The Technical Stack
| Layer | Technology | Why |
|---|---|---|
| Infrastructure | Terraform | Reproducible, modular, testable |
| Compute | AWS Lambda (Node.js + Python) | Node for API-facing (fast cold start), Python for data processing (PyIceberg) |
| Queue | SQS + DLQ | Decouples consent from processing, handles retries |
| State | DynamoDB | Single-table design, pay-per-request, point-in-time recovery |
| Storage | S3 + Iceberg + Glue | Analytics-ready, schema evolution, standard format |
| API | API Gateway (HTTP API) | Cheap, auto-deploy, payload v2 |
| Frontend | Static HTML/JS on S3/CloudFront | No framework, no build step |
| SES | Transactional notifications | |
| Open Banking | TrueLayer (sandbox) | Free, UK-native, good developer experience |
| AI | Claude Code | Design, implementation, debugging |
Note the deliberate absence of: React, Docker (except for the Python layer build), Kubernetes, databases-as-a-service, API frameworks. The simplest thing that works.
Closing
The system works. It connects to a real (sandbox) bank via TrueLayer, pulls accounts, transactions, and balances, writes them into Apache Iceberg tables on S3, emails me when it's done, and lets me view the data in a browser. All infrastructure-as-code, all test-driven, all from a monorepo. Built in a single day.
Claude Code didn't replace engineering judgement — it amplified it. I still made the decisions: what to build, which provider to use, sandbox-only scope, plain HTML over a framework. But the implementation velocity was genuinely different. A system that would have taken two or three weekends of scattered effort took one focused day of pairing.
Bear in mind what this isn't. There's no auth beyond "type your email." No token refresh for expired connections. No monitoring or alerting. No CI/CD pipeline. It's sandbox-only — the transactions are synthetic test data from TrueLayer's Mock Bank. The goal was to prove the pattern end-to-end, not to build something production-ready.
The repo is open if you want to poke around. The architecture doc, implementation plan, and ADR are all in docs/ — they're as much a record of the process as the code itself.


Comments
Post a Comment