Levelbrook Labs

Building AI Fraud Alert Management: Notes on a Financial Crime Prevention Platform

Financial Crime Prevention Platforms (FCPPs) operate at a brutal scale. They ingest a torrent of financial events—payments, logins, profile changes—and must distinguish, in milliseconds, legitimate activity from a vast and evolving landscape of threats. The core engineering challenge isn't just about running a predictive model; it's about managing the inevitable fallout: the alerts.

Any effective detection system, whether rule-based or machine-learned, will generate false positives. The business cost of an unmitigated false negative (missed fraud) is high, so models are often tuned to be sensitive. This creates a firehose of alerts that must be triaged, investigated, and dispositioned by human analysts. An organization's ability to fight financial crime is therefore gated not by the sophistication of its models, but by the efficiency of its human investigators. This makes the alert management dashboard one of the most critical, and technically interesting, components of the entire platform.

It's an intersection of high-throughput data engineering, low-latency machine learning, and thoughtful human-computer interaction. The goal is to build a system that acts as a force multiplier for human expertise.

System Architecture: A Polyglot Approach

No single tool is right for this job. A pragmatic architecture embraces a polyglot stack, choosing technologies for their specific strengths in a distributed system. Here's a sketch of a robust platform.

The Data Model: Separating Concerns

The data has different access patterns and consistency requirements. A hybrid persistence strategy is essential.

  • Postgres for Cases & Audits: The "case" is the central unit of work for an investigator. It has a defined state (Open, In-Progress, Closed), an assignee, and associated alerts. This is relational data that demands strong ACID guarantees. Postgres is the canonical choice for this transactional core, managing case states, user permissions, and regulatory-mandated audit trails with absolute consistency.
  • Bigtable for Events: The raw material of an investigation is the firehose of user events—transactions, logins, password resets. This is time-series data, written constantly and read in chronological slices ("show me everything this user did in the last 72 hours"). A wide-column store like Google Bigtable is built for this. A well-designed row key, such as user_id#inverted_timestamp, allows for efficient time-range scans and prevents hot-spotting at scale.
  • BigQuery for Analytics & Training: To find new fraud rings or retrain models, you need to run large-scale analytical queries across months or years of data. BigQuery serves as the data warehouse. Data from Postgres and Bigtable is streamed or batch-loaded into BigQuery, where data scientists can join it with other sources and perform the heavy lifting of feature engineering and model training without impacting the production operational databases.

Service Decomposition & Technology Choices

A microservices approach allows for independent scaling and technology selection.

Ingestion & Real-time Processing (Golang): A set of lightweight, high-concurrency services written in Go can consume from a message queue like Kafka. Their job is to parse, validate, and write events to Bigtable. Go's performance, simple concurrency model (goroutines), and static binaries make it ideal for these high-throughput, I/O-bound network services.

ML/AI Services (Python): Python is the lingua franca of machine learning. Separate Python services subscribe to the event stream, pass transaction data to loaded models (e.g., XGBoost, graph-based models), and publish scores back to the stream. Critically, this layer also includes services for AI-powered augmentation. For example, an LLM integration service can take a cluster of related, cryptic events and generate a human-readable summary:

"This alert was triggered by a high-velocity sequence of small card-not-present transactions from multiple IPs, followed by a large purchase of a digital gift card—a pattern highly indicative of card testing."

Backend-for-Frontend (BFF) & API (Node.js/TypeScript): The React frontend needs a tailored API that aggregates data from multiple downstream services. A Node.js server using TypeScript is a natural fit. It can query Postgres for case details, fetch event history from Bigtable, and call the Python LLM service for a summary, composing it all into a single payload for the client. Using TypeScript end-to-end (React frontend, Node.js BFF) reduces cognitive overhead and prevents entire classes of bugs.

The Investigator's Cockpit (React/TypeScript): The frontend is where it all comes together. It's a dense, real-time application. My experience building live UIs with tools like WebSockets and SSE (Server-Sent Events) informs this choice. For this system, SSE is often a perfect fit. When an analyst is viewing a case, the Node.js BFF can hold open an SSE connection, pushing updates as new events arrive or as another analyst leaves a comment. Virtualized lists are non-negotiable for rendering potentially thousands of transactions without crashing the browser.

Infrastructure as Code (Terraform): The entire cloud environment—from the Bigtable instance to the Kubernetes cluster running the services—should be defined declaratively using Terraform. This ensures reproducibility, simplifies environment management, and provides a clear audit trail for infrastructure changes.

Tradeoffs, Correctness, and the Human-in-the-Loop

Building such a system is an exercise in managing tradeoffs. While the detection pipeline needs to be low-latency, the investigation UI must prioritize correctness. An analyst cannot act on stale or inconsistent data. This is why the case state lives in Postgres; when an analyst claims a case, that `UPDATE` transaction must be atomic and immediately consistent across the entire system.

Where Things Break at Scale

The architecture looks clean on a whiteboard, but production load reveals the weak points. A Black Friday traffic spike can overwhelm the event ingestion pipeline. A poorly designed Bigtable row key can lead to catastrophic write latency. A single complex case with tens of thousands of associated events can cause the BFF to time out while trying to aggregate data. Defensive engineering is key: strict timeouts, circuit breakers between services, and intelligent caching (e.g., Redis for hot entity data) are not optional.

The AI Feedback Loop

The most critical feature of the entire platform is the feedback mechanism. When an analyst closes a case, they don't just click a button. They provide structured feedback: "Confirmed Fraud - Account Takeover," "False Positive - User Traveling," "Suspicious - Refer to AML."

This isn't just for record-keeping. It is the most valuable data the system can generate. This feedback flows directly back to BigQuery, where it becomes the ground truth for the next generation of models. This tight, human-in-the-loop feedback cycle is what allows the platform to adapt and evolve, turning every decision an analyst makes into a lesson for the machine.

The AI's role is not to replace the analyst, but to be the world's best assistant. It sifts through the noise, summarizes the complex, and highlights connections a human might miss, freeing up the analyst to do what they do best: apply domain expertise and make high-stakes judgment calls.

A Closing Reflection

The challenge of building a financial crime prevention platform is compelling because it's a microcosm of modern software engineering. It's a distributed system where correctness and performance are in constant tension. It's a data problem at a massive scale. And, most importantly, it's a sociotechnical system where the ultimate goal is not simply to achieve a high F1 score, but to build a seamless, low-friction interface between a human expert and a machine's pattern-matching capabilities. The engineering is in service of that collaboration.