Levelbrook Labs

Building an AI-Powered Fraud Investigation Dashboard: Notes on Financial Crime Prevention

Financial crime prevention is one of those domains where the engineering challenges are as complex and dynamic as the problem itself. It's a high-stakes, adversarial game played at millisecond latencies and petabyte scale. The goal isn't just to build a system that works today, but one that can evolve faster than the opposition. This makes it a fascinating problem to decompose from a full-stack perspective.

I recently spent some time architecting a proof-of-concept for a real-time fraud alert and investigation dashboard. This write-up captures some notes on the domain, the architecture, and the pragmatic tradeoffs required to build something robust and useful.

Try the interactive demo

The Domain: A High-Stakes Game of Cat and Mouse

At its core, the problem is about identifying suspicious patterns in a massive stream of transactions. The technical interest comes from several constraints:

This intersection of real-time processing, big data, and an adaptive threat model is what makes it a compelling engineering challenge. You can't just throw a model at it; you need a resilient, observable, and human-centric system.

System Architecture: A Polyglot, Purpose-Built Stack

No single language or database is the right tool for every part of this problem. A pragmatic architecture embraces a polyglot approach, choosing technologies for their specific strengths.


// Conceptual Data Flow

[Transaction Event] -> Kafka Topic
       |
       v
[Go/Java Scoring Service] --reads--> [Bigtable: User History]
       |                                --reads--> [Postgres: User Profile]
       |
       +--> [Python ML Model Endpoint] for inference
       |
       v
[Risk Score + Features] -> Kafka Topic
       |
       +----------------------------------> [BigQuery: Analytics & Model Training]
       |
       v
[Node.js WebSocket Service] --pushes--> [React Frontend Dashboard]
       |
       v
[Postgres: Cases/Alerts Table]
            

The Investigation Hub: React & TypeScript

The dashboard is the human interface to the machine's decisions. It needs to be fast, dense with information, and, above all, real-time. A new high-risk alert must appear on an investigator's screen instantly. For this, a stack of React and TypeScript is a solid choice. The component model is perfect for building a complex UI of transaction lists, detail panes, user history timelines, and network graphs. Real-time updates would be pushed from the backend via WebSockets or Server-Sent Events (SSE). My experience with Turbo Streams in the Rails world reinforces the value of server-pushed updates, and WebSockets provide the necessary bidirectional channel for actions initiated from the UI.

The Glue and Real-Time Layer: Node.js

A Node.js service acting as a Backend-for-Frontend (BFF) is ideal. Its non-blocking I/O model excels at managing thousands of persistent WebSocket connections and proxying requests to downstream services. It can listen to a Kafka topic for new alerts and immediately push them to the relevant investigator clients. It's the switchboard of the system.

The Core Logic: Golang & High-Performance Runtimes

The real-time transaction scoring engine is the critical path. It needs to be incredibly fast and concurrent. This is where a language like Golang (or Java/Rust) shines. Upon receiving a transaction, this service would perform a series of rapid, parallel lookups: fetch the user's last 10 transactions, check their account tenure, look up the device fingerprint's reputation, etc. After enriching the transaction with these features, it calls the ML model for a score and applies any hard-coded business rules. Go's concurrency primitives (goroutines, channels) are a natural fit for this fan-out/fan-in data retrieval pattern.

Data Storage: A Three-Tiered Approach

A single database can't efficiently serve all needs.

ML & Infrastructure

The model training and experimentation would live in the Python ecosystem (PyTorch, scikit-learn). Trained models are then exported to a format like ONNX for high-performance inference in the Go/Java service. All of this infrastructure—databases, services, networking rules—should be defined declaratively using Terraform for reproducibility and scalability.

Pragmatism, Tradeoffs, and the Human in the Loop

Building a system like this is an exercise in managing tradeoffs. A senior engineer's role is not just to pick the "best" tech but to make the right compromises.

The central challenge is not just detecting fraud, but making the detection explainable and actionable for a human investigator. A 99.8% risk score is useless without the "why."

Explainability is a Feature: The system must not return a simple score. It must return the score *and* the top contributing features. The React dashboard shouldn't just say "High Risk." It should say "High Risk because: transaction amount is 50x user average, shipping address is new, and device IP is from a high-risk region." This empowers the investigator to make a faster, more accurate decision.

The Human Feedback Loop: This is the most critical part of the architecture. When an investigator clicks "Confirm Fraud" or "Not Fraud" in the dashboard, that action must trigger an event. This event, a piece of high-quality, human-labeled data, is the most valuable asset the system can generate. It gets fed back into BigQuery, where it's used to retrain and validate the models. A system without this feedback loop is static and will inevitably be defeated.

Where It Breaks at Scale:

Closing Reflection

Engineering for fraud prevention is a systems problem that extends beyond pure code. It's about creating a tight, symbiotic loop between automated analysis and human expertise. The goal isn't to build a perfect, all-seeing AI that obviates the need for people. Instead, the goal is to build a system that acts as a force multiplier for human investigators, allowing them to focus their expertise on the most ambiguous and complex cases. The architecture must serve this primary goal: to augment, not replace, human intelligence in a domain where the context and consequences are profoundly human.