The Challenge
A Series B fintech processing over 2 million transactions daily was hemorrhaging money to fraud. Their legacy rule-based system caught only 58% of fraudulent transactions and generated so many false positives that their operations team spent more time reviewing legitimate transactions than catching real fraud.
The company needed a system that could detect fraud in real-time (sub-200ms latency), dramatically reduce false positives, and adapt to new fraud patterns without manual rule updates. They had 6 months of runway pressure and couldn't afford a year-long ML initiative.
Our Approach
Week 1-2: Data Audit & Architecture
We started by auditing 18 months of transaction data — 400M+ records — to understand fraud patterns, seasonal variations, and the specific failure modes of the existing rule engine. We identified 47 features that correlated with fraud but weren't being used, including device fingerprinting signals, velocity patterns, and behavioral anomalies.
We designed a two-stage architecture: a fast gradient-boosted model for real-time scoring, backed by a deep learning model for complex pattern detection on flagged transactions.
Week 3-5: Model Development & Training
We trained the primary model on 12 months of labeled data, using the remaining 6 months for validation. Key engineering decisions:
- Feature engineering pipeline built on Apache Flink for real-time streaming features
- XGBoost ensemble for the fast-path scorer (p99 latency: 12ms)
- Transformer-based model for second-stage analysis on flagged transactions
- Custom loss function weighted toward recall to minimize missed fraud
Week 6-7: Shadow Deployment & Tuning
We ran the new system in shadow mode alongside the existing rules engine for two weeks. This let us compare performance head-to-head on live traffic without risk. During this phase, we tuned the decision threshold to optimize the precision-recall tradeoff for the client's specific risk appetite.
Week 8: Production Cutover
Full production deployment with gradual traffic ramp: 10% on day one, 50% by day three, 100% by end of week. We built automated rollback triggers tied to false positive rate and latency SLAs.
Tech Stack
Results
Within the first quarter of production deployment:
- 95.7% fraud detection rate — up from 58% with the rule-based system
- $2.3M saved in prevented fraudulent transactions
- 60% reduction in false positives — freeing the ops team to focus on genuine edge cases
- Sub-50ms average latency — well within the 200ms SLA
- Model drift monitoring catches distribution shifts within hours, triggering automated retraining
Arkyon took our messy data landscape and turned it into a production-grade AI system in 7 weeks. Their engineering depth is unmatched in the boutique space.
R.S. — CTO, Series B FinTech
What Made This Work
- Data-first approach — we spent 30% of the project on data quality and feature engineering, which had more impact than model architecture
- Shadow deployment — running both systems in parallel eliminated risk and built stakeholder confidence
- Automated retraining — the system continuously adapts to new fraud patterns without manual intervention
- Clear success criteria — detection rate, false positive rate, and latency SLAs were defined before we wrote a single line of code