— Case study
Reporting Platform — from on-prem NiFi to BigQuery
Migrated 150+ legacy reports off on-prem NiFi to a BigQuery-backed Java service. Eliminated recurring monthly outages and gave SysOps a self-service admin.
- Role
- Product Engineer (RFC author + implementer)
- Period
- Nov 2023 – Present
- Stack
- Java 21 · Spring Boot · BigQuery · GCP · Cloud SQL · Datastream · Angular
Context
AstraPay’s legacy reporting ran on an on-prem NiFi cluster that had been quietly accumulating debt for years. Two pain points landed at the same time: monthly outages during the peak generation window that paged Operations at 2am, and a Finance team that couldn’t trace who changed which report when. The platform served 200+ critical merchant reports (settlement, reconciliation, regulatory), so “rewrite slowly while keeping the lights on” was the only viable path.
I was the only engineer on this and also the product owner. I authored the Business and Technical RFCs (system flow diagrams, ERDs, data mapping specs), prioritized the backlog with WSJF, and shipped the code.
Decisions
The first call was where to put the source of truth. NiFi’s flow-based model made every report a snowflake, so I picked BigQuery, with a Java service orchestrating templates against it. That gave us version control, audit trails, and a clean exit path off the on-prem cluster.
For the data pipeline, Datastream replaced the bespoke ETL paths that kept breaking whenever an upstream schema changed. UUPDP-compliant handling was a hard requirement, so Compliance and Legal reviewed the design before any code went in.
The third decision was the one that paid back fastest: a self-service Report Admin Web for SysOps. They had been the bottleneck for every schedule change and recipient list update. Once the admin shipped, that load left engineering entirely. SysOps now manages schedules, recipients, and ad-hoc regenerations on their own.
Results
- 150+ reports migrated; on-prem NiFi cluster decommissioned.
- Monthly outages eliminated. The new service has not paged Operations since cutover.
- 431+ merchants now settle through the new pipeline.
- SysOps no longer files engineering tickets for routine recipient or schedule changes.
Stack notes
The interesting tradeoff was around backfills. Switching the source of truth to BigQuery meant historical reports needed deterministic regeneration. Finance had to be able to ask “what did this look like as of February” and get the same numbers regardless of when we ran the query. We solved it with point-in-time partitions on the BigQuery side and idempotent generation on the Java side. Both pieces had to land together for the migration to be trustworthy.