← All Case Studies
Financial Services KubernetesAWSDevOps
The Challenge

A fast-growing fintech needed to scale their microservices architecture reliably while meeting strict regulatory requirements.

The situation

This was a fintech running around 30 microservices on a handful of EC2 instances, deployed via a collection of bash scripts that had grown organically over two years. No two services deployed the same way. Rollbacks meant SSH-ing into a box and hoping the previous artifact was still cached somewhere.

The real urgency, though, was regulatory. They had an FCA audit scheduled in three months, and the auditors would expect full deployment traceability: who deployed what, when, and through what process. At that point, the answer to all three questions was essentially “it depends who was on call.” Speaking of on-call, their engineers were averaging four pages a night. Morale was low. Two senior developers had already handed in their notice.

We needed to move quickly, but we also needed to get it right. There would be no second chance with the auditors.

Our approach

After a thorough assessment of their architecture and compliance requirements, we chose AWS EKS as the foundation. The team already had strong AWS knowledge, and EKS with managed node groups meant we could reduce operational overhead without sacrificing the control that a regulated environment demands. Migrating to an entirely new cloud provider would have burned time they simply did not have.

For the service mesh, Istio was a straightforward choice. The FCA requirements called for encryption of all inter-service communication. Istio’s mTLS gave us that out of the box, with the added benefit of fine-grained traffic policies and observability across every service boundary.

We implemented a full GitOps workflow using ArgoCD. Every deployment now flows through a pull request, is reviewed, and leaves a complete audit trail in git. No more undocumented changes. No more “I’ll just push this fix quickly.”

About three weeks in, we hit an unexpected problem. During our security review, we discovered that database credentials and API keys for two payment providers were stored in plaintext in a private GitHub repository. They had been there for over a year. We immediately rotated every compromised secret and introduced Sealed Secrets so that encrypted credentials could live safely in version control. It was a sobering find, but catching it before the audit saved the client from what could have been a serious regulatory issue.

We also ran into a performance snag. After enabling Istio across all services, p99 latency on their payment processing endpoint jumped by around 40ms. For a fintech handling real-time transactions, that was not acceptable. We spent a few days tuning the Envoy sidecar resource limits and adjusting the connection pooling configuration, which brought the overhead down to under 5ms. A good reminder that service meshes are not simply plug-and-play.

The outcome

The impact went beyond metrics.

  • Less than 3 minutes of total downtime across all services in the first year of production
  • Deployment time dropped from roughly 3 hours to 12 minutes, with full rollback capability
  • FCA audit passed on the first attempt, with auditors specifically praising the deployment traceability
  • Auto-scaling now handles 10x traffic spikes without manual intervention
  • On-call pages dropped by over 80%, and the two engineers who had resigned agreed to stay

The client went from dreading their audit to using their infrastructure as a selling point with institutional partners. Six months on, they have doubled their engineering team and are deploying to production multiple times a day with confidence.

Facing something similar?

Tell us what you are dealing with and we will let you know how we can help.

Get In Touch