DevOps Engineer Β· Platform Reliability

DevOps practices

I build reliable CI/CD and scalable cloud infrastructure as code: fast delivery, zero downtime, and observability by default.

  • Terraform evangelist
  • Kubernetes SRE
  • Cloud cost detective
DevOps illustration

Live delivery score

99.95% uptime across 120+ services

35% Faster releases after pipeline redesign
120+ Services observed with unified telemetry
15 min Average rollback time from infra incidents

What partnering with me feels like

Delivery with confidence

Layered CI/CD guardrails, ephemeral environments, and release strategies that keep features flowing without surprises.

  • Progressive delivery & feature flags
  • Optimized runners and caching
  • Self-service deployment portals

Resilient platforms

Kubernetes foundations, service meshes, and policy-as-code so teams build quickly while staying compliant.

  • GitOps & drift detection
  • Chaos & load rehearsal
  • Automated disaster recovery

Insights that matter

Full visibility from tracing to cost analytics. I surface actionable signals, not dashboards that gather dust.

  • SLI/SLO design and governance
  • Real-time incident command
  • Cloud spend detective work

Experienced With

Docker Kubernetes Terraform AWS Azure GCP Hybrid clouds Prometheus Grafana ELK GitLab CI Jenkins GitHub Actions

Delivery journey highlights

2024

Multi-cloud Terraform accelerator

Built reusable infrastructure modules powering AWS & GCP foundations for 8 product squads.

2023

GitHub Actions migration

Moved legacy GitLab CI pipelines into GitHub Actions, slicing build minutes by 35% and adding deployment previews.

2022

Always-on observability

Centralized logging, tracing, and SLOs across 120+ workloads with automated playbooks and on-call readiness.

Recent Projects

Elastic deploy runway

Migrated from GitLab CI to GitHub Actions with dynamic runners, driving 35% faster builds and fully automated rollbacks.

Multi-cloud IaC toolkit

Developed Terraform blueprints for AWS and GCP that ship with policy-as-code, tagging standards, and drift detection.

24/7 reliability ops

Implemented SLO-driven alerting, incident retrospectives, and shared dashboards to align engineering and product.