Multi-Cluster Kubernetes Platform
Architected and ran containerized infrastructure across AWS EKS and on-prem clusters for mission-critical connectivity systems in aviation and maritime — the platform inflight Wi-Fi and IFE ride on.
Principal-level DevOps and AI Infrastructure Engineer designing, scaling, and securing mission-critical distributed systems across cloud and on-prem environments. I build resilient platforms on Kubernetes and AWS EKS, with deep expertise in observability, CI/CD automation, and DevSecOps.
A deep technical stack assembled across 16+ years — spanning cloud architecture, container orchestration, observability, automation, security, and machine learning infrastructure.
Designing scalable, resilient, cost-efficient cloud architectures across hyperscalers and on-prem.
Automation-first pipelines that make deployments rapid, reliable, and repeatable.
Real-time visibility into distributed systems — metrics, logs, and databases at scale.
Building the infrastructure that powers next-generation AI/ML workloads.
Secure-by-design principles applied across the full platform lifecycle.
Leading complex incident response, mentoring teams, and communicating across cultures.
16+ years across cloud architecture, DevOps, SRE and AI infrastructure — layered on top of graduate study in Computer Science, Artificial Intelligence and Cyber Law.
From greenfield platform design to hardening what you already run in production — here are the engagements I take on.
Multi-cluster Kubernetes on AWS EKS, Azure AKS, OCI or on-prem. Designed for availability, resilience and cost efficiency.
End-to-end telemetry stacks that surface the right signal — alerting that respects SLIs/SLOs instead of paging on noise.
Jenkins pipelines, Ansible playbooks and infrastructure-as-code that make shipping and recovery boring in the best way.
Security as a first-class citizen — SSL/TLS lifecycle, container scanning, secrets management and continuous vulnerability posture.
Scalable, secure infrastructure for AI/ML workloads — from training pipelines to model serving and data versioning.
High-throughput MySQL with Group Replication — tuning, replication health and recovery from transaction storms.
A selection of the most impactful platforms and systems I've built, scaled and hardened.
Architected and ran containerized infrastructure across AWS EKS and on-prem clusters for mission-critical connectivity systems in aviation and maritime — the platform inflight Wi-Fi and IFE ride on.
Designed and operationalised a full observability platform — Prometheus for metrics, Grafana for dashboards, cAdvisor for container telemetry, ELK for logs. Alerting that mapped to real user impact, not noise.
Administered a production MySQL 8.2 cluster with Group Replication. Resolved replication conflicts, rolled-back transactions and storage bottlenecks while keeping write throughput high and consistency intact.
End-to-end Jenkins pipelines with Artifactory, SonarQube and Docker integrations across Python, C/C++, NodeJS and Vue codebases. Scheduled automation for health checks, log rotation and disk management. Ansible at the config layer.
Rolled out secure-by-design principles across the platform — SSL/TLS lifecycle via AWS Certificate Manager, container vulnerability scanning with Aqua Security, and NGINX-delivered certificate automation through Ansible.
Built a recommendation engine that drove product visibility using purchase history, cart activity, brand preference and browsing behaviour — on-site suggestions and email campaigns. Boosted data mining and automation by 45%.
Open to Senior SRE, Platform Engineering, DevOps/DevSecOps leadership and AI Infrastructure roles — remote, hybrid, or on-site in the UAE. Also available for consulting engagements. I typically reply within 24 hours.