Platform Engineer / Kubernetes / AWS / Developer Platforms

Platform Engineer building reliable Kubernetes platforms and developer tooling.

I work on platform engineering at Just Eat Takeaway.com, where I build tooling, automation, and guardrails around Kubernetes, AWS, progressive delivery, observability, and developer self-service. My work sits at the intersection of reliability, migration, operational guardrails, and developer experience.

/usr/local/bin/lian
LZY

operator

Lian Zhen Yang

$ kubectl get impact

network diagnostics | quota analysis | DR mapping

$ helm status delivery

rollout observability, rollback fixes, KEDA validation

$ terraform plan platform

AWS, EKS, DNS, IAM, certificates, developer guardrails

case studies

Platform engineering case studies.

Selected stories from developer tooling, Kubernetes reliability, disaster recovery, and progressive delivery work.

View all case studies

Developer tooling

Network Path Diagnostics

A Go-based self-service route-path visualization tool that made cross-account AWS networking issues faster to diagnose.

Problem
Cloud routing checks were manual, slow, and difficult for non-platform engineers to reason about. This created repeated support loops whenever teams needed to understand connectivity paths.
Outcome
Reduced network connectivity diagnosis time by roughly 90% and enabled engineers and product owners to answer routing questions with less direct platform-team support.
GoAWS NetworkingVPCTransit Gateway
Read case study

Architecture and resilience

Disaster Recovery Dependency Mapping

AI-assisted dependency discovery and diagram-as-code workflows for disaster recovery planning across hundreds of components.

Problem
Disaster recovery planning needed accurate service dependency maps, but manual discovery across hundreds of repositories and components would have been slow and error-prone.
Outcome
Mapped 200+ components and compressed discovery work into a shorter validation workflow, helping the DR initiative move from unknowns to actionable architecture review.
ArchitectureC4StructurizrAI Workflows
Read case study

Reliability engineering

Kubernetes Capacity Governance

Python tooling that scanned Kubernetes namespace quotas, HPA limits, and Karpenter capacity before scale risks turned into incidents.

Problem
Namespace quotas could silently block workloads from scaling to HPA maximums during peak traffic. Existing checks did not give teams an actionable view before the risk mattered.
Outcome
Warned 45+ teams before peak-traffic risk became incidents and validated quota suggestions against Karpenter capacity limits.
PythonKubernetesHPAResourceQuota
Read case study

Deployment reliability

Progressive Delivery Hardening

Improved rollout observability, rollback behavior, and autoscaling validation around Helm, Argo Rollouts, and KEDA.

Problem
Canary releases, rollback workflows, and autoscaling configurations had edge cases that could confuse developers or make operational signals harder to trust.
Outcome
Made progressive delivery workflows safer, reduced developer context switching, and improved confidence in rollout and autoscaling behavior.
HelmHelmfileArgo RolloutsKEDA
Read case study

capabilities

Platform skills grouped by outcomes, not logo grids.

The stack is Kubernetes-heavy, but the real value is in reliability, migration, developer experience, and operational guardrails.

Kubernetes Reliability

EKS production readiness, HPA and ResourceQuota analysis, Karpenter validation, and cluster operability.

EKSHPAResourceQuotaKarpenterPrometheusDatadog

Progressive Delivery

Safer rollout and rollback workflows with clearer developer-facing signals.

HelmHelmfileArgo RolloutsKEDADatadog

Cloud Infrastructure

Infrastructure-as-code, AWS networking, DNS, certificates, and platform migration work.

AWSTerraformTerragruntIAMRoute53ACMCloudflare

Developer Platforms

Self-service tooling, Backstage scaffolding, documentation, and platform UX.

BackstageTypeScriptGoPythonReactNext.js

Security & Governance

Access-model proposals, guardrails, certificate validation, and risk assessment.

RBACIRSAVaultDNSEKSPython

about

From process engineering to platform engineering.

I started in chemical engineering and data automation, where the work was always about understanding systems, finding bottlenecks, and making processes measurable. Platform engineering became the natural next step: the systems are cloud infrastructure and Kubernetes, the bottlenecks are developer friction and operational risk, and the best solutions are often small tools, clear guardrails, and documentation that lets teams move without waiting.