Platform-Engineering-Lab

Built a full hybrid environment spanning AWS, GCP, and on-prem ProxMox — complete with multi-tenant Kubernetes, Terraform IaC, and observability pipelines.

View the Project on GitHub

Troubleshooting Callout

Over 9 distinct troubleshooting events were logged, demonstrating skills in resolving Docker image mismatches, pod crash loops, network policy, JSON errors, and Terraform configuration failures across GCP and ProxMox.

Deployments

Summary: Creating, updating, rolling out, scaling, and troubleshooting Kubernetes deployments. Pod lifecycle managemnent, replication, rollout strategies, annotations, and error handling.

Docker

Summary: Installing Docker, launching containers, creating and mounting volumes, and validating persistent storage at the container level.

GCP

Summary: Provisioning compute resources and storage buckets, configuring IAM and gservice accounts, managing billing, creating snapshots, and enabling authentication from local workstation. Emphasis on cloud operations and observability.

Istio

Summary: Installation and validation of Istio service mesh, deploying sample applications, creating gateways, exposing services externally, and validating ingress traffic.

Minikube

Summary: Local Kubernetes cluster setup, pod deployment, external load balancer provisioning, filesystem creation, and troubleshooting Minikube startup issues.

Namespaces

Summary: Creating and managing Kubernetes namespaces, validating taints, and applying proper namespace scoping for resources.

Networking

Summary: Applying network policies, identifying manifest errors, and validating connectivity to pods.

OnPremIaC

Summary: Hypervisor and VMs management and troubleshooting. RBAC administration and enforcement via API authentication, administration and troubleshooting.

Pods

Summary: Creation, management, troubleshooting, and validation of pods. Includes manifest writing, exec into containers, volume mounts, and pod lifecycle events.

Prometheus

Summary: Monitoring and observability of pods, ensuring visibility into pod up status and overall health checks of defined metrics.

ReplicaSets

Summary: ReplicaSet manifest creation, scaling, validating replica counts, and ensuring high availability.

Security

Summary: Certificate creation for Kubernetes - signing, managing gservice account keys, and creating Kubernetes secrets to secure sensitive credentials.

ServiceAccounts

Summary: Service account creation, attaching io pods, decoding JWT tokens, validating token TTL, and troubleshooting service account deployment issues.

Storage

Summary: PersistentVolume (PV) and PersistentVolumeClaim (PVC) creation, ephemeral storage validation, mounting across pods, multi-namespace validation, and StorageClass creation.

Terraform

Summary: Declarative infrastructure management in GCP and on-prem including installing Terraform, initializing, planning, applying manifests, and destroying resources declaratively.