Kubernetes Deployment
Step-by-Step Guide for AWS EKS & GCP GKE for Enterprises
Introduction: Why Managed Kubernetes for Enterprise?
Kubernetes has emerged as the orchestrator of choice for containerized applications, offering unparalleled scalability, resilience, and portability. For enterprises, the operational overhead of managing a self-hosted Kubernetes cluster can be prohibitive. This is where managed Kubernetes services from major cloud providers like Amazon Web Services (AWS) with Elastic Kubernetes Service (EKS) and Google Cloud Platform (GCP) with Google Kubernetes Engine (GKE) shine. These services abstract away the complexities of control plane management, allowing cloud architects and development teams to focus on application deployment and innovation. This guide provides a detailed step-by-step approach to deploying Kubernetes on both EKS and GKE, along with critical enterprise-grade considerations for a robust, secure, and scalable production environment.
Deploying Kubernetes on AWS EKS
Amazon Elastic Kubernetes Service (EKS) provides a managed Kubernetes control plane, integrating deeply with AWS services for networking, security, and scalability. We’ll use eksctl
, a simple CLI tool for creating and managing EKS clusters.
Before you begin, ensure you have the necessary tools installed and configured on your local machine or a suitable CI/CD environment.
Tools:
- AWS CLI: Command Line Interface for interacting with AWS services.
- kubectl: Kubernetes command-line tool for interacting with clusters.
- eksctl: A simple CLI for Amazon EKS that automates cluster creation and management.
Configuration:
Configure your AWS CLI with appropriate credentials that have permissions to create EKS clusters, EC2 instances, and other related resources.
aws configure
Ensure your IAM user/role has sufficient permissions (e.g., AdministratorAccess for a quick start, or more granular permissions for production).
For enterprise deployments, it’s best practice to define your EKS cluster configuration in a YAML file. This allows for version control and repeatability.
# cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-enterprise-cluster
region: us-east-1
version: "1.28" # Specify your desired Kubernetes version
vpc:
id: "vpc-xxxxxxxxxxxxxxxxx" # Optional: Use an existing VPC
subnets:
public:
us-east-1a: { id: "subnet-xxxxxxxxxxxxxxxxx" }
us-east-1b: { id: "subnet-yyyyyyyyyyyyyyyyy" }
private:
us-east-1a: { id: "subnet-zzzzzzzzzzzzzzzzz" }
us-east-1b: { id: "subnet-wwwwwwwwwwwwwwwww" }
# If you don't specify VPC/subnet IDs, eksctl will create new ones.
# For enterprise, using existing, well-planned VPCs is highly recommended.
managedNodeGroups:
- name: ng-general-purpose
instanceType: m5.large
minSize: 3
maxSize: 10
desiredCapacity: 3
volumeSize: 20 # GB
amiFamily: AmazonLinux2 # or Bottlerocket
# Dedicated IAM role for worker nodes (eksctl creates one if not specified)
# instanceRoleArn: "arn:aws:iam::123456789012:role/my-custom-nodegroup-role"
labels: { role: general }
tags:
Environment: Production
Project: MyEnterpriseApp
# Enable EBS encryption for node volumes
# encrypted: true
# ssh:
# allow: true # ssh access to nodes
# publicKeyPath: ~/.ssh/id_rsa.pub
cloudWatch:
clusterLogging:
enableTypes: ["api", "audit", "authenticator", "controllerManager", "scheduler"]
# Enable all control plane logs for comprehensive monitoring
secretsEncryption:
keyARN: "arn:aws:kms:us-east-1:123456789012:key/your-kms-key-id" # Optional: Use KMS for envelope encryption of Kubernetes secrets
# IAM Identity Access Management for Service Accounts (IRSA)
iam:
withOIDC: true # Enables OIDC provider for IRSA (recommended)
Execute the eksctl create cluster
command using your YAML configuration file. This process can take 15-30 minutes.
eksctl create cluster -f cluster.yaml
eksctl
will provision the VPC (if not specified), EKS control plane, and the defined managed node groups. It also configures your local kubectl
context to connect to the new cluster.
Verify that your nodes are ready and deploy a simple Nginx application.
kubectl get nodes
kubectl get svc
kubectl get deployments
kubectl get pods -o wide
Sample Nginx Deployment:
# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
resources: # Important for cost optimization and stability
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "256Mi"
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb" # Use NLB for Layer 4, or "alb" for Layer 7
# service.beta.kubernetes.io/aws-load-balancer-internal: "true" # For internal load balancer
spec:
selector:
app: nginx
type: LoadBalancer
ports:
- protocol: TCP
port: 80
targetPort: 80
Apply the deployment and service:
kubectl apply -f nginx-deployment.yaml
It may take a few minutes for the AWS Load Balancer to provision. Check its external IP:
kubectl get service nginx-service
You can then access your Nginx application via the `EXTERNAL-IP` displayed.
Deploying Kubernetes on GCP GKE
Google Kubernetes Engine (GKE) offers a highly integrated and opinionated Kubernetes experience within the Google Cloud ecosystem, known for its strong automation and performance. We’ll use the gcloud
CLI for deployment.
Ensure you have the Google Cloud SDK (which includes gcloud
CLI) and kubectl
installed and configured. You’ll also need an active GCP project.
Tools:
- Google Cloud SDK (gcloud CLI): Command Line Interface for Google Cloud.
- kubectl: Kubernetes command-line tool.
GCP Project Setup:
Authenticate your gcloud
CLI and set your target project:
gcloud auth login
gcloud config set project [YOUR_PROJECT_ID]
Enable the necessary APIs:
gcloud services enable container.googleapis.com compute.googleapis.com
For enterprise use, consider creating a “Standard” cluster (not Autopilot initially, unless your use case aligns perfectly with Autopilot’s constraints). We’ll define a multi-zone regional cluster for high availability.
gcloud container clusters create my-enterprise-gke-cluster \
--region us-central1 \
--node-locations us-central1-a,us-central1-b,us-central1-c \
--machine-type e2-standard-4 \
--num-nodes 1 \
--min-nodes 1 \
--max-nodes 3 \
--disk-size 50GB \
--image-type COS_CONTAINERD \
--release-channel regular \
--enable-ip-alias \
--enable-autoupgrade \
--enable-autorepair \
--enable-stackdriver-kubernetes \
--workload-identity-config=enabled # Enable Workload Identity (recommended)
# --create-subnetwork name=my-gke-subnet # Optional: create dedicated subnet
# --network projects/[YOUR_PROJECT_ID]/global/networks/my-existing-vpc # Optional: use existing VPC
This command creates a GKE cluster with 1 node per zone (total 3 nodes initially), with autoscaling up to 3 nodes per zone (total 9 nodes). --enable-ip-alias
enables VPC-native clusters for better networking. --workload-identity-config=enabled
is crucial for secure IAM integration.
Note: GKE Autopilot clusters are an alternative for highly hands-off management, but they come with more constraints. For full enterprise control, Standard clusters are often preferred initially.
Get credentials for your new GKE cluster:
gcloud container clusters get-credentials my-enterprise-gke-cluster --region us-central1
Verify cluster connectivity:
kubectl get nodes
Sample Nginx Deployment:
# nginx-deployment-gke.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-gke-deployment
labels:
app: nginx-gke
spec:
replicas: 3
selector:
matchLabels:
app: nginx-gke
template:
metadata:
labels:
app: nginx-gke
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
resources: # Important for cost optimization and stability
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "256Mi"
---
apiVersion: v1
kind: Service
metadata:
name: nginx-gke-service
annotations:
cloud.google.com/load-balancer-type: "External" # Explicitly use external load balancer
spec:
selector:
app: nginx-gke
type: LoadBalancer
ports:
- protocol: TCP
port: 80
targetPort: 80
Apply the deployment and service:
kubectl apply -f nginx-deployment-gke.yaml
It may take a few minutes for the GCP Load Balancer to provision. Check its external IP:
kubectl get service nginx-gke-service
You can then access your Nginx application via the `EXTERNAL-IP` displayed.
Enterprise-Grade Considerations for Kubernetes Deployment
Deploying a basic Kubernetes cluster is just the first step. For production environments, enterprises must consider a robust set of best practices across multiple domains.
- VPC Design: Plan your VPCs and subnets carefully. Use private subnets for your nodes and control plane, and public subnets only for load balancers and bastion hosts. Ensure adequate IP address CIDR ranges for future scaling.
- VPC-Native Clusters (IP Alias/Secondary Ranges): Always enable VPC-native networking (e.g., EKS with AWS VPC CNI, GKE with IP aliases) for better scalability, security, and integration with cloud networking features like Network Endpoint Groups (NEGs).
- Network Policies: Implement Kubernetes Network Policies to control ingress and egress traffic between Pods. This provides micro-segmentation within your cluster, limiting lateral movement in case of compromise.
- Ingress/Egress: Choose appropriate ingress controllers (e.g., AWS Load Balancer Controller for ALB/NLB, GKE Ingress for GCLB) and manage egress traffic securely (e.g., via NAT Gateways, PrivateLink/Private Service Connect).
- Least Privilege: Apply the principle of least privilege for both human users and service accounts.
- Kubernetes RBAC: Define granular Role-Based Access Control (RBAC) policies within Kubernetes to control who can do what to cluster resources (Pods, Deployments, Services).
- Cloud IAM Integration:
- AWS EKS (IRSA): Use IAM Roles for Service Accounts (IRSA) to grant AWS IAM roles to Kubernetes service accounts. This allows Pods to directly assume AWS IAM roles, securely accessing AWS services without storing AWS credentials in Pods.
- GCP GKE (Workload Identity): Enable Workload Identity to allow Kubernetes service accounts to act as Google Cloud service accounts. This provides a secure and manageable way for Pods to access GCP services.
- Audit Logging: Enable comprehensive audit logging for both cloud provider APIs (CloudTrail, Cloud Audit Logs) and Kubernetes API server to track all actions and detect suspicious activity.
- Horizontal Pod Autoscaler (HPA): Scale the number of Pod replicas based on CPU utilization or custom metrics to handle application load variations.
- Cluster Autoscaler: Automatically adjust the number of nodes in your cluster based on pending Pods and node utilization. This is crucial for cost optimization and handling fluctuating infrastructure demands.
- Vertical Pod Autoscaler (VPA) / Resource Requests & Limits: Set appropriate resource requests and limits for containers to ensure efficient resource allocation and prevent over-provisioning or resource starvation. VPA can recommend optimal values.
- Spot Instances/Preemptible VMs: Leverage discounted spot instances for fault-tolerant or non-critical workloads to significantly reduce compute costs.
- Centralized Logging: Implement a robust logging solution (e.g., AWS CloudWatch Logs, GCP Cloud Logging, or ELK Stack) to collect, store, and analyze logs from all cluster components and applications.
- Comprehensive Monitoring: Use a combination of tools (e.g., Prometheus/Grafana, Datadog, New Relic) to collect metrics across all layers of your Kubernetes stack: node-level, Pod-level, container-level, and application-specific metrics.
- Distributed Tracing: Integrate distributed tracing (e.g., Jaeger, OpenTelemetry) for complex microservices to visualize request flows and pinpoint latency bottlenecks.
- Alerting: Configure meaningful alerts based on critical metrics and logs, integrating with your incident management systems.
- Image Security: Use trusted container images, scan images for vulnerabilities in your CI/CD pipeline, and implement image signing/attestation.
- Secrets Management: Use dedicated secrets management solutions (e.g., AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault, Kubernetes Secrets with KMS encryption) to securely store and inject sensitive data.
- Pod Security Standards (PSS): Enforce PSS to define the security capabilities a Pod must adhere to, limiting risky configurations.
- Regular Updates: Keep Kubernetes versions, node AMIs/OS, and add-ons regularly patched and updated to leverage the latest security fixes.
- Runtime Security: Implement runtime security solutions (e.g., Falco, Sysdig) to detect and respond to suspicious activity within running containers.
- Regional Clusters: Deploy regional clusters across multiple availability zones for high availability. Managed services like EKS and GKE handle control plane HA automatically.
- Multi-Region Strategy: For disaster recovery, consider active/passive or active/active deployments across multiple geographic regions, using tools like multi-cluster ingress or global load balancers.
- Backup & Restore: Implement strategies for backing up Kubernetes configurations (etcd) and persistent data (Persistent Volumes) using tools like Velero.
- Application Resilience: Design your applications for resilience with appropriate replica counts, anti-affinity rules, and readiness/liveness probes.
- Right-Sizing: Continuously right-size Pod resource requests and limits based on actual usage to avoid over-provisioning.
- Automated Scaling: Fully leverage HPA and Cluster Autoscaler to match resources to demand.
- Reserved Instances/Savings Plans: For predictable base loads, utilize cloud provider commitment plans for discounted rates.
- Spot/Preemptible Instances: Use these for stateless, fault-tolerant workloads to significantly reduce costs.
- Cost Visibility: Implement tools like Kubecost or cloud provider cost management dashboards to gain granular visibility into Kubernetes spend.
Conclusion: Architecting for Success
Deploying Kubernetes on AWS EKS or GCP GKE is a strategic move for enterprises embracing cloud-native transformations. While the initial steps involve setting up the core cluster, true enterprise readiness comes from meticulously planning and implementing robust strategies across networking, IAM, autoscaling, logging, monitoring, security, and high availability. By adopting these best practices, cloud architects can build highly efficient, secure, and scalable Kubernetes platforms that drive business value and accelerate innovation. The continuous evolution of cloud provider offerings and Kubernetes itself means staying informed and adapting your strategy will be key to long-term success.