Devops Essentials - Cloud

Cloud Source Repositories: Qwik Start


  • Theory
    • Google Cloud Source Repositories provides Git version control to support collaborative development of any application or service
  • Create a new repository
    • gcloud source repos create REPO_DEMO -=> To create a new Cloud Source Repository
  • Clone the new repository into your Cloud Shell session
    • gcloud source repos clone REPO_DEMO
  • Push to the Cloud Source Repository
    • git commit -m "First file using Cloud Source Repositories" myfile.txt
  • Browse files in the Google Cloud Source repository
    • Navigation menu > Source Repositories

Deploy Kubernetes Load Balancer Service with Terraform


  • Terraform
    • In Terraform, a Provider is the logical abstraction of an upstream API
    • Drift detection -=> terraform plan will always present you the difference between reality at a given time and config you intend to apply
    • Full lifecycle management -=> Terraform doesn't just initially create resources, but offers a single command to create, update, and delete tracked resources without needing to inspect the API to identify those resources
    • Synchronous feedback
    • Graph of relationships -=> Terraform understands relationships between resources which may help in scheduling
  • Kubernetes Services
    • A service is a grouping of pods that are running on the cluster
    • Kubernetes services can efficiently power a microservice architecture
    • Services provide important features that are standardized across the cluster: load-balancing, service discovery between applications, and features to support zero-downtime application deployments
    • Each service has a pod label query which defines the pods which will process data for the service
  • Initialize and install dependencies
    • terraform init -=> This command performs several different initialization steps in order to prepare a working directory for use
    • terraform apply -=> To apply the changes required to reach the desired state of the configuration
    • Navigation menu > Kubernetes Engine > Services & Ingress > Endpoints IP

Troubleshooting Workloads on GKE for Site Reliability Engineers


  • Theory
    • Site Reliability Engineers (SRE) have a broad set of responsibilities, and managing incidents is a critical part of their role
    • The troubleshooting process is an “iterative” approach where SREs form a hypothesis about the potential root cause of an incident, then filter, search, and navigate through large volumes of telemetry data collected from their systems to validate or invalidate their hypothesis
    • If a hypothesis is invalid, SREs will form another hypothesis and perform another iteration until they can isolate a root cause
    • You will explore the various services deployed to determine the root cause of the issue and set up a Service Level Objective (SLO) to prevent similar incidents from occurring in the future
  • Navigating Google Kubernetes Engine (GKE) Resource Pages
    • Navigation menu > Kubernetes Engine > Clusters
    • Click Name > Details > Nodes
    • Click Name > Resource summary > Summary
    • View in Metrics Explorer
  • Accessing Operational Data Through GKE Dashboards
    • Navigation menu > Kubernetes Engine > Services & Ingress
    • Click on the IP Address
    • Navigation Menu > Monitoring > Dashboards > GKE > Add Filter > Workloads > recommendationservice > Apply
  • Proactive Monitoring with Logs-Based Metrics
    • Navigation Menu > Logging > Logs Explorer > Query results > Create metric
  • Creating a Service Level Objective (SLO)
    • Navigation menu > Monitoring > Services > Service details > Create SLO
  • Define an Alert on the Service Level Objective (SLO)
    • Navigation menu > Monitoring > Services
    • Current status of 1 SLO > Create SLO Alert
Share: