Debugging Helm charts in Terraform
Recently, I encountered an intriguing issue while deploying helm charts on a Kubernetes cluster using Terraform. While it’s relatively straightforward to define and configure helm charts within Terraform using the helm_release
resource, debugging issues when a Kubernetes resource within the helm chart fails to start can be challenging. The default behavior of Terraform, including setting the environmental variable TF_LOG=”DEBUG”
often provides limited debugging information and can result in timeouts during the terraform apply
process.
In this post, I aim to address this ambiguity and demonstrate how you can obtain more useful information during helm chart deployment. Terraform itself suggests using the helm
CLI to tackle such issues, which aligns with our approach. However, there’s a catch. When deploying infrastructure to GCP using Terraform, it’s common practice to use a separate service account with limited privileges rather than your local GCP user account. This approach ensures consistency within the team responsible for cluster administration, minimizing ambiguities and potential configuration discrepancies. So there’s some additional configuration one needs to perform in order to perform such tasks.
Google Cloud Platform (GCP)
When deploying infrastructure on-prem, an authentication proxy is typically used for administration and privilege delegation. However, in cloud environments like Google Cloud Platform (GCP), the permission management system is more intricate. GCP’s IAM component handles this task, and it utilizes three types of resources to access other resources within the cloud infrastructure. For more information, you can refer to this resource.
GCP utilizes three types of resources to access other resources across the cloud infrastructure:
- User-managed service accounts: Service accounts created and managed by users, are often used as identities for workloads.
- Default service accounts: User-managed service accounts are automatically created when enabling specific Google Cloud services, requiring user management.
- Google-managed service accounts: Service accounts created and managed by Google, granting services access to resources on your behalf.
While all three types have their uses, for Terraform and GCP, I found it best to create and manage a single service account. Delegating permissions to this account allows the deployment of defined resources within the Terraform module to GCP.
💡 Terraform module parses the configuration and calls the appropriate provider API, while requiring proper authentication and authorization.
Terraform
Terraform is an open-source infrastructure as code (IaC) tool developed by HashiCorp. It enables users to define and manage their infrastructure in a declarative and version-controlled manner. With Terraform, you can describe your desired infrastructure configuration using a simple, human-readable language. This configuration can be used to provision and manage a wide range of resources across various cloud providers and on-premises environments.
Consider an example with a terraform@gcp-project.iam.gserviceaccount.com service account specified in the Terraform GCP provider configuration:
provider "google" {
alias = "impersonation"
scopes = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
]
}
provider "google" {
access_token = data.google_service_account_access_token.default.access_token
request_timeout = "60s"
}
data "google_service_account_access_token" "default" {
provider = google.impersonation
target_service_account = "terraform@gcp-project.iam.gserviceaccount.com"
scopes = ["userinfo-email", "cloud-platform"]
lifetime = "1200s"
}
In the configuration, the data source is utilized to retrieve the access token of the account we’re impersonating. Note that access token retrieval is possible because we used gcloud
CLI to authenticate as shown below.
Among other potential Terraform resources, there is a helm_release
resource that enables the deployment of a helm chart onto the Google Kubernetes Engine (GKE):
resource "helm_release" "example" {
name = "my-redis-release"
repository = "https://charts.bitnami.com/bitnami"
chart = "redis"
version = "6.0.1"
values = [
"${file("values.yaml")}"
]
}
Occasionally, misconfigurations in the helm chart can lead to deployment issues, which can be better debugged when using the helm CLI instead of Terraform. As noted at the beginning of the post — By default, Terraform’s behavior, which can optionally include setting the environmental variable TF_LOG=”DEBUG”
can offer limited debugging information and potentially lead to timeouts during the terraform apply
process.
Debugging
To demonstrate how to debug issues with the helm_release
resource in Terraform using the helm
CLI while using the Terraform service account instead of your user account, impersonation comes into play. To impersonate the Terraform service account on your local CLI, follow these steps:
- Ensure you are logged in using your user account:
gcloud auth login --update-adc
- Grant your user account permission to impersonate the Terraform service account as per the Google documentation.
- Impersonate the service account on your CLI using the following command:
gcloud container clusters get-credentials <cluster-name> --region <region> --project <project> --impersonate-service-account=terraform@gcp-project.iam.gserviceaccount.com
With this setup, both kubectl
and the helm
CLI will use the impersonation account to call the GKE API. Now you are authorized to execute:
helm install my-release oci://registry-1.docker.io/bitnamicharts/redis
This will provide you with a much better-debugging output on top of that you can use kubectl logs
to further investigate the issue with pods or whatever the helm chart is deploying.
Conclusion
In this post, we explored the challenges and limitations faced when deploying helm charts on a Kubernetes cluster using Terraform. We discussed how the default behavior of Terraform, along with limited debugging information, can result in timeouts during the terraform apply
process. To address these issues, we focused on using the helm
CLI for better debugging capabilities. However, when deploying infrastructure with Terraform, it’s common to use a separate service account with delegated privileges instead of the local user account. This approach ensures consistency and minimizes configuration discrepancies within the team responsible for cluster administration.
Thanks for reading! 😎 If you enjoyed this article, hit that clap button below 👏
Would mean a lot to me and it helps other people see the story. Say Hello on Linkedin | Twitter
Do you want to start reading exclusive stories on Medium? Use this referral link 🔗
If you liked my post you can buy me a Hot dog 🌭
Are you an enthusiastic Engineer, who lacks the ability to compile compelling and inspiring technical content about it? Hire me on Upwork 🛠️
Checkout the rest of my content on Teodor J. Podobnik, @dorkamotorka and follow me for more, cheers!