<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>Aws-Cloudformation on Gruion</title><link>https://www.gruion.com/blog/tags/aws-cloudformation/</link><description>Recent content in Aws-Cloudformation on Gruion</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 17 May 2026 06:01:36 +0000</lastBuildDate><atom:link href="https://www.gruion.com/blog/tags/aws-cloudformation/index.xml" rel="self" type="application/rss+xml"/><item><title>IaC Reliability in 2026: Trust, Identity, and the Hidden Failure Modes Nobody Plans For</title><link>https://www.gruion.com/blog/post/2026-05-17-infrastructure-as-code-deployment-reliability/</link><pubDate>Sun, 17 May 2026 06:01:36 +0000</pubDate><guid>https://www.gruion.com/blog/post/2026-05-17-infrastructure-as-code-deployment-reliability/</guid><description>Key Takeaways Expired machine identities in CI/CD pipelines — not bad code — are causing real production outages; audit your deployment tokens with tools like HashiCorp Vault or AWS IAM Access Analyzer. OpenTofu (the Linux Foundation fork of Terraform) is now a production-ready alternative if …</description><content:encoded><![CDATA[<h2 id="key-takeaways">Key Takeaways</h2>
<ul>
<li>Expired machine identities in CI/CD pipelines — not bad code — are causing real production outages; audit your deployment tokens with tools like HashiCorp Vault or AWS IAM Access Analyzer.</li>
<li>OpenTofu (the Linux Foundation fork of Terraform) is now a production-ready alternative if licensing is a constraint on your IaC adoption.</li>
<li>AWS CloudFormation&rsquo;s new <code>Fn::GetStackOutput</code> eliminates manual cross-account/cross-region output wiring — a significant quality-of-life improvement for multi-account CDK users.</li>
<li>Kubernetes v1.36&rsquo;s Mixed Version Proxy (now Beta) makes rolling upgrades safer by preventing 404s during control plane version skew.</li>
<li>Progressive delivery with ArgoCD + Flagger, backed by OpenTelemetry metrics, catches regressions canaries miss at the functional level.</li>
</ul>
<h2 id="tools--setup">Tools &amp; Setup</h2>
<p>IaC reliability isn&rsquo;t just about correct Terraform plans — it&rsquo;s about the full delivery chain. Start by auditing non-human identities across your pipelines: build runners, OIDC tokens, Kubernetes service accounts, and artifact-signing credentials. Tools like <code>trufflesecurity/driftwood</code>, AWS IAM Access Analyzer, or Teleport&rsquo;s machine ID can surface stale credentials before they expire on a Friday night.</p>
<p>For multi-account AWS shops, adopt <code>Fn::GetStackOutput</code> in CloudFormation/CDK to replace brittle SSM Parameter Store hand-offs between stacks. For Kubernetes clusters in rolling upgrades, enable the <code>UnknownVersionInteroperabilityProxy</code> feature gate in 1.36 — it proxies requests to the correct API server version and eliminates garbage-collection side effects during skewed control-plane upgrades. On the delivery side, pair ArgoCD with Flagger for canary rollouts and wire OpenTelemetry spans into your pipeline so a failed integration test correlates with the downstream service it actually broke.</p>
<h2 id="analysis">Analysis</h2>
<p>The through-line in recent production incidents — Discord&rsquo;s voice outage from a hidden circular dependency, Pinterest&rsquo;s CPU zombie problem on PinCompute, late-night deployment token expiries — is that the failure wasn&rsquo;t in the IaC itself. The infrastructure was declared correctly. What failed was the operational layer surrounding it: dependency maps nobody kept current, system defaults nobody audited, machine identities nobody remembered to rotate.</p>
<p>This is where IaC maturity actually lives in 2026. Writing a Terraform module is table stakes. The harder work is building the observability and governance scaffolding around it: route sync metrics in the Kubernetes CCM to validate reconciliation behavior, <code>route_controller_route_sync_total</code> counters to A/B test watch-based vs. interval-based reconciliation, and supply-chain attestations that remain trustworthy even when OIDC tokens are abused (as in the Mini Shai-Hulud CI/CD pipeline attacks).</p>
<p>The teams shipping reliably aren&rsquo;t the ones with the most sophisticated IaC — they&rsquo;re the ones treating deployment as an observability problem. Every rollout emits telemetry. Every credential has an owner and a TTL. Every cross-stack dependency is explicit, not implicit. OpenTofu, CloudFormation CDK, ArgoCD, and Kubernetes v1.36 all move in this direction. The gap is in adopting them as a system, not as isolated tools.</p>
<h2 id="sources">Sources</h2>
<ul>
<li><a href="https://devops.com/why-devops-is-critical-for-modern-business-resilience/">https://devops.com/why-devops-is-critical-for-modern-business-resilience/</a></li>
<li><a href="https://devops.com/widespread-mini-shai-hulud-campaign-is-a-matter-of-trust/">https://devops.com/widespread-mini-shai-hulud-campaign-is-a-matter-of-trust/</a></li>
<li><a href="https://devops.com/observability-driven-continuous-testing-in-cloud-native-devops/">https://devops.com/observability-driven-continuous-testing-in-cloud-native-devops/</a></li>
<li><a href="https://devops.com/your-ci-cd-pipeline-has-non-human-identities-you-forgot-about/">https://devops.com/your-ci-cd-pipeline-has-non-human-identities-you-forgot-about/</a></li>
<li><a href="https://www.infoq.com/news/2026/05/discord-circular-dependency/">https://www.infoq.com/news/2026/05/discord-circular-dependency/</a></li>
<li><a href="https://www.infoq.com/news/2026/05/pinterest-cpu-zombies-bottleneck/">https://www.infoq.com/news/2026/05/pinterest-cpu-zombies-bottleneck/</a></li>
<li><a href="https://www.infoq.com/news/2026/05/kubernetes-1-36-released/">https://www.infoq.com/news/2026/05/kubernetes-1-36-released/</a></li>
<li><a href="https://kubernetes.io/blog/2026/05/15/ccm-new-metric-route-sync-total/">https://kubernetes.io/blog/2026/05/15/ccm-new-metric-route-sync-total/</a></li>
<li><a href="https://kubernetes.io/blog/2026/05/15/kubernetes-1-36-feature-mixed-version-proxy-beta/">https://kubernetes.io/blog/2026/05/15/kubernetes-1-36-feature-mixed-version-proxy-beta/</a></li>
<li><a href="https://kubernetes.io/blog/2026/05/14/kubernetes-v1-36-deprecation-and-removal-of-service-externalips/">https://kubernetes.io/blog/2026/05/14/kubernetes-v1-36-deprecation-and-removal-of-service-externalips/</a></li>
<li><a href="https://www.env0.com/blog/opentofu-the-open-source-terraform-alternative">https://www.env0.com/blog/opentofu-the-open-source-terraform-alternative</a></li>
<li><a href="https://aws.amazon.com/blogs/devops/simplify-cross-account-and-cross-region-stack-output-references-with-aws-cloudformation-and-cdks-new-fngetstackoutput/">https://aws.amazon.com/blogs/devops/simplify-cross-account-and-cross-region-stack-output-references-with-aws-cloudformation-and-cdks-new-fngetstackoutput/</a></li>
</ul>
<hr>
<p><strong>Need help setting this up?</strong> Gruion provides hands-on DevOps services, CI/CD automation, and platform engineering. <a href="https://www.gruion.com/#contact">Get a free consultation</a></p>
]]></content:encoded><category>IaC</category></item></channel></rss>