<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Kubernetes Blog</title>
    <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/</link>
    <description>The Kubernetes blog is used by the project to communicate new features, community reports, and any news that might be relevant to the Kubernetes community.</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <image>
      <url>https://raw.githubusercontent.com/kubernetes/kubernetes/master/logo/logo.png</url>
      <title>The Kubernetes project logo</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/</link>
    </image>
    
    <atom:link href="https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/feed.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Ingress NGINX Retirement: What You Need to Know</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/11/ingress-nginx-retirement/</link>
      <pubDate>Tue, 11 Nov 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/11/ingress-nginx-retirement/</guid>
      <description>
        
        
        &lt;p&gt;To prioritize the safety and security of the ecosystem, Kubernetes SIG Network and the Security Response Committee are announcing the upcoming retirement of &lt;a href=&#34;https://github.com/kubernetes/ingress-nginx/&#34;&gt;Ingress NGINX&lt;/a&gt;. Best-effort maintenance will continue until March 2026. Afterward, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. &lt;strong&gt;Existing deployments of Ingress NGINX will continue to function and installation artifacts will remain available.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We recommend migrating to one of the many alternatives. Consider &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/&#34;&gt;migrating to Gateway API&lt;/a&gt;, the modern replacement for Ingress. If you must continue using Ingress, many alternative Ingress controllers are &lt;a href=&#34;https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/&#34;&gt;listed in the Kubernetes documentation&lt;/a&gt;. Continue reading for further information about the history and current state of Ingress NGINX, as well as next steps.&lt;/p&gt;
&lt;h2 id=&#34;about-ingress-nginx&#34;&gt;About Ingress NGINX&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://kubernetes.io/docs/concepts/services-networking/ingress/&#34;&gt;Ingress&lt;/a&gt; is the original user-friendly way to direct network traffic to workloads running on Kubernetes. (&lt;a href=&#34;https://kubernetes.io/docs/concepts/services-networking/gateway/&#34;&gt;Gateway API&lt;/a&gt; is a newer way to achieve many of the same goals.) In order for an Ingress to work in your cluster, there must be an &lt;a href=&#34;https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/&#34;&gt;Ingress controller&lt;/a&gt; running. There are many Ingress controller choices available, which serve the needs of different users and use cases. Some are cloud-provider specific, while others have more general applicability.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.github.com/kubernetes/ingress-nginx&#34;&gt;Ingress NGINX&lt;/a&gt; was an Ingress controller, developed early in the history of the Kubernetes project as an example implementation of the API. It became very popular due to its tremendous flexibility, breadth of features, and independence from any particular cloud or infrastructure provider. Since those days, many other Ingress controllers have been created within the Kubernetes project by community groups, and by cloud native vendors. Ingress NGINX has continued to be one of the most popular, deployed as part of many hosted Kubernetes platforms and within innumerable independent users’ clusters.&lt;/p&gt;
&lt;h2 id=&#34;history-and-challenges&#34;&gt;History and Challenges&lt;/h2&gt;
&lt;p&gt;The breadth and flexibility of Ingress NGINX has caused maintenance challenges. Changing expectations about cloud native software have also added complications. What were once considered helpful options have sometimes come to be considered serious security flaws, such as the ability to add arbitrary NGINX configuration directives via the &amp;quot;snippets&amp;quot; annotations. Yesterday’s flexibility has become today’s insurmountable technical debt.&lt;/p&gt;
&lt;p&gt;Despite the project’s popularity among users, Ingress NGINX has always struggled with insufficient or barely-sufficient maintainership. For years, the project has had only one or two people doing development work, on their own time, after work hours and on weekends. Last year, the Ingress NGINX maintainers &lt;a href=&#34;https://kccncna2024.sched.com/event/1hoxW/securing-the-future-of-ingress-nginx-james-strong-isovalent-marco-ebert-giant-swarm&#34;&gt;announced&lt;/a&gt; their plans to wind down Ingress NGINX and develop a replacement controller together with the Gateway API community. Unfortunately, even that announcement failed to generate additional interest in helping maintain Ingress NGINX or develop InGate to replace it. (InGate development never progressed far enough to create a mature replacement; it will also be retired.)&lt;/p&gt;
&lt;h2 id=&#34;current-state-and-next-steps&#34;&gt;Current State and Next Steps&lt;/h2&gt;
&lt;p&gt;Currently, Ingress NGINX is receiving best-effort maintenance. SIG Network and the Security Response Committee have exhausted our efforts to find additional support to make Ingress NGINX sustainable. To prioritize user safety, we must retire the project.&lt;/p&gt;
&lt;p&gt;In March 2026, Ingress NGINX maintenance will be halted, and the project will be &lt;a href=&#34;https://github.com/kubernetes-retired/&#34;&gt;retired&lt;/a&gt;. After that time, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. The GitHub repositories will be made read-only and left available for reference.&lt;/p&gt;
&lt;p&gt;Existing deployments of Ingress NGINX will not be broken. Existing project artifacts such as Helm charts and container images will remain available.&lt;/p&gt;
&lt;p&gt;In most cases, you can check whether you use Ingress NGINX by running &lt;code&gt;kubectl get pods \--all-namespaces \--selector app.kubernetes.io/name=ingress-nginx&lt;/code&gt; with cluster administrator permissions.&lt;/p&gt;
&lt;p&gt;We would like to thank the Ingress NGINX maintainers for their work in creating and maintaining this project–their dedication remains impressive. This Ingress controller has powered billions of requests in datacenters and homelabs all around the world. In a lot of ways, Kubernetes wouldn’t be where it is without Ingress NGINX, and we are grateful for so many years of incredible effort.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SIG Network and the Security Response Committee recommend that all Ingress NGINX users begin migration to Gateway API or another Ingress controller immediately.&lt;/strong&gt; Many options are listed in the Kubernetes documentation: &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/&#34;&gt;Gateway API&lt;/a&gt;, &lt;a href=&#34;https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/&#34;&gt;Ingress&lt;/a&gt;. Additional options may be available from vendors you work with.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes Configuration Good Practices</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/11/kubernetes-configuration-good-practices/</link>
      <pubDate>Tue, 11 Nov 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/11/kubernetes-configuration-good-practices/</guid>
      <description>
        
        
        &lt;p&gt;Configuration is one of those things in Kubernetes that seems small until it&#39;s not. Configuration is at the heart of every Kubernetes workload.
A missing quote, a wrong API version or a misplaced YAML indent can ruin your entire deploy.&lt;/p&gt;
&lt;p&gt;This blog brings together tried-and-tested configuration best practices. The small habits that make your Kubernetes setup clean, consistent and easier to manage.
Whether you are just starting out or already deploying apps daily, these are the little things that keep your cluster stable and your future self sane.&lt;/p&gt;
&lt;h2 id=&#34;general-configuration-practices&#34;&gt;General Configuration Practices&lt;/h2&gt;
&lt;h3 id=&#34;use-the-latest-stable-api-version&#34;&gt;Use the latest stable API version&lt;/h3&gt;
&lt;p&gt;Kubernetes evolves fast. Older APIs eventually get deprecated and stop working. So, whenever you are defining resources, make sure you are using the latest stable API version.
You can always check with&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl api-resources
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This simple step saves you from future compatibility issues.&lt;/p&gt;
&lt;h3 id=&#34;store-configuration-in-version-control&#34;&gt;Store configuration in version control&lt;/h3&gt;
&lt;p&gt;Never apply manifest files directly from your desktop. Always keep them in a version control system like Git, it&#39;s your safety net.
If something breaks, you can instantly roll back to a previous commit, compare changes or recreate your cluster setup without panic.&lt;/p&gt;
&lt;h3 id=&#34;write-configs-in-yaml-not-json&#34;&gt;Write configs in YAML not JSON&lt;/h3&gt;
&lt;p&gt;Write your configuration files using YAML rather than JSON. Both work technically, but YAML is just easier for humans. It&#39;s cleaner to read and less noisy and widely used in the community.&lt;/p&gt;
&lt;p&gt;YAML has some sneaky gotchas with boolean values:
Use only &lt;code&gt;true&lt;/code&gt; or &lt;code&gt;false&lt;/code&gt;.
Don&#39;t write &lt;code&gt;yes&lt;/code&gt;, &lt;code&gt;no&lt;/code&gt;, &lt;code&gt;on&lt;/code&gt; or  &lt;code&gt;off&lt;/code&gt;.
They might work in one version of YAML but break in another. To be safe, quote anything that looks like a Boolean (for example &lt;code&gt;&amp;quot;yes&amp;quot;&lt;/code&gt;).&lt;/p&gt;
&lt;h3 id=&#34;keep-configuration-simple-and-minimal&#34;&gt;Keep configuration simple and minimal&lt;/h3&gt;
&lt;p&gt;Avoid setting default values that are already handled by Kubernetes. Minimal manifests are easier to debug, cleaner to review and less likely to break things later.&lt;/p&gt;
&lt;h3 id=&#34;group-related-objects-together&#34;&gt;Group related objects together&lt;/h3&gt;
&lt;p&gt;If your Deployment, Service and ConfigMap all belong to one app, put them in a single manifest file.&lt;br&gt;
It&#39;s easier to track changes and apply them as a unit.
See the &lt;a href=&#34;https://github.com/kubernetes/examples/blob/master/web/guestbook/all-in-one/guestbook-all-in-one.yaml&#34;&gt;Guestbook all-in-one.yaml&lt;/a&gt; file for an example of this syntax.&lt;/p&gt;
&lt;p&gt;You can even apply entire directories with:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f configs/
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;One command and boom everything in that folder gets deployed.&lt;/p&gt;
&lt;h3 id=&#34;add-helpful-annotations&#34;&gt;Add helpful annotations&lt;/h3&gt;
&lt;p&gt;Manifest files are not just for machines, they are for humans too. Use annotations to describe why something exists or what it does. A quick one-liner can save hours when debugging later and also allows better collaboration.&lt;/p&gt;
&lt;p&gt;The most helpful annotation to set is &lt;code&gt;kubernetes.io/description&lt;/code&gt;. It&#39;s like using comment, except that it gets copied into the API so that everyone else can see it even after you deploy.&lt;/p&gt;
&lt;h2 id=&#34;managing-workloads-pods-deployments-and-jobs&#34;&gt;Managing Workloads: Pods, Deployments, and Jobs&lt;/h2&gt;
&lt;p&gt;A common early mistake in Kubernetes is creating Pods directly. Pods work, but they don&#39;t reschedule themselves if something goes wrong.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Naked Pods&lt;/em&gt; (Pods not managed by a controller, such as &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/deployment/&#34;&gt;Deployment&lt;/a&gt; or a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/statefulset/&#34;&gt;StatefulSet&lt;/a&gt;) are fine for testing, but in real setups, they are risky.&lt;/p&gt;
&lt;p&gt;Why?
Because if the node hosting that Pod dies, the Pod dies with it and Kubernetes won&#39;t bring it back automatically.&lt;/p&gt;
&lt;h3 id=&#34;use-deployments-for-apps-that-should-always-be-running&#34;&gt;Use Deployments for apps that should always be running&lt;/h3&gt;
&lt;p&gt;A Deployment, which both creates a ReplicaSet to ensure that the desired number of Pods is always available, and specifies a strategy to replace Pods (such as &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/deployment/#rolling-update-deployment&#34;&gt;RollingUpdate&lt;/a&gt;), is almost always preferable to creating Pods directly.
You can roll out a new version, and if something breaks, roll back instantly.&lt;/p&gt;
&lt;h3 id=&#34;use-jobs-for-tasks-that-should-finish&#34;&gt;Use Jobs for tasks that should finish&lt;/h3&gt;
&lt;p&gt;A &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/&#34;&gt;Job&lt;/a&gt; is perfect when you need something to run once and then stop like database migration or batch processing task.
It will retry if the pods fails and report success when it&#39;s done.&lt;/p&gt;
&lt;h2 id=&#34;service-configuration-and-networking&#34;&gt;Service Configuration and Networking&lt;/h2&gt;
&lt;p&gt;Services are how your workloads talk to each other inside (and sometimes outside) your cluster. Without them, your pods exist but can&#39;t reach anyone. Let&#39;s make sure that doesn&#39;t happen.&lt;/p&gt;
&lt;h3 id=&#34;create-services-before-workloads-that-use-them&#34;&gt;Create Services before workloads that use them&lt;/h3&gt;
&lt;p&gt;When Kubernetes starts a Pod, it automatically injects environment variables for existing Services.
So, if a Pod depends on a Service, create a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/services-networking/service/&#34;&gt;Service&lt;/a&gt; &lt;strong&gt;before&lt;/strong&gt; its corresponding backend workloads (Deployments or StatefulSets), and before any workloads that need to access it.&lt;/p&gt;
&lt;p&gt;For example, if a Service named foo exists, all containers will get the following variables in their initial environment:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;FOO_SERVICE_HOST=&amp;lt;the host the Service runs on&amp;gt;
FOO_SERVICE_PORT=&amp;lt;the port the Service runs on&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;DNS based discovery doesn&#39;t have this problem, but it&#39;s a good habit to follow anyway.&lt;/p&gt;
&lt;h3 id=&#34;use-dns-for-service-discovery&#34;&gt;Use DNS for Service discovery&lt;/h3&gt;
&lt;p&gt;If your cluster has the DNS &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/cluster-administration/addons/&#34;&gt;add-on&lt;/a&gt; (most do), every Service automatically gets a DNS entry. That means you can access it by name instead of IP:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl http://my-service.default.svc.cluster.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It&#39;s one of those features that makes Kubernetes networking feel magical.&lt;/p&gt;
&lt;h3 id=&#34;avoid-hostport-and-hostnetwork-unless-absolutely-necessary&#34;&gt;Avoid &lt;code&gt;hostPort&lt;/code&gt; and &lt;code&gt;hostNetwork&lt;/code&gt; unless absolutely necessary&lt;/h3&gt;
&lt;p&gt;You&#39;ll sometimes see these options in manifests:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostPort&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostNetwork&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;But here&#39;s the thing:
They tie your Pods to specific nodes, making them harder to schedule and scale. Because each &amp;lt;&lt;code&gt;hostIP&lt;/code&gt;, &lt;code&gt;hostPort&lt;/code&gt;, &lt;code&gt;protocol&lt;/code&gt;&amp;gt; combination must be unique. If you don&#39;t specify the &lt;code&gt;hostIP&lt;/code&gt; and &lt;code&gt;protocol&lt;/code&gt; explicitly, Kubernetes will use &lt;code&gt;0.0.0.0&lt;/code&gt; as the default &lt;code&gt;hostIP&lt;/code&gt; and &lt;code&gt;TCP&lt;/code&gt; as the default &lt;code&gt;protocol&lt;/code&gt;.
Unless you&#39;re debugging or building something like a network plugin, avoid them.&lt;/p&gt;
&lt;p&gt;If you just need local access for testing, try &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubectl/generated/kubectl_port-forward/&#34;&gt;&lt;code&gt;kubectl port-forward&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl port-forward deployment/web 8080:80
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;See &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/access-application-cluster/port-forward-access-application-cluster/&#34;&gt;Use Port Forwarding to access applications in a cluster&lt;/a&gt; to learn more.
Or if you really need external access, use a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/services-networking/service/#type-nodeport&#34;&gt;&lt;code&gt;type: NodePort&lt;/code&gt; Service&lt;/a&gt;. That&#39;s the safer, Kubernetes-native way.&lt;/p&gt;
&lt;h3 id=&#34;use-headless-services-for-internal-discovery&#34;&gt;Use headless Services for internal discovery&lt;/h3&gt;
&lt;p&gt;Sometimes, you don&#39;t want Kubernetes to load balance traffic. You want to talk directly to each Pod. That&#39;s where &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/services-networking/service/#headless-services&#34;&gt;headless Services&lt;/a&gt; come in.&lt;/p&gt;
&lt;p&gt;You create one by setting &lt;code&gt;clusterIP: None&lt;/code&gt;.
Instead of a single IP, DNS gives you a list of all Pods IPs, perfect for apps that manage connections themselves.&lt;/p&gt;
&lt;h2 id=&#34;working-with-labels-effectively&#34;&gt;Working with labels effectively&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/labels/&#34;&gt;Labels&lt;/a&gt; are key/value pairs that are attached to objects such as Pods.
Labels help you organize, query and group your resources.
They don&#39;t do anything by themselves, but they make everything else from Services to Deployments work together smoothly.&lt;/p&gt;
&lt;h3 id=&#34;use-semantics-labels&#34;&gt;Use semantics labels&lt;/h3&gt;
&lt;p&gt;Good labels help you understand what&#39;s what, even after months later.
Define and use &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/labels/&#34;&gt;labels&lt;/a&gt; that identify semantic attributes of your application or Deployment.
For example;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app.kubernetes.io/name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app.kubernetes.io/component&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;web&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tier&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;frontend&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;phase&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;test&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;&lt;code&gt;app.kubernetes.io/name&lt;/code&gt; : what the app is&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tier&lt;/code&gt; : which layer it belongs to (frontend/backend)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;phase&lt;/code&gt; : which stage it&#39;s in (test/prod)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can then use these labels to make powerful selectors.
For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods -l &lt;span style=&#34;color:#b8860b&#34;&gt;tier&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;frontend
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This will list all frontend Pods across your cluster, no matter which Deployment they came from.
Basically you are not manually listing Pod names; you are just describing what you want.
See the &lt;a href=&#34;https://github.com/kubernetes/examples/tree/master/web/guestbook/&#34;&gt;guestbook&lt;/a&gt; app for examples of this approach.&lt;/p&gt;
&lt;h3 id=&#34;use-common-kubernetes-labels&#34;&gt;Use common Kubernetes labels&lt;/h3&gt;
&lt;p&gt;Kubernetes actually recommends a set of &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/common-labels/&#34;&gt;common labels&lt;/a&gt;. It&#39;s a standardized way to name things across your different workloads or projects.
Following this convention makes your manifests cleaner, and it means that tools such as &lt;a href=&#34;https://headlamp.dev/&#34;&gt;Headlamp&lt;/a&gt;, &lt;a href=&#34;https://github.com/kubernetes/dashboard#introduction&#34;&gt;dashboard&lt;/a&gt;, or third-party monitoring systems can all
automatically understand what&#39;s running.&lt;/p&gt;
&lt;h3 id=&#34;manipulate-labels-for-debugging&#34;&gt;Manipulate labels for debugging&lt;/h3&gt;
&lt;p&gt;Since controllers (like ReplicaSets or Deployments) use labels to manage Pods, you can remove a label to “detach” a Pod temporarily.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl label pod mypod app-
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;app-&lt;/code&gt; part removes the label key &lt;code&gt;app&lt;/code&gt;.
Once that happens, the controller won’t manage that Pod anymore.
It’s like isolating it for inspection, a “quarantine mode” for debugging. To interactively remove or add labels, use &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubectl/generated/kubectl_label/&#34;&gt;&lt;code&gt;kubectl label&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can then check logs, exec into it and once done, delete it manually.
That’s a super underrated trick every Kubernetes engineer should know.&lt;/p&gt;
&lt;h2 id=&#34;handy-kubectl-tips&#34;&gt;Handy kubectl tips&lt;/h2&gt;
&lt;p&gt;These small tips make life much easier when you are working with multiple manifest files or clusters.&lt;/p&gt;
&lt;h3 id=&#34;apply-entire-directories&#34;&gt;Apply entire directories&lt;/h3&gt;
&lt;p&gt;Instead of applying one file at a time, apply the whole folder:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Using server-side apply is also a good practice&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f configs/ --server-side
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This command looks for &lt;code&gt;.yaml&lt;/code&gt;, &lt;code&gt;.yml&lt;/code&gt; and &lt;code&gt;.json&lt;/code&gt; files in that folder and applies them all together.
It&#39;s faster, cleaner and helps keep things grouped by app.&lt;/p&gt;
&lt;h3 id=&#34;use-label-selectors-to-get-or-delete-resources&#34;&gt;Use label selectors to get or delete resources&lt;/h3&gt;
&lt;p&gt;You don&#39;t always need to type out resource names one by one.
Instead, use &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/labels/#label-selectors&#34;&gt;selectors&lt;/a&gt; to act on entire groups at once:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods -l &lt;span style=&#34;color:#b8860b&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;myapp
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl delete pod -l &lt;span style=&#34;color:#b8860b&#34;&gt;phase&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a2f&#34;&gt;test&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It&#39;s especially useful in CI/CD pipelines, where you want to clean up test resources dynamically.&lt;/p&gt;
&lt;h3 id=&#34;quickly-create-deployments-and-services&#34;&gt;Quickly create Deployments and Services&lt;/h3&gt;
&lt;p&gt;For quick experiments, you don&#39;t always need to write a manifest. You can spin up a Deployment right from the CLI:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl create deployment webapp --image&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;nginx
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then expose it as a Service:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl expose deployment webapp --port&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is great when you just want to test something before writing full manifests.
Also, see &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/access-application-cluster/service-access-application-cluster/&#34;&gt;Use a Service to Access an Application in a cluster&lt;/a&gt; for an example.&lt;/p&gt;
&lt;h2 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Cleaner configuration leads to calmer cluster administrators.
If you stick to a few simple habits: keep configuration simple and minimal, version-control everything,
use consistent labels, and avoid relying on naked Pods, you&#39;ll save yourself hours of debugging down the road.&lt;/p&gt;
&lt;p&gt;The best part?
Clean configurations stay readable. Even after months, you or anyone on your team can glance at them and know exactly what’s happening.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Announcing the 2025 Steering Committee Election Results</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/09/steering-committee-results-2025/</link>
      <pubDate>Sun, 09 Nov 2025 15:10:00 -0500</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/09/steering-committee-results-2025/</guid>
      <description>
        
        
        &lt;p&gt;The &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/elections/steering/2025&#34;&gt;2025 Steering Committee Election&lt;/a&gt; is now complete. The Kubernetes Steering Committee consists of 7 seats, 4 of which were up for election in 2025. Incoming committee members serve a term of 2 years, and all members are elected by the Kubernetes Community.&lt;/p&gt;
&lt;p&gt;The Steering Committee oversees the governance of the entire Kubernetes project. With that great power comes great responsibility. You can learn more about the steering committee’s role in their &lt;a href=&#34;https://github.com/kubernetes/steering/blob/master/charter.md&#34;&gt;charter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thank you to everyone who voted in the election; your participation helps support the community’s continued health and success.&lt;/p&gt;
&lt;h2 id=&#34;results&#34;&gt;Results&lt;/h2&gt;
&lt;p&gt;Congratulations to the elected committee members whose two year terms begin immediately (listed in alphabetical order by GitHub handle):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Kat Cosgrove (&lt;a href=&#34;https://github.com/katcosgrove&#34;&gt;@katcosgrove&lt;/a&gt;), Minimus&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Paco Xu (&lt;a href=&#34;https://github.com/pacoxu&#34;&gt;@pacoxu&lt;/a&gt;), DaoCloud&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Rita Zhang (&lt;a href=&#34;https://github.com/ritazh&#34;&gt;@ritazh&lt;/a&gt;), Microsoft&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Maciej Szulik (&lt;a href=&#34;https://github.com/soltysh&#34;&gt;@soltysh&lt;/a&gt;), Defense Unicorns&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They join continuing members:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Antonio Ojea (&lt;a href=&#34;https://github.com/aojea&#34;&gt;@aojea&lt;/a&gt;), Google&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Benjamin Elder (&lt;a href=&#34;https://github.com/BenTheElder&#34;&gt;@BenTheElder&lt;/a&gt;), Google&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sascha Grunert (&lt;a href=&#34;https://github.com/saschagrunert&#34;&gt;@saschagrunert&lt;/a&gt;), Red Hat&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Maciej Szulik and Paco Xu are returning Steering Committee Members.&lt;/p&gt;
&lt;h2 id=&#34;big-thanks&#34;&gt;Big thanks!&lt;/h2&gt;
&lt;p&gt;Thank you and congratulations on a successful election to this round’s election officers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Christoph Blecker (&lt;a href=&#34;https://github.com/cblecker&#34;&gt;@cblecker&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Nina Polshakova (&lt;a href=&#34;https://github.com/npolshakova&#34;&gt;@npolshakova&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Sreeram Venkitesh (&lt;a href=&#34;https://github.com/sreeram-venkitesh&#34;&gt;@sreeram-venkitesh&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thanks to the Emeritus Steering Committee Members. Your service is appreciated by the community:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Stephen Augustus (&lt;a href=&#34;https://github.com/justaugustus&#34;&gt;@justaugustus&lt;/a&gt;), Bloomberg&lt;/li&gt;
&lt;li&gt;Patrick Ohly (&lt;a href=&#34;https://github.com/pohly&#34;&gt;@pohly&lt;/a&gt;), Intel&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And thank you to all the candidates who came forward to run for election.&lt;/p&gt;
&lt;h2 id=&#34;get-involved-with-the-steering-committee&#34;&gt;Get involved with the Steering Committee&lt;/h2&gt;
&lt;p&gt;This governing body, like all of Kubernetes, is open to all. You can follow along with Steering Committee &lt;a href=&#34;https://bit.ly/k8s-steering-wd&#34;&gt;meeting notes&lt;/a&gt; and weigh in by filing an issue or creating a PR against their &lt;a href=&#34;https://github.com/kubernetes/steering&#34;&gt;repo&lt;/a&gt;. They have an open meeting on &lt;a href=&#34;https://github.com/kubernetes/steering&#34;&gt;the first Wednesday at 8am PT of every month&lt;/a&gt;. They can also be contacted at their public mailing list &lt;a href=&#34;mailto:steering@kubernetes.io&#34;&gt;steering@kubernetes.io&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can see what the Steering Committee meetings are all about by watching past meetings on the &lt;a href=&#34;https://www.youtube.com/playlist?list=PL69nYSiGNLP1yP1B_nd9-drjoxp0Q14qM&#34;&gt;YouTube Playlist&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;This post was adapted from one written by the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/communication/contributor-comms&#34;&gt;Contributor Comms Subproject&lt;/a&gt;. If you want to write stories about the Kubernetes community, learn more about us.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This article was revised in November 2025 to update the information about when the steering committee meets.&lt;/em&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Gateway API 1.4: New Features</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/06/gateway-api-v1-4/</link>
      <pubDate>Thu, 06 Nov 2025 09:00:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/06/gateway-api-v1-4/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img alt=&#34;Gateway API logo&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/11/06/gateway-api-v1-4/gateway-api-logo.svg&#34;&gt;&lt;/p&gt;
&lt;p&gt;Ready to rock your Kubernetes networking? The Kubernetes SIG Network community presented the General Availability (GA) release of Gateway API (v1.4.0)! Released on October 6, 2025, version 1.4.0 reinforces the path for modern, expressive, and extensible service networking in Kubernetes.&lt;/p&gt;
&lt;p&gt;Gateway API v1.4.0 brings three new features to the &lt;em&gt;Standard channel&lt;/em&gt;
(Gateway API&#39;s GA release channel):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BackendTLSPolicy for TLS between gateways and backends&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;supportedFeatures&lt;/code&gt; in GatewayClass status&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Named rules for Routes&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and introduces three new experimental features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Mesh resource for service mesh configuration&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Default gateways&lt;/strong&gt; to ease configuration burden**&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;externalAuth&lt;/code&gt; filter for HTTPRoute&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;graduations-to-standard-channel&#34;&gt;Graduations to Standard Channel&lt;/h2&gt;
&lt;h3 id=&#34;backend-tls-policy&#34;&gt;Backend TLS policy&lt;/h3&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/candita&#34;&gt;Candace Holman&lt;/a&gt;, &lt;a href=&#34;https://github.com/snorwin&#34;&gt;Norwin Schnyder&lt;/a&gt;, &lt;a href=&#34;https://github.com/kl52752&#34;&gt;Katarzyna Łach&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-1897: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/issues/1897&#34;&gt;BackendTLSPolicy&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://gateway-api.sigs.k8s.io/api-types/backendtlspolicy&#34;&gt;BackendTLSPolicy&lt;/a&gt; is a new Gateway API type for specifying the TLS configuration
of the connection from the Gateway to backend pod(s).
.  Prior to the introduction of BackendTLSPolicy, there was no API specification
that allowed encrypted traffic on the hop from Gateway to backend.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;BackendTLSPolicy&lt;/code&gt; &lt;code&gt;validation&lt;/code&gt; configuration requires a hostname. This &lt;code&gt;hostname&lt;/code&gt;
serves two purposes. It is used as the SNI header when connecting to the backend and
for authentication, the certificate presented by the backend must match this hostname,
&lt;em&gt;unless&lt;/em&gt; &lt;code&gt;subjectAltNames&lt;/code&gt; is explicitly specified.&lt;/p&gt;
&lt;p&gt;If &lt;code&gt;subjectAltNames&lt;/code&gt; (SANs) are specified, the &lt;code&gt;hostname&lt;/code&gt; is only used for SNI, and authentication is performed against the SANs instead. If you still need to authenticate against the hostname value in this case, you MUST add it to the &lt;code&gt;subjectAltNames&lt;/code&gt; list.&lt;/p&gt;
&lt;p&gt;BackendTLSPolicy &lt;code&gt;validation&lt;/code&gt; configuration also requires either &lt;code&gt;caCertificateRefs&lt;/code&gt; or &lt;code&gt;wellKnownCACertificates&lt;/code&gt;.
&lt;code&gt;caCertificateRefs&lt;/code&gt; refer to one or more (up to 8) PEM-encoded TLS certificate bundles. If there are no specific certificates to use,
then depending on your implementation, you may use &lt;code&gt;wellKnownCACertificates&lt;/code&gt;,
set to &amp;quot;System&amp;quot; to tell the Gateway to use an implementation-specific set of trusted CA Certificates.&lt;/p&gt;
&lt;p&gt;In this example, the BackendTLSPolicy is configured to use certificates defined in the auth-cert ConfigMap
to connect with a TLS-encrypted upstream connection where pods backing the auth service are expected to serve a
valid certificate for &lt;code&gt;auth.example.com&lt;/code&gt;.  It uses &lt;code&gt;subjectAltNames&lt;/code&gt; with a Hostname type, but you may also use a URI type.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;BackendTLSPolicy&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;tls-upstream-auth&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;targetRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Service&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;auth&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;sectionName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;https&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;validation&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;caCertificateRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# core API group&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ConfigMap&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;auth-cert&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;subjectAltNames&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;Hostname&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;auth.example.com&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this example, the BackendTLSPolicy is configured to use system certificates to connect with a TLS-encrypted backend connection where Pods backing the dev Service are expected to serve a valid certificate for &lt;code&gt;dev.example.com&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;BackendTLSPolicy&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;tls-upstream-dev&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;targetRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Service&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;dev&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;sectionName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;btls&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;validation&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;wellKnownCACertificates&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;System&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;dev.example.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;More information on the configuration of TLS in Gateway API can be found in &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/tls/&#34;&gt;Gateway API - TLS Configuration&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;status-information-about-the-features-that-an-implementation-supports&#34;&gt;Status information about the features that an implementation supports&lt;/h3&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/liorlieberman&#34;&gt;Lior Lieberman&lt;/a&gt;, &lt;a href=&#34;https://github.com/bexxmodd&#34;&gt;Beka Modebadze&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-2162: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/geps/gep-2162/index.md&#34;&gt;Supported features in GatewayClass Status&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GatewayClass status has a new field, &lt;code&gt;supportedFeatures&lt;/code&gt;.
This addition allows implementations to declare the set of features they support. This provides a clear way for users and tools to understand the capabilities of a given GatewayClass.&lt;/p&gt;
&lt;p&gt;This feature&#39;s name for conformance tests (and GatewayClass status reporting) is &lt;strong&gt;SupportedFeatures&lt;/strong&gt;.
Implementations must populate the &lt;code&gt;supportedFeatures&lt;/code&gt; field in the &lt;code&gt;.status&lt;/code&gt; of the GatewayClass &lt;strong&gt;before&lt;/strong&gt; the GatewayClass
is accepted, or in the same operation.&lt;/p&gt;
&lt;p&gt;Here’s an example of a &lt;code&gt;supportedFeatures&lt;/code&gt; published under GatewayClass&#39; &lt;code&gt;.status&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;GatewayClass&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;conditions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;lastTransitionTime&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;2022-11-16T10:33:06Z&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;message&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Handled by Foo controller&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;observedGeneration&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;reason&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Accepted&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;True&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Accepted&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supportedFeatures&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- HTTPRouteHostRewrite&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- HTTPRoutePortRedirect&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- HTTPRouteQueryParamMatching&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Graduation of SupportedFeatures to Standard, helped improve the conformance testing process for Gateway API.
The conformance test suite will now automatically run tests based on the features populated in the GatewayClass&#39; status.
This creates a strong, verifiable link between an implementation&#39;s declared capabilities and the test results,
making it easier for implementers to run the correct conformance tests and for users to trust the conformance reports.&lt;/p&gt;
&lt;p&gt;This means when the SupportedFeatures field is populated in the GatewayClass status there will be no need for additional
conformance tests flags like &lt;code&gt;–suported-features&lt;/code&gt;, or &lt;code&gt;–exempt&lt;/code&gt; or &lt;code&gt;–all-features&lt;/code&gt;.
It&#39;s important to note that Mesh features are an exception to this and can be tested for conformance by using
&lt;em&gt;Conformance Profiles&lt;/em&gt;, or by manually providing any combination of features related flags until the dedicated resource
graduates from the experimental channel.&lt;/p&gt;
&lt;h3 id=&#34;named-rules-for-routes&#34;&gt;Named rules for Routes&lt;/h3&gt;
&lt;p&gt;GEP-995: &lt;a href=&#34;https://gateway-api.sigs.k8s.io/geps/gep-995&#34;&gt;Adding a new name field to all xRouteRule types (HTTPRouteRule, GRPCRouteRule, etc.)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/guicassolato&#34;&gt;Guilherme Cassolato&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This enhancement enables route rules to be explicitly identified and referenced across the Gateway API ecosystem.
Some of the key use cases include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Status:&lt;/strong&gt; Allowing status conditions to reference specific rules directly by name.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Observability:&lt;/strong&gt; Making it easier to identify individual rules in logs, traces, and metrics.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Policies:&lt;/strong&gt; Enabling policies (&lt;a href=&#34;https://gateway-api.sigs.k8s.io/geps/gep-773&#34;&gt;GEP-713&lt;/a&gt;) to target specific route rules via the &lt;code&gt;sectionName&lt;/code&gt; field in their &lt;code&gt;targetRef[s]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tooling:&lt;/strong&gt; Simplifying filtering and referencing of route rules in tools such as &lt;code&gt;gwctl&lt;/code&gt;, &lt;code&gt;kubectl&lt;/code&gt;, and general-purpose utilities like &lt;code&gt;jq&lt;/code&gt; and &lt;code&gt;yq&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Internal configuration mapping:&lt;/strong&gt; Facilitating the generation of internal configurations that reference route rules by name within gateway and mesh implementations.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This follows the same well-established pattern already adopted for Gateway listeners, Service ports, Pods (and containers),
and many other Kubernetes resources.&lt;/p&gt;
&lt;p&gt;While the new name field is &lt;strong&gt;optional&lt;/strong&gt; (so existing resources remain valid), its use is &lt;strong&gt;strongly encouraged&lt;/strong&gt;.
Implementations are not expected to assign a default value, but they may enforce constraints such as immutability.&lt;/p&gt;
&lt;p&gt;Finally, keep in mind that the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/geps/gep-995/?h=995#format&#34;&gt;name format&lt;/a&gt; is validated,
and other fields (such as &lt;a href=&#34;https://gateway-api.sigs.k8s.io/reference/spec/?h=sectionname#sectionname&#34;&gt;&lt;code&gt;sectionName&lt;/code&gt;&lt;/a&gt;)
may impose additional, indirect constraints.&lt;/p&gt;
&lt;h2 id=&#34;experimental-channel-changes&#34;&gt;Experimental channel changes&lt;/h2&gt;
&lt;h3 id=&#34;enabling-external-auth-for-httproute&#34;&gt;Enabling external Auth for HTTPRoute&lt;/h3&gt;
&lt;p&gt;Giving Gateway API the ability to enforce authentication and maybe authorization as well at the Gateway or HTTPRoute level has been a highly requested feature for a long time. (See the &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/issues/1494&#34;&gt;GEP-1494 issue&lt;/a&gt; for some background.)&lt;/p&gt;
&lt;p&gt;This Gateway API release adds an Experimental filter in HTTPRoute that tells the Gateway API implementation to call out to an external service to authenticate (and, optionally, authorize) requests.&lt;/p&gt;
&lt;p&gt;This filter is based on the &lt;a href=&#34;https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter#config-http-filters-ext-authz&#34;&gt;Envoy ext_authz API&lt;/a&gt;, and allows talking to an Auth service that uses either gRPC or HTTP for its protocol.&lt;/p&gt;
&lt;p&gt;Both methods allow the configuration of what headers to forward to the Auth service, with the HTTP protocol allowing some extra information like a prefix path.&lt;/p&gt;
&lt;p&gt;A HTTP example might look like this (noting that this example requires the Experimental channel to be installed and an implementation that supports External Auth to actually understand the config):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;require-auth&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;your-gateway-here&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matches&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;path&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Prefix&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;/admin&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;filters&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ExternalAuth&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;externalAuth&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTP&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;auth-service&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;http&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# These headers are always sent for the HTTP protocol,&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# but are included here for illustrative purposes&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowedHeaders&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                &lt;/span&gt;- Host&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                &lt;/span&gt;- Method&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                &lt;/span&gt;- Path&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                &lt;/span&gt;- Content-Length&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                &lt;/span&gt;- Authorization&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;admin-backend&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This allows the backend Auth service to use the supplied headers to make a determination about the authentication for the request.&lt;/p&gt;
&lt;p&gt;When a request is allowed, the external Auth service will respond with a 200 HTTP response code, and optionally extra headers to be included in the request that is forwarded to the backend. When the request is denied, the Auth service will respond with a 403 HTTP response.&lt;/p&gt;
&lt;p&gt;Since the Authorization header is used in many authentication methods, this method can be used to do Basic, Oauth, JWT, and other common authentication and authorization methods.&lt;/p&gt;
&lt;h3 id=&#34;mesh-resource&#34;&gt;Mesh resource&lt;/h3&gt;
&lt;p&gt;Lead(s): &lt;a href=&#34;https://github.com/kflynn&#34;&gt;Flynn&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-3949: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/issues/3949&#34;&gt;Mesh-wide configuration and supported features&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gateway API v1.4.0 introduces a new experimental Mesh resource, which provides a way to configure mesh-wide settings and discover the features supported by a given mesh implementation. This resource is analogous to the Gateway resource and will initially be mainly used for conformance testing, with plans to extend its use to off-cluster Gateways in the future.&lt;/p&gt;
&lt;p&gt;The Mesh resource is cluster-scoped and, as an experimental feature, is named &lt;code&gt;XMesh&lt;/code&gt; and resides in the &lt;code&gt;gateway.networking.x-k8s.io&lt;/code&gt; API group. A key field is controllerName, which specifies the mesh implementation responsible for the resource. The resource&#39;s &lt;code&gt;status&lt;/code&gt; stanza indicates whether the mesh implementation has accepted it and lists the features the mesh supports.&lt;/p&gt;
&lt;p&gt;One of the goals of this GEP is to avoid making it more difficult for users to adopt a mesh. To simplify adoption, mesh implementations are expected to create a default Mesh resource upon startup if one with a matching &lt;code&gt;controllerName&lt;/code&gt; doesn&#39;t already exist. This avoids the need for manual creation of the resource to begin using a mesh.&lt;/p&gt;
&lt;p&gt;The new XMesh API kind, within the gateway.networking.x-k8s.io/v1alpha1 API group,
provides a central point for mesh configuration and feature discovery (source).&lt;/p&gt;
&lt;p&gt;A minimal XMesh object specifies the &lt;code&gt;controllerName&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.x-k8s.io/v1alpha1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;XMesh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;one-mesh-to-mesh-them-all&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;controllerName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;one-mesh.example.com/one-mesh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The mesh implementation populates the status field to confirm it has accepted the resource and to list its supported features ( source):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;conditions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Accepted&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;True&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;reason&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Accepted&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supportedFeatures&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;MeshHTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;OffClusterGateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;introducing-default-gateways&#34;&gt;Introducing default Gateways&lt;/h3&gt;
&lt;p&gt;Lead(s): &lt;a href=&#34;https://github.com/kflynn&#34;&gt;Flynn&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-3793: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/issues/3793&#34;&gt;Allowing Gateways to program some routes by default&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For application developers, one common piece of feedback has been the need to explicitly name a parent Gateway for every single north-south Route. While this explicitness prevents ambiguity, it adds friction, especially for developers who just want to expose their application to the outside world without worrying about the underlying infrastructure&#39;s naming scheme. To address this, we have introduce the concept of &lt;strong&gt;Default Gateways&lt;/strong&gt;.&lt;/p&gt;
&lt;h4 id=&#34;for-application-developers-just-use-the-default&#34;&gt;For application developers: Just &amp;quot;use the default&amp;quot;&lt;/h4&gt;
&lt;p&gt;As an application developer, you often don&#39;t care about the specific Gateway your traffic flows through, you just want it to work. With this enhancement, you can now create a Route and simply ask it to use a default Gateway.&lt;/p&gt;
&lt;p&gt;This is done by setting the new &lt;code&gt;useDefaultGateways&lt;/code&gt; field in your Route&#39;s &lt;code&gt;spec&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here’s a simple &lt;code&gt;HTTPRoute&lt;/code&gt; that uses a default Gateway:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-route&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;useDefaultGateways&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;All&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-service&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&#39;s it! No more need to hunt down the correct Gateway name for your environment. Your Route is now a &amp;quot;defaulted Route.&amp;quot;&lt;/p&gt;
&lt;h4 id=&#34;for-cluster-operators-you-re-still-in-control&#34;&gt;For cluster operators: You&#39;re still in control&lt;/h4&gt;
&lt;p&gt;This feature doesn&#39;t take control away from cluster operators (&amp;quot;Chihiro&amp;quot;).
In fact, they have explicit control over which Gateways can act as a default. A Gateway will only accept these &lt;em&gt;defaulted Routes&lt;/em&gt; if it is configured to do so.&lt;/p&gt;
&lt;p&gt;You can also use a ValidatingAdmissionPolicy to either require or even forbid for Routes to rely on a default Gateway.&lt;/p&gt;
&lt;p&gt;As a cluster operator, you can designate a Gateway as a default
by setting the (new) &lt;code&gt;.spec.defaultScope&lt;/code&gt; field:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-default-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;defaultScope&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;All&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# ... other gateway configuration&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Operators can choose to have no default Gateways, or even multiple.&lt;/p&gt;
&lt;h4 id=&#34;how-it-works-and-key-details&#34;&gt;How it works and key details&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;To maintain a clean, GitOps-friendly workflow, a default Gateway does &lt;em&gt;not&lt;/em&gt; modify the &lt;code&gt;spec.parentRefs&lt;/code&gt; of your Route. Instead, the binding is reflected in the Route&#39;s &lt;code&gt;status&lt;/code&gt; field. You can always inspect the &lt;code&gt;status.parents&lt;/code&gt; stanza of your Route to see exactly which Gateway or Gateways have accepted it. This preserves your original intent and avoids conflicts with CD tools.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The design explicitly supports having multiple Gateways designated as defaults within a cluster. When this happens, a defaulted Route will bind to &lt;em&gt;all&lt;/em&gt; of them. This enables cluster operators to perform zero-downtime migrations and testing of new default Gateways.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You can create a single Route that handles both north-south traffic (traffic entering or leaving the cluster, via a default Gateway) and east-west/mesh traffic (traffic between services within the cluster), by explicitly referencing a Service in &lt;code&gt;parentRefs&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Default Gateways represent a significant step forward in making the Gateway API simpler and more intuitive for everyday use cases, bridging the gap between the flexibility needed by operators and the simplicity desired by developers.&lt;/p&gt;
&lt;h3 id=&#34;configuring-client-certificate-validation&#34;&gt;Configuring client certificate validation&lt;/h3&gt;
&lt;p&gt;Lead(s): &lt;a href=&#34;https://github.com/arkodg&#34;&gt;Arko Dasgupta&lt;/a&gt;, &lt;a href=&#34;https://github.com/kl52752&#34;&gt;Katarzyna Łach&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-91: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/pull/3942&#34;&gt;Address connection coalescing security issue&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This release brings updates for configuring client certificate validation, addressing a critical security vulnerability related to connection reuse.
HTTP connection coalescing is a web performance optimization that allows a client to reuse an existing TLS connection
for requests to different domains. While this reduces the overhead of establishing new connections, it introduces a security risk
in the context of API gateways.
The ability to reuse a single TLS connection across multiple Listeners brings the need to introduce shared client certificate
configuration in order to avoid unauthorized access.&lt;/p&gt;
&lt;h4 id=&#34;why-sni-based-mtls-is-not-the-answer&#34;&gt;Why SNI-based mTLS is not the answer&lt;/h4&gt;
&lt;p&gt;One might think that using Server Name Indication (SNI) to differentiate between Listeners would solve this problem.
However, TLS SNI is not a reliable mechanism for enforcing security policies in a connection coalescing scenario.
A client could use a single TLS connection for multiple peer connections, as long as they are all covered by the same certificate.
This means that a client could establish a connection by indicating one peer identity (using SNI), and then reuse that connection
to access a different virtual host that is listening on the same IP address and port. That reuse, which is controlled by client side
heuristics, could bypass mutual TLS policies that were specific to the second listener configuration.&lt;/p&gt;
&lt;p&gt;Here&#39;s an example to help explain it:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;wildcard-tls-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;gatewayClassName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;listeners&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo-https&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPS&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;443&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo.example.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tls&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;certificateRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# core API group&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Secret&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name: foo-example-com-cert  # SAN&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo.example.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;wildcard-https&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPS&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;443&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;*.example.com&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tls&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;certificateRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# core API group&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Secret&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name: wildcard-example-com-cert # SAN&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080&#34;&gt;*.example.com&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I have configured a Gateway with two listeners, both having overlapping hostnames.
My intention is for the &lt;code&gt;foo-http&lt;/code&gt; listener to be accessible only by clients presenting the &lt;code&gt;foo-example-com-cert&lt;/code&gt; certificate.
In contrast, the &lt;code&gt;wildcard-https&lt;/code&gt; listener should allow access to a broader audience using any certificate valid for the &lt;code&gt;*.example.com&lt;/code&gt; domain.&lt;/p&gt;
&lt;p&gt;Consider a scenario where a client initially connects to &lt;code&gt;foo.example.com&lt;/code&gt;. The server requests and successfully validates the
&lt;code&gt;foo-example-com-cert&lt;/code&gt; certificate, establishing the connection. Subsequently, the same client wishes to access other sites within this domain,
such as &lt;code&gt;bar.example.com&lt;/code&gt;, which is handled by the &lt;code&gt;wildcard-https&lt;/code&gt; listener. Due to connection reuse,
clients can access &lt;code&gt;wildcard-https&lt;/code&gt; backends without an additional TLS handshake on the existing connection.
This process functions as expected.&lt;/p&gt;
&lt;p&gt;However, a critical security vulnerability arises when the order of access is reversed.
If a client first connects to &lt;code&gt;bar.example.com&lt;/code&gt; and presents a valid &lt;code&gt;bar.example.com&lt;/code&gt; certificate, the connection is successfully established.
If this client then attempts to access &lt;code&gt;foo.example.com&lt;/code&gt;, the existing connection&#39;s client certificate will not be re-validated.
This allows the client to bypass the specific certificate requirement for the &lt;code&gt;foo&lt;/code&gt; backend, leading to a serious security breach.&lt;/p&gt;
&lt;h4 id=&#34;the-solution-per-port-tls-configuration&#34;&gt;The solution: per-port TLS configuration&lt;/h4&gt;
&lt;p&gt;The updated Gateway API gains a &lt;code&gt;tls&lt;/code&gt; field in the &lt;code&gt;.spec&lt;/code&gt; of a Gateway, that allows you to define a default client certificate
validation configuration for all Listeners, and then if needed override it on a per-port basis. This provides a flexible and
powerful way to manage your TLS policies.&lt;/p&gt;
&lt;p&gt;Here’s a look at the updated API definitions (shown as Go source code):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// GatewaySpec defines the desired state of Gateway.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; GatewaySpec &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#666&#34;&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// GatewayTLSConfig specifies frontend tls configuration for gateway.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    TLS &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;GatewayTLSConfig &lt;span style=&#34;color:#b44&#34;&gt;`json:&amp;#34;tls,omitempty&amp;#34;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// GatewayTLSConfig specifies frontend tls configuration for gateway.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; GatewayTLSConfig &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Default specifies the default client certificate validation configuration
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    Default TLSConfig &lt;span style=&#34;color:#b44&#34;&gt;`json:&amp;#34;default&amp;#34;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// PerPort specifies tls configuration assigned per port.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    PerPort []TLSPortConfig &lt;span style=&#34;color:#b44&#34;&gt;`json:&amp;#34;perPort,omitempty&amp;#34;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// TLSPortConfig describes a TLS configuration for a specific port.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; TLSPortConfig &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// The Port indicates the Port Number to which the TLS configuration will be applied.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    Port PortNumber &lt;span style=&#34;color:#b44&#34;&gt;`json:&amp;#34;port&amp;#34;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// TLS store the configuration that will be applied to all Listeners handling
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// HTTPS traffic and matching given port.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    TLS TLSConfig &lt;span style=&#34;color:#b44&#34;&gt;`json:&amp;#34;tls&amp;#34;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;breaking-changes&#34;&gt;Breaking changes&lt;/h2&gt;
&lt;h3 id=&#34;breaking-grpcroute&#34;&gt;Standard GRPCRoute - &lt;code&gt;.spec&lt;/code&gt; field required (technicality)&lt;/h3&gt;
&lt;p&gt;The promotion of GRPCRoute to Standard introduces a minor but technically breaking change regarding the presence of the top-level &lt;code&gt;.spec&lt;/code&gt; field.
As part of achieving Standard status, the Gateway API has tightened the OpenAPI schema validation within the GRPCRoute
CustomResourceDefinition (CRD)
to explicitly ensure the spec field is required for all GRPCRoute resources.
This change enforces stricter conformance to Kubernetes object standards and enhances the resource&#39;s stability and predictability.
While it is highly unlikely that users were attempting to define a GRPCRoute without any specification, any existing automation
or manifests that might have relied on a relaxed interpretation allowing a completely absent &lt;code&gt;spec&lt;/code&gt; field will now fail validation
and &lt;strong&gt;must&lt;/strong&gt; be updated to include the &lt;code&gt;.spec&lt;/code&gt; field, even if empty.&lt;/p&gt;
&lt;h3 id=&#34;breaking-httproute&#34;&gt;Experimental CORS support in HTTPRoute - breaking change for &lt;code&gt;allowCredentials&lt;/code&gt; field&lt;/h3&gt;
&lt;p&gt;The Gateway API subproject has introduced a breaking change to the Experimental CORS support in HTTPRoute, concerning the &lt;code&gt;allowCredentials&lt;/code&gt; field
within the CORS policy.
This field&#39;s definition has been strictly aligned with the upstream CORS specification, which dictates that the corresponding
&lt;code&gt;Access-Control-Allow-Credentials&lt;/code&gt; header must represent a Boolean value.
Previously, the implementation might have been overly permissive, potentially accepting non-standard or string representations such as
&lt;code&gt;true&lt;/code&gt; due to relaxed schema validation.
Users who were configuring CORS rules must now review their manifests and ensure the value for &lt;code&gt;allowCredentials&lt;/code&gt;
strictly conforms to the new, more restrictive schema.
Any existing HTTPRoute definitions that do not adhere to this stricter validation will now be rejected by the API server,
requiring a configuration update to maintain functionality.&lt;/p&gt;
&lt;h2 id=&#34;improving-the-development-and-usage-experience&#34;&gt;Improving the development and usage experience&lt;/h2&gt;
&lt;p&gt;As part of this release, we have improved some of the developer experience workflow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Added &lt;a href=&#34;https://github.com/kubernetes-sigs/kube-api-linter&#34;&gt;Kube API Linter&lt;/a&gt; to the CI/CD pipelines, reducing the burden of API reviewers and also reducing the amount of common mistakes.&lt;/li&gt;
&lt;li&gt;Improving the execution time of CRD tests with the usage of &lt;a href=&#34;https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/envtest&#34;&gt;&lt;code&gt;envtest&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Additionally, as part of the effort to improve Gateway API usage experience, some efforts were made to remove some ambiguities and some old tech-debts from our documentation website:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The API reference is now explicit when a field is &lt;code&gt;experimental&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The GEP (GatewayAPI Enhancement Proposal) navigation bar is automatically generated, reflecting the real status of the enhancements.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;try-it-out&#34;&gt;Try it out&lt;/h2&gt;
&lt;p&gt;Unlike other Kubernetes APIs, you don&#39;t need to upgrade to the latest version of
Kubernetes to get the latest version of Gateway API. As long as you&#39;re running
Kubernetes 1.26 or later, you&#39;ll be able to get up and running with this version
of Gateway API.&lt;/p&gt;
&lt;p&gt;To try out the API, follow the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/&#34;&gt;Getting Started Guide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As of this writing, seven implementations are already conformant with Gateway API v1.4.0. In alphabetical order:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kgateway-dev/kgateway/releases/tag/v2.2.0-alpha.1&#34;&gt;Agent Gateway (with kgateway)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/airlock/microgateway/releases/tag/4.8.0-alpha1&#34;&gt;Airlock Microgateway&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/envoyproxy/gateway/releases/tag/v1.6.0-rc.1&#34;&gt;Envoy Gateway&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.cloud.google.com/kubernetes-engine/docs/concepts/gateway-api&#34;&gt;GKE Gateway&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/istio/istio/releases/tag/1.28.0-rc.1&#34;&gt;Istio&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0&#34;&gt;kgateway&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/traefik/traefik/releases/tag/v3.6.0-rc1&#34;&gt;Traefik Proxy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;Wondering when a feature will be added?  There are lots of opportunities to get
involved and help define the future of Kubernetes routing APIs for both ingress
and service mesh.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Check out the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides&#34;&gt;user guides&lt;/a&gt; to see what use-cases can be addressed.&lt;/li&gt;
&lt;li&gt;Try out one of the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/implementations/&#34;&gt;existing Gateway controllers&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Or &lt;a href=&#34;https://gateway-api.sigs.k8s.io/contributing/&#34;&gt;join us in the community&lt;/a&gt;
and help us build the future of Gateway API together!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The maintainers would like to thank &lt;em&gt;everyone&lt;/em&gt; who&#39;s contributed to Gateway
API, whether in the form of commits to the repo, discussion, ideas, or general
support. We could never have made this kind of progress without the support of
this dedicated and active community.&lt;/p&gt;
&lt;h2 id=&#34;related-kubernetes-blog-articles&#34;&gt;Related Kubernetes blog articles&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/02/gateway-api-v1-3/&#34;&gt;Gateway API v1.3.0: Advancements in Request Mirroring, CORS, Gateway Merging, and Retry Budgets&lt;/a&gt;
(June 2025)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/11/21/gateway-api-v1-2/&#34;&gt;Gateway API v1.2: WebSockets, Timeouts, Retries, and More&lt;/a&gt;
(November 2024)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/05/09/gateway-api-v1-1/&#34;&gt;Gateway API v1.1: Service mesh, GRPCRoute, and a whole lot more&lt;/a&gt;
(May 2024)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/11/28/gateway-api-ga/&#34;&gt;New Experimental Features in Gateway API v1.0&lt;/a&gt;
(November 2023)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/10/31/gateway-api-ga/&#34;&gt;Gateway API v1.0: GA Release&lt;/a&gt;
(October 2023)&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.35: Kubelet Configuration Drop-in Directory Graduates to GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/23/kubelet-config-drop-in-directory-ga/</link>
      <pubDate>Thu, 23 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/23/kubelet-config-drop-in-directory-ga/</guid>
      <description>
        
        
        &lt;p&gt;With the recent v1.35 release of Kubernetes, support for a kubelet configuration drop-in directory is generally available.
The newly stable feature simplifies the management of kubelet configuration across large, heterogeneous clusters.&lt;/p&gt;
&lt;p&gt;With v1.35, the kubelet command line argument &lt;code&gt;--config-dir&lt;/code&gt; is production-ready and fully supported,
allowing you to specify a directory containing kubelet configuration drop-in files.
All files in that directory will be automatically merged with your main kubelet configuration.
This allows cluster administrators to maintain a cohesive &lt;em&gt;base configuration&lt;/em&gt; for kubelets while enabling targeted customizations for different node groups or use cases, and without complex tooling or manual configuration management.&lt;/p&gt;
&lt;h2 id=&#34;the-problem-managing-kubelet-configuration-at-scale&#34;&gt;The problem: managing kubelet configuration at scale&lt;/h2&gt;
&lt;p&gt;As Kubernetes clusters grow larger and more complex, they often include heterogeneous node pools with different hardware capabilities, workload requirements, and operational constraints. This diversity necessitates different kubelet configurations across node groups—yet managing these varied configurations at scale becomes increasingly challenging. Several pain points emerge:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Configuration drift&lt;/strong&gt;: Different nodes may have slightly different configurations, leading to inconsistent behavior&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Node group customization&lt;/strong&gt;: GPU nodes, edge nodes, and standard compute nodes often require different kubelet settings&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operational overhead&lt;/strong&gt;: Maintaining separate, complete configuration files for each node type is error-prone and difficult to audit&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Change management&lt;/strong&gt;: Rolling out configuration changes across heterogeneous node pools requires careful coordination&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Before this support was added to Kubernetes, cluster administrators had to choose between using a single monolithic configuration file for all nodes,
manually maintaining multiple complete configuration files, or relying on separate tooling. Each approach had its own drawbacks.
This graduation to stable gives cluster administrators a fully supported fourth way to solve that challenge.&lt;/p&gt;
&lt;h2 id=&#34;example-use-cases&#34;&gt;Example use cases&lt;/h2&gt;
&lt;h3 id=&#34;managing-heterogeneous-node-pools&#34;&gt;Managing heterogeneous node pools&lt;/h3&gt;
&lt;p&gt;Consider a cluster with multiple node types: standard compute nodes, high-capacity nodes (such as those with GPUs or large amounts of memory), and edge nodes with specialized requirements.&lt;/p&gt;
&lt;h4 id=&#34;base-configuration&#34;&gt;Base configuration&lt;/h4&gt;
&lt;p&gt;File: &lt;code&gt;00-base.conf&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet.config.k8s.io/v1beta1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;KubeletConfiguration&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;clusterDNS&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;10.96.0.10&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;clusterDomain&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;cluster.local&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h4 id=&#34;high-capacity-node-override&#34;&gt;High-capacity node override&lt;/h4&gt;
&lt;p&gt;File: &lt;code&gt;50-high-capacity-nodes.conf&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet.config.k8s.io/v1beta1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;KubeletConfiguration&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;maxPods&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;50&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;systemReserved&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;4Gi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;1000m&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h4 id=&#34;edge-node-override&#34;&gt;Edge node override&lt;/h4&gt;
&lt;p&gt;File: &lt;code&gt;50-edge-nodes.conf&lt;/code&gt; (edge compute typically has lower capacity)&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet.config.k8s.io/v1beta1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;KubeletConfiguration&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;evictionHard&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory.available&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;500Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;nodefs.available&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;5%&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With this structure, high-capacity nodes apply both the base configuration and the capacity-specific overrides, while edge nodes apply the base configuration with edge-specific settings.&lt;/p&gt;
&lt;h3 id=&#34;gradual-configuration-rollouts&#34;&gt;Gradual configuration rollouts&lt;/h3&gt;
&lt;p&gt;When rolling out configuration changes, you can:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Add a new drop-in file with a high numeric prefix (e.g., &lt;code&gt;99-new-feature.conf&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Test the changes on a subset of nodes&lt;/li&gt;
&lt;li&gt;Gradually roll out to more nodes&lt;/li&gt;
&lt;li&gt;Once stable, merge changes into the base configuration&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;viewing-the-merged-configuration&#34;&gt;Viewing the merged configuration&lt;/h2&gt;
&lt;p&gt;Since configuration is now spread across multiple files, you can inspect the final merged configuration using the kubelet&#39;s &lt;code&gt;/configz&lt;/code&gt; endpoint:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Start kubectl proxy&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl proxy
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# In another terminal, fetch the merged configuration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Change the &amp;#39;&amp;lt;node-name&amp;gt;&amp;#39; placeholder before running the curl command&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -X GET http://127.0.0.1:8001/api/v1/nodes/&amp;lt;node-name&amp;gt;/proxy/configz | jq .
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This shows the actual configuration the kubelet is using after all merging has been applied.
The merged configuration also includes any configuration settings that were specified via kubelet command-line arguments.&lt;/p&gt;
&lt;p&gt;For detailed setup instructions, configuration examples, and merging behavior, see the official documentation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/administer-cluster/kubelet-config-file/#kubelet-conf-d&#34;&gt;Set Kubelet Parameters Via A Configuration File&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/node/kubelet-config-directory-merging/&#34;&gt;Kubelet Configuration Directory Merging&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;good-practices&#34;&gt;Good practices&lt;/h2&gt;
&lt;p&gt;When using the kubelet configuration drop-in directory:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Test configurations incrementally&lt;/strong&gt;: Always test new drop-in configurations on a subset of nodes before rolling out cluster-wide to minimize risk&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Version control your drop-ins&lt;/strong&gt;: Store your drop-in configuration files in version control (or the configuration source from which these are generated) alongside your infrastructure as code to track changes and enable easy rollbacks&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use numeric prefixes for predictable ordering&lt;/strong&gt;: Name files with numeric prefixes (e.g., &lt;code&gt;00-&lt;/code&gt;, &lt;code&gt;50-&lt;/code&gt;, &lt;code&gt;90-&lt;/code&gt;) to explicitly control merge order and make the configuration layering obvious to other administrators&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Be mindful of temporary files&lt;/strong&gt;: Some text editors automatically create backup files (such as &lt;code&gt;.bak&lt;/code&gt;, &lt;code&gt;.swp&lt;/code&gt;, or files with &lt;code&gt;~&lt;/code&gt; suffix) in the same directory when editing. Ensure these temporary or backup files are not left in the configuration directory, as they may be processed by the kubelet&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;acknowledgments&#34;&gt;Acknowledgments&lt;/h2&gt;
&lt;p&gt;This feature was developed through the collaborative efforts of &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-node&#34;&gt;SIG Node&lt;/a&gt;. Special thanks to all contributors who helped design, implement, test, and document this feature across its journey from alpha in v1.28, through beta in v1.30, to GA in v1.35.&lt;/p&gt;
&lt;p&gt;To provide feedback on this feature, join the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-node&#34;&gt;Kubernetes Node Special Interest Group&lt;/a&gt;, participate in discussions on the &lt;a href=&#34;http://slack.k8s.io/&#34;&gt;public Slack channel&lt;/a&gt; (#sig-node), or file an issue on &lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues&#34;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;If you have feedback or questions about kubelet configuration management, or want to share your experience using this feature, join the discussion:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-node&#34;&gt;SIG Node community page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://slack.k8s.io/&#34;&gt;Kubernetes Slack&lt;/a&gt; in the #sig-node channel&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://groups.google.com/forum/#!forum/kubernetes-sig-node&#34;&gt;SIG Node mailing list&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;SIG Node would love to hear about your experiences using this feature in production!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>7 Common Kubernetes Pitfalls (and How I Learned to Avoid Them)</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/20/seven-kubernetes-pitfalls-and-how-to-avoid/</link>
      <pubDate>Mon, 20 Oct 2025 08:30:00 -0700</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/20/seven-kubernetes-pitfalls-and-how-to-avoid/</guid>
      <description>
        
        
        &lt;p&gt;It’s no secret that Kubernetes can be both powerful and frustrating at times. When I first started dabbling with container orchestration, I made more than my fair share of mistakes enough to compile a whole list of pitfalls. In this post, I want to walk through seven big gotchas I’ve encountered (or seen others run into) and share some tips on how to avoid them. Whether you’re just kicking the tires on Kubernetes or already managing production clusters, I hope these insights help you steer clear of a little extra stress.&lt;/p&gt;
&lt;h2 id=&#34;1-skipping-resource-requests-and-limits&#34;&gt;1. Skipping resource requests and limits&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Not specifying CPU and memory requirements in Pod specifications. This typically happens because Kubernetes does not require these fields, and workloads can often start and run without them—making the omission easy to overlook in early configurations or during rapid deployment cycles.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt;:
In Kubernetes, resource requests and limits are critical for efficient cluster management. Resource requests ensure that the scheduler reserves the appropriate amount of CPU and memory for each pod, guaranteeing that it has the necessary resources to operate. Resource limits cap the amount of CPU and memory a pod can use, preventing any single pod from consuming excessive resources and potentially starving other pods.
When resource requests and limits are not set:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Resource Starvation: Pods may get insufficient resources, leading to degraded performance or failures. This is because Kubernetes schedules pods based on these requests. Without them, the scheduler might place too many pods on a single node, leading to resource contention and performance bottlenecks.&lt;/li&gt;
&lt;li&gt;Resource Hoarding: Conversely, without limits, a pod might consume more than its fair share of resources, impacting the performance and stability of other pods on the same node. This can lead to issues such as other pods getting evicted or killed by the Out-Of-Memory (OOM) killer due to lack of available memory.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;how-to-avoid-it&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Start with modest &lt;code&gt;requests&lt;/code&gt; (for example &lt;code&gt;100m&lt;/code&gt; CPU, &lt;code&gt;128Mi&lt;/code&gt; memory) and see how your app behaves.&lt;/li&gt;
&lt;li&gt;Monitor real-world usage and refine your values; the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/run-application/horizontal-pod-autoscale/&#34;&gt;HorizontalPodAutoscaler&lt;/a&gt; can help automate scaling based on metrics.&lt;/li&gt;
&lt;li&gt;Keep an eye on &lt;code&gt;kubectl top pods&lt;/code&gt; or your logging/monitoring tool to confirm you’re not over- or under-provisioning.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: Early on, I never thought about memory limits. Things seemed fine on my local cluster. Then, on a larger environment, Pods got &lt;em&gt;OOMKilled&lt;/em&gt; left and right. Lesson learned.
For detailed instructions on configuring resource requests and limits for your containers, please refer to &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/assign-memory-resource/&#34;&gt;Assign Memory Resources to Containers and Pods&lt;/a&gt;
(part of the official Kubernetes documentation).&lt;/p&gt;
&lt;h2 id=&#34;2-underestimating-liveness-and-readiness-probes&#34;&gt;2. Underestimating liveness and readiness probes&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Deploying containers without explicitly defining how Kubernetes should check their health or readiness. This tends to happen because Kubernetes will consider a container “running” as long as the process inside hasn’t exited. Without additional signals, Kubernetes assumes the workload is functioning—even if the application inside is unresponsive, initializing, or stuck.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt;:&lt;br&gt;
Liveness, readiness, and startup probes are mechanisms Kubernetes uses to monitor container health and availability.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Liveness probes&lt;/strong&gt; determine if the application is still alive. If a liveness check fails, the container is restarted.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Readiness probes&lt;/strong&gt; control whether a container is ready to serve traffic. Until the readiness probe passes, the container is removed from Service endpoints.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Startup probes&lt;/strong&gt; help distinguish between long startup times and actual failures.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;how-to-avoid-it-1&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Add a simple HTTP &lt;code&gt;livenessProbe&lt;/code&gt; to check a health endpoint (for example &lt;code&gt;/healthz&lt;/code&gt;) so Kubernetes can restart a hung container.&lt;/li&gt;
&lt;li&gt;Use a &lt;code&gt;readinessProbe&lt;/code&gt; to ensure traffic doesn’t reach your app until it’s warmed up.&lt;/li&gt;
&lt;li&gt;Keep probes simple. Overly complex checks can create false alarms and unnecessary restarts.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: I once forgot a readiness probe for a web service that took a while to load. Users hit it prematurely, got weird timeouts, and I spent hours scratching my head. A 3-line readiness probe would have saved the day.&lt;/p&gt;
&lt;p&gt;For comprehensive instructions on configuring liveness, readiness, and startup probes for containers, please refer to &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/&#34;&gt;Configure Liveness, Readiness and Startup Probes&lt;/a&gt;
in the official Kubernetes documentation.&lt;/p&gt;
&lt;h2 id=&#34;3-we-ll-just-look-at-container-logs-famous-last-words&#34;&gt;3. “We’ll just look at container logs” (famous last words)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Relying solely on container logs retrieved via &lt;code&gt;kubectl logs&lt;/code&gt;. This often happens because the command is quick and convenient, and in many setups, logs appear accessible during development or early troubleshooting. However, &lt;code&gt;kubectl logs&lt;/code&gt; only retrieves logs from currently running or recently terminated containers, and those logs are stored on the node’s local disk. As soon as the container is deleted, evicted, or the node is restarted, the log files may be rotated out or permanently lost.&lt;/p&gt;
&lt;h3 id=&#34;how-to-avoid-it-2&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Centralize logs&lt;/strong&gt; using CNCF tools like &lt;a href=&#34;https://kubernetes.io/docs/concepts/cluster-administration/logging/#sidecar-container-with-a-logging-agent&#34;&gt;Fluentd&lt;/a&gt; or &lt;a href=&#34;https://fluentbit.io/&#34;&gt;Fluent Bit&lt;/a&gt; to aggregate output from all Pods.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Adopt OpenTelemetry&lt;/strong&gt; for a unified view of logs, metrics, and (if needed) traces. This lets you spot correlations between infrastructure events and app-level behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pair logs with Prometheus metrics&lt;/strong&gt; to track cluster-level data alongside application logs. If you need distributed tracing, consider CNCF projects like &lt;a href=&#34;https://www.jaegertracing.io/&#34;&gt;Jaeger&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: The first time I lost Pod logs to a quick restart, I realized how flimsy “kubectl logs” can be on its own. Since then, I’ve set up a proper pipeline for every cluster to avoid missing vital clues.&lt;/p&gt;
&lt;h2 id=&#34;4-treating-dev-and-prod-exactly-the-same&#34;&gt;4. Treating dev and prod exactly the same&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Deploying the same Kubernetes manifests with identical settings across development, staging, and production environments. This often occurs when teams aim for consistency and reuse, but overlook that environment-specific factors—such as traffic patterns, resource availability, scaling needs, or access control—can differ significantly. Without customization, configurations optimized for one environment may cause instability, poor performance, or security gaps in another.&lt;/p&gt;
&lt;h3 id=&#34;how-to-avoid-it-3&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Use environment overlays or &lt;a href=&#34;https://kustomize.io/&#34;&gt;kustomize&lt;/a&gt; to maintain a shared base while customizing resource requests, replicas, or config for each environment.&lt;/li&gt;
&lt;li&gt;Extract environment-specific configuration into ConfigMaps and / or Secrets. You can use a specialized tool such as &lt;a href=&#34;https://github.com/bitnami-labs/sealed-secrets&#34;&gt;Sealed Secrets&lt;/a&gt; to manage confidential data.&lt;/li&gt;
&lt;li&gt;Plan for scale in production. Your dev cluster can probably get away with minimal CPU/memory, but prod might need significantly more.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: One time, I scaled up &lt;code&gt;replicaCount&lt;/code&gt; from 2 to 10 in a tiny dev environment just to “test.” I promptly ran out of resources and spent half a day cleaning up the aftermath. Oops.&lt;/p&gt;
&lt;h2 id=&#34;5-leaving-old-stuff-floating-around&#34;&gt;5. Leaving old stuff floating around&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Leaving unused or outdated resources—such as Deployments, Services, ConfigMaps, or PersistentVolumeClaims—running in the cluster. This often happens because Kubernetes does not automatically remove resources unless explicitly instructed, and there is no built-in mechanism to track ownership or expiration. Over time, these forgotten objects can accumulate, consuming cluster resources, increasing cloud costs, and creating operational confusion, especially when stale Services or LoadBalancers continue to route traffic.&lt;/p&gt;
&lt;h3 id=&#34;how-to-avoid-it-4&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Label everything&lt;/strong&gt; with a purpose or owner label. That way, you can easily query resources you no longer need.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Regularly audit&lt;/strong&gt; your cluster: run &lt;code&gt;kubectl get all -n &amp;lt;namespace&amp;gt;&lt;/code&gt; to see what’s actually running, and confirm it’s all legit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Adopt Kubernetes’ Garbage Collection&lt;/strong&gt;: &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/garbage-collection/&#34;&gt;K8s docs&lt;/a&gt; show how to remove dependent objects automatically.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Leverage policy automation&lt;/strong&gt;: Tools like &lt;a href=&#34;https://kyverno.io/&#34;&gt;Kyverno&lt;/a&gt; can automatically delete or block stale resources after a certain period, or enforce lifecycle policies so you don’t have to remember every single cleanup step.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: After a hackathon, I forgot to tear down a “test-svc” pinned to an external load balancer. Three weeks later, I realized I’d been paying for that load balancer the entire time. Facepalm.&lt;/p&gt;
&lt;h2 id=&#34;6-diving-too-deep-into-networking-too-soon&#34;&gt;6. Diving too deep into networking too soon&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Introducing advanced networking solutions—such as service meshes, custom CNI plugins, or multi-cluster communication—before fully understanding Kubernetes&#39; native networking primitives. This commonly occurs when teams implement features like traffic routing, observability, or mTLS using external tools without first mastering how core Kubernetes networking works: including Pod-to-Pod communication, ClusterIP Services, DNS resolution, and basic ingress traffic handling. As a result, network-related issues become harder to troubleshoot, especially when overlays introduce additional abstractions and failure points.&lt;/p&gt;
&lt;h3 id=&#34;how-to-avoid-it-5&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Start small: a Deployment, a Service, and a basic ingress controller such as one based on NGINX (e.g., Ingress-NGINX).&lt;/li&gt;
&lt;li&gt;Make sure you understand how traffic flows within the cluster, how service discovery works, and how DNS is configured.&lt;/li&gt;
&lt;li&gt;Only move to a full-blown mesh or advanced CNI features when you actually need them, complex networking adds overhead.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: I tried Istio on a small internal app once, then spent more time debugging Istio itself than the actual app. Eventually, I stepped back, removed Istio, and everything worked fine.&lt;/p&gt;
&lt;h2 id=&#34;7-going-too-light-on-security-and-rbac&#34;&gt;7. Going too light on security and RBAC&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The pitfall&lt;/strong&gt;: Deploying workloads with insecure configurations, such as running containers as the root user, using the &lt;code&gt;latest&lt;/code&gt; image tag, disabling security contexts, or assigning overly broad RBAC roles like &lt;code&gt;cluster-admin&lt;/code&gt;. These practices persist because Kubernetes does not enforce strict security defaults out of the box, and the platform is designed to be flexible rather than opinionated. Without explicit security policies in place, clusters can remain exposed to risks like container escape, unauthorized privilege escalation, or accidental production changes due to unpinned images.&lt;/p&gt;
&lt;h3 id=&#34;how-to-avoid-it-6&#34;&gt;How to avoid it:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Use &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/access-authn-authz/rbac/&#34;&gt;RBAC&lt;/a&gt; to define roles and permissions within Kubernetes. While RBAC is the default and most widely supported authorization mechanism, Kubernetes also allows the use of alternative authorizers. For more advanced or external policy needs, consider solutions like &lt;a href=&#34;https://open-policy-agent.github.io/gatekeeper/&#34;&gt;OPA Gatekeeper&lt;/a&gt; (based on Rego), &lt;a href=&#34;https://kyverno.io/&#34;&gt;Kyverno&lt;/a&gt;, or custom webhooks using policy languages such as CEL or &lt;a href=&#34;https://cedarpolicy.com/&#34;&gt;Cedar&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Pin images to specific versions (no more &lt;code&gt;:latest&lt;/code&gt;!). This helps you know what’s actually deployed.&lt;/li&gt;
&lt;li&gt;Look into &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/security/pod-security-admission/&#34;&gt;Pod Security Admission&lt;/a&gt; (or other solutions like Kyverno) to enforce non-root containers, read-only filesystems, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;My reality check&lt;/strong&gt;: I never had a huge security breach, but I’ve heard plenty of cautionary tales. If you don’t tighten things up, it’s only a matter of time before something goes wrong.&lt;/p&gt;
&lt;h2 id=&#34;final-thoughts&#34;&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;Kubernetes is amazing, but it’s not psychic, it won’t magically do the right thing if you don’t tell it what you need. By keeping these pitfalls in mind, you’ll avoid a lot of headaches and wasted time. Mistakes happen (trust me, I’ve made my share), but each one is a chance to learn more about how Kubernetes truly works under the hood.
If you’re curious to dive deeper, the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/home/&#34;&gt;official docs&lt;/a&gt; and the &lt;a href=&#34;http://slack.kubernetes.io/&#34;&gt;community Slack&lt;/a&gt; are excellent next steps. And of course, feel free to share your own horror stories or success tips, because at the end of the day, we’re all in this cloud native adventure together.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Happy Shipping!&lt;/strong&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Spotlight on Policy Working Group</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/18/wg-policy-spotlight-2025/</link>
      <pubDate>Sat, 18 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/18/wg-policy-spotlight-2025/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;(Note: The Policy Working Group has completed its mission and is no longer active. This article reflects its work, accomplishments, and insights into how a working group operates.)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the complex world of Kubernetes, policies play a crucial role in managing and securing clusters. But have you ever wondered how these policies are developed, implemented, and standardized across the Kubernetes ecosystem? To answer that, let&#39;s take a look back at the work of the Policy Working Group.&lt;/p&gt;
&lt;p&gt;The Policy Working Group was dedicated to a critical mission: providing an overall architecture that encompasses both current policy-related implementations and future policy proposals in Kubernetes. Their goal was both ambitious and essential: to develop a universal policy architecture that benefits developers and end-users alike.&lt;/p&gt;
&lt;p&gt;Through collaborative methods, this working group strove to bring clarity and consistency to the often complex world of Kubernetes policies. By focusing on both existing implementations and future proposals, they ensured that the policy landscape in Kubernetes remains coherent and accessible as the technology evolves.&lt;/p&gt;
&lt;p&gt;This blog post dives deeper into the work of the Policy Working Group, guided by insights from its former co-chairs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://twitter.com/JimBugwadia&#34;&gt;Jim Bugwadia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://twitter.com/poonam_lamba&#34;&gt;Poonam Lamba&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://twitter.com/sudermanjr&#34;&gt;Andy Suderman&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Interviewed by &lt;a href=&#34;https://twitter.com/arujjval&#34;&gt;Arujjwal Negi&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;These co-chairs explained what the Policy Working Group was all about.&lt;/p&gt;
&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Hello, thank you for the time! Let’s start with some introductions, could you tell us a bit about yourself, your role, and how you got involved in Kubernetes?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Jim Bugwadia&lt;/strong&gt;: My name is Jim Bugwadia, and I am a co-founder and the CEO at Nirmata which provides solutions that automate security and compliance for cloud-native workloads. At Nirmata, we have been working with Kubernetes since it started in 2014. We initially built a Kubernetes policy engine in our commercial platform and later donated it to CNCF as the Kyverno project. I joined the CNCF Kubernetes Policy Working Group to help build and standardize various aspects of policy management for Kubernetes and later became a co-chair.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Andy Suderman&lt;/strong&gt;: My name is Andy Suderman and I am the CTO of Fairwinds, a managed Kubernetes-as-a-Service provider. I began working with Kubernetes in 2016 building a web conferencing platform. I am an author and/or maintainer of several Kubernetes-related open-source projects such as Goldilocks, Pluto, and Polaris. Polaris is a JSON-schema-based policy engine, which started Fairwinds&#39; journey into the policy space and my involvement in the Policy Working Group.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Poonam Lamba&lt;/strong&gt;: My name is Poonam Lamba, and I currently work as a Product Manager for Google Kubernetes Engine (GKE) at Google. My journey with Kubernetes began back in 2017 when I was building an SRE platform for a large enterprise, using a private cloud built on Kubernetes. Intrigued by its potential to revolutionize the way we deployed and managed applications at the time, I dove headfirst into learning everything I could about it. Since then, I&#39;ve had the opportunity to build the policy and compliance products for GKE. I lead and contribute to GKE CIS benchmarks. I am involved with the Gatekeeper project as well as I have contributed to Policy-WG for over 2 years and served as a co-chair for the group.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Responses to the following questions represent an amalgamation of insights from the former co-chairs.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id=&#34;about-working-groups&#34;&gt;About Working Groups&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;One thing even I am not aware of is the difference between a working group and a SIG. Can you help us understand what a working group is and how it is different from a SIG?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Unlike SIGs, working groups are temporary and focused on tackling specific, cross-cutting issues or projects that may involve multiple SIGs. Their lifespan is defined, and they disband once they&#39;ve achieved their objective. Generally, working groups don&#39;t own code or have long-term responsibility for managing a particular area of the Kubernetes project.&lt;/p&gt;
&lt;p&gt;(To know more about SIGs, visit the &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-list.md&#34;&gt;list of Special Interest Groups&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You mentioned that Working Groups involve multiple SIGS. What SIGS was the Policy WG closely involved with, and how did you coordinate with them?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The group collaborated closely with Kubernetes SIG Auth throughout our existence, and more recently, the group also worked with SIG Security since its formation. Our collaboration occurred in a few ways. We provided periodic updates during the SIG meetings to keep them informed of our progress and activities. Additionally, we utilize other community forums to maintain open lines of communication and ensured our work aligned with the broader Kubernetes ecosystem. This collaborative approach helped the group stay coordinated with related efforts across the Kubernetes community.&lt;/p&gt;
&lt;h2 id=&#34;policy-wg&#34;&gt;Policy WG&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Why was the Policy Working Group created?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To enable a broad set of use cases, we recognize that Kubernetes is powered by a highly declarative, fine-grained, and extensible configuration management system. We&#39;ve observed that a Kubernetes configuration manifest may have different portions that are important to various stakeholders. For example, some parts may be crucial for developers, while others might be of particular interest to security teams or address operational concerns. Given this complexity, we believe that policies governing the usage of these intricate configurations are essential for success with Kubernetes.&lt;/p&gt;
&lt;p&gt;Our Policy Working Group was created specifically to research the standardization of policy definitions and related artifacts. We saw a need to bring consistency and clarity to how policies are defined and implemented across the Kubernetes ecosystem, given the diverse requirements and stakeholders involved in Kubernetes deployments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Can you give me an idea of the work you did in the group?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We worked on several Kubernetes policy-related projects. Our initiatives included:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We worked on a Kubernetes Enhancement Proposal (KEP) for the Kubernetes Policy Reports API. This aims to standardize how policy reports are generated and consumed within the Kubernetes ecosystem.&lt;/li&gt;
&lt;li&gt;We conducted a CNCF survey to better understand policy usage in the Kubernetes space. This helped gauge the practices and needs across the community at the time.&lt;/li&gt;
&lt;li&gt;We wrote a paper that will guide users in achieving PCI-DSS compliance for containers. This is intended to help organizations meet important security standards in their Kubernetes environments.&lt;/li&gt;
&lt;li&gt;We also worked on a paper highlighting how shifting security down can benefit organizations. This focuses on the advantages of implementing security measures earlier in the development and deployment process.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Can you tell us what were the main objectives of the Policy Working Group and some of your key accomplishments?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The charter of the Policy WG was to help standardize policy management for Kubernetes and educate the community on best practices.&lt;/p&gt;
&lt;p&gt;To accomplish this we updated the Kubernetes documentation (&lt;a href=&#34;https://kubernetes.io/docs/concepts/policy&#34;&gt;Policies | Kubernetes&lt;/a&gt;), produced several whitepapers (&lt;a href=&#34;https://github.com/kubernetes/sig-security/blob/main/sig-security-docs/papers/policy/CNCF_Kubernetes_Policy_Management_WhitePaper_v1.pdf&#34;&gt;Kubernetes Policy Management&lt;/a&gt;, &lt;a href=&#34;https://github.com/kubernetes/sig-security/blob/main/sig-security-docs/papers/policy_grc/Kubernetes_Policy_WG_Paper_v1_101123.pdf&#34;&gt;Kubernetes GRC&lt;/a&gt;), and created the Policy Reports API (&lt;a href=&#34;https://github.com/kubernetes-retired/wg-policy-prototypes/blob/master/policy-report/docs/api-docs.md&#34;&gt;API reference&lt;/a&gt;) which standardizes reporting across various tools. Several popular tools such as Falco, Trivy, Kyverno, kube-bench, and others support the Policy Report API. A major milestone for the Policy WG was promoting the Policy Reports API to a SIG-level API or finding it a stable home.&lt;/p&gt;
&lt;p&gt;Beyond that, as &lt;a href=&#34;https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/&#34;&gt;ValidatingAdmissionPolicy&lt;/a&gt; and &lt;a href=&#34;https://kubernetes.io/docs/reference/access-authn-authz/mutating-admission-policy/&#34;&gt;MutatingAdmissionPolicy&lt;/a&gt; approached GA in Kubernetes, a key goal of the WG was to guide and educate the community on the tradeoffs and appropriate usage patterns for these built-in API objects and other CNCF policy management solutions like OPA/Gatekeeper and Kyverno.&lt;/p&gt;
&lt;h2 id=&#34;challenges&#34;&gt;Challenges&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What were some of the major challenges that the Policy Working Group worked on?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;During our work in the Policy Working Group, we encountered several challenges:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;One of the main issues we faced was finding time to consistently contribute. Given that many of us have other professional commitments, it can be difficult to dedicate regular time to the working group&#39;s initiatives.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Another challenge we experienced was related to our consensus-driven model. While this approach ensures that all voices are heard, it can sometimes lead to slower decision-making processes. We valued thorough discussion and agreement, but this can occasionally delay progress on our projects.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We&#39;ve also encountered occasional differences of opinion among group members. These situations require careful navigation to ensure that we maintain a collaborative and productive environment while addressing diverse viewpoints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Lastly, we&#39;ve noticed that newcomers to the group may find it difficult to contribute effectively without consistent attendance at our meetings. The complex nature of our work often requires ongoing context, which can be challenging for those who aren&#39;t able to participate regularly.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Can you tell me more about those challenges? How did you discover each one? What has the impact been? What were some strategies you used to address them?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There are no easy answers, but having more contributors and maintainers greatly helps! Overall the CNCF community is great to work with and is very welcoming to beginners. So, if folks out there are hesitating to get involved, I highly encourage them to attend a WG or SIG meeting and just listen in.&lt;/p&gt;
&lt;p&gt;It often takes a few meetings to fully understand the discussions, so don&#39;t feel discouraged if you don&#39;t grasp everything right away. We made a point to emphasize this and encouraged new members to review documentation as a starting point for getting involved.&lt;/p&gt;
&lt;p&gt;Additionally, differences of opinion were valued and encouraged within the Policy-WG. We adhered to the CNCF core values and resolve disagreements by maintaining respect for one another. We also strove to timebox our decisions and assign clear responsibilities to keep things moving forward.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This is where our discussion about the Policy Working Group ends. The working group, and especially the people who took part in this article, hope this gave you some insights into the group&#39;s aims and workings. You can get more info about Working Groups &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Introducing Headlamp Plugin for Karpenter - Scaling and Visibility</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/</link>
      <pubDate>Mon, 06 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/</guid>
      <description>
        
        
        &lt;p&gt;Headlamp is an open‑source, extensible Kubernetes SIG UI project designed to let you explore, manage, and debug cluster resources.&lt;/p&gt;
&lt;p&gt;Karpenter is a Kubernetes Autoscaling SIG node provisioning project that helps clusters scale quickly and efficiently. It launches new nodes in seconds, selects appropriate instance types for workloads, and manages the full node lifecycle, including scale-down.&lt;/p&gt;
&lt;p&gt;The new Headlamp Karpenter Plugin adds real-time visibility into Karpenter’s activity directly from the Headlamp UI. It shows how Karpenter resources relate to Kubernetes objects, displays live metrics, and surfaces scaling events as they happen. You can inspect pending pods during provisioning, review scaling decisions, and edit Karpenter-managed resources with built-in validation. The Karpenter plugin was made as part of a LFX mentor project.&lt;/p&gt;
&lt;p&gt;The Karpenter plugin for Headlamp aims to make it easier for Kubernetes users and operators to understand, debug, and fine-tune autoscaling behavior in their clusters. Now we will give a brief tour of the Headlamp plugin.&lt;/p&gt;
&lt;h2 id=&#34;map-view-of-karpenter-resources-and-how-they-relate-to-kubernetes-resources&#34;&gt;Map view of Karpenter Resources and how they relate to Kubernetes resources&lt;/h2&gt;
&lt;p&gt;Easily see how Karpenter Resources like NodeClasses, NodePool and NodeClaims connect with core Kubernetes resources like Pods, Nodes etc.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Map view showing relationships between resources&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/mini-map-view.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;visualization-of-karpenter-metrics&#34;&gt;Visualization of Karpenter Metrics&lt;/h2&gt;
&lt;p&gt;Get instant insights of Resource Usage v/s Limits, Allowed disruptions, Pending Pods, Provisioning Latency and many more .&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;NodePool default metrics shown with controls to see different frequencies&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/chart-1.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;NodeClaim default metrics shown with controls to see different frequencies&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/chart-2.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;scaling-decisions&#34;&gt;Scaling decisions&lt;/h2&gt;
&lt;p&gt;Shows which instances are being provisioned for your workloads and understand the reason behind why Karpenter made those choices. Helpful while debugging.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Pod Placement Decisions data including reason, from, pod, message, and age&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/pod-decisions.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Node decision data including Type, Reason, Node, From, Message&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/node-decisions.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;config-editor-with-validation-support&#34;&gt;Config editor with validation support&lt;/h2&gt;
&lt;p&gt;Make live edits to Karpenter configurations. The editor includes diff previews and resource validation for safer adjustments.&lt;br&gt;
&lt;img alt=&#34;Config editor with validation support&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/config-editor.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;real-time-view-of-karpenter-resources&#34;&gt;Real time view of Karpenter resources&lt;/h2&gt;
&lt;p&gt;View and track Karpenter specific resources in real time such as “NodeClaims” as your  cluster scales up and down.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Node claims data including Name, Status, Instance Type, CPU, Zone, Age, and Actions&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/node-claims.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Node Pools data including Name, NodeClass, CPU, Memory, Nodes, Status, Age, Actions&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/nodepools.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;EC2 Node Classes data including Name, Cluster, Instance Profile, Status, IAM Role, Age, and Actions&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/nodeclass.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;dashboard-for-pending-pods&#34;&gt;Dashboard for Pending Pods&lt;/h2&gt;
&lt;p&gt;View all pending pods with unmet scheduling requirements/Failed Scheduling highlighting why they couldn&#39;t be scheduled.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Pending Pods data including Name, Namespace, Type, Reason, From, and Message&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/pending-pods.png&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;karpenter-providers&#34;&gt;&lt;strong&gt;Karpenter Providers&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;This plugin should work with most Karpenter providers, but has only so far been tested on the ones listed in the table. Additionally, each provider gives some extra information, and the ones in the table below are displayed by the plugin.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider Name&lt;/th&gt;
&lt;th&gt;Tested&lt;/th&gt;
&lt;th&gt;Extra provider specific info supported&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/aws/karpenter-provider-aws&#34;&gt;AWS&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/Azure/karpenter-provider-azure&#34;&gt;Azure&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/cloudpilot-ai/karpenter-provider-alibabacloud&#34;&gt;AlibabaCloud&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/bizflycloud/karpenter-provider-bizflycloud&#34;&gt;Bizfly Cloud&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/kubernetes-sigs/karpenter-provider-cluster-api&#34;&gt;Cluster API&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/cloudpilot-ai/karpenter-provider-gcp&#34;&gt;GCP&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/sergelogvinov/karpenter-provider-proxmox&#34;&gt;Proxmox&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&#34;https://github.com/zoom/karpenter-oci&#34;&gt;Oracle Cloud Infrastructure (OCI)&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Please &lt;a href=&#34;https://github.com/headlamp-k8s/plugins/issues&#34;&gt;submit an issue&lt;/a&gt; if you test one of the untested providers or if you want support for this provider (PRs also gladly accepted).&lt;/p&gt;
&lt;h2 id=&#34;how-to-use&#34;&gt;How to use&lt;/h2&gt;
&lt;p&gt;Please see the &lt;a href=&#34;https://github.com/headlamp-k8s/plugins/tree/main/karpenter&#34;&gt;plugins/karpenter/README.md&lt;/a&gt; for instructions on how to use.&lt;/p&gt;
&lt;h2 id=&#34;feedback-and-questions&#34;&gt;Feedback and Questions&lt;/h2&gt;
&lt;p&gt;Please &lt;a href=&#34;https://github.com/headlamp-k8s/plugins/issues&#34;&gt;submit an issue&lt;/a&gt; if you use Karpenter and have any other ideas or feedback. Or come to the &lt;a href=&#34;https://kubernetes.slack.com/?redir=%2Fmessages%2Fheadlamp&#34;&gt;Kubernetes slack headlamp channel&lt;/a&gt; for a chat.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Announcing Changed Block Tracking API support (alpha)</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/25/csi-changed-block-tracking/</link>
      <pubDate>Thu, 25 Sep 2025 05:00:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/25/csi-changed-block-tracking/</guid>
      <description>
        
        
        &lt;p&gt;We&#39;re excited to announce the alpha support for a &lt;em&gt;changed block tracking&lt;/em&gt; mechanism. This enhances
the Kubernetes storage ecosystem by providing an efficient way for
&lt;a href=&#34;https://kubernetes.io/docs/concepts/storage/volumes/#csi&#34;&gt;CSI&lt;/a&gt; storage drivers to identify changed
blocks in PersistentVolume snapshots. With a driver that can use the feature, you could benefit
from faster and more resource-efficient backup operations.&lt;/p&gt;
&lt;p&gt;If you&#39;re eager to try this feature, you can &lt;a href=&#34;#getting-started&#34;&gt;skip to the Getting Started section&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;what-is-changed-block-tracking&#34;&gt;What is changed block tracking?&lt;/h2&gt;
&lt;p&gt;Changed block tracking enables storage systems to identify and track modifications at the block level
between snapshots, eliminating the need to scan entire volumes during backup operations. The
improvement is a change to the Container Storage Interface (CSI), and also to the storage support
in Kubernetes itself.
With the alpha feature enabled, your cluster can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Identify allocated blocks within a CSI volume snapshot&lt;/li&gt;
&lt;li&gt;Determine changed blocks between two snapshots of the same volume&lt;/li&gt;
&lt;li&gt;Streamline backup operations by focusing only on changed data blocks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For Kubernetes users managing large datasets, this API enables significantly more efficient
backup processes. Backup applications can now focus only on the blocks that have changed,
rather than processing entire volumes.&lt;/p&gt;

&lt;div class=&#34;alert alert-info&#34; role=&#34;alert&#34;&gt;&lt;h4 class=&#34;alert-heading&#34;&gt;Note:&lt;/h4&gt;As of now, the Changed Block Tracking API is supported only for block volumes and not for
file volumes. CSI drivers that manage file-based storage systems will not be able to
implement this capability.&lt;/div&gt;

&lt;h2 id=&#34;benefits-of-changed-block-tracking-support-in-kubernetes&#34;&gt;Benefits of changed block tracking support in Kubernetes&lt;/h2&gt;
&lt;p&gt;As Kubernetes adoption grows for stateful workloads managing critical data, the need for efficient
backup solutions becomes increasingly important. Traditional full backup approaches face challenges with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Long backup windows&lt;/em&gt;: Full volume backups can take hours for large datasets, making it difficult
to complete within maintenance windows.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;High resource utilization&lt;/em&gt;: Backup operations consume substantial network bandwidth and I/O
resources, especially for large data volumes and data-intensive applications.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Increased storage costs&lt;/em&gt;: Repetitive full backups store redundant data, causing storage
requirements to grow linearly even when only a small percentage of data actually changes between
backups.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Changed Block Tracking API addresses these challenges by providing native Kubernetes support for
incremental backup capabilities through the CSI interface.&lt;/p&gt;
&lt;h2 id=&#34;key-components&#34;&gt;Key components&lt;/h2&gt;
&lt;p&gt;The implementation consists of three primary components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;CSI SnapshotMetadata Service API&lt;/em&gt;: An API, offered by gRPC, that provides volume
snapshot and changed block data.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;SnapshotMetadataService API&lt;/em&gt;: A Kubernetes CustomResourceDefinition (CRD) that
advertises CSI driver metadata service availability and connection details to
cluster clients.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;External Snapshot Metadata Sidecar&lt;/em&gt;: An intermediary component that connects CSI
drivers to backup applications via a standardized gRPC interface.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;implementation-requirements&#34;&gt;Implementation requirements&lt;/h2&gt;
&lt;h3 id=&#34;storage-provider-responsibilities&#34;&gt;Storage provider responsibilities&lt;/h3&gt;
&lt;p&gt;If you&#39;re an author of a storage integration with Kubernetes and want to support the changed block tracking feature, you must implement specific requirements:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Implement CSI RPCs&lt;/em&gt;: Storage providers need to implement the &lt;code&gt;SnapshotMetadata&lt;/code&gt; service as defined in the &lt;a href=&#34;https://github.com/container-storage-interface/spec/blob/master/csi.proto&#34;&gt;CSI specifications protobuf&lt;/a&gt;. This service requires server-side streaming implementations for the following RPCs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;GetMetadataAllocated&lt;/code&gt;: For identifying allocated blocks in a snapshot&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GetMetadataDelta&lt;/code&gt;: For determining changed blocks between two snapshots&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Storage backend capabilities&lt;/em&gt;: Ensure the storage backend has the capability to track and report block-level changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Deploy external components&lt;/em&gt;: Integrate with the &lt;code&gt;external-snapshot-metadata&lt;/code&gt; sidecar to expose the snapshot metadata service.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Register custom resource&lt;/em&gt;: Register the &lt;code&gt;SnapshotMetadataService&lt;/code&gt; resource using a CustomResourceDefinition and create a &lt;code&gt;SnapshotMetadataService&lt;/code&gt; custom resource that advertises the availability of the metadata service and provides connection details.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Support error handling&lt;/em&gt;: Implement proper error handling for these RPCs according to the CSI specification requirements.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;backup-solution-responsibilities&#34;&gt;Backup solution responsibilities&lt;/h3&gt;
&lt;p&gt;A backup solution looking to leverage this feature must:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Set up authentication&lt;/em&gt;: The backup application must provide a Kubernetes ServiceAccount token when using the
&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3314-csi-changed-block-tracking#the-kubernetes-snapshotmetadata-service-api&#34;&gt;Kubernetes SnapshotMetadataService API&lt;/a&gt;.
Appropriate access grants, such as RBAC RoleBindings, must be established to authorize the backup application
ServiceAccount to obtain such tokens.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Implement streaming client-side code&lt;/em&gt;: Develop clients that implement the streaming gRPC APIs defined in the
&lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshot-metadata/blob/main/proto/schema.proto&#34;&gt;schema.proto&lt;/a&gt; file.
Specifically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Implement streaming client code for &lt;code&gt;GetMetadataAllocated&lt;/code&gt; and &lt;code&gt;GetMetadataDelta&lt;/code&gt; methods&lt;/li&gt;
&lt;li&gt;Handle server-side streaming responses efficiently as the metadata comes in chunks&lt;/li&gt;
&lt;li&gt;Process the &lt;code&gt;SnapshotMetadataResponse&lt;/code&gt; message format with proper error handling&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;external-snapshot-metadata&lt;/code&gt; GitHub repository provides a convenient
&lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshot-metadata/tree/master/pkg/iterator&#34;&gt;iterator&lt;/a&gt;
support package to simplify client implementation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Handle large dataset streaming&lt;/em&gt;: Design clients to efficiently handle large streams of block metadata that
could be returned for volumes with significant changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Optimize backup processes&lt;/em&gt;: Modify backup workflows to use the changed block metadata to identify and only
transfer changed blocks to make backups more efficient, reducing both backup duration and resource consumption.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;getting-started&#34;&gt;Getting started&lt;/h2&gt;
&lt;p&gt;To use changed block tracking in your cluster:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ensure your CSI driver supports volume snapshots and implements the snapshot metadata capabilities with the required &lt;code&gt;external-snapshot-metadata&lt;/code&gt; sidecar&lt;/li&gt;
&lt;li&gt;Make sure the SnapshotMetadataService custom resource is registered using CRD&lt;/li&gt;
&lt;li&gt;Verify the presence of a SnapshotMetadataService custom resource for your CSI driver&lt;/li&gt;
&lt;li&gt;Create clients that can access the API using appropriate authentication (via Kubernetes ServiceAccount tokens)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The API provides two main functions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;GetMetadataAllocated&lt;/code&gt;: Lists blocks allocated in a single snapshot&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GetMetadataDelta&lt;/code&gt;: Lists blocks changed between two snapshots&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What’s next?&lt;/h2&gt;
&lt;p&gt;Depending on feedback and adoption, the Kubernetes developers hope to push the CSI Snapshot Metadata implementation to Beta in the future releases.&lt;/p&gt;
&lt;h2 id=&#34;where-can-i-learn-more&#34;&gt;Where can I learn more?&lt;/h2&gt;
&lt;p&gt;For those interested in trying out this new feature:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Official Kubernetes CSI Developer &lt;a href=&#34;https://kubernetes-csi.github.io/docs/external-snapshot-metadata.html&#34;&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3314-csi-changed-block-tracking&#34;&gt;enhancement proposal&lt;/a&gt; for the snapshot metadata feature.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshot-metadata&#34;&gt;GitHub repository&lt;/a&gt; for implementation and release status of &lt;code&gt;external-snapshot-metadata&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Complete gRPC protocol definitions for snapshot metadata API: &lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshot-metadata/blob/main/proto/schema.proto&#34;&gt;schema.proto&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Example snapshot metadata client implementation: &lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshot-metadata/tree/main/examples/snapshot-metadata-lister&#34;&gt;snapshot-metadata-lister&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;End-to-end example with csi-hostpath-driver: &lt;a href=&#34;https://github.com/kubernetes-csi/csi-driver-host-path/blob/master/docs/example-ephemeral.md&#34;&gt;example documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-do-i-get-involved&#34;&gt;How do I get involved?&lt;/h2&gt;
&lt;p&gt;This project, like all of Kubernetes, is the result of hard work by many contributors from diverse backgrounds working together.
On behalf of SIG Storage, I would like to offer a huge thank you to the contributors who helped review the design and implementation of the project, including but not limited to the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ben Swartzlander (&lt;a href=&#34;https://github.com/bswartz&#34;&gt;bswartz&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Carl Braganza (&lt;a href=&#34;https://github.com/carlbraganza&#34;&gt;carlbraganza&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Daniil Fedotov (&lt;a href=&#34;https://github.com/hairyhum&#34;&gt;hairyhum&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Ivan Sim (&lt;a href=&#34;https://github.com/ihcsim&#34;&gt;ihcsim&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Nikhil Ladha (&lt;a href=&#34;https://github.com/Nikhil-Ladha&#34;&gt;Nikhil-Ladha&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Prasad Ghangal (&lt;a href=&#34;https://github.com/PrasadG193&#34;&gt;PrasadG193&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Praveen M (&lt;a href=&#34;https://github.com/iPraveenParihar&#34;&gt;iPraveenParihar&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Rakshith R (&lt;a href=&#34;https://github.com/Rakshith-R&#34;&gt;Rakshith-R&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Xing Yang (&lt;a href=&#34;https://github.com/xing-yang&#34;&gt;xing-yang&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thank also to everyone who has contributed to the project, including others who helped review the
&lt;a href=&#34;https://github.com/kubernetes/enhancements/pull/4082&#34;&gt;KEP&lt;/a&gt; and the
&lt;a href=&#34;https://github.com/container-storage-interface/spec/pull/551&#34;&gt;CSI spec PR&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;For those interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system,
join the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;Kubernetes Storage Special Interest Group&lt;/a&gt; (SIG).
We always welcome new contributors.&lt;/p&gt;
&lt;p&gt;The SIG also holds regular &lt;a href=&#34;https://docs.google.com/document/d/15tLCV3csvjHbKb16DVk-mfUmFry_Rlwo-2uG6KNGsfw/edit&#34;&gt;Data Protection Working Group meetings&lt;/a&gt;.
New attendees are welcome to join our discussions.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Pod Level Resources Graduated to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/22/kubernetes-v1-34-pod-level-resources/</link>
      <pubDate>Mon, 22 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/22/kubernetes-v1-34-pod-level-resources/</guid>
      <description>
        
        
        &lt;p&gt;On behalf of the Kubernetes community, I am thrilled to announce that the Pod Level Resources feature has graduated to Beta in the Kubernetes v1.34 release and is enabled by default! This significant milestone introduces a new layer of flexibility for defining and managing resource allocation for your Pods. This flexibility stems from the ability to specify CPU and memory resources for the Pod as a whole. Pod level resources can be combined with the container-level specifications to express the exact resource requirements and limits your application needs.&lt;/p&gt;
&lt;h2 id=&#34;pod-level-specification-for-resources&#34;&gt;Pod-level specification for resources&lt;/h2&gt;
&lt;p&gt;Until recently, resource specifications that applied to Pods were primarily defined
at the individual container level. While effective, this approach sometimes required
duplicating or meticulously calculating resource needs across multiple containers
within a single Pod. As a beta feature, Kubernetes allows you to specify the CPU,
memory and hugepages resources at the Pod-level. This means you can now define
resource requests and limits for an entire Pod, enabling easier resource sharing
without requiring granular, per-container management of these resources where
it&#39;s not needed.&lt;/p&gt;
&lt;h2 id=&#34;why-does-pod-level-specification-matter&#34;&gt;Why does Pod-level specification matter?&lt;/h2&gt;
&lt;p&gt;This feature enhances resource management in Kubernetes by offering &lt;em&gt;flexible resource management&lt;/em&gt; at both the Pod and container levels.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;It provides a consolidated approach to resource declaration, reducing the need for
meticulous, per-container management, especially for Pods with multiple
containers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod-level resources enable containers within a pod to share unused resoures
amongst themselves, promoting efficient utilization within the pod. For example,
it prevents sidecar containers from becoming performance bottlenecks. Previously,
a sidecar (e.g., a logging agent or service mesh proxy) hitting its individual CPU
limit could be throttled and slow down the entire Pod, even if the main
application container had plenty of spare CPU. With pod-level resources, the
sidecar and the main container can share Pod&#39;s resource budget, ensuring smooth
operation during traffic spikes - either the whole Pod is throttled or all
containers work.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When both pod-level and container-level resources are specified, pod-level
requests and limits take precedence. This gives you – and cluster administrators -
a powerful way to enforce overall resource boundaries for your Pods.&lt;/p&gt;
&lt;p&gt;For scheduling, if a pod-level request is explicitly defined, the scheduler uses
that specific value to find a suitable node, insteaf of the aggregated requests of
the individual containers. At runtime, the pod-level limit acts as a hard ceiling
for the combined resource usage of all containers. Crucially, this pod-level limit
is the absolute enforcer; even if the sum of the individual container limits is
higher, the total resource consumption can never exceed the pod-level limit.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod-level resources are &lt;strong&gt;prioritized&lt;/strong&gt; in influencing the Quality of Service (QoS) class of the Pod.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;For Pods running on Linux nodes, the Out-Of-Memory (OOM) score adjustment
calculation considers both pod-level and container-level resources requests.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod-level resources are &lt;strong&gt;designed to be compatible with existing Kubernetes functionalities&lt;/strong&gt;, ensuring a smooth integration into your workflows.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-to-specify-resources-for-an-entire-pod&#34;&gt;How to specify resources for an entire Pod&lt;/h2&gt;
&lt;p&gt;Using &lt;code&gt;PodLevelResources&lt;/code&gt; &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature
gate&lt;/a&gt; requires
Kubernetes v1.34 or newer for all cluster components, including the control plane
and every node. This feature gate is in beta and enabled by default in v1.34.&lt;/p&gt;
&lt;h3 id=&#34;example-manifest&#34;&gt;Example manifest&lt;/h3&gt;
&lt;p&gt;You can specify CPU, memory and hugepages resources directly in the Pod spec manifest at the &lt;code&gt;resources&lt;/code&gt; field for the entire Pod.&lt;/p&gt;
&lt;p&gt;Here’s an example demonstrating a Pod with both CPU and memory requests and limits
defined at the Pod level:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pod-resources-demo&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pod-resources-example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The &amp;#39;resources&amp;#39; field at the Pod specification level defines the overall&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# resource budget for all containers within this Pod combined.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Pod-level resources&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# &amp;#39;limits&amp;#39; specifies the maximum amount of resources the Pod is allowed to use.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The sum of the limits of all containers in the Pod cannot exceed these values.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;limits&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The entire Pod cannot use more than 1 CPU core.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;200Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The entire Pod cannot use more than 200 MiB of memory.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# &amp;#39;requests&amp;#39; specifies the minimum amount of resources guaranteed to the Pod.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This value is used by the Kubernetes scheduler to find a node with enough capacity.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The Pod is guaranteed 1 CPU core when scheduled.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;100Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# The Pod is guaranteed 100 MiB of memory when scheduled.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;main-app-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This container has no resource requests or limits specified.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;auxiliary-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;fedora&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;inf&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This container has no resource requests or limits specified.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this example, the &lt;code&gt;pod-resources-demo&lt;/code&gt; Pod as a whole requests 1 CPU and 100 MiB of memory, and is limited to 1 CPU and 200 MiB of memory. The containers within will operate under these overall Pod-level constraints, as explained in the next section.&lt;/p&gt;
&lt;h3 id=&#34;interaction-with-container-level-resource-requests-or-limits&#34;&gt;Interaction with container-level resource requests or limits&lt;/h3&gt;
&lt;p&gt;When both pod-level and container-level resources are specified, &lt;strong&gt;pod-level requests and limits take precedence&lt;/strong&gt;. This means the node allocates resources based on the pod-level specifications.&lt;/p&gt;
&lt;p&gt;Consider a Pod with two containers where pod-level CPU and memory requests and
limits are defined, and only one container has its own explicit resource
definitions:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pod-resources-demo&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pod-resources-example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;limits&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;200Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;100Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;main-app-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpu&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0.5&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;50Mi&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;auxiliary-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;fedora&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;inf&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This container has no resource requests or limits specified.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Pod-Level Limits: The pod-level limits (cpu: &amp;quot;1&amp;quot;, memory: &amp;quot;200Mi&amp;quot;) establish an absolute boundary for the entire Pod. The sum of resources consumed by all its containers is enforced at this ceiling and cannot be surpassed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Resource Sharing and Bursting: Containers can dynamically borrow any unused capacity, allowing them to burst as needed, so long as the Pod&#39;s aggregate usage stays within the overall limit.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod-Level Requests: The pod-level requests (cpu: &amp;quot;1&amp;quot;, memory: &amp;quot;100Mi&amp;quot;) serve as the foundational resource guarantee for the entire Pod. This value informs the scheduler&#39;s placement decision and represents the minimum resources the Pod can rely on during node-level contention.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Container-Level Requests: Container-level requests create a priority system within
the Pod&#39;s guaranteed budget. Because main-app-container has an explicit request
(cpu: &amp;quot;0.5&amp;quot;, memory: &amp;quot;50Mi&amp;quot;), it is given precedence for its share of resources
under resource pressure over the auxiliary-container, which has no
such explicit claim.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;limitations&#34;&gt;Limitations&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;First of all, &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/pods/#pod-update-and-replacement&#34;&gt;in-place
resize&lt;/a&gt; of pod-level
resources is &lt;strong&gt;not supported&lt;/strong&gt; for Kubernetes v1.34 (or earlier). Attempting to
modify the &lt;em&gt;pod-level&lt;/em&gt; resource limits or requests on a running Pod results in an
error: the resize is rejected. The v1.34 implementation of Pod level resources
focuses on allowing initial declaration of an overall resource envelope, that
applies to the &lt;strong&gt;entire Pod&lt;/strong&gt;. That is distinct from in-place pod resize, which
(despite what the name might suggest) allows you
to make dynamic adjustments to &lt;em&gt;container&lt;/em&gt; resource
requests and limits, within a &lt;em&gt;running&lt;/em&gt; Pod,
and potentially without a container restart. In-place resizing is also not yet a
stable feature; it graduated to Beta in the v1.33 release.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Only CPU, memory, and hugepages resources can be specified at pod-level.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod-level resources are not supported for Windows pods. If the Pod specification
explicitly targets Windows (e.g., by setting spec.os.name: &amp;quot;windows&amp;quot;), the API
server will reject the Pod during the validation step. If the Pod is not explicitly
marked for Windows but is scheduled to a Windows node (e.g., via a nodeSelector),
the Kubelet on that Windows node will reject the Pod during its admission process.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The Topology Manager, Memory Manager and CPU Manager do not
align pods and containers based on pod-level resources as these resource managers
don&#39;t currently support pod-level resources.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;getting-started-and-providing-feedback&#34;&gt;Getting started and providing feedback&lt;/h4&gt;
&lt;p&gt;Ready to explore &lt;em&gt;Pod Level Resources&lt;/em&gt; feature? You&#39;ll need a Kubernetes cluster running version 1.34 or later. Remember to enable the &lt;code&gt;PodLevelResources&lt;/code&gt; &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature gate&lt;/a&gt; across your control plane and all nodes.&lt;/p&gt;
&lt;p&gt;As this feature moves through Beta, your feedback is invaluable. Please report any issues or share your experiences via the standard Kubernetes communication channels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Slack: &lt;a href=&#34;https://kubernetes.slack.com/messages/sig-node&#34;&gt;#sig-node&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://groups.google.com/forum/#!forum/kubernetes-sig-node&#34;&gt;Mailing list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/community/labels/sig%2Fnode&#34;&gt;Open Community Issues/PRs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Recovery From Volume Expansion Failure (GA)</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/19/kubernetes-v1-34-recover-expansion-failure/</link>
      <pubDate>Fri, 19 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/19/kubernetes-v1-34-recover-expansion-failure/</guid>
      <description>
        
        
        &lt;p&gt;Have you ever made a typo when expanding your persistent volumes in Kubernetes? Meant to specify &lt;code&gt;2TB&lt;/code&gt;
but specified &lt;code&gt;20TiB&lt;/code&gt;? This seemingly innocuous problem was kinda hard to fix - and took the project almost 5 years to fix.
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/storage/persistent-volumes/#recovering-from-failure-when-expanding-volumes&#34;&gt;Automated recovery from storage expansion&lt;/a&gt; has been around for a while in beta; however, with the v1.34 release, we have graduated this to
&lt;strong&gt;general availability&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;While it was always possible to recover from failing volume expansions manually, it usually required cluster-admin access and was tedious to do (See aformentioned link for more information).&lt;/p&gt;
&lt;p&gt;What if you make a mistake and then realize immediately?
With Kubernetes v1.34, you should be able to reduce the requested size of the PersistentVolumeClaim (PVC) and, as long as the expansion to previously requested
size hadn&#39;t finished, you can amend the size requested. Kubernetes will
automatically work to correct it. Any quota consumed by failed expansion will be returned to the user and the associated PersistentVolume should be resized to the
latest size you specified.&lt;/p&gt;
&lt;p&gt;I&#39;ll walk through an example of how all of this works.&lt;/p&gt;
&lt;h2 id=&#34;reducing-pvc-size-to-recover-from-failed-expansion&#34;&gt;Reducing PVC size to recover from failed expansion&lt;/h2&gt;
&lt;p&gt;Imagine that you are running out of disk space for one of your database servers, and you want to expand the PVC from previously
specified &lt;code&gt;10TB&lt;/code&gt; to &lt;code&gt;100TB&lt;/code&gt; - but you make a typo and specify &lt;code&gt;1000TB&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PersistentVolumeClaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myclaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;accessModes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- ReadWriteOnce&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;storage&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;1000TB&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# newly specified size - but incorrect!&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now, you may be out of disk space on your disk array or simply ran out of allocated quota on your cloud-provider. But, assume that expansion to &lt;code&gt;1000TB&lt;/code&gt; is never going to succeed.&lt;/p&gt;
&lt;p&gt;In Kubernetes v1.34, you can simply correct your mistake and request a new PVC size,
that is smaller than the mistake, provided it is still larger than the original size
of the actual PersistentVolume.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PersistentVolumeClaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myclaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;accessModes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- ReadWriteOnce&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;storage&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;100TB&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Corrected size; has to be greater than 10TB.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;                     &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# You cannot shrink the volume below its actual size.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This requires no admin intervention. Even better, any surplus Kubernetes quota that you temporarily consumed will be automatically returned.&lt;/p&gt;
&lt;p&gt;This fault recovery mechanism does have a caveat: whatever new size you specify for the PVC, it &lt;strong&gt;must&lt;/strong&gt; be still higher than the original size in &lt;code&gt;.status.capacity&lt;/code&gt;.
Since Kubernetes doesn&#39;t support shrinking your PV objects, you can never go below the size that was originally allocated for your PVC request.&lt;/p&gt;
&lt;h2 id=&#34;improved-error-handling-and-observability-of-volume-expansion&#34;&gt;Improved error handling and observability of volume expansion&lt;/h2&gt;
&lt;p&gt;Implementing what might look like a relatively minor change also required us to almost
fully redo how volume expansion works under the hood in Kubernetes.
There are new API fields available in PVC objects which you can monitor to observe progress of volume expansion.&lt;/p&gt;
&lt;h3 id=&#34;improved-observability-of-in-progress-expansion&#34;&gt;Improved observability of in-progress expansion&lt;/h3&gt;
&lt;p&gt;You can query &lt;code&gt;.status.allocatedResourceStatus[&#39;storage&#39;]&lt;/code&gt; of a PVC to monitor progress of a volume expansion operation.
For a typical block volume, this should transition between &lt;code&gt;ControllerResizeInProgress&lt;/code&gt;, &lt;code&gt;NodeResizePending&lt;/code&gt; and &lt;code&gt;NodeResizeInProgress&lt;/code&gt; and become nil/empty when volume expansion has finished.&lt;/p&gt;
&lt;p&gt;If for some reason, volume expansion to requested size is not feasible it should accordingly be in states like - &lt;code&gt;ControllerResizeInfeasible&lt;/code&gt; or &lt;code&gt;NodeResizeInfeasible&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can also observe size towards which Kubernetes is working by watching &lt;code&gt;pvc.status.allocatedResources&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;improved-error-handling-and-reporting&#34;&gt;Improved error handling and reporting&lt;/h3&gt;
&lt;p&gt;Kubernetes should now retry your failed volume expansions at slower rate, it should make fewer requests to both storage system and Kubernetes apiserver.&lt;/p&gt;
&lt;p&gt;Errors observerd during volume expansion are now reported as condition on PVC objects and should persist unlike events. Kubernetes will now populate &lt;code&gt;pvc.status.conditions&lt;/code&gt; with error keys &lt;code&gt;ControllerResizeError&lt;/code&gt; or &lt;code&gt;NodeResizeError&lt;/code&gt; when volume expansion fails.&lt;/p&gt;
&lt;h3 id=&#34;fixes-long-standing-bugs-in-resizing-workflows&#34;&gt;Fixes long standing bugs in resizing workflows&lt;/h3&gt;
&lt;p&gt;This feature also has allowed us to fix long standing bugs in resizing workflow such as &lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/115294&#34;&gt;Kubernetes issue #115294&lt;/a&gt;.
If you observe anything broken, please report your bugs to &lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/new/choose&#34;&gt;https://github.com/kubernetes/kubernetes/issues&lt;/a&gt;, along with details about how to reproduce the problem.&lt;/p&gt;
&lt;p&gt;Working on this feature through its lifecycle was challenging and it wouldn&#39;t have been possible to reach GA
without feedback from &lt;a href=&#34;https://github.com/msau42&#34;&gt;@msau42&lt;/a&gt;, &lt;a href=&#34;https://github.com/jsafrane&#34;&gt;@jsafrane&lt;/a&gt; and &lt;a href=&#34;https://github.com/xing-yang&#34;&gt;@xing-yang&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;All of the contributors who worked on this also appreciate the input provided by &lt;a href=&#34;https://github.com/thockin&#34;&gt;@thockin&lt;/a&gt; and &lt;a href=&#34;https://github.comliggitt&#34;&gt;@liggitt&lt;/a&gt; at various Kubernetes contributor summits.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: DRA Consumable Capacity</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/18/kubernetes-v1-34-dra-consumable-capacity/</link>
      <pubDate>Thu, 18 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/18/kubernetes-v1-34-dra-consumable-capacity/</guid>
      <description>
        
        
        &lt;p&gt;Dynamic Resource Allocation (DRA) is a Kubernetes API for managing scarce resources across Pods and containers.
It enables flexible resource requests, going beyond simply allocating &lt;em&gt;N&lt;/em&gt; number of devices to support more granular usage scenarios.
With DRA, users can request specific types of devices based on their attributes, define custom configurations tailored to their workloads, and even share the same resource among multiple containers or Pods.&lt;/p&gt;
&lt;p&gt;In this blog, we focus on the device sharing feature and dive into a new capability introduced in Kubernetes 1.34: &lt;em&gt;DRA consumable capacity&lt;/em&gt;,
which extends DRA to support finer-grained device sharing.&lt;/p&gt;
&lt;h2 id=&#34;background-device-sharing-via-resourceclaims&#34;&gt;Background: device sharing via ResourceClaims&lt;/h2&gt;
&lt;p&gt;From the beginning, DRA introduced the ability for multiple Pods to share a device by referencing the same ResourceClaim.
This design decouples resource allocation from specific hardware, allowing for more dynamic and reusable provisioning of devices.&lt;/p&gt;
&lt;p&gt;In Kubernetes 1.33, the new support for &lt;em&gt;partitionable devices&lt;/em&gt; allowed resource drivers to advertise slices of a device that are available, rather than exposing the entire device as an all-or-nothing resource.
This enabled Kubernetes to model shareable hardware more accurately.&lt;/p&gt;
&lt;p&gt;But there was still a missing piece: it didn&#39;t yet support scenarios
where the device driver manages fine-grained, dynamic portions of a device resource — like network bandwidth — based on user demand,
or to share those resources independently of ResourceClaims, which are restricted by their spec and namespace.&lt;/p&gt;
&lt;p&gt;That’s where &lt;em&gt;consumable capacity&lt;/em&gt; for DRA comes in.&lt;/p&gt;
&lt;h2 id=&#34;benefits-of-dra-consumable-capacity-support&#34;&gt;Benefits of DRA consumable capacity support&lt;/h2&gt;
&lt;p&gt;Here&#39;s a taste of what you get in a cluster with the &lt;code&gt;DRAConsumableCapacity&lt;/code&gt;
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature gate&lt;/a&gt; enabled.&lt;/p&gt;
&lt;h3 id=&#34;device-sharing-across-multiple-resourceclaims-or-devicerequests&#34;&gt;Device sharing across multiple ResourceClaims or DeviceRequests&lt;/h3&gt;
&lt;p&gt;Resource drivers can now support sharing the same device — or even a slice of a device — across multiple ResourceClaims or across multiple DeviceRequests.&lt;/p&gt;
&lt;p&gt;This means that Pods from different namespaces can simultaneously share the same device,
if permitted and supported by the specific DRA driver.&lt;/p&gt;
&lt;h3 id=&#34;device-resource-allocation&#34;&gt;Device resource allocation&lt;/h3&gt;
&lt;p&gt;Kubernetes extends the allocation algorithm in the scheduler to support allocating a portion of a device&#39;s resources, as defined in the &lt;code&gt;capacity&lt;/code&gt; field.
The scheduler ensures that the total allocated capacity across all consumers never exceeds the device’s total capacity, even when shared across multiple ResourceClaims or DeviceRequests.
This is very similar to the way the scheduler allows Pods and containers to share allocatable resources on Nodes;
in this case, it allows them to share allocatable (consumable) resources on Devices.&lt;/p&gt;
&lt;p&gt;This feature expands support for scenarios where the device driver is able to manage resources &lt;strong&gt;within&lt;/strong&gt; a device and on a per-process basis — for example,
allocating a specific amount of memory (e.g., 8 GiB) from a virtual GPU,
or setting bandwidth limits on virtual network interfaces allocated to specific Pods. This aims to provide safe and efficient resource sharing.&lt;/p&gt;
&lt;h3 id=&#34;distinctattribute-constraint&#34;&gt;DistinctAttribute constraint&lt;/h3&gt;
&lt;p&gt;This feature also introduces a new constraint: &lt;code&gt;DistinctAttribute&lt;/code&gt;, which is the complement of the existing  &lt;code&gt;MatchAttribute&lt;/code&gt; constraint.&lt;/p&gt;
&lt;p&gt;The primary goal of &lt;code&gt;DistinctAttribute&lt;/code&gt; is to prevent the same underlying device from being allocated multiple times within a single ResourceClaim, which could happen since we are allocating shares (or subsets) of devices.
This constraint ensures that each allocation refers to a distinct resource, even if they belong to the same device class.&lt;/p&gt;
&lt;p&gt;It is useful for use cases such as allocating network devices connecting to different subnets to expand coverage or provide redundancy across failure domains.&lt;/p&gt;
&lt;h2 id=&#34;how-to-use-consumable-capacity&#34;&gt;How to use consumable capacity?&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;DRAConsumableCapacity&lt;/code&gt; is introduced as an alpha feature in Kubernetes 1.34. The &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature gate&lt;/a&gt; &lt;code&gt;DRAConsumableCapacity&lt;/code&gt; must be enabled in kubelet, kube-apiserver, kube-scheduler and kube-controller-manager.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;--feature-gates&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;...,DRAConsumableCapacity&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a2f&#34;&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;as-a-dra-driver-developer&#34;&gt;As a DRA driver developer&lt;/h3&gt;
&lt;p&gt;As a DRA driver developer writing in Golang, you can make a device within a ResourceSlice allocatable to multiple ResourceClaims (or &lt;code&gt;devices.requests&lt;/code&gt;) by setting &lt;code&gt;AllowMultipleAllocations&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Device {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#666&#34;&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  AllowMultipleAllocations: ptr.&lt;span style=&#34;color:#00a000&#34;&gt;To&lt;/span&gt;(&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#666&#34;&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Additionally, you can define a policy to restrict how each device&#39;s &lt;code&gt;Capacity&lt;/code&gt; should be consumed by each &lt;code&gt;DeviceRequest&lt;/code&gt; by defining &lt;code&gt;RequestPolicy&lt;/code&gt; field in the &lt;code&gt;DeviceCapacity&lt;/code&gt;.
The example below shows how to define a policy that requires a GPU with 40 GiB of memory to allocate at least 5 GiB per request, with each allocation in multiples of 5 GiB.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;DeviceCapacity{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  Value: resource.&lt;span style=&#34;color:#00a000&#34;&gt;MustParse&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;40Gi&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  RequestPolicy: &lt;span style=&#34;color:#666&#34;&gt;&amp;amp;&lt;/span&gt;CapacityRequestPolicy{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Default: ptr.&lt;span style=&#34;color:#00a000&#34;&gt;To&lt;/span&gt;(resource.&lt;span style=&#34;color:#00a000&#34;&gt;MustParse&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;5Gi&amp;#34;&lt;/span&gt;)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ValidRange: &lt;span style=&#34;color:#666&#34;&gt;&amp;amp;&lt;/span&gt;CapacityRequestPolicyRange {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      Min: ptr.&lt;span style=&#34;color:#00a000&#34;&gt;To&lt;/span&gt;(resource.&lt;span style=&#34;color:#00a000&#34;&gt;MustParse&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;5Gi&amp;#34;&lt;/span&gt;)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      Step: ptr.&lt;span style=&#34;color:#00a000&#34;&gt;To&lt;/span&gt;(resource.&lt;span style=&#34;color:#00a000&#34;&gt;MustParse&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;5Gi&amp;#34;&lt;/span&gt;)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This will be published to the ResourceSlice, as partially shown below:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;resource.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ResourceSlice&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;devices&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gpu0&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowMultipleAllocations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;capacity&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;40Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requestPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;5Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;validRange&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;min&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;5Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;step&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;5Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;An allocated device with a specified portion of consumed capacity will have a &lt;code&gt;ShareID&lt;/code&gt; field set in the allocation status.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;claim.Status.Allocation.Devices.Results[i].ShareID
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This &lt;code&gt;ShareID&lt;/code&gt; allows the driver to distinguish between different allocations that refer to the &lt;strong&gt;same device or same statically-partitioned slice&lt;/strong&gt; but come from &lt;strong&gt;different &lt;code&gt;ResourceClaim&lt;/code&gt; requests&lt;/strong&gt;.&lt;br&gt;
It acts as a unique identifier for each shared slice, enabling the driver to manage and enforce resource limits independently across multiple consumers.&lt;/p&gt;
&lt;h3 id=&#34;as-a-consumer&#34;&gt;As a consumer&lt;/h3&gt;
&lt;p&gt;As a consumer (or user), the device resource can be requested with a ResourceClaim like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;resource.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ResourceClaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;devices&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# for devices&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;req0&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exactly&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;deviceClassName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;resource.example.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;capacity&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requests&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# for resources which must be provided by those devices&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;memory&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;10Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This configuration ensures that the requested device can provide at least 10GiB of &lt;code&gt;memory&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Notably that &lt;strong&gt;any&lt;/strong&gt; &lt;code&gt;resource.example.com&lt;/code&gt; device that has at least 10GiB of memory can be allocated.
If a device that does not support multiple allocations is chosen, the allocation would consume the entire device.
To filter only devices that support multiple allocations, you can define a selector like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;selectors&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cel&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;expression&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;|-&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;        device.allowMultipleAllocations == true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;integration-with-dra-device-status&#34;&gt;Integration with DRA device status&lt;/h2&gt;
&lt;p&gt;In device sharing, general device information is provided through the resource slice.
However, some details are set dynamically after allocation.
These can be conveyed using the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#resourceclaim-device-status&#34;&gt;&lt;code&gt;.status.devices&lt;/code&gt;&lt;/a&gt; field of a ResourceClaim.
That field is only published in clusters where the &lt;code&gt;DRAResourceClaimDeviceStatus&lt;/code&gt;
feature gate is enabled.&lt;/p&gt;
&lt;p&gt;If you do have &lt;em&gt;device status&lt;/em&gt; support available, a driver can expose additional device-specific information beyond the &lt;code&gt;ShareID&lt;/code&gt;.
One particularly useful use case is for virtual networks, where a driver can include the assigned IP address(es) in the status.
This is valuable for both network service operations and troubleshooting.&lt;/p&gt;
&lt;p&gt;You can find more information by watching our recording at: &lt;a href=&#34;https://sched.co/1x71v&#34;&gt;KubeCon Japan 2025 - Reimagining Cloud Native Networks: The Critical Role of DRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;what-can-you-do-next&#34;&gt;What can you do next?&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Check out the &lt;a href=&#34;https://github.com/kubernetes-sigs/cni-dra-driver&#34;&gt;CNI DRA Driver project&lt;/a&gt;&lt;/strong&gt; for an example of DRA integration in Kubernetes networking. Try integrating with network resources like &lt;code&gt;macvlan&lt;/code&gt;, &lt;code&gt;ipvlan&lt;/code&gt;, or smart NICs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Start enabling the &lt;code&gt;DRAConsumableCapacity&lt;/code&gt; feature gate and experimenting with virtualized or partitionable devices. Specify your workloads with &lt;em&gt;consumable capacity&lt;/em&gt; (for example: fractional bandwidth or memory).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Let us know your feedback:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;✅ What worked well?&lt;/li&gt;
&lt;li&gt;⚠️ What didn’t?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you encountered issues to fix or opportunities to enhance,
please &lt;a href=&#34;https://github.com/kubernetes/enhancements/issues&#34;&gt;file a new issue&lt;/a&gt;
and reference &lt;a href=&#34;https://github.com/kubernetes/enhancements/issues/5075&#34;&gt;KEP-5075&lt;/a&gt; there,
or reach out via &lt;a href=&#34;https://kubernetes.slack.com/archives/C0409NGC1TK&#34;&gt;Slack (#wg-device-management)&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;Consumable capacity support enhances the device sharing capability of DRA by allowing effective device sharing across namespaces, across claims, and tailored to each Pod’s actual needs.
It also empowers drivers to enforce capacity limits, improves scheduling accuracy, and unlocks new use cases like bandwidth-aware networking and multi-tenant device sharing.&lt;/p&gt;
&lt;p&gt;Try it out, experiment with consumable resources, and help shape the future of dynamic resource allocation in Kubernetes!&lt;/p&gt;
&lt;h3 id=&#34;further-reading&#34;&gt;Further Reading&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/&#34;&gt;DRA in the Kubernetes documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4815-dra-partitionable-devices&#34;&gt;KEP for DRA Partitionable Devices&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4817-resource-claim-device-status&#34;&gt;KEP for DRA Device Status&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/5075-dra-consumable-capacity&#34;&gt;KEP for DRA Consumable Capacity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.kubernetes.dev/resources/release/#kubernetes-v134&#34;&gt;Kubernetes 1.34 Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Pods Report DRA Resource Health</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/17/kubernetes-v1-34-pods-report-dra-resource-health/</link>
      <pubDate>Wed, 17 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/17/kubernetes-v1-34-pods-report-dra-resource-health/</guid>
      <description>
        
        
        &lt;p&gt;The rise of AI/ML and other high-performance workloads has made specialized hardware like GPUs, TPUs, and FPGAs a critical component of many Kubernetes clusters. However, as discussed in a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/&#34;&gt;previous blog post about navigating failures in Pods with devices&lt;/a&gt;, when this hardware fails, it can be difficult to diagnose, leading to significant downtime. With the release of Kubernetes v1.34, we are excited to announce a new alpha feature that brings much-needed visibility into the health of these devices.&lt;/p&gt;
&lt;p&gt;This work extends the functionality of &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4680-add-resource-health-to-pod-status&#34;&gt;KEP-4680&lt;/a&gt;, which first introduced a mechanism for reporting the health of devices managed by Device Plugins. Now, this capability is being extended to &lt;em&gt;Dynamic Resource Allocation (DRA)&lt;/em&gt;. Controlled by the &lt;code&gt;ResourceHealthStatus&lt;/code&gt; feature gate, this enhancement allows DRA drivers to report device health directly into a Pod&#39;s &lt;code&gt;.status&lt;/code&gt; field, providing crucial insights for operators and developers.&lt;/p&gt;
&lt;h2 id=&#34;why-expose-device-health-in-pod-status&#34;&gt;Why expose device health in Pod status?&lt;/h2&gt;
&lt;p&gt;For stateful applications or long-running jobs, a device failure can be disruptive and costly. By exposing device health in the &lt;code&gt;.status&lt;/code&gt; field for a Pod, Kubernetes provides a standardized way for users and automation tools to quickly diagnose issues. If a Pod is failing, you can now check its status to see if an unhealthy device is the root cause, saving valuable time that might otherwise be spent debugging application code.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;This feature introduces a new, optional communication channel between the Kubelet and DRA drivers, built on three core components.&lt;/p&gt;
&lt;h3 id=&#34;a-new-grpc-health-service&#34;&gt;A new gRPC health service&lt;/h3&gt;
&lt;p&gt;A new gRPC service, &lt;code&gt;DRAResourceHealth&lt;/code&gt;, is defined in the &lt;code&gt;dra-health/v1alpha1&lt;/code&gt; API group. DRA drivers can implement this service to stream device health updates to the Kubelet. The service includes a &lt;code&gt;NodeWatchResources&lt;/code&gt; server-streaming RPC that sends the health status (&lt;code&gt;Healthy&lt;/code&gt;, &lt;code&gt;Unhealthy&lt;/code&gt;, or &lt;code&gt;Unknown&lt;/code&gt;) for the devices it manages.&lt;/p&gt;
&lt;h3 id=&#34;kubelet-integration&#34;&gt;Kubelet integration&lt;/h3&gt;
&lt;p&gt;The Kubelet’s &lt;code&gt;DRAPluginManager&lt;/code&gt; discovers which drivers implement the health service. For each compatible driver, it starts a long-lived &lt;code&gt;NodeWatchResources&lt;/code&gt; stream to receive health updates. The DRA Manager then consumes these updates and stores them in a persistent &lt;code&gt;healthInfoCache&lt;/code&gt; that can survive Kubelet restarts.&lt;/p&gt;
&lt;h3 id=&#34;populating-the-pod-status&#34;&gt;Populating the Pod status&lt;/h3&gt;
&lt;p&gt;When a device&#39;s health changes, the DRA manager identifies all Pods affected by the change and triggers a Pod status update. A new field, &lt;code&gt;allocatedResourcesStatus&lt;/code&gt;, is now part of the &lt;code&gt;v1.ContainerStatus&lt;/code&gt; API object. The Kubelet populates this field with the current health of each device allocated to the container.&lt;/p&gt;
&lt;h2 id=&#34;a-practical-example&#34;&gt;A practical example&lt;/h2&gt;
&lt;p&gt;If a Pod is in a &lt;code&gt;CrashLoopBackOff&lt;/code&gt; state, you can use &lt;code&gt;kubectl describe pod &amp;lt;pod-name&amp;gt;&lt;/code&gt; to inspect its status. If an allocated device has failed, the output will now include the &lt;code&gt;allocatedResourcesStatus&lt;/code&gt; field, clearly indicating the problem:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containerStatuses&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-gpu-intensive-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# ... other container statuses&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allocatedResourcesStatus&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;claim:my-gpu-claim&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resourceID&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;example.com/gpu-a1b2-c3d4&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;health&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;Unhealthy&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This explicit status makes it clear that the issue is with the underlying hardware, not the application.&lt;/p&gt;
&lt;p&gt;Now you can improve the failure detection logic to react on the unhealthy devices associated with the Pod by de-scheduling a Pod.&lt;/p&gt;
&lt;h2 id=&#34;how-to-use-this-feature&#34;&gt;How to use this feature&lt;/h2&gt;
&lt;p&gt;As this is an alpha feature in Kubernetes v1.34, you must take the following steps to use it:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enable the &lt;code&gt;ResourceHealthStatus&lt;/code&gt; feature gate on your kube-apiserver and kubelets.&lt;/li&gt;
&lt;li&gt;Ensure you are using a DRA driver that implements the &lt;code&gt;v1alpha1 DRAResourceHealth&lt;/code&gt; gRPC service.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;dra-drivers&#34;&gt;DRA drivers&lt;/h2&gt;
&lt;p&gt;If you are developing a DRA driver, make sure to think about device failure detection strategy and ensure that your driver is integrated with this feature. This way, your driver will improve the user experience and simplify debuggability of hardware issues.&lt;/p&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;This is the first step in a broader effort to improve how Kubernetes handles device failures. As we gather feedback on this alpha feature, the community is planning several key enhancements before graduating to Beta:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Detailed health messages:&lt;/em&gt; To improve the troubleshooting experience, we plan to add a human-readable message field to the gRPC API. This will allow DRA drivers to provide specific context for a health status, such as &amp;quot;GPU temperature exceeds threshold&amp;quot; or &amp;quot;NVLink connection lost&amp;quot;.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Configurable health timeouts:&lt;/em&gt; The timeout for marking a device&#39;s health as &amp;quot;Unknown&amp;quot; is currently hardcoded. We plan to make this configurable, likely on a per-driver basis, to better accommodate the different health-reporting characteristics of various hardware.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Improved post-mortem troubleshooting:&lt;/em&gt; We will address a known limitation where health updates may not be applied to pods that have already terminated. This fix will ensure that the health status of a device at the time of failure is preserved, which is crucial for troubleshooting batch jobs and other &amp;quot;run-to-completion&amp;quot; workloads.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This feature was developed as part of &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4680-add-resource-health-to-pod-status&#34;&gt;KEP-4680&lt;/a&gt;, and community feedback is crucial as we work toward graduating it to Beta. We have more improvements of device failure handling in k8s and encourage you to try it out and share your experiences with the SIG Node community!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Moving Volume Group Snapshots to v1beta2</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/16/kubernetes-v1-34-volume-group-snapshot-beta-2/</link>
      <pubDate>Tue, 16 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/16/kubernetes-v1-34-volume-group-snapshot-beta-2/</guid>
      <description>
        
        
        &lt;p&gt;Volume group snapshots were &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/05/08/kubernetes-1-27-volume-group-snapshot-alpha/&#34;&gt;introduced&lt;/a&gt;
as an Alpha feature with the Kubernetes 1.27 release and moved to &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/12/18/kubernetes-1-32-volume-group-snapshot-beta/&#34;&gt;Beta&lt;/a&gt; in the Kubernetes 1.32 release.
The recent release of Kubernetes v1.34 moved that support to a second beta.
The support for volume group snapshots relies on a set of
&lt;a href=&#34;https://kubernetes-csi.github.io/docs/group-snapshot-restore-feature.html#volume-group-snapshot-apis&#34;&gt;extension APIs for group snapshots&lt;/a&gt;.
These APIs allow users to take crash consistent snapshots for a set of volumes.
Behind the scenes, Kubernetes uses a label selector to group multiple PersistentVolumeClaims
for snapshotting.
A key aim is to allow you restore that set of snapshots to new volumes and
recover your workload based on a crash consistent recovery point.&lt;/p&gt;
&lt;p&gt;This new feature is only supported for &lt;a href=&#34;https://kubernetes-csi.github.io/docs/&#34;&gt;CSI&lt;/a&gt; volume drivers.&lt;/p&gt;
&lt;h2 id=&#34;what-s-new-in-beta-2&#34;&gt;What&#39;s new in Beta 2?&lt;/h2&gt;
&lt;p&gt;While testing the beta version, we encountered an &lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshotter/issues/1271&#34;&gt;issue&lt;/a&gt; where the &lt;code&gt;restoreSize&lt;/code&gt; field is not set for individual VolumeSnapshotContents and VolumeSnapshots if CSI driver does not implement the ListSnapshots RPC call.
We evaluated various options &lt;a href=&#34;https://docs.google.com/document/d/1LLBSHcnlLTaP6ZKjugtSGQHH2LGZPndyfnNqR1YvzS4/edit?tab=t.0&#34;&gt;here&lt;/a&gt; and decided to make this change releasing a new beta for the API.&lt;/p&gt;
&lt;p&gt;Specifically, a VolumeSnapshotInfo struct is added in v1beta2, it contains information for an individual volume snapshot that is a member of a volume group snapshot.
VolumeSnapshotInfoList, a list of VolumeSnapshotInfo, is added to VolumeGroupSnapshotContentStatus, replacing VolumeSnapshotHandlePairList.
VolumeSnapshotInfoList is a list of snapshot information returned by the CSI driver to identify snapshots on the storage system.
VolumeSnapshotInfoList is populated by the csi-snapshotter sidecar based on the CSI CreateVolumeGroupSnapshotResponse returned by the CSI driver&#39;s CreateVolumeGroupSnapshot call.&lt;/p&gt;
&lt;p&gt;The existing v1beta1 API objects will be converted to the new v1beta2 API objects by a conversion webhook.&lt;/p&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What’s next?&lt;/h2&gt;
&lt;p&gt;Depending on feedback and adoption, the Kubernetes project plans to push the volume
group snapshot implementation to general availability (GA) in a future release.&lt;/p&gt;
&lt;h2 id=&#34;how-can-i-learn-more&#34;&gt;How can I learn more?&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3476-volume-group-snapshot&#34;&gt;design spec&lt;/a&gt;
for the volume group snapshot feature.&lt;/li&gt;
&lt;li&gt;The &lt;a href=&#34;https://github.com/kubernetes-csi/external-snapshotter&#34;&gt;code repository&lt;/a&gt; for volume group
snapshot APIs and controller.&lt;/li&gt;
&lt;li&gt;CSI &lt;a href=&#34;https://kubernetes-csi.github.io/docs/&#34;&gt;documentation&lt;/a&gt; on the group snapshot feature.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-do-i-get-involved&#34;&gt;How do I get involved?&lt;/h2&gt;
&lt;p&gt;This project, like all of Kubernetes, is the result of hard work by many contributors
from diverse backgrounds working together. On behalf of SIG Storage, I would like to
offer a huge thank you to the contributors who stepped up these last few quarters
to help the project reach beta:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ben Swartzlander (&lt;a href=&#34;https://github.com/bswartz&#34;&gt;bswartz&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Hemant Kumar (&lt;a href=&#34;https://github.com/gnufied&#34;&gt;gnufied&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Jan Šafránek (&lt;a href=&#34;https://github.com/jsafrane&#34;&gt;jsafrane&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Madhu Rajanna (&lt;a href=&#34;https://github.com/Madhu-1&#34;&gt;Madhu-1&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Michelle Au (&lt;a href=&#34;https://github.com/msau42&#34;&gt;msau42&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Niels de Vos (&lt;a href=&#34;https://github.com/nixpanic&#34;&gt;nixpanic&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Leonardo Cecchi (&lt;a href=&#34;https://github.com/leonardoce&#34;&gt;leonardoce&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Saad Ali (&lt;a href=&#34;https://github.com/saad-ali&#34;&gt;saad-ali&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Xing Yang (&lt;a href=&#34;https://github.com/xing-yang&#34;&gt;xing-yang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Yati Padia (&lt;a href=&#34;https://github.com/yati1998&#34;&gt;yati1998&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For those interested in getting involved with the design and development of CSI or
any part of the Kubernetes Storage system, join the
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;Kubernetes Storage Special Interest Group&lt;/a&gt; (SIG).
We always welcome new contributors.&lt;/p&gt;
&lt;p&gt;We also hold regular &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/wg-data-protection&#34;&gt;Data Protection Working Group meetings&lt;/a&gt;.
New attendees are welcome to join our discussions.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Decoupled Taint Manager Is Now Stable</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/15/kubernetes-v1-34-decoupled-taint-manager-is-now-stable/</link>
      <pubDate>Mon, 15 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/15/kubernetes-v1-34-decoupled-taint-manager-is-now-stable/</guid>
      <description>
        
        
        &lt;p&gt;This enhancement separates the responsibility of managing node lifecycle and pod eviction into two distinct components.
Previously, the node lifecycle controller handled both marking nodes as unhealthy with NoExecute taints and evicting pods from them.
Now, a dedicated taint eviction controller manages the eviction process, while the node lifecycle controller focuses solely on applying taints.
This separation not only improves code organization but also makes it easier to improve taint eviction controller or build custom implementations of the taint based eviction.&lt;/p&gt;
&lt;h2 id=&#34;what-s-new&#34;&gt;What&#39;s new?&lt;/h2&gt;
&lt;p&gt;The feature gate &lt;code&gt;SeparateTaintEvictionController&lt;/code&gt; has been promoted to GA in this release.
Users can optionally disable taint-based eviction by setting &lt;code&gt;--controllers=-taint-eviction-controller&lt;/code&gt;
in kube-controller-manager.&lt;/p&gt;
&lt;h2 id=&#34;how-can-i-learn-more&#34;&gt;How can I learn more?&lt;/h2&gt;
&lt;p&gt;For more details, refer to the &lt;a href=&#34;http://kep.k8s.io/3902&#34;&gt;KEP&lt;/a&gt; and to the beta announcement article: &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/12/19/kubernetes-1-29-taint-eviction-controller/&#34;&gt;Kubernetes 1.29: Decoupling taint manager from node lifecycle controller&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;how-to-get-involved&#34;&gt;How to get involved?&lt;/h2&gt;
&lt;p&gt;We offer a huge thank you to all the contributors who helped with design,
implementation, and review of this feature and helped move it from beta to stable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ed Bartosh (@bart0sh)&lt;/li&gt;
&lt;li&gt;Yuan Chen (@yuanchen8911)&lt;/li&gt;
&lt;li&gt;Aldo Culquicondor (@alculquicondor)&lt;/li&gt;
&lt;li&gt;Baofa Fan (@carlory)&lt;/li&gt;
&lt;li&gt;Sergey Kanzhelev (@SergeyKanzhelev)&lt;/li&gt;
&lt;li&gt;Tim Bannister (@lmktfy)&lt;/li&gt;
&lt;li&gt;Maciej Skoczeń (@macsko)&lt;/li&gt;
&lt;li&gt;Maciej Szulik (@soltysh)&lt;/li&gt;
&lt;li&gt;Wojciech Tyczynski (@wojtek-t)&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Autoconfiguration for Node Cgroup Driver Goes GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/12/kubernetes-v1-34-cri-cgroup-driver-lookup-now-ga/</link>
      <pubDate>Fri, 12 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/12/kubernetes-v1-34-cri-cgroup-driver-lookup-now-ga/</guid>
      <description>
        
        
        &lt;p&gt;Historically, configuring the correct cgroup driver has been a pain point for users running new
Kubernetes clusters. On Linux systems, there are two different cgroup drivers:
&lt;code&gt;cgroupfs&lt;/code&gt; and &lt;code&gt;systemd&lt;/code&gt;. In the past, both the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/kubelet/&#34;&gt;kubelet&lt;/a&gt;
and CRI implementation (like CRI-O or containerd) needed to be configured to use
the same cgroup driver, or else the kubelet would misbehave without any explicit
error message. This was a source of headaches for many cluster admins. Now, we&#39;ve
(almost) arrived at the end of that headache.&lt;/p&gt;
&lt;h2 id=&#34;automated-cgroup-driver-detection&#34;&gt;Automated cgroup driver detection&lt;/h2&gt;
&lt;p&gt;In v1.28.0, the SIG Node community introduced the feature gate
&lt;code&gt;KubeletCgroupDriverFromCRI&lt;/code&gt;, which instructs the kubelet to ask the CRI
implementation which cgroup driver to use. You can read more &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/08/21/cri-cgroup-driver-lookup-now-beta/&#34;&gt;here&lt;/a&gt;.
After many releases of waiting for each CRI implementation to have major versions released
and packaged in major operating systems, this feature has gone GA as of Kubernetes 1.34.0.&lt;/p&gt;
&lt;p&gt;In addition to setting the feature gate, a cluster admin needs to ensure their
CRI implementation is new enough:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;containerd: Support was added in v2.0.0&lt;/li&gt;
&lt;li&gt;CRI-O: Support was added in v1.28.0&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;announcement-kubernetes-is-deprecating-containerd-v1-y-support&#34;&gt;Announcement: Kubernetes is deprecating containerd v1.y support&lt;/h2&gt;
&lt;p&gt;While CRI-O releases versions that match Kubernetes versions, and thus CRI-O
versions without this behavior are no longer supported, containerd maintains its
own release cycle. containerd support for this feature is only in v2.0 and
later, but Kubernetes 1.34 still supports containerd 1.7 and other LTS releases
of containerd.&lt;/p&gt;
&lt;p&gt;The Kubernetes SIG Node community has formally agreed upon a final support
timeline for containerd v1.y. The last Kubernetes release to offer this support
will be the last released version of v1.35, and support will be dropped in
v1.36.0. To assist administrators in managing this future transition,
a new detection mechanism is available. You are able to monitor
the &lt;code&gt;kubelet_cri_losing_support&lt;/code&gt; metric to determine if any nodes in your cluster
are using a containerd version that will soon be outdated. The presence of
this metric with a version label of &lt;code&gt;1.36.0&lt;/code&gt; will indicate that the node&#39;s containerd
runtime is not new enough for the upcoming requirements. Consequently, an
administrator will need to upgrade containerd to v2.0 or a later version before,
or at the same time as, upgrading the kubelet to v1.36.0.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Mutable CSI Node Allocatable Graduates to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/11/kubernetes-v1-34-mutable-csi-node-allocatable-count/</link>
      <pubDate>Thu, 11 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/11/kubernetes-v1-34-mutable-csi-node-allocatable-count/</guid>
      <description>
        
        
        &lt;p&gt;The &lt;a href=&#34;https://kep.k8s.io/4876&#34;&gt;functionality for CSI drivers to update information about attachable volume count on the nodes&lt;/a&gt;, first introduced as Alpha in Kubernetes v1.33, has graduated to &lt;strong&gt;Beta&lt;/strong&gt; in the Kubernetes v1.34 release! This marks a significant milestone in enhancing the accuracy of stateful pod scheduling by reducing failures due to outdated attachable volume capacity information.&lt;/p&gt;
&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;Traditionally, Kubernetes &lt;a href=&#34;https://kubernetes-csi.github.io/docs/introduction.html&#34;&gt;CSI drivers&lt;/a&gt; report a static maximum volume attachment limit when initializing. However, actual attachment capacities can change during a node&#39;s lifecycle for various reasons, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Manual or external operations attaching/detaching volumes outside of Kubernetes control.&lt;/li&gt;
&lt;li&gt;Dynamically attached network interfaces or specialized hardware (GPUs, NICs, etc.) consuming available slots.&lt;/li&gt;
&lt;li&gt;Multi-driver scenarios, where one CSI driver’s operations affect available capacity reported by another.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Static reporting can cause Kubernetes to schedule pods onto nodes that appear to have capacity but don&#39;t, leading to pods stuck in a &lt;code&gt;ContainerCreating&lt;/code&gt; state.&lt;/p&gt;
&lt;h2 id=&#34;dynamically-adapting-csi-volume-limits&#34;&gt;Dynamically adapting CSI volume limits&lt;/h2&gt;
&lt;p&gt;With this new feature, Kubernetes enables CSI drivers to dynamically adjust and report node attachment capacities at runtime. This ensures that the scheduler, as well as other components relying on this information, have the most accurate, up-to-date view of node capacity.&lt;/p&gt;
&lt;h3 id=&#34;how-it-works&#34;&gt;How it works&lt;/h3&gt;
&lt;p&gt;Kubernetes supports two mechanisms for updating the reported node volume limits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Periodic Updates:&lt;/strong&gt; CSI drivers specify an interval to periodically refresh the node&#39;s allocatable capacity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reactive Updates:&lt;/strong&gt; An immediate update triggered when a volume attachment fails due to exhausted resources (&lt;code&gt;ResourceExhausted&lt;/code&gt; error).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;enabling-the-feature&#34;&gt;Enabling the feature&lt;/h3&gt;
&lt;p&gt;To use this beta feature, the &lt;code&gt;MutableCSINodeAllocatableCount&lt;/code&gt; feature gate must be enabled in these components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;kube-apiserver&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kubelet&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;example-csi-driver-configuration&#34;&gt;Example CSI driver configuration&lt;/h3&gt;
&lt;p&gt;Below is an example of configuring a CSI driver to enable periodic updates every 60 seconds:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  name: example.csi.k8s.io
spec:
  nodeAllocatableUpdatePeriodSeconds: 60
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This configuration directs kubelet to periodically call the CSI driver&#39;s &lt;code&gt;NodeGetInfo&lt;/code&gt; method every 60 seconds, updating the node’s allocatable volume count. Kubernetes enforces a minimum update interval of 10 seconds to balance accuracy and resource usage.&lt;/p&gt;
&lt;h3 id=&#34;immediate-updates-on-attachment-failures&#34;&gt;Immediate updates on attachment failures&lt;/h3&gt;
&lt;p&gt;When a volume attachment operation fails due to a &lt;code&gt;ResourceExhausted&lt;/code&gt; error (gRPC code &lt;code&gt;8&lt;/code&gt;), Kubernetes immediately updates the allocatable count instead of waiting for the next periodic update. The Kubelet then marks the affected pods as Failed, enabling their controllers to recreate them. This prevents pods from getting permanently stuck in the &lt;code&gt;ContainerCreating&lt;/code&gt; state.&lt;/p&gt;
&lt;h2 id=&#34;getting-started&#34;&gt;Getting started&lt;/h2&gt;
&lt;p&gt;To enable this feature in your Kubernetes v1.34 cluster:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enable the feature gate &lt;code&gt;MutableCSINodeAllocatableCount&lt;/code&gt; on the &lt;code&gt;kube-apiserver&lt;/code&gt; and &lt;code&gt;kubelet&lt;/code&gt; components.&lt;/li&gt;
&lt;li&gt;Update your CSI driver configuration by setting &lt;code&gt;nodeAllocatableUpdatePeriodSeconds&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Monitor and observe improvements in scheduling accuracy and pod placement reliability.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;next-steps&#34;&gt;Next steps&lt;/h2&gt;
&lt;p&gt;This feature is currently in beta and the Kubernetes community welcomes your feedback. Test it, share your experiences, and help guide its evolution to GA stability.&lt;/p&gt;
&lt;p&gt;Join discussions in the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;Kubernetes Storage Special Interest Group (SIG-Storage)&lt;/a&gt; to shape the future of Kubernetes storage capabilities.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Use An Init Container To Define App Environment Variables</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/10/kubernetes-v1-34-env-files/</link>
      <pubDate>Wed, 10 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/10/kubernetes-v1-34-env-files/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes typically uses ConfigMaps and Secrets to set environment variables,
which introduces additional API calls and complexity,
For example, you need to separately manage the Pods of your workloads
and their configurations, while ensuring orderly
updates for both the configurations and the workload Pods.&lt;/p&gt;
&lt;p&gt;Alternatively, you might be using a vendor-supplied container
that requires environment variables (such as a license key or a one-time token),
but you don’t want to hard-code them or mount volumes just to get the job done.&lt;/p&gt;
&lt;p&gt;If that&#39;s the situation you are in, you now have a new (alpha) way to
achieve that. Provided you have the &lt;code&gt;EnvFiles&lt;/code&gt;
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature gate&lt;/a&gt;
enabled across your cluster, you can tell the kubelet to load a container&#39;s
environment variables from a volume (the volume must be part of the Pod that
the container belongs to).
this feature gate allows you to load environment variables directly from a file in an emptyDir volume
without actually mounting that file into the container.
It’s a simple yet elegant solution to some surprisingly common problems.&lt;/p&gt;
&lt;h2 id=&#34;what-s-this-all-about&#34;&gt;What’s this all about?&lt;/h2&gt;
&lt;p&gt;At its core, this feature allows you to point your container to a file,
one generated by an &lt;code&gt;initContainer&lt;/code&gt;,
and have Kubernetes parse that file to set your environment variables.
The file lives in an &lt;code&gt;emptyDir&lt;/code&gt; volume (a temporary storage space that lasts as long as the pod does),
Your main container doesn’t need to mount the volume.
The kubelet will read the file and inject these variables when the container starts.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How It Works&lt;/h2&gt;
&lt;p&gt;Here&#39;s a simple example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;generate-config&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;busybox&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;echo &amp;#34;CONFIG_VAR=HELLO&amp;#34; &amp;gt; /config/config.env&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeMounts&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;config-volume&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;mountPath&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;/config&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;app-container&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gcr.io/distroless/static&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;env&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;CONFIG_VAR&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;valueFrom&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;fileKeyRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;path&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;config.env&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;config-volume&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;key&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;CONFIG_VAR&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;config-volume&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;emptyDir&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Using this approach is a breeze.
You define your environment variables in the pod spec using the &lt;code&gt;fileKeyRef&lt;/code&gt; field,
which tells Kubernetes where to find the file and which key to pull.
The file itself resembles the standard for .env syntax (think KEY=VALUE),
and (for this alpha stage at least) you must ensure that it is written into
an &lt;code&gt;emptyDir&lt;/code&gt; volume. Other volume types aren&#39;t supported for this feature.
At least one init container must mount that &lt;code&gt;emptyDir&lt;/code&gt; volume (to write the file),
but the main container doesn’t need to—it just gets the variables handed to it at startup.&lt;/p&gt;
&lt;h2 id=&#34;a-word-on-security&#34;&gt;A word on security&lt;/h2&gt;
&lt;p&gt;While this feature supports handling sensitive data such as keys or tokens,
note that its implementation relies on &lt;code&gt;emptyDir&lt;/code&gt; volumes mounted into pod.
Operators with node filesystem access could therefore
easily retrieve this sensitive data through pod directory paths.&lt;/p&gt;
&lt;p&gt;If storing sensitive data like keys or tokens using this feature,
ensure your cluster security policies effectively protect nodes
against unauthorized access to prevent exposure of confidential information.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;This feature will eliminate a number of complex workarounds used today, simplifying
apps authoring, and opening doors for more use cases. Kubernetes stays flexible and
open for feedback. Tell us how you use this feature or what is missing.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Snapshottable API server cache</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/09/kubernetes-v1-34-snapshottable-api-server-cache/</link>
      <pubDate>Tue, 09 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/09/kubernetes-v1-34-snapshottable-api-server-cache/</guid>
      <description>
        
        
        &lt;p&gt;For years, the Kubernetes community has been on a mission to improve the stability and performance predictability of the API server.
A major focus of this effort has been taming &lt;strong&gt;list&lt;/strong&gt; requests, which have historically been a primary source of high memory usage and heavy load on the &lt;code&gt;etcd&lt;/code&gt; datastore.
With each release, we&#39;ve chipped away at the problem, and today, we&#39;re thrilled to announce the final major piece of this puzzle.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;snapshottable API server cache&lt;/em&gt; feature has graduated to &lt;strong&gt;Beta&lt;/strong&gt; in Kubernetes v1.34,
culminating a multi-release effort to allow virtually all read requests to be served directly from the API server&#39;s cache.&lt;/p&gt;
&lt;h2 id=&#34;evolving-the-cache-for-performance-and-stability&#34;&gt;Evolving the cache for performance and stability&lt;/h2&gt;
&lt;p&gt;The path to the current state involved several key enhancements over recent releases that paved the way for today&#39;s announcement.&lt;/p&gt;
&lt;h3 id=&#34;consistent-reads-from-cache-beta-in-v1-31&#34;&gt;Consistent reads from cache (Beta in v1.31)&lt;/h3&gt;
&lt;p&gt;While the API server has long used a cache for performance, a key milestone was guaranteeing &lt;em&gt;consistent reads of the latest data&lt;/em&gt; from it. This v1.31 enhancement allowed the watch cache to be used for strongly-consistent read requests for the first time, a huge win as it enabled &lt;em&gt;filtered collections&lt;/em&gt; (e.g. &amp;quot;a list of pods bound to this node&amp;quot;) to be safely served from the cache instead of etcd, dramatically reducing its load for common workloads.&lt;/p&gt;
&lt;h3 id=&#34;taming-large-responses-with-streaming-beta-in-v1-33&#34;&gt;Taming large responses with streaming (Beta in v1.33)&lt;/h3&gt;
&lt;p&gt;Another key improvement was tackling the problem of memory spikes when transmitting large responses. The streaming encoder, introduced in v1.33, allowed the API server to send list items one by one, rather than buffering the entire multi-gigabyte response in memory. This made the memory cost of sending a response predictable and minimal, regardless of its size.&lt;/p&gt;
&lt;h3 id=&#34;the-missing-piece&#34;&gt;The missing piece&lt;/h3&gt;
&lt;p&gt;Despite these huge improvements, a critical gap remained. Any request for a historical &lt;code&gt;LIST&lt;/code&gt;—most commonly used for paginating through large result sets—still had to bypass the cache and query &lt;code&gt;etcd&lt;/code&gt; directly. This meant that the cost of &lt;em&gt;retrieving&lt;/em&gt; the data was still unpredictable and could put significant memory pressure on the API server.&lt;/p&gt;
&lt;h2 id=&#34;kubernetes-1-34-snapshots-complete-the-picture&#34;&gt;Kubernetes 1.34: snapshots complete the picture&lt;/h2&gt;
&lt;p&gt;The &lt;em&gt;snapshottable API server cache&lt;/em&gt; solves this final piece of the puzzle.
This feature enhances the watch cache, enabling it to generate efficient, point-in-time snapshots of its state.&lt;/p&gt;
&lt;p&gt;Here’s how it works: for each update, the cache creates a lightweight snapshot.
These snapshots are &amp;quot;lazy copies,&amp;quot; meaning they don&#39;t duplicate objects but simply store pointers, making them incredibly memory-efficient.&lt;/p&gt;
&lt;p&gt;When a &lt;strong&gt;list&lt;/strong&gt; request for a historical &lt;code&gt;resourceVersion&lt;/code&gt; arrives, the API server now finds the corresponding snapshot and serves the response directly from its memory.
This closes the final major gap, allowing paginated requests to be served entirely from the cache.&lt;/p&gt;
&lt;h2 id=&#34;a-new-era-of-api-server-performance&#34;&gt;A new era of API Server performance 🚀&lt;/h2&gt;
&lt;p&gt;With this final piece in place, the synergy of these three features ushers in a new era of API server predictability and performance:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Get Data from Cache&lt;/strong&gt;: &lt;em&gt;Consistent reads&lt;/em&gt; and &lt;em&gt;snapshottable cache&lt;/em&gt; work together to ensure nearly all read requests—whether for the latest data or a historical snapshot—are served from the API server&#39;s memory.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Send data via stream&lt;/strong&gt;: &lt;em&gt;Streaming list responses&lt;/em&gt; ensure that sending this data to the client has a minimal and constant memory footprint.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The result is a system where the resource cost of read operations is almost fully predictable and much more resiliant to spikes in request load.
This means dramatically reduced memory pressure, a lighter load on &lt;code&gt;etcd&lt;/code&gt;, and a more stable, scalable, and reliable control plane for all Kubernetes clusters.&lt;/p&gt;
&lt;h2 id=&#34;how-to-get-started&#34;&gt;How to get started&lt;/h2&gt;
&lt;p&gt;With its graduation to Beta, the &lt;code&gt;SnapshottableCache&lt;/code&gt; feature gate is &lt;strong&gt;enabled by default&lt;/strong&gt; in Kubernetes v1.34. There are no actions required to start benefiting from these performance and stability improvements.&lt;/p&gt;
&lt;h2 id=&#34;acknowledgements&#34;&gt;Acknowledgements&lt;/h2&gt;
&lt;p&gt;Special thanks for designing, implementing, and reviewing these critical features go to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ahmad Zolfaghari&lt;/strong&gt; (&lt;a href=&#34;https://github.com/ah8ad3&#34;&gt;@ah8ad3&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ben Luddy&lt;/strong&gt; (&lt;a href=&#34;https://github.com/benluddy&#34;&gt;@benluddy&lt;/a&gt;) – &lt;em&gt;Red Hat&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chen Chen&lt;/strong&gt; (&lt;a href=&#34;https://github.com/z1cheng&#34;&gt;@z1cheng&lt;/a&gt;) – &lt;em&gt;Microsoft&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Davanum Srinivas&lt;/strong&gt; (&lt;a href=&#34;https://github.com/dims&#34;&gt;@dims&lt;/a&gt;) – &lt;em&gt;Nvidia&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;David Eads&lt;/strong&gt; (&lt;a href=&#34;https://github.com/deads2k&#34;&gt;@deads2k&lt;/a&gt;) – &lt;em&gt;Red Hat&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Han Kang&lt;/strong&gt; (&lt;a href=&#34;https://github.com/logicalhan&#34;&gt;@logicalhan&lt;/a&gt;) – &lt;em&gt;CoreWeave&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;haosdent&lt;/strong&gt; (&lt;a href=&#34;https://github.com/haosdent&#34;&gt;@haosdent&lt;/a&gt;) – &lt;em&gt;Shopee&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Joe Betz&lt;/strong&gt; (&lt;a href=&#34;https://github.com/jpbetz&#34;&gt;@jpbetz&lt;/a&gt;) – &lt;em&gt;Google&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Jordan Liggitt&lt;/strong&gt; (&lt;a href=&#34;https://github.com/liggitt&#34;&gt;@liggitt&lt;/a&gt;) – &lt;em&gt;Google&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Łukasz Szaszkiewicz&lt;/strong&gt; (&lt;a href=&#34;https://github.com/p0lyn0mial&#34;&gt;@p0lyn0mial&lt;/a&gt;) – &lt;em&gt;Red Hat&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Maciej Borsz&lt;/strong&gt; (&lt;a href=&#34;https://github.com/mborsz&#34;&gt;@mborsz&lt;/a&gt;) – &lt;em&gt;Google&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Madhav Jivrajani&lt;/strong&gt; (&lt;a href=&#34;https://github.com/MadhavJivrajani&#34;&gt;@MadhavJivrajani&lt;/a&gt;) – &lt;em&gt;UIUC&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Marek Siarkowicz&lt;/strong&gt; (&lt;a href=&#34;https://github.com/serathius&#34;&gt;@serathius&lt;/a&gt;) – &lt;em&gt;Google&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NKeert&lt;/strong&gt; (&lt;a href=&#34;https://github.com/NKeert&#34;&gt;@NKeert&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tim Bannister&lt;/strong&gt; (&lt;a href=&#34;https://github.com/lmktfy&#34;&gt;@lmktfy&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wei Fu&lt;/strong&gt; (&lt;a href=&#34;https://github.com/fuweid&#34;&gt;@fuweid&lt;/a&gt;) - &lt;em&gt;Microsoft&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wojtek Tyczyński&lt;/strong&gt; (&lt;a href=&#34;https://github.com/wojtek-t&#34;&gt;@wojtek-t&lt;/a&gt;) – &lt;em&gt;Google&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;...and many others in SIG API Machinery. This milestone is a testament to the community&#39;s dedication to building a more scalable and robust Kubernetes.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: VolumeAttributesClass for Volume Modification GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/08/kubernetes-v1-34-volume-attributes-class/</link>
      <pubDate>Mon, 08 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/08/kubernetes-v1-34-volume-attributes-class/</guid>
      <description>
        
        
        &lt;p&gt;The VolumeAttributesClass API, which empowers users to dynamically modify volume attributes, has officially graduated to General Availability (GA) in Kubernetes v1.34. This marks a significant milestone, providing a robust and stable way to tune your persistent storage directly within Kubernetes.&lt;/p&gt;
&lt;h2 id=&#34;what-is-volumeattributesclass&#34;&gt;What is VolumeAttributesClass?&lt;/h2&gt;
&lt;p&gt;At its core, VolumeAttributesClass is a cluster-scoped resource that defines a set of mutable parameters for a volume. Think of it as a &amp;quot;profile&amp;quot; for your storage, allowing cluster administrators to expose different quality-of-service (QoS) levels or performance tiers.&lt;/p&gt;
&lt;p&gt;Users can then specify a &lt;code&gt;volumeAttributesClassName&lt;/code&gt; in their PersistentVolumeClaim (PVC) to indicate which class of attributes they desire. The magic happens through the Container Storage Interface (CSI): when a PVC referencing a VolumeAttributesClass is updated, the associated CSI driver interacts with the underlying storage system to apply the specified changes to the volume.&lt;/p&gt;
&lt;p&gt;This means you can now:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Dynamically scale performance: Increase IOPS or throughput for a busy database, or reduce it for a less critical application.&lt;/li&gt;
&lt;li&gt;Optimize costs: Adjust attributes on the fly to match your current needs, avoiding over-provisioning.&lt;/li&gt;
&lt;li&gt;Simplify operations: Manage volume modifications directly within the Kubernetes API, rather than relying on external tools or manual processes.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-is-new-from-beta-to-ga&#34;&gt;What is new from Beta to GA&lt;/h2&gt;
&lt;p&gt;There are two major enhancements from beta.&lt;/p&gt;
&lt;h3 id=&#34;cancellation-support-when-errors-occur&#34;&gt;Cancellation support when errors occur&lt;/h3&gt;
&lt;p&gt;To improve resilience and user experience, the GA release introduces explicit cancel support when a requested volume modification encounters an error. If the underlying storage system or CSI driver indicates that the requested changes cannot be applied (e.g., due to invalid arguments), users can cancel the operation and revert the volume to its previous stable configuration, preventing the volume from being left in an inconsistent state.&lt;/p&gt;
&lt;h3 id=&#34;quota-support-based-on-scope&#34;&gt;Quota support based on scope&lt;/h3&gt;
&lt;p&gt;While VolumeAttributesClass doesn&#39;t add a new quota type, the Kubernetes control plane can be configured to enforce quotas on PersistentVolumeClaims that reference a specific VolumeAttributesClass.&lt;/p&gt;
&lt;p&gt;This is achieved by using the &lt;code&gt;scopeSelector&lt;/code&gt; field in a ResourceQuota to target PVCs that have &lt;code&gt;.spec.volumeAttributesClassName&lt;/code&gt; set to a particular VolumeAttributesClass name. Please see more details &lt;a href=&#34;https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-per-volumeattributesclass&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;drivers-support-volumeattributesclass&#34;&gt;Drivers support VolumeAttributesClass&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Amazon EBS CSI Driver: The AWS EBS CSI driver has robust support for VolumeAttributesClass and allows you to modify parameters like volume type (e.g., gp2 to gp3, io1 to io2), IOPS, and throughput of EBS volumes dynamically.&lt;/li&gt;
&lt;li&gt;Google Compute Engine (GCE) Persistent Disk CSI Driver (pd.csi.storage.gke.io): This driver also supports dynamic modification of persistent disk attributes, including IOPS and throughput, via VolumeAttributesClass.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;contact&#34;&gt;Contact&lt;/h2&gt;
&lt;p&gt;For any inquiries or specific questions related to VolumeAttributesClass, please reach out to the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;SIG Storage community&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Pod Replacement Policy for Jobs Goes GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/05/kubernetes-v1-34-pod-replacement-policy-for-jobs-goes-ga/</link>
      <pubDate>Fri, 05 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/05/kubernetes-v1-34-pod-replacement-policy-for-jobs-goes-ga/</guid>
      <description>
        
        
        &lt;p&gt;In Kubernetes v1.34, the &lt;em&gt;Pod replacement policy&lt;/em&gt; feature has reached general availability (GA).
This blog post describes the Pod replacement policy feature and how to use it in your Jobs.&lt;/p&gt;
&lt;h2 id=&#34;about-pod-replacement-policy&#34;&gt;About Pod Replacement Policy&lt;/h2&gt;
&lt;p&gt;By default, the Job controller immediately recreates Pods as soon as they fail or begin terminating (when they have a deletion timestamp).&lt;/p&gt;
&lt;p&gt;As a result, while some Pods are terminating, the total number of running Pods for a Job can temporarily exceed the specified parallelism.
For Indexed Jobs, this can even mean multiple Pods running for the same index at the same time.&lt;/p&gt;
&lt;p&gt;This behavior works fine for many workloads, but it can cause problems in certain cases.&lt;/p&gt;
&lt;p&gt;For example, popular machine learning frameworks like TensorFlow and
&lt;a href=&#34;https://jax.readthedocs.io/en/latest/&#34;&gt;JAX&lt;/a&gt; expect exactly one Pod per worker index.
If two Pods run at the same time, you might encounter errors such as:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;/job:worker/task:4: Duplicate task registration with task_name=/job:worker/replica:0/task:4
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Additionally, starting replacement Pods before the old ones fully terminate can lead to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scheduling delays by kube-scheduler as the nodes remain occupied.&lt;/li&gt;
&lt;li&gt;Unnecessary cluster scale-ups to accommodate the replacement Pods.&lt;/li&gt;
&lt;li&gt;Temporary bypassing of quota checks by workload orchestrators like &lt;a href=&#34;https://kueue.sigs.k8s.io/&#34;&gt;Kueue&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With Pod replacement policy, Kubernetes gives you control over when the control plane
replaces terminating Pods, helping you avoid these issues.&lt;/p&gt;
&lt;h2 id=&#34;how-pod-replacement-policy-works&#34;&gt;How Pod Replacement Policy works&lt;/h2&gt;
&lt;p&gt;This enhancement means that Jobs in Kubernetes have an optional field &lt;code&gt;.spec.podReplacementPolicy&lt;/code&gt;.&lt;br&gt;
You can choose one of two policies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;TerminatingOrFailed&lt;/code&gt; (default): Replaces Pods as soon as they start terminating.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Failed&lt;/code&gt;: Replaces Pods only after they fully terminate and transition to the &lt;code&gt;Failed&lt;/code&gt; phase.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Setting the policy to &lt;code&gt;Failed&lt;/code&gt; ensures that a new Pod is only created after the previous one has completely terminated.&lt;/p&gt;
&lt;p&gt;For Jobs with a Pod Failure Policy, the default &lt;code&gt;podReplacementPolicy&lt;/code&gt; is &lt;code&gt;Failed&lt;/code&gt;, and no other value is allowed.
See &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-failure-policy&#34;&gt;Pod Failure Policy&lt;/a&gt; to learn more about Pod Failure Policies for Jobs.&lt;/p&gt;
&lt;p&gt;You can check how many Pods are currently terminating by inspecting the Job’s &lt;code&gt;.status.terminating&lt;/code&gt; field:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get job myjob -o&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#b8860b&#34;&gt;jsonpath&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;{.status.terminating}&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;p&gt;Here’s a Job example that executes a task two times (&lt;code&gt;spec.completions: 2&lt;/code&gt;) in parallel (&lt;code&gt;spec.parallelism: 2&lt;/code&gt;) and
replaces Pods only after they fully terminate (&lt;code&gt;spec.podReplacementPolicy: Failed&lt;/code&gt;):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;batch/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Job&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;example-job&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parallelism&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;podReplacementPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Failed&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;template&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Never&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;worker&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;your-image&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If a Pod receives a SIGTERM signal (deletion, eviction, preemption...), it begins terminating.
When the container handles termination gracefully, cleanup may take some time.&lt;/p&gt;
&lt;p&gt;When the Job starts, we will see two Pods running:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;NAME                READY   STATUS    RESTARTS   AGE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-qr8kf   1/1     Running   &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          2s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-stvb4   1/1     Running   &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          2s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Let&#39;s delete one of the Pods (&lt;code&gt;example-job-qr8kf&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;With the &lt;code&gt;TerminatingOrFailed&lt;/code&gt; policy, as soon as one Pod (&lt;code&gt;example-job-qr8kf&lt;/code&gt;) starts terminating, the Job controller immediately creates a new Pod (&lt;code&gt;example-job-b59zk&lt;/code&gt;) to replace it.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;NAME                READY   STATUS        RESTARTS   AGE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-b59zk   1/1     Running       &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          1s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-qr8kf   1/1     Terminating   &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          17s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-stvb4   1/1     Running       &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          17s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With the &lt;code&gt;Failed&lt;/code&gt; policy, the new Pod (&lt;code&gt;example-job-b59zk&lt;/code&gt;) is not created while the old Pod (&lt;code&gt;example-job-qr8kf&lt;/code&gt;) is terminating.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;NAME                READY   STATUS        RESTARTS   AGE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-qr8kf   1/1     Terminating   &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          17s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-stvb4   1/1     Running       &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          17s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When the terminating Pod has fully transitioned to the &lt;code&gt;Failed&lt;/code&gt; phase, a new Pod is created:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl get pods
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;NAME                READY   STATUS        RESTARTS   AGE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-b59zk   1/1     Running       &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          1s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;example-job-stvb4   1/1     Running       &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;          25s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;how-can-you-learn-more&#34;&gt;How can you learn more?&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Read the user-facing documentation for &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-replacement-policy&#34;&gt;Pod Replacement Policy&lt;/a&gt;,
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#backoff-limit-per-index&#34;&gt;Backoff Limit per Index&lt;/a&gt;, and
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-failure-policy&#34;&gt;Pod Failure Policy&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Read the KEPs for &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3939-allow-replacement-when-fully-terminated&#34;&gt;Pod Replacement Policy&lt;/a&gt;,
&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3850-backoff-limits-per-index-for-indexed-jobs&#34;&gt;Backoff Limit per Index&lt;/a&gt;, and
&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures&#34;&gt;Pod Failure Policy&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;acknowledgments&#34;&gt;Acknowledgments&lt;/h2&gt;
&lt;p&gt;As with any Kubernetes feature, multiple people contributed to getting this
done, from testing and filing bugs to reviewing code.&lt;/p&gt;
&lt;p&gt;As this feature moves to stable after 2 years, we would like to thank the following people:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kannon92&#34;&gt;Kevin Hannon&lt;/a&gt; - for writing the KEP and the initial implementation.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/mimowo&#34;&gt;Michał Woźniak&lt;/a&gt; - for guidance, mentorship, and reviews.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/alculquicondor&#34;&gt;Aldo Culquicondor&lt;/a&gt; - for guidance, mentorship, and reviews.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/soltysh&#34;&gt;Maciej Szulik&lt;/a&gt; - for guidance, mentorship, and reviews.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/dejanzele&#34;&gt;Dejan Zele Pejchev&lt;/a&gt; - for taking over the feature and promoting it from Alpha through Beta to GA.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;This work was sponsored by the Kubernetes
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/wg-batch&#34;&gt;batch working group&lt;/a&gt;
in close collaboration with the
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-apps&#34;&gt;SIG Apps&lt;/a&gt; community.&lt;/p&gt;
&lt;p&gt;If you are interested in working on new features in the space we recommend
subscribing to our &lt;a href=&#34;https://kubernetes.slack.com/messages/wg-batch&#34;&gt;Slack&lt;/a&gt;
channel and attending the regular community meetings.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: PSI Metrics for Kubernetes Graduates to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/04/kubernetes-v1-34-introducing-psi-metrics-beta/</link>
      <pubDate>Thu, 04 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/04/kubernetes-v1-34-introducing-psi-metrics-beta/</guid>
      <description>
        
        
        &lt;p&gt;As Kubernetes clusters grow in size and complexity, understanding the health and performance of individual nodes becomes increasingly critical. We are excited to announce that as of Kubernetes v1.34, &lt;strong&gt;Pressure Stall Information (PSI) Metrics&lt;/strong&gt; has graduated to Beta.&lt;/p&gt;
&lt;h2 id=&#34;what-is-pressure-stall-information-psi&#34;&gt;What is Pressure Stall Information (PSI)?&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://docs.kernel.org/accounting/psi.html&#34;&gt;Pressure Stall Information (PSI)&lt;/a&gt; is a feature of the Linux kernel (version 4.20 and later)
that provides a canonical way to quantify pressure on infrastructure resources,
in terms of whether demand for a resource exceeds current supply.
It moves beyond simple resource utilization metrics and instead
measures the amount of time that tasks are stalled due to resource contention.
This is a powerful way to identify and diagnose resource bottlenecks that can impact application performance.&lt;/p&gt;
&lt;p&gt;PSI exposes metrics for CPU, memory, and I/O, categorized as either &lt;code&gt;some&lt;/code&gt; or &lt;code&gt;full&lt;/code&gt; pressure:&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;some&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The percentage of time that &lt;strong&gt;at least one&lt;/strong&gt; task is stalled on a resource. This indicates some level of resource contention.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;full&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The percentage of time that &lt;strong&gt;all&lt;/strong&gt; non-idle tasks are stalled on a resource simultaneously. This indicates a more severe resource bottleneck.&lt;/dd&gt;
&lt;/dl&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/04/kubernetes-v1-34-introducing-psi-metrics-beta/psi-metrics-some-vs-full.svg&#34;
         alt=&#34;Diagram illustrating the difference between &amp;#39;some&amp;#39; and &amp;#39;full&amp;#39; PSI pressure.&#34;/&gt; &lt;figcaption&gt;
            &lt;h4&gt;PSI: &amp;#39;Some&amp;#39; vs. &amp;#39;Full&amp;#39; Pressure&lt;/h4&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;These metrics are aggregated over 10-second, 1-minute, and 5-minute rolling windows, providing a comprehensive view of resource pressure over time.&lt;/p&gt;
&lt;h2 id=&#34;psi-metrics-in-kubernetes&#34;&gt;PSI metrics in Kubernetes&lt;/h2&gt;
&lt;p&gt;With the &lt;code&gt;KubeletPSI&lt;/code&gt; feature gate enabled, the kubelet can now collect PSI metrics from the Linux kernel and expose them through two channels: the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/instrumentation/node-metrics/#summary-api-source&#34;&gt;Summary API&lt;/a&gt; and the &lt;code&gt;/metrics/cadvisor&lt;/code&gt; Prometheus endpoint. This allows you to monitor and alert on resource pressure at the node, pod, and container level.&lt;/p&gt;
&lt;p&gt;The following new metrics are available in Prometheus exposition format via &lt;code&gt;/metrics/cadvisor&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;container_pressure_cpu_stalled_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;container_pressure_cpu_waiting_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;container_pressure_memory_stalled_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;container_pressure_memory_waiting_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;container_pressure_io_stalled_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;container_pressure_io_waiting_seconds_total&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These metrics, along with the data from the Summary API, provide a granular view of resource pressure, enabling you to pinpoint the source of performance issues and take corrective action. For example, you can use these metrics to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Identify memory leaks:&lt;/strong&gt; A steadily increasing &lt;code&gt;some&lt;/code&gt; pressure for memory can indicate a memory leak in an application.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Optimize resource requests and limits:&lt;/strong&gt; By understanding the resource pressure of your workloads, you can more accurately tune their resource requests and limits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Autoscale workloads:&lt;/strong&gt; You can use PSI metrics to trigger autoscaling events, ensuring that your workloads have the resources they need to perform optimally.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-to-enable-psi-metrics&#34;&gt;How to enable PSI metrics&lt;/h2&gt;
&lt;p&gt;To enable PSI metrics in your Kubernetes cluster, you need to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Ensure your nodes are running a Linux kernel version 4.20 or later and are using cgroup v2.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Enable the &lt;code&gt;KubeletPSI&lt;/code&gt; feature gate on the kubelet.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Once enabled, you can start scraping the &lt;code&gt;/metrics/cadvisor&lt;/code&gt; endpoint with your Prometheus-compatible monitoring solution or query the Summary API to collect and visualize the new PSI metrics. Note that PSI is a Linux-kernel feature, so these metrics are not available on Windows nodes. Your cluster can contain a mix of Linux and Windows nodes, and on the Windows nodes the kubelet does not expose PSI metrics.&lt;/p&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;We are excited to bring PSI metrics to the Kubernetes community and look forward to your feedback. As a beta feature, we are actively working on improving and extending this functionality towards a stable GA release. We encourage you to try it out and share your experiences with us.&lt;/p&gt;
&lt;p&gt;To learn more about PSI metrics, check out the official &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/instrumentation/understand-psi-metrics/&#34;&gt;Kubernetes documentation&lt;/a&gt;. You can also get involved in the conversation on the &lt;a href=&#34;https://kubernetes.slack.com/messages/sig-node&#34;&gt;#sig-node&lt;/a&gt; Slack channel.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Service Account Token Integration for Image Pulls Graduates to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/03/kubernetes-v1-34-sa-tokens-image-pulls-beta/</link>
      <pubDate>Wed, 03 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/03/kubernetes-v1-34-sa-tokens-image-pulls-beta/</guid>
      <description>
        
        
        &lt;p&gt;The Kubernetes community continues to advance security best practices
by reducing reliance on long-lived credentials.
Following the successful &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/07/kubernetes-v1-33-wi-for-image-pulls/&#34;&gt;alpha release in Kubernetes v1.33&lt;/a&gt;,
&lt;em&gt;Service Account Token Integration for Kubelet Credential Providers&lt;/em&gt;
has now graduated to &lt;strong&gt;beta&lt;/strong&gt; in Kubernetes v1.34,
bringing us closer to eliminating long-lived image pull secrets from Kubernetes clusters.&lt;/p&gt;
&lt;p&gt;This enhancement allows credential providers
to use workload-specific service account tokens to obtain registry credentials,
providing a secure, ephemeral alternative to traditional image pull secrets.&lt;/p&gt;
&lt;h2 id=&#34;what-s-new-in-beta&#34;&gt;What&#39;s new in beta?&lt;/h2&gt;
&lt;p&gt;The beta graduation brings several important changes
that make the feature more robust and production-ready:&lt;/p&gt;
&lt;h3 id=&#34;required-cachetype-field&#34;&gt;Required &lt;code&gt;cacheType&lt;/code&gt; field&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Breaking change from alpha&lt;/strong&gt;: The &lt;code&gt;cacheType&lt;/code&gt; field is &lt;strong&gt;required&lt;/strong&gt;
in the credential provider configuration when using service account tokens.
This field is new in beta and must be specified to ensure proper caching behavior.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# CAUTION: this is not a complete configuration example, just a reference for the &amp;#39;tokenAttributes.cacheType&amp;#39; field.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tokenAttributes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;serviceAccountTokenAudience&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;my-registry-audience&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cacheType&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;ServiceAccount&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Required field in beta&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requireServiceAccount&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Choose between two caching strategies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;Token&lt;/code&gt;&lt;/strong&gt;: Cache credentials per service account token
(use when credential lifetime is tied to the token).
This is useful when the credential provider transforms the service account token into registry credentials
with the same lifetime as the token, or when registries support Kubernetes service account tokens directly.
Note: The kubelet cannot send service account tokens directly to registries;
credential provider plugins are needed to transform tokens into the username/password format expected by registries.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;ServiceAccount&lt;/code&gt;&lt;/strong&gt;: Cache credentials per service account identity
(use when credentials are valid for all pods using the same service account)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;isolated-image-pull-credentials&#34;&gt;Isolated image pull credentials&lt;/h3&gt;
&lt;p&gt;The beta release provides stronger security isolation for container images
when using service account tokens for image pulls.
It ensures that pods can only access images that were pulled using ServiceAccounts they&#39;re authorized to use.
This prevents unauthorized access to sensitive container images
and enables granular access control where different workloads can have different registry permissions
based on their ServiceAccount.&lt;/p&gt;
&lt;p&gt;When credential providers use service account tokens,
the system tracks ServiceAccount identity (namespace, name, and &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/names/#uids&#34;&gt;UID&lt;/a&gt;) for each pulled image.
When a pod attempts to use a cached image,
the system verifies that the pod&#39;s ServiceAccount matches exactly with the ServiceAccount
that was used to originally pull the image.&lt;/p&gt;
&lt;p&gt;Administrators can revoke access to previously pulled images
by deleting and recreating the ServiceAccount,
which changes the UID and invalidates cached image access.&lt;/p&gt;
&lt;p&gt;For more details about this capability,
see the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/containers/images/#ensureimagepullcredentialverification&#34;&gt;image pull credential verification&lt;/a&gt; documentation.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;h3 id=&#34;configuration&#34;&gt;Configuration&lt;/h3&gt;
&lt;p&gt;Credential providers opt into using ServiceAccount tokens
by configuring the &lt;code&gt;tokenAttributes&lt;/code&gt; field:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# CAUTION: this is an example configuration.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#          Do not use this for your own cluster!&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet.config.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;CredentialProviderConfig&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;providers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-credential-provider&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchImages&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;*.myregistry.io/*&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;defaultCacheDuration&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;10m&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;credentialprovider.kubelet.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tokenAttributes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;serviceAccountTokenAudience&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;my-registry-audience&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cacheType&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;ServiceAccount&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# New in beta&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requireServiceAccount&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requiredServiceAccountAnnotationKeys&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;myregistry.io/identity-id&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;optionalServiceAccountAnnotationKeys&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;myregistry.io/optional-annotation&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;image-pull-flow&#34;&gt;Image pull flow&lt;/h3&gt;
&lt;p&gt;At a high level, &lt;code&gt;kubelet&lt;/code&gt; coordinates with your credential provider
and the container runtime as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;When the image is not present locally:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;kubelet&lt;/code&gt; checks its credential cache using the configured &lt;code&gt;cacheType&lt;/code&gt;
(&lt;code&gt;Token&lt;/code&gt; or &lt;code&gt;ServiceAccount&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;If needed, &lt;code&gt;kubelet&lt;/code&gt; requests a ServiceAccount token for the pod&#39;s ServiceAccount
and passes it, plus any required annotations, to the credential provider&lt;/li&gt;
&lt;li&gt;The provider exchanges that token for registry credentials
and returns them to &lt;code&gt;kubelet&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kubelet&lt;/code&gt; caches credentials per the &lt;code&gt;cacheType&lt;/code&gt; strategy
and pulls the image with those credentials&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kubelet&lt;/code&gt; records the ServiceAccount coordinates (namespace, name, UID)
associated with the pulled image for later authorization checks&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When the image is already present locally:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;kubelet&lt;/code&gt; verifies the pod&#39;s ServiceAccount coordinates
match the coordinates recorded for the cached image&lt;/li&gt;
&lt;li&gt;If they match exactly, the cached image can be used
without pulling from the registry&lt;/li&gt;
&lt;li&gt;If they differ, &lt;code&gt;kubelet&lt;/code&gt; performs a fresh pull
using credentials for the new ServiceAccount&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;With image pull credential verification enabled:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Authorization is enforced using the recorded ServiceAccount coordinates,
ensuring pods only use images pulled by a ServiceAccount
they are authorized to use&lt;/li&gt;
&lt;li&gt;Administrators can revoke access by deleting and recreating a ServiceAccount;
the UID changes and previously recorded authorization no longer matches&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;audience-restriction&#34;&gt;Audience restriction&lt;/h3&gt;
&lt;p&gt;The beta release builds on service account node audience restriction
(beta since v1.33) to ensure &lt;code&gt;kubelet&lt;/code&gt; can only request tokens for authorized audiences.
Administrators configure allowed audiences using RBAC to enable kubelet to request service account tokens for image pulls:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# CAUTION: this is an example configuration.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#          Do not use this for your own cluster!&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;rbac.authorization.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ClusterRole&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet-credential-provider-audiences&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;verbs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;request-serviceaccounts-token-audience&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiGroups&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;my-registry-audience&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resourceNames&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;registry-access-sa&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;]  # Optional&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;specific SA&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;getting-started-with-beta&#34;&gt;Getting started with beta&lt;/h2&gt;
&lt;h3 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Kubernetes v1.34 or later&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Feature gate enabled&lt;/strong&gt;:
&lt;code&gt;KubeletServiceAccountTokenForCredentialProviders=true&lt;/code&gt; (beta, enabled by default)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Credential provider support&lt;/strong&gt;:
Update your credential provider to handle ServiceAccount tokens&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;migration-from-alpha&#34;&gt;Migration from alpha&lt;/h3&gt;
&lt;p&gt;If you&#39;re already using the alpha version,
the migration to beta requires minimal changes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Add &lt;code&gt;cacheType&lt;/code&gt; field&lt;/strong&gt;:
Update your credential provider configuration to include the required &lt;code&gt;cacheType&lt;/code&gt; field&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review caching strategy&lt;/strong&gt;:
Choose between &lt;code&gt;Token&lt;/code&gt; and &lt;code&gt;ServiceAccount&lt;/code&gt; cache types based on your provider&#39;s behavior&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test audience restrictions&lt;/strong&gt;:
Ensure your RBAC configuration, or other cluster authorization rules, will properly restrict token audiences&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;example-setup&#34;&gt;Example setup&lt;/h3&gt;
&lt;p&gt;Here&#39;s a complete example
for setting up a credential provider with service account tokens
(this example assumes your cluster uses RBAC authorization):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# CAUTION: this is an example configuration.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#          Do not use this for your own cluster!&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;#&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Service Account with registry annotations&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ServiceAccount&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry-access-sa&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;annotations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;myregistry.io/identity-id&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;user123&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# RBAC for audience restriction&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;rbac.authorization.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ClusterRole&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry-audience-access&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;verbs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;request-serviceaccounts-token-audience&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiGroups&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resources&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;my-registry-audience&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resourceNames&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;registry-access-sa&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;]  # Optional&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;specific ServiceAccount&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;rbac.authorization.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ClusterRoleBinding&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet-registry-audience&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;roleRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiGroup&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;rbac.authorization.k8s.io&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ClusterRole&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry-audience-access&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;subjects&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Group&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;system:nodes&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiGroup&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;rbac.authorization.k8s.io&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Pod using the ServiceAccount&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;serviceAccountName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry-access-sa&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my-app&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myregistry.example/my-app:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;For Kubernetes v1.35, we - Kubernetes SIG Auth - expect the feature to stay in beta,
and we will continue to solicit feedback.&lt;/p&gt;
&lt;p&gt;You can learn more about this feature
on the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/administer-cluster/kubelet-credential-provider/#service-account-token-for-image-pulls&#34;&gt;service account token for image pulls&lt;/a&gt;
page in the Kubernetes documentation.&lt;/p&gt;
&lt;p&gt;You can also follow along on the
&lt;a href=&#34;https://kep.k8s.io/4412&#34;&gt;KEP-4412&lt;/a&gt;
to track progress across the coming Kubernetes releases.&lt;/p&gt;
&lt;h2 id=&#34;call-to-action&#34;&gt;Call to action&lt;/h2&gt;
&lt;p&gt;In this blog post,
I have covered the beta graduation of ServiceAccount token integration
for Kubelet Credential Providers in Kubernetes v1.34.
I discussed the key improvements,
including the required &lt;code&gt;cacheType&lt;/code&gt; field
and enhanced integration with Ensure Secret Pull Images.&lt;/p&gt;
&lt;p&gt;We have been receiving positive feedback from the community during the alpha phase
and would love to hear more as we stabilize this feature for GA.
In particular, we would like feedback from credential provider implementors
as they integrate with the new beta API and caching mechanisms.
Please reach out to us on the &lt;a href=&#34;https://kubernetes.slack.com/archives/C04UMAUC4UA&#34;&gt;#sig-auth-authenticators-dev&lt;/a&gt; channel on Kubernetes Slack.&lt;/p&gt;
&lt;h2 id=&#34;how-to-get-involved&#34;&gt;How to get involved&lt;/h2&gt;
&lt;p&gt;If you are interested in getting involved in the development of this feature,
share feedback, or participate in any other ongoing SIG Auth projects,
please reach out on the &lt;a href=&#34;https://kubernetes.slack.com/archives/C0EN96KUY&#34;&gt;#sig-auth&lt;/a&gt; channel on Kubernetes Slack.&lt;/p&gt;
&lt;p&gt;You are also welcome to join the bi-weekly &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-auth/README.md#meetings&#34;&gt;SIG Auth meetings&lt;/a&gt;,
held every other Wednesday.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Introducing CPU Manager Static Policy Option for Uncore Cache Alignment</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/</link>
      <pubDate>Tue, 02 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/</guid>
      <description>
        
        
        &lt;p&gt;A new CPU Manager Static Policy Option called &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; was introduced in Kubernetes v1.32 as an alpha feature, and has graduated to &lt;strong&gt;beta&lt;/strong&gt; in Kubernetes v1.34.
This CPU Manager Policy Option is designed to optimize performance for specific workloads running on processors with a &lt;em&gt;split uncore cache&lt;/em&gt; architecture.
In this article, I&#39;ll explain what that means and why it&#39;s useful.&lt;/p&gt;
&lt;h2 id=&#34;understanding-the-feature&#34;&gt;Understanding the feature&lt;/h2&gt;
&lt;h3 id=&#34;what-is-uncore-cache&#34;&gt;What is uncore cache?&lt;/h3&gt;
&lt;p&gt;Until relatively recently, nearly all mainstream computer processors had a
monolithic last-level-cache cache that was shared across every core in a multiple
CPU package.
This monolithic cache is also referred to as &lt;em&gt;uncore cache&lt;/em&gt;
(because it is not linked to a specific core), or as Level 3 cache.
As well as the Level 3 cache, there is other cache, commonly called Level 1 and Level 2 cache,
that &lt;strong&gt;is&lt;/strong&gt; associated with a specific CPU core.&lt;/p&gt;
&lt;p&gt;In order to reduce access latency between the CPU cores and their cache, recent AMD64 and ARM
architecture based processors have introduced a &lt;em&gt;split uncore cache&lt;/em&gt; architecture,
where the last-level-cache is divided into multiple physical caches,
that are aligned to specific CPU groupings within the physical package.
The shorter distances within the CPU package help to reduce latency.
&lt;img alt=&#34;Diagram showing monolithic cache on the left and split cache on the right&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/mono_vs_split_uncore.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Kubernetes is able to place workloads in a way that accounts for the cache
topology within the CPU package(s).&lt;/p&gt;
&lt;h3 id=&#34;cache-aware-workload-placement&#34;&gt;Cache-aware workload placement&lt;/h3&gt;
&lt;p&gt;The matrix below shows the &lt;a href=&#34;https://github.com/nviennot/core-to-core-latency&#34;&gt;CPU-to-CPU latency&lt;/a&gt; measured in nanoseconds (lower is better) when
passing a packet between CPUs, via its cache coherence protocol on a processor that
uses split uncore cache.
In this example, the processor package consists of 2 uncore caches.
Each uncore cache serves 8 CPU cores.
&lt;img alt=&#34;Table showing CPU-to-CPU latency figures&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/c2c_latency.png&#34;&gt;
Blue entries in the matrix represent latency between CPUs sharing the same uncore cache, while grey entries indicate latency between CPUs corresponding to different uncore caches. Latency between CPUs that correspond to different caches are higher than the latency between CPUs that belong to the same cache.&lt;/p&gt;
&lt;p&gt;With &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; enabled, the
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/policy/node-resource-managers/#static-policy&#34;&gt;static CPU Manager&lt;/a&gt; attempts to allocates CPU resources for a container, such that all CPUs assigned to a container share the same uncore cache.
This policy operates on a best-effort basis, aiming to minimize the distribution of a container&#39;s CPU resources across uncore caches, based on the
container&#39;s requirements, and accounting for allocatable resources on the node.&lt;/p&gt;
&lt;p&gt;By running a workload, where it can, on a set of CPUS that use the smallest feasible number of uncore caches, applications benefit from reduced cache latency (as seen in the matrix above),
and from reduced contention against other workloads, which can result in overall higher throughput.
The benefit only shows up if your nodes use a split uncore cache topology for their processors.&lt;/p&gt;
&lt;p&gt;The following diagram below illustrates uncore cache alignment when the feature is enabled.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Diagram showing an example workload CPU assignment, default static policy, and with prefer-align-cpus-by-uncorecache&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/cache-align-diagram.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;By default, Kubernetes does not account for uncore cache topology; containers are assigned CPU resources using a packed methodology.
As a result, Container 1 and Container 2 can experience a noisy neighbor impact due to
cache access contention on Uncore Cache 0. Additionally, Container 2 will have CPUs distributed across both caches which can introduce a cross-cache latency.&lt;/p&gt;
&lt;p&gt;With &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; enabled, each container is isolated on an individual cache. This resolves the cache contention between the containers and minimizes the cache latency for the CPUs being utilized.&lt;/p&gt;
&lt;h2 id=&#34;use-cases&#34;&gt;Use cases&lt;/h2&gt;
&lt;p&gt;Common use cases can include telco applications like vRAN, Mobile Packet Core, and Firewalls. It&#39;s important to note that the optimization provided by &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; can be dependent on the workload. For example, applications that are memory bandwidth bound may not benefit from uncore cache alignment, as utilizing more uncore caches can increase memory bandwidth access.&lt;/p&gt;
&lt;h2 id=&#34;enabling-the-feature&#34;&gt;Enabling the feature&lt;/h2&gt;
&lt;p&gt;To enable this feature, set the CPU Manager Policy to &lt;code&gt;static&lt;/code&gt; and enable the CPU Manager Policy Options with &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For Kubernetes 1.34, the feature is in the beta stage and requires the &lt;code&gt;CPUManagerPolicyBetaOptions&lt;/code&gt;
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/command-line-tools-reference/feature-gates/&#34;&gt;feature gate&lt;/a&gt; to also be enabled.&lt;/p&gt;
&lt;p&gt;Append the following to the kubelet configuration file:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;KubeletConfiguration&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubelet.config.k8s.io/v1beta1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;featureGates&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;CPUManagerPolicyBetaOptions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpuManagerPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;static&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cpuManagerPolicyOptions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;prefer-align-cpus-by-uncorecache&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;true&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;reservedSystemCPUs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you&#39;re making this change to an existing node, remove the &lt;code&gt;cpu_manager_state&lt;/code&gt; file and then restart kubelet.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; can be enabled on nodes with a monolithic uncore cache processor. The feature will mimic a best-effort socket alignment effect and will pack CPU resources on the socket similar to the default static CPU Manager policy.&lt;/p&gt;
&lt;h2 id=&#34;further-reading&#34;&gt;Further reading&lt;/h2&gt;
&lt;p&gt;See &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/policy/node-resource-managers/&#34;&gt;Node Resource Managers&lt;/a&gt; to learn more about the CPU Manager and the available policies.&lt;/p&gt;
&lt;p&gt;Reference the documentation for &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/policy/node-resource-managers/#prefer-align-cpus-by-uncorecache&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Please see the &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4800-cpumanager-split-uncorecache&#34;&gt;Kubernetes Enhancement Proposal&lt;/a&gt; for more information on how &lt;code&gt;prefer-align-cpus-by-uncorecache&lt;/code&gt; is implemented.&lt;/p&gt;
&lt;h2 id=&#34;getting-involved&#34;&gt;Getting involved&lt;/h2&gt;
&lt;p&gt;This feature is driven by &lt;a href=&#34;https://github.com/Kubernetes/community/blob/master/sig-node/README.md&#34;&gt;SIG Node&lt;/a&gt;. If you are interested in helping develop this feature, sharing feedback, or participating in any other ongoing SIG Node projects, please attend the SIG Node meeting for more details.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: DRA has graduated to GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/01/kubernetes-v1-34-dra-updates/</link>
      <pubDate>Mon, 01 Sep 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/01/kubernetes-v1-34-dra-updates/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes 1.34 is here, and it has brought a huge wave of enhancements for Dynamic Resource Allocation (DRA)! This
release marks a major milestone with many APIs in the &lt;code&gt;resource.k8s.io&lt;/code&gt; group graduating to General Availability (GA),
unlocking the full potential of how you manage devices on Kubernetes. On top of that, several key features have
moved to beta, and a fresh batch of new alpha features promise even more expressiveness and flexibility.&lt;/p&gt;
&lt;p&gt;Let&#39;s dive into what&#39;s new for DRA in Kubernetes 1.34!&lt;/p&gt;
&lt;h2 id=&#34;the-core-of-dra-is-now-ga&#34;&gt;The core of DRA is now GA&lt;/h2&gt;
&lt;p&gt;The headline feature of the v1.34 release is that the core of DRA has graduated to General Availability.&lt;/p&gt;
&lt;p&gt;Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/&#34;&gt;Dynamic Resource Allocation (DRA)&lt;/a&gt; provides
a flexible framework for managing specialized hardware and infrastructure resources, such as GPUs or FPGAs. DRA
provides APIs that enable each workload to specify the properties of the devices it needs, but leaving it to the
scheduler to allocate actual devices, allowing increased reliability and improved utilization of expensive hardware.&lt;/p&gt;
&lt;p&gt;With the graduation to GA, DRA is stable and will be part of Kubernetes for the long run. The community can still
expect a steady stream of new features being added to DRA over the next several Kubernetes releases, but they will
not make any breaking changes to DRA. So users and developers of DRA drivers can start adopting DRA with confidence.&lt;/p&gt;
&lt;p&gt;Starting with Kubernetes 1.34, DRA is enabled by default; the DRA features that have reached beta are &lt;strong&gt;also&lt;/strong&gt; enabled by default.
That&#39;s because the default API version for DRA is now the stable &lt;code&gt;v1&lt;/code&gt; version, and not the earlier versions
(eg: &lt;code&gt;v1beta1&lt;/code&gt; or &lt;code&gt;v1beta2&lt;/code&gt;) that needed explicit opt in.&lt;/p&gt;
&lt;h2 id=&#34;features-promoted-to-beta&#34;&gt;Features promoted to beta&lt;/h2&gt;
&lt;p&gt;Several powerful features have been promoted to beta, adding more control, flexibility, and observability to resource
management with DRA.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#admin-access&#34;&gt;Admin access labelling&lt;/a&gt; has been updated.
In v1.34, you can restrict device support to people (or software) authorized to use it. This is meant
as a way to avoid privilege escalation if a DRA driver grants additional privileges when admin access is requested
and to avoid accessing devices which are in use by normal applications, potentially in another namespace.
The restriction works by ensuring that only users with access to a namespace with the
&lt;code&gt;resource.k8s.io/admin-access: &amp;quot;true&amp;quot;&lt;/code&gt; label are authorized to create
ResourceClaim or ResourceClaimTemplates objects with the &lt;code&gt;adminAccess&lt;/code&gt; field set to true. This ensures that non-admin users cannot misuse the feature.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#prioritized-list&#34;&gt;Prioritized list&lt;/a&gt; lets users specify
a list of acceptable devices for their workloads, rather than just a single type of device. So while the workload
might run best on a single high-performance GPU, it might also be able to run on 2 mid-level GPUs. The scheduler will
attempt to satisfy the alternatives in the list in order, so the workload will be allocated the best set of devices
available on the node.&lt;/p&gt;
&lt;p&gt;The kubelet&#39;s API has been updated to report on Pod resources allocated through DRA. This allows node monitoring agents
to know the allocated DRA resources for Pods on a node and makes it possible to use the DRA information in the PodResources API
to develop new features and integrations.&lt;/p&gt;
&lt;h2 id=&#34;new-alpha-features&#34;&gt;New alpha features&lt;/h2&gt;
&lt;p&gt;Kubernetes 1.34 also introduces several new alpha features that give us a glimpse into the future of resource management with DRA.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#extended-resource&#34;&gt;Extended resource mapping&lt;/a&gt; support in DRA allows
cluster administrators to advertise DRA-managed resources as &lt;em&gt;extended resources&lt;/em&gt;, allowing developers to consume them using
the familiar, simpler request syntax while still benefiting from dynamic allocation. This makes it possible for existing
workloads to start using DRA without modifications, simplifying the transition to DRA for both application developers and
cluster administrators.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#consumable-capacity&#34;&gt;Consumable capacity&lt;/a&gt; introduces a flexible
device sharing model where multiple, independent resource claims from unrelated
pods can each be allocated a share of the same underlying physical device. This new capability is managed through optional,
administrator-defined sharing policies that govern how a device&#39;s total capacity is divided and enforced by the platform for
each request. This allows for sharing of devices in scenarios where pre-defined partitions are not viable.&lt;/p&gt;
&lt;p&gt;For more information, see &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/18/kubernetes-v1-34-dra-consumable-capacity/&#34;&gt;Kubernetes v1.34: DRA Consumable Capacity&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#binding-conditions&#34;&gt;Binding conditions&lt;/a&gt; improve scheduling
reliability for certain classes of devices by allowing the Kubernetes scheduler to delay binding a pod to a node until its
required external resources, such as attachable devices or FPGAs, are confirmed to be fully prepared. This prevents premature
pod assignments that could lead to failures and ensures more robust, predictable scheduling by explicitly modeling resource
readiness before the pod is committed to a node.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Resource health status&lt;/em&gt; for DRA improves observability by exposing the health status of devices allocated to a Pod via Pod Status.
This works whether the device is allocated through DRA or Device Plugin. This makes it easier to understand the cause of an
unhealthy device and respond properly.&lt;/p&gt;
&lt;p&gt;For more information, see &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/09/17/kubernetes-v1-34-pods-report-dra-resource-health/&#34;&gt;Kubernetes v1.34: Pods Report DRA Resource Health&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What’s next?&lt;/h2&gt;
&lt;p&gt;While DRA got promoted to GA this cycle, the hard work on DRA doesn&#39;t stop. There are several features in alpha and beta that
we plan to bring to GA in the next couple of releases and we are looking to continue to improve performance, scalability
and reliability of DRA. So expect an equally ambitious set of features in DRA for the 1.35 release.&lt;/p&gt;
&lt;h2 id=&#34;getting-involved&#34;&gt;Getting involved&lt;/h2&gt;
&lt;p&gt;A good starting point is joining the WG Device Management &lt;a href=&#34;https://kubernetes.slack.com/archives/C0409NGC1TK&#34;&gt;Slack channel&lt;/a&gt; and &lt;a href=&#34;https://docs.google.com/document/d/1qxI87VqGtgN7EAJlqVfxx86HGKEAc2A3SKru8nJHNkQ/edit?tab=t.0#heading=h.tgg8gganowxq&#34;&gt;meetings&lt;/a&gt;, which happen at US/EU and EU/APAC friendly time slots.&lt;/p&gt;
&lt;p&gt;Not all enhancement ideas are tracked as issues yet, so come talk to us if you want to help or have some ideas yourself! We have work to do at all levels, from difficult core changes to usability enhancements in kubectl, which could be picked up by newcomers.&lt;/p&gt;
&lt;h2 id=&#34;acknowledgments&#34;&gt;Acknowledgments&lt;/h2&gt;
&lt;p&gt;A huge thanks to the new contributors to DRA this cycle:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Alay Patel (&lt;a href=&#34;https://github.com/alaypatel07&#34;&gt;alaypatel07&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Gaurav Kumar Ghildiyal (&lt;a href=&#34;https://github.com/gauravkghildiyal&#34;&gt;gauravkghildiyal&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;JP (&lt;a href=&#34;https://github.com/Jpsassine&#34;&gt;Jpsassine&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Kobayashi Daisuke (&lt;a href=&#34;https://github.com/KobayashiD27&#34;&gt;KobayashiD27&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Laura Lorenz (&lt;a href=&#34;https://github.com/lauralorenz&#34;&gt;lauralorenz&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Sunyanan Choochotkaew (&lt;a href=&#34;https://github.com/sunya-ch&#34;&gt;sunya-ch&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Swati Gupta (&lt;a href=&#34;https://github.com/guptaNswati&#34;&gt;guptaNswati&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Yu Liao (&lt;a href=&#34;https://github.com/yliaog&#34;&gt;yliaog&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Finer-Grained Control Over Container Restarts</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/29/kubernetes-v1-34-per-container-restart-policy/</link>
      <pubDate>Fri, 29 Aug 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/29/kubernetes-v1-34-per-container-restart-policy/</guid>
      <description>
        
        
        &lt;p&gt;With the release of Kubernetes 1.34, a new alpha feature is introduced
that gives you more granular control over container restarts within a Pod. This
feature, named &lt;strong&gt;Container Restart Policy and Rules&lt;/strong&gt;, allows you to specify a
restart policy for each container individually, overriding the Pod&#39;s global
restart policy. In addition, it also allows you to conditionally restart
individual containers based on their exit codes. This feature is available
behind the alpha feature gate &lt;code&gt;ContainerRestartRules&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This has been a long-requested feature. Let&#39;s dive into how it works and how you
can use it.&lt;/p&gt;
&lt;h2 id=&#34;the-problem-with-a-single-restart-policy&#34;&gt;The problem with a single restart policy&lt;/h2&gt;
&lt;p&gt;Before this feature, the &lt;code&gt;restartPolicy&lt;/code&gt; was set at the Pod level. This meant
that all containers in a Pod shared the same restart policy (&lt;code&gt;Always&lt;/code&gt;,
&lt;code&gt;OnFailure&lt;/code&gt;, or &lt;code&gt;Never&lt;/code&gt;). While this works for many use cases, it can be
limiting in others.&lt;/p&gt;
&lt;p&gt;For example, consider a Pod with a main application container and an init
container that performs some initial setup. You might want the main container
to always restart on failure, but the init container should only run once and
never restart. With a single Pod-level restart policy, this wasn&#39;t possible.&lt;/p&gt;
&lt;h2 id=&#34;introducing-per-container-restart-policies&#34;&gt;Introducing per-container restart policies&lt;/h2&gt;
&lt;p&gt;With the new &lt;code&gt;ContainerRestartRules&lt;/code&gt; feature gate, you can now specify a
&lt;code&gt;restartPolicy&lt;/code&gt; for each container in your Pod&#39;s spec. You can also define
&lt;code&gt;restartPolicyRules&lt;/code&gt; to control restarts based on exit codes. This gives you
the fine-grained control you need to handle complex scenarios.&lt;/p&gt;
&lt;h2 id=&#34;use-cases&#34;&gt;Use cases&lt;/h2&gt;
&lt;p&gt;Let&#39;s look at some real-life use cases where per-container restart policies can
be beneficial.&lt;/p&gt;
&lt;h3 id=&#34;in-place-restarts-for-training-jobs&#34;&gt;In-place restarts for training jobs&lt;/h3&gt;
&lt;p&gt;In ML research, it&#39;s common to orchestrate a large number of long-running AI/ML
training workloads. In these scenarios, workload failures are unavoidable. When
a workload fails with a retriable exit code, you want the container to restart
quickly without rescheduling the entire Pod, which consumes a significant amount
of time and resources. Restarting the failed container &amp;quot;in-place&amp;quot; is critical
for better utilization of compute resources. The container should only restart
&amp;quot;in-place&amp;quot; if it failed due to a retriable error; otherwise, the container and
Pod should terminate and possibly be rescheduled.&lt;/p&gt;
&lt;p&gt;This can now be achieved with container-level &lt;code&gt;restartPolicyRules&lt;/code&gt;. The workload
can exit with different codes to represent retriable and non-retriable errors.
With &lt;code&gt;restartPolicyRules&lt;/code&gt;, the workload can be restarted in-place quickly, but
only when the error is retriable.&lt;/p&gt;
&lt;h3 id=&#34;try-once-init-containers&#34;&gt;Try-once init containers&lt;/h3&gt;
&lt;p&gt;Init containers are often used to perform initialization work for the main
container, such as setting up environments and credentials. Sometimes, you want
the main container to always be restarted, but you don&#39;t want to retry
initialization if it fails.&lt;/p&gt;
&lt;p&gt;With a container-level &lt;code&gt;restartPolicy&lt;/code&gt;, this is now possible. The init container
can be executed only once, and its failure would be considered a Pod failure. If
the initialization succeeds, the main container can be always restarted.&lt;/p&gt;
&lt;h3 id=&#34;pods-with-multiple-containers&#34;&gt;Pods with multiple containers&lt;/h3&gt;
&lt;p&gt;For Pods that run multiple containers, you might have different restart
requirements for each container. Some containers might have a clear definition
of success and should only be restarted on failure. Others might need to be
always restarted.&lt;/p&gt;
&lt;p&gt;This is now possible with a container-level &lt;code&gt;restartPolicy&lt;/code&gt;, allowing individual
containers to have different restart policies.&lt;/p&gt;
&lt;h2 id=&#34;how-to-use-it&#34;&gt;How to use it&lt;/h2&gt;
&lt;p&gt;To use this new feature, you need to enable the &lt;code&gt;ContainerRestartRules&lt;/code&gt; feature
gate on your Kubernetes cluster control-plane and worker nodes running
Kubernetes 1.34+. Once enabled, you can specify the &lt;code&gt;restartPolicy&lt;/code&gt; and
&lt;code&gt;restartPolicyRules&lt;/code&gt; fields in your container definitions.&lt;/p&gt;
&lt;p&gt;Here are some examples:&lt;/p&gt;
&lt;h3 id=&#34;example-1-restarting-on-specific-exit-codes&#34;&gt;Example 1: Restarting on specific exit codes&lt;/h3&gt;
&lt;p&gt;In this example, the container should restart if and only if it fails with a
retriable error, represented by exit code 42.&lt;/p&gt;
&lt;p&gt;To achieve this, the container has &lt;code&gt;restartPolicy: Never&lt;/code&gt;, and a restart
policy rule that tells Kubernetes to restart the container in-place if it exits
with code 42.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;restart-on-exit-codes&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;annotations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kubernetes.io/description&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;This Pod only restart the container only when it exits with code 42.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Never&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;restart-on-exit-codes&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;docker.io/library/busybox:1.28&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sleep 60 &amp;amp;&amp;amp; exit 0&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Never    &lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Container restart policy must be specified if rules are specified&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicyRules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# Only restart the container if it exits with code 42&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;action&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Restart&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exitCodes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;operator&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;In&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;values&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#666&#34;&gt;42&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;example-2-a-try-once-init-container&#34;&gt;Example 2: A try-once init container&lt;/h3&gt;
&lt;p&gt;In this example, a Pod should always be restarted once the initialization succeeds.
However, the initialization should only be tried once.&lt;/p&gt;
&lt;p&gt;To achieve this, the Pod has an &lt;code&gt;Always&lt;/code&gt; restart policy. The &lt;code&gt;init-once&lt;/code&gt;
init container will only try once. If it fails, the Pod will fail. This allows
the Pod to fail if the initialization failed, but also keep running once the
initialization succeeds.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;fail-pod-if-init-fails&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;annotations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kubernetes.io/description&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;This Pod has an init container that runs only once. After initialization succeeds, the main container will always be restarted.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;init-once     &lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This init container will only try once. If it fails, the Pod will fail.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;docker.io/library/busybox:1.28&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;echo &amp;#34;Failing initialization&amp;#34; &amp;amp;&amp;amp; sleep 10 &amp;amp;&amp;amp; exit 1&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Never&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;main-container&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This container will always be restarted once initialization succeeds.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;docker.io/library/busybox:1.28&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sleep 1800 &amp;amp;&amp;amp; exit 0&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;example-3-containers-with-different-restart-policies&#34;&gt;Example 3: Containers with different restart policies&lt;/h3&gt;
&lt;p&gt;In this example, there are two containers with different restart requirements. One
should always be restarted, while the other should only be restarted on failure.&lt;/p&gt;
&lt;p&gt;This is achieved by using a different container-level &lt;code&gt;restartPolicy&lt;/code&gt; on each of
the two containers.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;on&lt;/span&gt;-failure-pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;annotations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kubernetes.io/description&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;This Pod has two containers with different restart policies.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;restart-on-failure&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;docker.io/library/busybox:1.28&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;echo &amp;#34;Not restarting after success&amp;#34; &amp;amp;&amp;amp; sleep 10 &amp;amp;&amp;amp; exit 0&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;OnFailure&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;restart-always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;docker.io/library/busybox:1.28&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;echo &amp;#34;Always restarting&amp;#34; &amp;amp;&amp;amp; sleep 1800 &amp;amp;&amp;amp; exit 0&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Read the documentation for
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/pods/pod-lifecycle/#container-restart-rules&#34;&gt;container restart policy&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Read the KEP for the
&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/5307-container-restart-policy&#34;&gt;Container Restart Rules&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;roadmap&#34;&gt;Roadmap&lt;/h2&gt;
&lt;p&gt;More actions and signals to restart Pods and containers are coming! Notably,
there are plans to add support for restarting the entire Pod. Planning and
discussions on these features are in progress. Feel free to share feedback or
requests with the SIG Node community!&lt;/p&gt;
&lt;h2 id=&#34;your-feedback-is-welcome&#34;&gt;Your feedback is welcome!&lt;/h2&gt;
&lt;p&gt;This is an alpha feature, and the Kubernetes project would love to hear your feedback.
Please try it out. This feature is driven by the
&lt;a href=&#34;https://github.com/Kubernetes/community/blob/master/sig-node/README.md&#34;&gt;SIG Node&lt;/a&gt;.
If you are interested in helping develop this feature, sharing feedback, or
participating in any other ongoing SIG Node projects, please reach out to the
SIG Node community!&lt;/p&gt;
&lt;p&gt;You can reach SIG Node by several means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Slack: &lt;a href=&#34;https://kubernetes.slack.com/messages/sig-node&#34;&gt;#sig-node&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://groups.google.com/forum/#!forum/kubernetes-sig-node&#34;&gt;Mailing list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/community/labels/sig%2Fnode&#34;&gt;Open Community Issues/PRs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: User preferences (kuberc) are available for testing in kubectl 1.34</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/28/kubernetes-v1-34-kubectl-kuberc-beta/</link>
      <pubDate>Thu, 28 Aug 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/28/kubernetes-v1-34-kubectl-kuberc-beta/</guid>
      <description>
        
        
        &lt;p&gt;Have you ever wished you could enable &lt;a href=&#34;https://kep.k8s.io/3895&#34;&gt;interactive delete&lt;/a&gt;,
by default, in &lt;code&gt;kubectl&lt;/code&gt;? Or maybe, you&#39;d like to have custom aliases defined,
but not necessarily &lt;a href=&#34;https://github.com/ahmetb/kubectl-aliases&#34;&gt;generate hundreds of them manually&lt;/a&gt;?
Look no further. &lt;a href=&#34;https://git.k8s.io/community/sig-cli/&#34;&gt;SIG-CLI&lt;/a&gt;
has been working hard to add &lt;a href=&#34;https://kep.k8s.io/3104&#34;&gt;user preferences to kubectl&lt;/a&gt;,
and we are happy to announce that this functionality is reaching beta as part
of the Kubernetes v1.34 release.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;A full description of this functionality is available &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubectl/kuberc/&#34;&gt;in our official documentation&lt;/a&gt;,
but this blog post will answer both of the questions from the beginning of this
article.&lt;/p&gt;
&lt;p&gt;Before we dive into details, let&#39;s quickly cover what the user preferences file
looks like and where to place it. By default, &lt;code&gt;kubectl&lt;/code&gt; will look for &lt;code&gt;kuberc&lt;/code&gt;
file in your default &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/configuration/organize-cluster-access-kubeconfig/&#34;&gt;kubeconfig&lt;/a&gt;
directory, which is &lt;code&gt;$HOME/.kube&lt;/code&gt;. Alternatively, you can specify this location
using &lt;code&gt;--kuberc&lt;/code&gt; option or the &lt;code&gt;KUBERC&lt;/code&gt; environment variable.&lt;/p&gt;
&lt;p&gt;Just like every Kubernetes manifest, &lt;code&gt;kuberc&lt;/code&gt; file will start with an &lt;code&gt;apiVersion&lt;/code&gt;
and &lt;code&gt;kind&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kubectl.config.k8s.io/v1beta1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Preference&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# the user preferences will follow here&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;defaults&#34;&gt;Defaults&lt;/h3&gt;
&lt;p&gt;Let&#39;s start by setting default values for &lt;code&gt;kubectl&lt;/code&gt; command options. Our goal
is to always use interactive delete, which means we want the &lt;code&gt;--interactive&lt;/code&gt;
option for &lt;code&gt;kubectl delete&lt;/code&gt; to always be set to &lt;code&gt;true&lt;/code&gt;. This can be achieved
with the following addition to our &lt;code&gt;kuberc&lt;/code&gt; file:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;defaults&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;delete&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;options&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;interactive&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;true&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In the above example, I&#39;m introducing &lt;code&gt;defaults&lt;/code&gt; section, which allows users to
define default values for &lt;code&gt;kubectl&lt;/code&gt; options. In this case, we&#39;re setting the
interactive option for &lt;code&gt;kubectl delete&lt;/code&gt; to be &lt;code&gt;true&lt;/code&gt; by default. This default
can be overridden if a user explicitly provides a different value such as
&lt;code&gt;kubectl delete --interactive=false&lt;/code&gt;, in which case the explicit option takes
precedence.&lt;/p&gt;
&lt;p&gt;Another highly encouraged default from SIG-CLI, is using &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/using-api/server-side-apply/&#34;&gt;Server-Side Apply&lt;/a&gt;.
To do so, you can add the following snippet to your preferences:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# continuing defaults section&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;apply&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;options&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;server-side&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;true&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;aliases&#34;&gt;Aliases&lt;/h3&gt;
&lt;p&gt;The ability to define aliases allows us to save precious seconds when typing
commands. I bet that you most likely have one defined for &lt;code&gt;kubectl&lt;/code&gt;, because
typing seven letters is definitely longer than just pressing &lt;code&gt;k&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For this reason, the ability to define aliases was a must-have when we decided
to implement user preferences, alongside defaulting. To define an alias for any
of the built-in commands, expand your &lt;code&gt;kuberc&lt;/code&gt; file with the following addition:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;aliases&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gns&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;get&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;prependArgs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;   &lt;/span&gt;- namespace&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;options&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;   &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;output&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;     &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;json&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There&#39;s a lot going on above, so let me break this down. First, we&#39;re introducing
a new section: &lt;code&gt;aliases&lt;/code&gt;. Here, we&#39;re defining a new alias &lt;code&gt;gns&lt;/code&gt;, which is mapped
to the command &lt;code&gt;get&lt;/code&gt; command. Next, we&#39;re defining arguments (&lt;code&gt;namespace&lt;/code&gt; resource)
that will be inserted right after the command name. Additionally, we&#39;re setting
&lt;code&gt;--output=json&lt;/code&gt; option for this alias. The structure of &lt;code&gt;options&lt;/code&gt; block is identical
to the one in the &lt;code&gt;defaults&lt;/code&gt; section.&lt;/p&gt;
&lt;p&gt;You probably noticed that we&#39;ve introduced a mechanism for prepending arguments,
and you might wonder if there is a complementary setting for appending them (in
other words, adding to the end of the command, after user-provided arguments).
This can be achieved through &lt;code&gt;appendArgs&lt;/code&gt; block, which is presented below:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# continuing aliases section&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;runx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;run&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;options&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;image&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;busybox&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;namespace&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;default&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;test-ns&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;appendArgs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- --&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- custom-arg&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here, we&#39;re introducing another alias: &lt;code&gt;runx&lt;/code&gt;, which invokes &lt;code&gt;kubectl run&lt;/code&gt; command,
passing &lt;code&gt;--image&lt;/code&gt; and &lt;code&gt;--namespace&lt;/code&gt; options with predefined values, and also
appending &lt;code&gt;--&lt;/code&gt; and &lt;code&gt;custom-arg&lt;/code&gt; at the end of the invocation.&lt;/p&gt;
&lt;h2 id=&#34;debugging&#34;&gt;Debugging&lt;/h2&gt;
&lt;p&gt;We hope that &lt;code&gt;kubectl&lt;/code&gt; user preferences will open up new possibilities for our users.
Whenever you&#39;re in doubt, feel free to run &lt;code&gt;kubectl&lt;/code&gt; with increased verbosity.
At &lt;code&gt;-v=5&lt;/code&gt;, you should get all the possible debugging information from this feature,
which will be crucial when reporting issues.&lt;/p&gt;
&lt;p&gt;To learn more, I encourage you to read through &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubectl/kuberc/&#34;&gt;our official documentation&lt;/a&gt;
and the &lt;a href=&#34;https://git.k8s.io/enhancements/keps/sig-cli/3104-introduce-kuberc/README.md&#34;&gt;actual proposal&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;Kubectl user preferences feature has reached beta, and we are very interested
in your feedback. We&#39;d love to hear what you like about it and what problems
you&#39;d like to see it solve. Feel free to join &lt;a href=&#34;https://kubernetes.slack.com/archives/C2GL57FJ4&#34;&gt;SIG-CLI slack channel&lt;/a&gt;,
or open an issue against &lt;a href=&#34;https://git.k8s.io/kubectl/&#34;&gt;kubectl repository&lt;/a&gt;.
You can also join us at our &lt;a href=&#34;https://git.k8s.io/community/sig-cli/#meetings&#34;&gt;community meetings&lt;/a&gt;,
which happen every other Wednesday, and share your stories with us.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34: Of Wind &amp; Will (O&#39; WaW)</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/27/kubernetes-v1-34-release/</link>
      <pubDate>Wed, 27 Aug 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/27/kubernetes-v1-34-release/</guid>
      <description>
        
        
        &lt;p&gt;&lt;strong&gt;Editors:&lt;/strong&gt; Agustina Barbetta, Alejandro Josue Leon Bellido, Graziano Casto, Melony Qin, Dipesh Rawat&lt;/p&gt;
&lt;p&gt;Similar to previous releases, the release of Kubernetes v1.34 introduces new stable, beta, and alpha features. The consistent delivery of high-quality releases underscores the strength of our development cycle and the vibrant support from our community.&lt;/p&gt;
&lt;p&gt;This release consists of 58 enhancements. Of those enhancements, 23 have graduated to Stable, 22 have entered Beta, and 13 have entered Alpha.&lt;/p&gt;
&lt;p&gt;There are also some &lt;a href=&#34;#deprecations-and-removals&#34;&gt;deprecations and removals&lt;/a&gt; in this release; make sure to read about those.&lt;/p&gt;
&lt;h2 id=&#34;release-theme-and-logo&#34;&gt;Release theme and logo&lt;/h2&gt;


&lt;figure class=&#34;release-logo &#34;&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/27/kubernetes-v1-34-release/k8s-v1.34.png&#34;
         alt=&#34;Kubernetes v1.34 logo: Three bears sail a wooden ship with a flag featuring a paw and a helm symbol on the sail, as wind blows across the ocean&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;A release powered by the wind around us — and the will within us.&lt;/p&gt;
&lt;p&gt;Every release cycle, we inherit winds that we don&#39;t really control — the state
of our tooling, documentation, and the historical quirks of our project.
Sometimes these winds fill our sails, sometimes they push us sideways or die
down.&lt;/p&gt;
&lt;p&gt;What keeps Kubernetes moving isn&#39;t the perfect winds, but the will of our
sailors who adjust the sails, man the helm, chart the courses and keep the ship
steady. The release happens not because conditions are always ideal, but because
of the people who build it, the people who release it, and the bears&lt;sup&gt;
^&lt;/sup&gt;, cats, dogs, wizards, and curious minds who keep Kubernetes sailing
strong — no matter which way the wind blows.&lt;/p&gt;
&lt;p&gt;This release, &lt;strong&gt;Of Wind &amp;amp; Will (O&#39; WaW)&lt;/strong&gt;, honors the winds that have shaped us,
and the will that propels us forward.&lt;/p&gt;
&lt;p&gt;&lt;sub&gt;^ Oh, and you wonder why bears? Keep wondering!&lt;/sub&gt;&lt;/p&gt;
&lt;h2 id=&#34;spotlight-on-key-updates&#34;&gt;Spotlight on key updates&lt;/h2&gt;
&lt;p&gt;Kubernetes v1.34 is packed with new features and improvements. Here are a few select updates the Release Team would like to highlight!&lt;/p&gt;
&lt;h3 id=&#34;stable-the-core-of-dra-is-ga&#34;&gt;Stable: The core of DRA is GA&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/&#34;&gt;Dynamic Resource Allocation&lt;/a&gt; (DRA)
enables more powerful ways to select, allocate, share, and configure
GPUs, TPUs, NICs and other devices.&lt;/p&gt;
&lt;p&gt;Since the v1.30 release, DRA has been based around claiming devices using
&lt;em&gt;structured parameters&lt;/em&gt; that are opaque to the core of Kubernetes.
This enhancement took inspiration from dynamic provisioning for storage volumes.
DRA with structured parameters relies on a set of supporting API kinds:
ResourceClaim, DeviceClass, ResourceClaimTemplate, and ResourceSlice API types
under &lt;code&gt;resource.k8s.io&lt;/code&gt;, while extending the &lt;code&gt;.spec&lt;/code&gt; for Pods with a new &lt;code&gt;resourceClaims&lt;/code&gt; field.&lt;br&gt;
The &lt;code&gt;resource.k8s.io/v1&lt;/code&gt; APIs have graduated to stable and are now available by default.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4381&#34;&gt;KEP #4381&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h3 id=&#34;beta-projected-serviceaccount-tokens-for-kubelet-image-credential-providers&#34;&gt;Beta: Projected ServiceAccount tokens for &lt;code&gt;kubelet&lt;/code&gt; image credential providers&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;kubelet&lt;/code&gt; credential providers, used for pulling private container images, traditionally relied on long-lived Secrets stored on the node or in the cluster. This approach increased security risks and management overhead, as these credentials were not tied to the specific workload and did not rotate automatically.&lt;br&gt;
To solve this, the &lt;code&gt;kubelet&lt;/code&gt; can now request short-lived, audience-bound ServiceAccount tokens for authenticating to container registries. This allows image pulls to be authorized based on the Pod&#39;s own identity rather than a node-level credential.&lt;br&gt;
The primary benefit is a significant security improvement. It eliminates the need for long-lived Secrets for image pulls, reducing the attack surface and simplifying credential management for both administrators and developers.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4412&#34;&gt;KEP #4412&lt;/a&gt; led by SIG Auth and SIG Node.&lt;/p&gt;
&lt;h3 id=&#34;alpha-support-for-kyaml-a-kubernetes-dialect-of-yaml&#34;&gt;Alpha: Support for KYAML, a Kubernetes dialect of YAML&lt;/h3&gt;
&lt;p&gt;KYAML aims to be a safer and less ambiguous YAML subset, and was designed specifically for Kubernetes. Whatever version of Kubernetes you use, starting from Kubernetes v1.34 you are able to use KYAML as a new output format for kubectl.&lt;/p&gt;
&lt;p&gt;KYAML addresses specific challenges with both YAML and JSON. YAML&#39;s significant whitespace requires careful attention to indentation and nesting, while its optional string-quoting can lead to unexpected type coercion (for example: &lt;a href=&#34;https://hitchdev.com/strictyaml/why/implicit-typing-removed/&#34;&gt;&amp;quot;The Norway Bug&amp;quot;&lt;/a&gt;). Meanwhile, JSON lacks comment support and has strict requirements for trailing commas and quoted keys.&lt;/p&gt;
&lt;p&gt;You can write KYAML and pass it as an input to any version of &lt;code&gt;kubectl&lt;/code&gt;, because all KYAML files are also valid as YAML. With &lt;code&gt;kubectl&lt;/code&gt; v1.34, you are also able to &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubectl/#syntax-1&#34;&gt;request KYAML output&lt;/a&gt; (as in kubectl get -o kyaml …) by setting environment variable &lt;code&gt;KUBECTL_KYAML=true&lt;/code&gt;. If you prefer, you can still request the output in JSON or YAML format.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5295&#34;&gt;KEP #5295&lt;/a&gt; led by SIG CLI.&lt;/p&gt;
&lt;h2 id=&#34;features-graduating-to-stable&#34;&gt;Features graduating to Stable&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;This is a selection of some of the improvements that are now stable following the v1.34 release.&lt;/em&gt;&lt;/p&gt;
&lt;h3 id=&#34;delayed-creation-of-job-s-replacement-pods&#34;&gt;Delayed creation of Job’s replacement Pods&lt;/h3&gt;
&lt;p&gt;By default, Job controllers create replacement Pods immediately when a Pod starts terminating, causing both Pods to run simultaneously. This can cause resource contention in constrained clusters, where the replacement Pod may struggle to find available nodes until the original Pod fully terminates. The situation can also trigger unwanted cluster autoscaler scale-ups.
Additionally, some machine learning frameworks like TensorFlow and &lt;a href=&#34;https://jax.readthedocs.io/en/latest/&#34;&gt;JAX&lt;/a&gt; require only one Pod per index to run at a time, making simultaneous Pod execution problematic.
This feature introduces &lt;code&gt;.spec.podReplacementPolicy&lt;/code&gt; in Jobs. You may choose to create replacement Pods only when the Pod is fully terminated (has &lt;code&gt;.status.phase: Failed&lt;/code&gt;). To do this, set &lt;code&gt;.spec.podReplacementPolicy: Failed&lt;/code&gt;.&lt;br&gt;
Introduced as alpha in v1.28, this feature has graduated to stable in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3939&#34;&gt;KEP #3939&lt;/a&gt; led by SIG Apps.&lt;/p&gt;
&lt;h3 id=&#34;recovery-from-volume-expansion-failure&#34;&gt;Recovery from volume expansion failure&lt;/h3&gt;
&lt;p&gt;This feature allows users to cancel volume expansions that are unsupported by the underlying storage provider, and retry volume expansion with smaller values that may succeed.&lt;br&gt;
Introduced as alpha in v1.23, this feature has graduated to stable in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/1790&#34;&gt;KEP #1790&lt;/a&gt; led by SIG Storage.&lt;/p&gt;
&lt;h3 id=&#34;volumeattributesclass-for-volume-modification&#34;&gt;VolumeAttributesClass for volume modification&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/storage/volume-attributes-classes/&#34;&gt;VolumeAttributesClass&lt;/a&gt; has graduated to stable in v1.34. VolumeAttributesClass is a generic, Kubernetes-native API for modifying volume parameters like provisioned IO. It allows workloads to vertically scale their volumes on-line to balance cost and performance, if supported by their provider.&lt;br&gt;
Like all new volume features in Kubernetes, this API is implemented via the &lt;a href=&#34;https://kubernetes-csi.github.io/docs/&#34;&gt;container storage interface (CSI)&lt;/a&gt;. Your provisioner-specific CSI driver must support the new ModifyVolume API which is the CSI side of this feature.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3751&#34;&gt;KEP #3751&lt;/a&gt; led by SIG Storage.&lt;/p&gt;
&lt;h3 id=&#34;structured-authentication-configuration&#34;&gt;Structured authentication configuration&lt;/h3&gt;
&lt;p&gt;Kubernetes v1.29 introduced a configuration file format to manage API server client authentication, moving away from the previous reliance on a large set of command-line options.
The &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/access-authn-authz/authentication/#using-authentication-configuration&#34;&gt;AuthenticationConfiguration&lt;/a&gt; kind allows administrators to support multiple JWT authenticators, CEL expression validation, and dynamic reloading.
This change significantly improves the manageability and auditability of the cluster&#39;s authentication settings - and has graduated to stable in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3331&#34;&gt;KEP #3331&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;finer-grained-authorization-based-on-selectors&#34;&gt;Finer-grained authorization based on selectors&lt;/h3&gt;
&lt;p&gt;Kubernetes authorizers, including webhook authorizers and the built-in node authorizer, can now make authorization decisions based on field and label selectors in incoming requests. When you send &lt;strong&gt;list&lt;/strong&gt;, &lt;strong&gt;watch&lt;/strong&gt; or &lt;strong&gt;deletecollection&lt;/strong&gt; requests with selectors, the authorization layer can now evaluate access with that additional context.&lt;/p&gt;
&lt;p&gt;For example, you can write an authorization policy that only allows listing Pods bound to a specific &lt;code&gt;.spec.nodeName&lt;/code&gt;.
The client (perhaps the kubelet on a particular node) must specify
the field selector that the policy requires, otherwise the request is forbidden.
This change makes it feasible to set up least privilege rules, provided that the client knows how to conform to the restrictions you set.
Kubernetes v1.34 now supports more granular control in environments like per-node isolation or custom multi-tenant setups.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4601&#34;&gt;KEP #4601&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;restrict-anonymous-requests-with-fine-grained-controls&#34;&gt;Restrict anonymous requests with fine-grained controls&lt;/h3&gt;
&lt;p&gt;Instead of fully enabling or disabling anonymous access, you can now configure a strict list of endpoints where unauthenticated requests are allowed. This provides a safer alternative for clusters that rely on anonymous access to health or bootstrap endpoints like &lt;code&gt;/healthz&lt;/code&gt;, &lt;code&gt;/readyz&lt;/code&gt;, or &lt;code&gt;/livez&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;With this feature, accidental RBAC misconfigurations that grant broad access to anonymous users can be avoided without requiring changes to external probes or bootstrapping tools.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4633&#34;&gt;KEP #4633&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;more-efficient-requeueing-through-plugin-specific-callbacks&#34;&gt;More efficient requeueing through plugin-specific callbacks&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;kube-scheduler&lt;/code&gt; can now make more accurate decisions about when to retry scheduling Pods that were previously unschedulable. Each scheduling plugin can now register callback functions that tell the scheduler whether an incoming cluster event is likely to make a rejected Pod schedulable again.&lt;/p&gt;
&lt;p&gt;This reduces unnecessary retries and improves overall scheduling throughput - especially in clusters using dynamic resource allocation. The feature also lets certain plugins skip the usual backoff delay when it is safe to do so, making scheduling faster in specific cases.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4247&#34;&gt;KEP #4247&lt;/a&gt; led by SIG Scheduling.&lt;/p&gt;
&lt;h3 id=&#34;ordered-namespace-deletion&#34;&gt;Ordered Namespace deletion&lt;/h3&gt;
&lt;p&gt;Semi-random resource deletion order can create security gaps or unintended behavior, such as Pods persisting after their associated NetworkPolicies are deleted.&lt;br&gt;
This improvement introduces a more structured deletion process for Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/namespaces/&#34;&gt;namespaces&lt;/a&gt; to ensure secure and deterministic resource removal. By enforcing a structured deletion sequence that respects logical and security dependencies, this approach ensures Pods are removed before other resources.&lt;br&gt;
This feature was introduced in Kubernetes v1.33 and graduated to stable in v1.34. The graduation improves security and reliability by mitigating risks from non-deterministic deletions, including the vulnerability described in &lt;a href=&#34;https://github.com/advisories/GHSA-r56h-j38w-hrqq&#34;&gt;CVE-2024-7598&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5080&#34;&gt;KEP #5080&lt;/a&gt; led by SIG API Machinery.&lt;/p&gt;
&lt;h3 id=&#34;streaming-list-responses&#34;&gt;Streaming &lt;strong&gt;list&lt;/strong&gt; responses&lt;/h3&gt;
&lt;p&gt;Handling large &lt;strong&gt;list&lt;/strong&gt; responses in Kubernetes previously posed a significant scalability challenge. When clients requested extensive resource lists, such as thousands of Pods or Custom Resources, the API server was required to serialize the entire collection of objects into a single, large memory buffer before sending it. This process created substantial memory pressure and could lead to performance degradation, impacting the overall stability of the cluster.&lt;br&gt;
To address this limitation, a streaming encoding mechanism for collections (list responses)
has been introduced. For the JSON and Kubernetes Protobuf response formats, that streaming mechanism
is automatically active and the associated feature gate is stable.
The primary benefit of this approach is the avoidance of large memory allocations on the API server, resulting in a much smaller and more predictable memory footprint.
Consequently, the cluster becomes more resilient and performant, especially in large-scale environments where frequent requests for extensive resource lists are common.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5116&#34;&gt;KEP #5116&lt;/a&gt; led by SIG API Machinery.&lt;/p&gt;
&lt;h3 id=&#34;resilient-watch-cache-initialization&#34;&gt;Resilient watch cache initialization&lt;/h3&gt;
&lt;p&gt;Watch cache is a caching layer inside &lt;code&gt;kube-apiserver&lt;/code&gt; that maintains an eventually consistent cache of cluster state stored in etcd. In the past, issues could occur when the watch cache was not yet initialized during &lt;code&gt;kube-apiserver&lt;/code&gt; startup or when it required re-initialization.&lt;/p&gt;
&lt;p&gt;To address these issues, the watch cache initialization process has been made more resilient to failures, improving control plane robustness and ensuring controllers and clients can reliably establish watches. This improvement was introduced as beta in v1.31 and is now stable.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4568&#34;&gt;KEP #4568&lt;/a&gt; led by SIG API Machinery and SIG Scalability.&lt;/p&gt;
&lt;h3 id=&#34;relaxing-dns-search-path-validation&#34;&gt;Relaxing DNS search path validation&lt;/h3&gt;
&lt;p&gt;Previously, the strict validation of a Pod&#39;s DNS &lt;code&gt;search&lt;/code&gt; path in Kubernetes often created integration challenges in complex or legacy network environments. This restrictiveness could block configurations that were necessary for an organization&#39;s infrastructure, forcing administrators to implement difficult workarounds.&lt;br&gt;
To address this, relaxed DNS validation was introduced as alpha in v1.32 and has now graduated to stable in v1.34. A common use case involves Pods that need to communicate with both internal Kubernetes services and external domains. By setting a single dot (&lt;code&gt;.&lt;/code&gt;) as the first entry in the &lt;code&gt;searches&lt;/code&gt; list of the Pod&#39;s &lt;code&gt;.spec.dnsConfig&lt;/code&gt;, administrators can prevent the system&#39;s resolver from appending the cluster&#39;s internal search domains to external queries. This avoids generating unnecessary DNS requests to the internal DNS server for external hostnames, improving efficiency and preventing potential resolution errors.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4427&#34;&gt;KEP #4427&lt;/a&gt; led by SIG Network.&lt;/p&gt;
&lt;h3 id=&#34;support-for-direct-service-return-dsr-in-windows-kube-proxy&#34;&gt;Support for Direct Service Return (DSR) in Windows &lt;code&gt;kube-proxy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;DSR provides performance optimizations by allowing return traffic routed through load balancers to bypass the load balancer and respond directly to the client, reducing load on the load balancer and improving overall latency. For information on DSR on Windows, read &lt;a href=&#34;https://techcommunity.microsoft.com/blog/networkingblog/direct-server-return-dsr-in-a-nutshell/693710&#34;&gt;Direct Server Return (DSR) in a nutshell&lt;/a&gt;.&lt;br&gt;
Initially introduced in v1.14, this feature has graduated to stable in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5100&#34;&gt;KEP #5100&lt;/a&gt; led by SIG Windows.&lt;/p&gt;
&lt;h3 id=&#34;sleep-action-for-container-lifecycle-hooks&#34;&gt;Sleep action for Container lifecycle hooks&lt;/h3&gt;
&lt;p&gt;A Sleep action for containers’ PreStop and PostStart lifecycle hooks was introduced to provide a straightforward way to manage graceful shutdowns and improve overall container lifecycle management.&lt;br&gt;
The Sleep action allows containers to pause for a specified duration after starting or before termination. Using a negative or zero sleep duration returns immediately, resulting in a no-op.&lt;br&gt;
The Sleep action was introduced in Kubernetes v1.29, with zero value support added in v1.32. Both features graduated to stable in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3960&#34;&gt;KEP #3960&lt;/a&gt; and &lt;a href=&#34;https://kep.k8s.io/4818&#34;&gt;KEP #4818&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h3 id=&#34;linux-node-swap-support&#34;&gt;Linux node swap support&lt;/h3&gt;
&lt;p&gt;Historically, the lack of swap support in Kubernetes could lead to workload instability, as nodes under memory pressure often had to terminate processes abruptly. This particularly affected applications with large but infrequently accessed memory footprints and prevented more graceful resource management.&lt;/p&gt;
&lt;p&gt;To address this, configurable per-node swap support was introduced in v1.22. It has progressed through alpha and beta stages and has graduated to stable in v1.34. The primary mode, &lt;code&gt;LimitedSwap&lt;/code&gt;, allows Pods to use swap within their existing memory limits, providing a direct solution to the problem. By default, the &lt;code&gt;kubelet&lt;/code&gt; is configured with &lt;code&gt;NoSwap&lt;/code&gt; mode, which means Kubernetes workloads cannot use swap.&lt;/p&gt;
&lt;p&gt;This feature improves workload stability and allows for more efficient resource utilization. It enables clusters to support a wider variety of applications, especially in resource-constrained environments, though administrators must consider the potential performance impact of swapping.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/2400&#34;&gt;KEP #2400&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h3 id=&#34;allow-special-characters-in-environment-variables&#34;&gt;Allow special characters in environment variables&lt;/h3&gt;
&lt;p&gt;The environment variable validation rules in Kubernetes have been relaxed
to allow nearly all printable ASCII characters in variable names, excluding &lt;code&gt;=&lt;/code&gt;.
This change supports scenarios where workloads require nonstandard characters in variable names - for example, frameworks like .NET Core that use &lt;code&gt;:&lt;/code&gt; to represent nested configuration keys.&lt;/p&gt;
&lt;p&gt;The relaxed validation applies to environment variables defined directly in Pod spec,
as well as those injected using &lt;code&gt;envFrom&lt;/code&gt; references to ConfigMaps and Secrets.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4369&#34;&gt;KEP #4369&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h3 id=&#34;taint-management-is-separated-from-node-lifecycle&#34;&gt;Taint management is separated from Node lifecycle&lt;/h3&gt;
&lt;p&gt;Historically, the &lt;code&gt;TaintManager&lt;/code&gt;&#39;s logic for applying NoSchedule and NoExecute taints to nodes based on their condition (NotReady, Unreachable, etc.) was tightly coupled with the node lifecycle controller. This tight coupling made the code harder to maintain and test, and it also limited the flexibility of the taint-based eviction mechanism.
This KEP refactors the &lt;code&gt;TaintManager&lt;/code&gt; into its own separate controller within the Kubernetes controller manager. It is an internal architectural improvement designed to increase code modularity and maintainability. This change allows the logic for taint-based evictions to be tested and evolved independently, but it has no direct user-facing impact on how taints are used.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3902&#34;&gt;KEP #3902&lt;/a&gt; led by SIG Scheduling and SIG Node.&lt;/p&gt;
&lt;h2 id=&#34;new-features-in-beta&#34;&gt;New features in Beta&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;This is a selection of some of the improvements that are now beta following the v1.34 release.&lt;/em&gt;&lt;/p&gt;
&lt;h3 id=&#34;pod-level-resource-requests-and-limits&#34;&gt;Pod-level resource requests and limits&lt;/h3&gt;
&lt;p&gt;Defining resource needs for Pods with multiple containers has been challenging,
as requests and limits could only be set on a per-container basis.
This forced developers to either over-provision resources for each container or meticulously
divide the total desired resources, making configuration complex and often leading to
inefficient resource allocation.
To simplify this, the ability to specify resource requests and limits at the Pod level was introduced.
This allows developers to define an overall resource budget for a Pod,
which is then shared among its constituent containers.
This feature was introduced as alpha in v1.32 and has graduated to beta in v1.34,
with HPA now supporting pod-level resource specifications.&lt;/p&gt;
&lt;p&gt;The primary benefit is a more intuitive and straightforward way to manage resources for multi-container Pods.
It ensures that the total resources used by all containers do not exceed the Pod&#39;s defined limits,
leading to better resource planning, more accurate scheduling,
and more efficient utilization of cluster resources.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/2837&#34;&gt;KEP #2837&lt;/a&gt; led by SIG Scheduling and SIG Autoscaling.&lt;/p&gt;
&lt;h3 id=&#34;kuberc-file-for-kubectl-user-preferences&#34;&gt;&lt;code&gt;.kuberc&lt;/code&gt; file for &lt;code&gt;kubectl&lt;/code&gt; user preferences&lt;/h3&gt;
&lt;p&gt;A &lt;code&gt;.kuberc&lt;/code&gt; configuration file allows you to define preferences for &lt;code&gt;kubectl&lt;/code&gt;, such as default options and command aliases. Unlike the kubeconfig file, the &lt;code&gt;.kuberc&lt;/code&gt; configuration file does not contain cluster details, usernames or passwords.&lt;br&gt;
This feature was introduced as alpha in v1.33, gated behind the environment variable &lt;code&gt;KUBECTL_KUBERC&lt;/code&gt;. It has graduated to beta in v1.34 and is enabled by default.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3104&#34;&gt;KEP #3104&lt;/a&gt; led by SIG CLI.&lt;/p&gt;
&lt;h3 id=&#34;external-serviceaccount-token-signing&#34;&gt;External ServiceAccount token signing&lt;/h3&gt;
&lt;p&gt;Traditionally, Kubernetes manages ServiceAccount tokens using static signing keys that are loaded from disk at &lt;code&gt;kube-apiserver&lt;/code&gt; startup. This feature introduces an &lt;code&gt;ExternalJWTSigner&lt;/code&gt; gRPC service for out-of-process signing, enabling Kubernetes distributions to integrate with external key management solutions (for example, HSMs, cloud KMSes) for ServiceAccount token signing instead of static disk-based keys.&lt;/p&gt;
&lt;p&gt;Introduced as alpha in v1.32, this external JWT signing capability advances to beta and is enabled by default in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/740&#34;&gt;KEP #740&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;dra-features-in-beta&#34;&gt;DRA features in beta&lt;/h3&gt;
&lt;h4 id=&#34;admin-access-for-secure-resource-monitoring&#34;&gt;Admin access for secure resource monitoring&lt;/h4&gt;
&lt;p&gt;DRA supports controlled administrative access via the &lt;code&gt;adminAccess&lt;/code&gt; field in ResourceClaims or ResourceClaimTemplates, allowing cluster operators to access devices already in use by others for monitoring or diagnostics. This privileged mode is limited to users authorized to create such objects in namespaces labeled &lt;code&gt;resource.k8s.io/admin-access: &amp;quot;true&amp;quot;&lt;/code&gt;, ensuring regular workloads remain unaffected. Graduating to beta in v1.34, this feature provides secure introspection capabilities while preserving workload isolation through namespace-based authorization checks.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5018&#34;&gt;KEP #5018&lt;/a&gt; led by WG Device Management and SIG Auth.&lt;/p&gt;
&lt;h4 id=&#34;prioritized-alternatives-in-resourceclaims-and-resourceclaimtemplates&#34;&gt;Prioritized alternatives in ResourceClaims and ResourceClaimTemplates&lt;/h4&gt;
&lt;p&gt;While a workload might run best on a single high-performance GPU, it might also be able to run on two mid-level GPUs.&lt;br&gt;
With the feature gate &lt;code&gt;DRAPrioritizedList&lt;/code&gt; (now enabled by default), ResourceClaims and ResourceClaimTemplates get a new field named &lt;code&gt;firstAvailable&lt;/code&gt;. This field is an ordered list that allows users to specify that a request may be satisfied in different ways, including allocating nothing at all if specific hardware is not available. The scheduler will attempt to satisfy the alternatives in the list in order, so the workload will be allocated the best set of devices available in the cluster.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4816&#34;&gt;KEP #4816&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h4 id=&#34;the-kubelet-reports-allocated-dra-resources&#34;&gt;The &lt;code&gt;kubelet&lt;/code&gt; reports allocated DRA resources&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;kubelet&lt;/code&gt;&#39;s API has been updated to report on Pod resources allocated through DRA. This allows node monitoring agents to discover the allocated DRA resources for Pods on a node. Additionally, it enables node components to use the PodResourcesAPI and leverage this DRA information when developing new features and integrations.&lt;br&gt;
Starting from Kubernetes v1.34, this feature is enabled by default.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3695&#34;&gt;KEP #3695&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h3 id=&#34;kube-scheduler-non-blocking-api-calls&#34;&gt;&lt;code&gt;kube-scheduler&lt;/code&gt; non-blocking API calls&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;kube-scheduler&lt;/code&gt; makes blocking API calls during scheduling cycles, creating performance bottlenecks. This feature introduces asynchronous API handling through a prioritized queue system with request deduplication, allowing the scheduler to continue processing Pods while API operations complete in the background. Key benefits include reduced scheduling latency, prevention of scheduler thread starvation during API delays, and immediate retry capability for unschedulable Pods. The implementation maintains backward compatibility and adds metrics for monitoring pending API operations.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5229&#34;&gt;KEP #5229&lt;/a&gt; led by SIG Scheduling.&lt;/p&gt;
&lt;h3 id=&#34;mutating-admission-policies&#34;&gt;Mutating admission policies&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/access-authn-authz/mutating-admission-policy/&#34;&gt;MutatingAdmissionPolicies&lt;/a&gt; offer a declarative, in-process alternative to mutating admission webhooks. This feature leverages CEL&#39;s object instantiation and JSON Patch strategies, combined with Server Side Apply’s merge algorithms.&lt;br&gt;
This significantly simplifies admission control by allowing administrators to define mutation rules directly in the API server.&lt;br&gt;
Introduced as alpha in v1.32, mutating admission policies has graduated to beta in v1.34.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3962&#34;&gt;KEP #3962&lt;/a&gt; led by SIG API Machinery.&lt;/p&gt;
&lt;h3 id=&#34;snapshottable-api-server-cache&#34;&gt;Snapshottable API server cache&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;kube-apiserver&lt;/code&gt;&#39;s caching mechanism (watch cache) efficiently serves requests for the latest observed state. However, &lt;strong&gt;list&lt;/strong&gt; requests for previous states (for example, via pagination or by specifying a &lt;code&gt;resourceVersion&lt;/code&gt;) often bypass this cache and are served directly from etcd. This direct etcd access significantly increases performance costs and can lead to stability issues, particularly with large resources, due to memory pressure from transferring large data blobs.&lt;br&gt;
With the &lt;code&gt;ListFromCacheSnapshot&lt;/code&gt; feature gate enabled by default, &lt;code&gt;kube-apiserver&lt;/code&gt; will attempt to serve the response from snapshots if one is available with &lt;code&gt;resourceVersion&lt;/code&gt; older than requested. The &lt;code&gt;kube-apiserver&lt;/code&gt; starts with no snapshots, creates a new snapshot on every watch event, and keeps them until it detects etcd is compacted or if cache is full with events older than 75 seconds. If the provided &lt;code&gt;resourceVersion&lt;/code&gt; is unavailable, the server will fallback to etcd.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4988&#34;&gt;KEP #4988&lt;/a&gt; led by SIG API Machinery.&lt;/p&gt;
&lt;h3 id=&#34;tooling-for-declarative-validation-of-kubernetes-native-types&#34;&gt;Tooling for declarative validation of Kubernetes-native types&lt;/h3&gt;
&lt;p&gt;Prior to this release, validation rules for the
APIs built into Kubernetes were written entirely by hand, which makes them difficult for maintainers to discover, understand, improve or test.
There was no single way to find all the validation rules that might apply to an API.
&lt;em&gt;Declarative validation&lt;/em&gt; benefits Kubernetes maintainers by making API development, maintenance, and review easier while enabling programmatic inspection for better tooling and documentation.
For people using Kubernetes libraries to write their own code
(for example: a controller), the new approach streamlines adding new fields through IDL tags, rather than complex validation functions.
This change helps speed up API creation by automating validation boilerplate,
and provides more relevant error messages by performing validation on versioned types.​​​​​​​​​​​​​​​​&lt;br&gt;
This enhancement (which graduated to beta in v1.33 and continues as beta in v1.34) brings CEL-based validation rules to native Kubernetes types. It allows for more granular and declarative validation to be defined directly in the type definitions, improving API consistency and developer experience.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5073&#34;&gt;KEP #5073&lt;/a&gt; led by SIG API Machinery.&lt;/p&gt;
&lt;h3 id=&#34;streaming-informers-for-list-requests&#34;&gt;Streaming informers for &lt;strong&gt;list&lt;/strong&gt; requests&lt;/h3&gt;
&lt;p&gt;The streaming informers feature, which has been in beta since v1.32, gains further beta refinements in v1.34. This capability allows &lt;strong&gt;list&lt;/strong&gt; requests to return data as a continuous stream of objects from the API server’s watch cache, rather than assembling paged results directly from etcd. By reusing the same mechanics used for &lt;strong&gt;watch&lt;/strong&gt; operations, the API server can serve large datasets while keeping memory usage steady and avoiding allocation spikes that can affect stability.&lt;/p&gt;
&lt;p&gt;In this release, the &lt;code&gt;kube-apiserver&lt;/code&gt; and &lt;code&gt;kube-controller-manager&lt;/code&gt; both take advantage of the new &lt;code&gt;WatchList&lt;/code&gt; mechanism by default. For the &lt;code&gt;kube-apiserver&lt;/code&gt;, this means list requests are streamed more efficiently, while the &lt;code&gt;kube-controller-manager&lt;/code&gt; benefits from a more memory-efficient and predictable way to work with informers. Together, these improvements reduce memory pressure during large list operations, and improve reliability under sustained load, making list streaming more predictable and efficient.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3157&#34;&gt;KEP #3157&lt;/a&gt; led by SIG API Machinery and SIG Scalability.&lt;/p&gt;
&lt;h3 id=&#34;graceful-node-shutdown-handling-for-windows-nodes&#34;&gt;Graceful node shutdown handling for Windows nodes&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;kubelet&lt;/code&gt; on Windows nodes can now detect system shutdown events and begin graceful termination of running Pods. This mirrors existing behavior on Linux and helps ensure workloads exit cleanly during planned shutdowns or restarts.&lt;br&gt;
When the system begins shutting down, the &lt;code&gt;kubelet&lt;/code&gt; reacts by using standard termination logic. It respects the configured lifecycle hooks and grace periods, giving Pods time to stop before the node powers off. The feature relies on Windows pre-shutdown notifications to coordinate this process. This enhancement improves workload reliability during maintenance, restarts, or system updates. It is now in beta and enabled by default.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4802&#34;&gt;KEP #4802&lt;/a&gt; led by SIG Windows.&lt;/p&gt;
&lt;h3 id=&#34;in-place-pod-resize-improvements&#34;&gt;In-place Pod resize improvements&lt;/h3&gt;
&lt;p&gt;Graduated to beta and enabled by default in v1.33, in-place Pod resizing receives further improvements in v1.34. These include support for decreasing memory usage and integration with Pod-level resources.&lt;/p&gt;
&lt;p&gt;This feature remains in beta in v1.34. For detailed usage instructions and examples, refer to the documentation: &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/resize-container-resources/&#34;&gt;Resize CPU and Memory Resources assigned to Containers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/1287&#34;&gt;KEP #1287&lt;/a&gt; led by SIG Node and SIG Autoscaling.&lt;/p&gt;
&lt;h2 id=&#34;new-features-in-alpha&#34;&gt;New features in Alpha&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;This is a selection of some of the improvements that are now alpha following the v1.34 release.&lt;/em&gt;&lt;/p&gt;
&lt;h3 id=&#34;pod-certificates-for-mtls-authentication&#34;&gt;Pod certificates for mTLS authentication&lt;/h3&gt;
&lt;p&gt;Authenticating workloads within a cluster, especially for communication with the API server, has primarily relied on ServiceAccount tokens. While effective, these tokens aren&#39;t always ideal for establishing a strong, verifiable identity for mutual TLS (mTLS) and can present challenges when integrating with external systems that expect certificate-based authentication.&lt;br&gt;
Kubernetes v1.34 introduces a built-in mechanism for Pods to obtain X.509 certificates via &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/access-authn-authz/certificate-signing-requests/#pod-certificate-requests&#34;&gt;PodCertificateRequests&lt;/a&gt;. The &lt;code&gt;kubelet&lt;/code&gt; can request and manage certificates for Pods, which can then be used to authenticate to the Kubernetes API server and other services using mTLS.
The primary benefit is a more robust and flexible identity mechanism for Pods. It provides a native way to implement strong mTLS authentication without relying solely on bearer tokens, aligning Kubernetes with standard security practices and simplifying integrations with certificate-aware observability and security tooling.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4317&#34;&gt;KEP #4317&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;restricted-pod-security-standard-now-forbids-remote-probes&#34;&gt;&amp;quot;Restricted&amp;quot; Pod security standard now forbids remote probes&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;host&lt;/code&gt; field within probes and lifecycle handlers allows users to specify an entity other than the &lt;code&gt;podIP&lt;/code&gt; for the &lt;code&gt;kubelet&lt;/code&gt; to probe.
However, this opens up a route for misuse and for attacks that bypass security controls, since the &lt;code&gt;host&lt;/code&gt; field could be set to &lt;strong&gt;any&lt;/strong&gt; value, including security sensitive external hosts, or localhost on the node.
In Kubernetes v1.34, Pods only meet the
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/security/pod-security-standards/#restricted&#34;&gt;Restricted&lt;/a&gt;
Pod security standard if they either leave the &lt;code&gt;host&lt;/code&gt; field unset, or if they don&#39;t even use this
kind of probe.
You can use &lt;em&gt;Pod security admission&lt;/em&gt;, or a third party solution, to enforce that Pods meet this standard. Because these are security controls, check
the documentation to understand the limitations and behavior of the enforcement mechanism you choose.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4940&#34;&gt;KEP #4940&lt;/a&gt; led by SIG Auth.&lt;/p&gt;
&lt;h3 id=&#34;use-status-nominatednodename-to-express-pod-placement&#34;&gt;Use &lt;code&gt;.status.nominatedNodeName&lt;/code&gt; to express Pod placement&lt;/h3&gt;
&lt;p&gt;When the &lt;code&gt;kube-scheduler&lt;/code&gt; takes time to bind Pods to Nodes, cluster autoscalers may not understand that a Pod will be bound to a specific Node. Consequently, they may mistakenly consider the Node as underutilized and delete it.&lt;br&gt;
To address this issue, the &lt;code&gt;kube-scheduler&lt;/code&gt; can use &lt;code&gt;.status.nominatedNodeName&lt;/code&gt; not only to indicate ongoing preemption but also to express Pod placement intentions. By enabling the &lt;code&gt;NominatedNodeNameForExpectation&lt;/code&gt; feature gate, the scheduler uses this field to indicate where a Pod will be bound. This exposes internal reservations to help external components make informed decisions.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5278&#34;&gt;KEP #5278&lt;/a&gt; led by SIG Scheduling.&lt;/p&gt;
&lt;h3 id=&#34;dra-features-in-alpha&#34;&gt;DRA features in alpha&lt;/h3&gt;
&lt;h4 id=&#34;resource-health-status-for-dra&#34;&gt;Resource health status for DRA&lt;/h4&gt;
&lt;p&gt;It can be difficult to know when a Pod is using a device that has failed or is temporarily unhealthy, which makes troubleshooting Pod crashes challenging or impossible.&lt;br&gt;
Resource Health Status for DRA improves observability by exposing the health status of devices allocated to a Pod in the Pod’s status. This makes it easier to identify the cause of Pod issues related to unhealthy devices and respond appropriately.&lt;br&gt;
To enable this functionality, the &lt;code&gt;ResourceHealthStatus&lt;/code&gt; feature gate must be enabled, and the DRA driver must implement the &lt;code&gt;DRAResourceHealth&lt;/code&gt; gRPC service.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4680&#34;&gt;KEP #4680&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h4 id=&#34;extended-resource-mapping&#34;&gt;Extended resource mapping&lt;/h4&gt;
&lt;p&gt;Extended resource mapping provides a simpler alternative to DRA&#39;s expressive and flexible approach by offering a straightforward way to describe resource capacity and consumption. This feature enables cluster administrators to advertise DRA-managed resources as &lt;em&gt;extended resources&lt;/em&gt;, allowing application developers and operators to continue using the familiar container’s &lt;code&gt;.spec.resources&lt;/code&gt; syntax to consume them.&lt;br&gt;
This enables existing workloads to adopt DRA without modifications, simplifying the transition to DRA for both application developers and cluster administrators.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5004&#34;&gt;KEP #5004&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h4 id=&#34;dra-consumable-capacity&#34;&gt;DRA consumable capacity&lt;/h4&gt;
&lt;p&gt;Kubernetes v1.33 added support for resource drivers to advertise slices of a device that are available, rather than exposing the entire device as an all-or-nothing resource. However, this approach couldn&#39;t handle scenarios where device drivers manage fine-grained, dynamic portions of a device resource based on user demand, or share those resources independently of ResourceClaims, which are restricted by their spec and namespace.&lt;br&gt;
Enabling the &lt;code&gt;DRAConsumableCapacity&lt;/code&gt; feature gate
(introduced as alpha in v1.34)
allows resource drivers to share the same device, or even a slice of a device, across multiple ResourceClaims or across multiple DeviceRequests.
The feature also extends the scheduler to support allocating portions of device resources,
as defined in the &lt;code&gt;capacity&lt;/code&gt; field.
This DRA feature improves device sharing across namespaces and claims, tailoring it to Pod needs. It enables drivers to enforce capacity limits, enhances scheduling, and supports new use cases like bandwidth-aware networking and multi-tenant sharing.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5075&#34;&gt;KEP #5075&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h4 id=&#34;device-binding-conditions&#34;&gt;Device binding conditions&lt;/h4&gt;
&lt;p&gt;The Kubernetes scheduler gets more reliable by delaying binding a Pod to a Node until its required external resources, such as attachable devices or FPGAs, are confirmed to be ready.&lt;br&gt;
This delay mechanism is implemented in the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/scheduling-framework/#pre-bind&#34;&gt;PreBind phase&lt;/a&gt; of the scheduling framework. During this phase, the scheduler checks whether all required device conditions are satisfied before proceeding with binding. This enables coordination with external device controllers, ensuring more robust, predictable scheduling.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5007&#34;&gt;KEP #5007&lt;/a&gt; led by WG Device Management.&lt;/p&gt;
&lt;h3 id=&#34;container-restart-rules&#34;&gt;Container restart rules&lt;/h3&gt;
&lt;p&gt;Currently, all containers within a Pod will follow the same &lt;code&gt;.spec.restartPolicy&lt;/code&gt; when exited or crashed. However, Pods that run multiple containers might have different restart requirements for each container. For example, for init containers used to perform initialization, you may not want to retry initialization if they fail. Similarly, in ML research environments with long-running training workloads, containers that fail with retriable exit codes should restart quickly in place, rather than triggering Pod recreation and losing progress.&lt;br&gt;
Kubernetes v1.34 introduces the &lt;code&gt;ContainerRestartRules&lt;/code&gt; feature gate. When enabled, a &lt;code&gt;restartPolicy&lt;/code&gt; can be specified for each container within a Pod. A &lt;code&gt;restartPolicyRules&lt;/code&gt; list can also be defined to override &lt;code&gt;restartPolicy&lt;/code&gt; based on the last exit code. This provides the fine-grained control needed to handle complex scenarios and better utilization of compute resources.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/5307&#34;&gt;KEP #5307&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h3 id=&#34;load-environment-variables-from-files-created-in-runtime&#34;&gt;Load environment variables from files created in runtime&lt;/h3&gt;
&lt;p&gt;Application developers have long requested greater flexibility in declaring environment variables.
Traditionally, environment variables are declared on the API server side via static values, ConfigMaps, or Secrets.&lt;/p&gt;
&lt;p&gt;Behind the &lt;code&gt;EnvFiles&lt;/code&gt; feature gate, Kubernetes v1.34 introduces the ability to declare environment variables at runtime.
One container (typically an init container) can generate the variable and store it in a file,
and a subsequent container can start with the environment variable loaded from that file.
This approach eliminates the need to &amp;quot;wrap&amp;quot; the target container&#39;s entry point,
enabling more flexible in-Pod container orchestration.&lt;/p&gt;
&lt;p&gt;This feature particularly benefits AI/ML training workloads,
where each Pod in a training Job requires initialization with runtime-defined values.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3721&#34;&gt;KEP #5307&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h2 id=&#34;graduations-deprecations-and-removals-in-v1-34&#34;&gt;Graduations, deprecations, and removals in v1.34&lt;/h2&gt;
&lt;h3 id=&#34;graduations-to-stable&#34;&gt;Graduations to stable&lt;/h3&gt;
&lt;p&gt;This lists all the features that graduated to stable (also known as &lt;em&gt;general availability&lt;/em&gt;). For a full list of updates including new features and graduations from alpha to beta, see the release notes.&lt;/p&gt;
&lt;p&gt;This release includes a total of 23 enhancements promoted to stable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4369&#34;&gt;Allow almost all printable ASCII characters in environment variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/3939&#34;&gt;Allow for recreation of pods once fully terminated in the job controller&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4818&#34;&gt;Allow zero value for Sleep Action of PreStop Hook&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/647&#34;&gt;API Server tracing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/24&#34;&gt;AppArmor support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4601&#34;&gt;Authorize with Field and Label Selectors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/2340&#34;&gt;Consistent Reads from Cache&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/3902&#34;&gt;Decouple TaintManager from NodeLifecycleController&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4033&#34;&gt;Discover cgroup driver from CRI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4381&#34;&gt;DRA: structured parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/3960&#34;&gt;Introducing Sleep Action for PreStop Hook&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/2831&#34;&gt;Kubelet OpenTelemetry Tracing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/3751&#34;&gt;Kubernetes VolumeAttributesClass ModifyVolume&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/2400&#34;&gt;Node memory swap support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4633&#34;&gt;Only allow anonymous auth for configured endpoints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/5080&#34;&gt;Ordered namespace deletion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4247&#34;&gt;Per-plugin callback functions for accurate requeueing in kube-scheduler&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4427&#34;&gt;Relaxed DNS search string validation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/4568&#34;&gt;Resilient Watchcache Initialization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/5116&#34;&gt;Streaming Encoding for LIST Responses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/3331&#34;&gt;Structured Authentication Config&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/5100&#34;&gt;Support for Direct Service Return (DSR) and overlay networking in Windows kube-proxy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kep.k8s.io/1790&#34;&gt;Support recovery from volume expansion failure&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;deprecations-and-removals&#34;&gt;Deprecations and removals&lt;/h3&gt;
&lt;p&gt;As Kubernetes develops and matures, features may be deprecated, removed, or replaced with better
ones to improve the project&#39;s overall health. See the Kubernetes
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/using-api/deprecation-policy/&#34;&gt;deprecation and removal policy&lt;/a&gt; for more details on
this process. Kubernetes v1.34 includes a couple of deprecations.&lt;/p&gt;
&lt;h4 id=&#34;manual-cgroup-driver-configuration-is-deprecated&#34;&gt;Manual cgroup driver configuration is deprecated&lt;/h4&gt;
&lt;p&gt;Historically, configuring the correct cgroup driver has been a pain point for users running Kubernetes clusters.
Kubernetes v1.28 added a way for the &lt;code&gt;kubelet&lt;/code&gt;
to query the CRI implementation and find which cgroup driver to use. That automated detection is now
&lt;strong&gt;strongly recommended&lt;/strong&gt; and support for it has graduated to stable in v1.34.
If your CRI container runtime does not support the
ability to report the cgroup driver it needs, you
should upgrade or change your container runtime.
The &lt;code&gt;cgroupDriver&lt;/code&gt; configuration setting in the &lt;code&gt;kubelet&lt;/code&gt; configuration file is now deprecated.
The corresponding command-line option &lt;code&gt;--cgroup-driver&lt;/code&gt; was previously deprecated,
as Kubernetes recommends using the configuration file instead.
Both the configuration setting and command-line option will be removed in a future release,
that removal will not happen before the v1.36 minor release.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4033&#34;&gt;KEP #4033&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h4 id=&#34;kubernetes-to-end-containerd-1-x-support-in-v1-36&#34;&gt;Kubernetes to end containerd 1.x support in v1.36&lt;/h4&gt;
&lt;p&gt;While Kubernetes v1.34 still supports containerd 1.7 and other LTS releases of containerd,
as a consequence of automated cgroup driver detection, the Kubernetes SIG Node community
has formally agreed upon a final support timeline for containerd v1.X.
The last Kubernetes release to offer this support will be v1.35 (aligned with containerd 1.7 EOL).
This is an early warning that if you are using containerd 1.X, consider switching to 2.0+ soon.
You are able to monitor the &lt;code&gt;kubelet_cri_losing_support&lt;/code&gt; metric to determine if any nodes in your
cluster are using a containerd version that will soon be outdated.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/4033&#34;&gt;KEP #4033&lt;/a&gt; led by SIG Node.&lt;/p&gt;
&lt;h4 id=&#34;preferclose-traffic-distribution-is-deprecated&#34;&gt;&lt;code&gt;PreferClose&lt;/code&gt; traffic distribution is deprecated&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;spec.trafficDistribution&lt;/code&gt; field within a Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/services-networking/service/&#34;&gt;Service&lt;/a&gt; allows users to express preferences for how traffic should be routed to Service endpoints.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://kep.k8s.io/3015&#34;&gt;KEP-3015&lt;/a&gt; deprecates &lt;code&gt;PreferClose&lt;/code&gt; and introduces two additional values: &lt;code&gt;PreferSameZone&lt;/code&gt; and &lt;code&gt;PreferSameNode&lt;/code&gt;. &lt;code&gt;PreferSameZone&lt;/code&gt; is an alias for the existing &lt;code&gt;PreferClose&lt;/code&gt; to clarify its semantics. &lt;code&gt;PreferSameNode&lt;/code&gt; allows connections to be delivered to a local endpoint when possible, falling back to a remote endpoint when not possible.&lt;/p&gt;
&lt;p&gt;This feature was introduced in v1.33 behind the &lt;code&gt;PreferSameTrafficDistribution&lt;/code&gt; feature gate. It has graduated to beta in v1.34 and is enabled by default.&lt;/p&gt;
&lt;p&gt;This work was done as part of &lt;a href=&#34;https://kep.k8s.io/3015&#34;&gt;KEP #3015&lt;/a&gt; led by SIG Network.&lt;/p&gt;
&lt;h2 id=&#34;release-notes&#34;&gt;Release notes&lt;/h2&gt;
&lt;p&gt;Check out the full details of the Kubernetes v1.34 release in our &lt;a href=&#34;https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.34.md&#34;&gt;release notes&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;availability&#34;&gt;Availability&lt;/h2&gt;
&lt;p&gt;Kubernetes v1.34 is available for download on &lt;a href=&#34;https://github.com/kubernetes/kubernetes/releases/tag/v1.34.0&#34;&gt;GitHub&lt;/a&gt; or on the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/releases/download/&#34;&gt;Kubernetes download page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To get started with Kubernetes, check out these &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tutorials/&#34;&gt;interactive tutorials&lt;/a&gt; or run local Kubernetes clusters using &lt;a href=&#34;https://minikube.sigs.k8s.io/&#34;&gt;minikube&lt;/a&gt;. You can also easily install v1.34 using &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/&#34;&gt;kubeadm&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;release-team&#34;&gt;Release Team&lt;/h2&gt;
&lt;p&gt;Kubernetes is only possible with the support, commitment, and hard work of its community. Each release team is made up of dedicated community volunteers who work together to build the many pieces that make up the Kubernetes releases you rely on. This requires the specialized skills of people from all corners of our community, from the code itself to its documentation and project management.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/cncf/memorials/blob/main/rodolfo-martinez.md&#34;&gt;We honor the memory of Rodolfo &amp;quot;Rodo&amp;quot; Martínez Vega&lt;/a&gt;, a dedicated contributor whose passion for technology and community building left a mark on the Kubernetes community. Rodo served as a member of the Kubernetes Release Team across multiple releases, including v1.22-v1.23 and v1.25-v1.30, demonstrating unwavering commitment to the project&#39;s success and stability.&lt;br&gt;
Beyond his Release Team contributions, Rodo was deeply involved in fostering the Cloud Native LATAM community, helping to bridge language and cultural barriers in the space. His work on the Spanish version of Kubernetes documentation and the CNCF Glossary exemplified his dedication to making knowledge accessible to Spanish-speaking developers worldwide. Rodo&#39;s legacy lives on through the countless community members he mentored, the releases he helped deliver, and the vibrant LATAM Kubernetes community he helped cultivate.&lt;/p&gt;
&lt;p&gt;We would like to thank the entire &lt;a href=&#34;https://github.com/kubernetes/sig-release/blob/master/releases/release-1.34/release-team.md&#34;&gt;Release Team&lt;/a&gt; for the hours spent hard at work to deliver the Kubernetes v1.34 release to our community. The Release Team&#39;s membership ranges from first-time shadows to returning team leads with experience forged over several release cycles. A very special thanks goes out to our release lead, Vyom Yadav, for guiding us through a successful release cycle, for his hands-on approach to solving challenges, and for bringing the energy and care that drives our community forward.&lt;/p&gt;
&lt;h2 id=&#34;project-velocity&#34;&gt;Project Velocity&lt;/h2&gt;
&lt;p&gt;The CNCF K8s &lt;a href=&#34;https://k8s.devstats.cncf.io/d/11/companies-contributing-in-repository-groups?orgId=1&amp;var-period=m&amp;var-repogroup_name=All&#34;&gt;DevStats&lt;/a&gt; project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing and is an illustration of the depth and breadth of effort that goes into evolving this ecosystem.&lt;/p&gt;
&lt;p&gt;During the v1.34 release cycle, which spanned 15 weeks from 19th May 2025 to 27th August 2025, Kubernetes received contributions from as many as 106 different companies and 491 individuals. In the wider cloud native ecosystem, the figure goes up to 370 companies, counting 2235 total contributors.&lt;/p&gt;
&lt;p&gt;Note that &amp;quot;contribution&amp;quot; counts when someone makes a commit, code review, comment, creates an issue or PR, reviews a PR (including blogs and documentation) or comments on issues and PRs.&lt;br&gt;
If you are interested in contributing, visit &lt;a href=&#34;https://www.kubernetes.dev/docs/guide/#getting-started&#34;&gt;Getting Started&lt;/a&gt; on our contributor website.&lt;/p&gt;
&lt;p&gt;Source for this data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://k8s.devstats.cncf.io/d/11/companies-contributing-in-repository-groups?orgId=1&amp;from=1747609200000&amp;to=1756335599000&amp;var-period=d28&amp;var-repogroup_name=Kubernetes&amp;var-repo_name=kubernetes%2Fkubernetes&#34;&gt;Companies contributing to Kubernetes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://k8s.devstats.cncf.io/d/11/companies-contributing-in-repository-groups?orgId=1&amp;from=1747609200000&amp;to=1756335599000&amp;var-period=d28&amp;var-repogroup_name=All&amp;var-repo_name=kubernetes%2Fkubernetes&#34;&gt;Overall ecosystem contributions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;event-update&#34;&gt;Event Update&lt;/h2&gt;
&lt;p&gt;Explore upcoming Kubernetes and cloud native events, including KubeCon + CloudNativeCon, KCD, and other notable conferences worldwide. Stay informed and get involved with the Kubernetes community!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;August 2025&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-colombia-presents-kcd-colombia-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days:  Colombia&lt;/strong&gt;&lt;/a&gt;: Aug 28, 2025 | Bogotá, Colombia&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;September 2025&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-cloud-native-sydney-presents-cloudcon-sydney-sydney-international-convention-centre-910-september/&#34;&gt;&lt;strong&gt;CloudCon Sydney&lt;/strong&gt;&lt;/a&gt;: Sep 9–10, 2025 | Sydney, Australia.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-sf-bay-area-presents-kcd-san-francisco-bay-area/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: San Francisco Bay Area&lt;/strong&gt;&lt;/a&gt;: Sep 9, 2025 | San Francisco, USA&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-washington-dc-presents-kcd-washington-dc-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Washington DC&lt;/strong&gt;&lt;/a&gt;: Sep 16, 2025 | Washington, D.C., USA&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-sofia-presents-kubernetes-community-days-sofia/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Sofia&lt;/strong&gt;&lt;/a&gt;: Sep 18, 2025 | Sofia, Bulgaria&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-el-salvador-presents-kcd-el-salvador/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: El Salvador&lt;/strong&gt;&lt;/a&gt;: Sep 20, 2025 | San Salvador, El Salvador&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;October 2025&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-warsaw-presents-kcd-warsaw-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Warsaw&lt;/strong&gt;&lt;/a&gt;: Oct 9, 2025 | Warsaw, Poland&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-uk-presents-kubernetes-community-days-uk-edinburgh-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Edinburgh&lt;/strong&gt;&lt;/a&gt;: Oct 21, 2025 | Edinburgh, United Kingdom&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-sri-lanka-presents-kcd-sri-lanka-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Sri Lanka&lt;/strong&gt;&lt;/a&gt;: Oct 26, 2025 | Colombo, Sri Lanka&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;November 2025&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-porto-presents-kcd-porto-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Porto&lt;/strong&gt;&lt;/a&gt;: Nov 3, 2025 | Porto, Portugal&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/&#34;&gt;&lt;strong&gt;KubeCon + CloudNativeCon North America 2025&lt;/strong&gt;&lt;/a&gt;: Nov 10-13, 2025 | Atlanta, USA&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://sessionize.com/kcd-hangzhou-and-oicd-2025/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Hangzhou&lt;/strong&gt;&lt;/a&gt;: Nov 15, 2025 | Hangzhou, China&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;December 2025&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://community.cncf.io/events/details/cncf-kcd-suisse-romande-presents-kcd-suisse-romande/&#34;&gt;&lt;strong&gt;KCD - Kubernetes Community Days: Suisse Romande&lt;/strong&gt;&lt;/a&gt;: Dec 4, 2025 | Geneva, Switzerland&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can find the latest event details &lt;a href=&#34;https://community.cncf.io/events/#/list&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;upcoming-release-webinar&#34;&gt;Upcoming Release Webinar&lt;/h2&gt;
&lt;p&gt;Join members of the Kubernetes v1.34 Release Team on &lt;strong&gt;Wednesday, September 24th 2025 at 4:00 PM (UTC)&lt;/strong&gt;, to learn about the release highlights of this release. For more information and registration, visit the &lt;a href=&#34;https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cloud-native-live-kubernetes-v134-release/&#34;&gt;event page&lt;/a&gt; on the CNCF Online Programs site.&lt;/p&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get Involved&lt;/h2&gt;
&lt;p&gt;The simplest way to get involved with Kubernetes is by joining one of the many &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-list.md&#34;&gt;Special Interest Groups&lt;/a&gt; (SIGs) that align with your interests. Have something you’d like to broadcast to the Kubernetes community? Share your voice at our weekly &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/communication&#34;&gt;community meeting&lt;/a&gt;, and through the channels below. Thank you for your continued feedback and support.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Follow us on Bluesky &lt;a href=&#34;https://bsky.app/profile/kubernetes.io&#34;&gt;@Kubernetesio&lt;/a&gt; for the latest updates&lt;/li&gt;
&lt;li&gt;Join the community discussion on &lt;a href=&#34;https://discuss.kubernetes.io/&#34;&gt;Discuss&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Join the community on &lt;a href=&#34;http://slack.k8s.io/&#34;&gt;Slack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Post questions (or answer questions) on &lt;a href=&#34;http://stackoverflow.com/questions/tagged/kubernetes&#34;&gt;Stack Overflow&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Share your Kubernetes &lt;a href=&#34;https://docs.google.com/a/linuxfoundation.org/forms/d/e/1FAIpQLScuI7Ye3VQHQTwBASrgkjQDSS5TP0g3AXfFhwSM9YpHgxRKFA/viewform&#34;&gt;story&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Read more about what’s happening with Kubernetes on the &lt;a href=&#34;https://kubernetes.io/blog/&#34;&gt;blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Learn more about the &lt;a href=&#34;https://github.com/kubernetes/sig-release/tree/master/release-team&#34;&gt;Kubernetes Release Team&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Tuning Linux Swap for Kubernetes: A Deep Dive</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/</link>
      <pubDate>Tue, 19 Aug 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/</guid>
      <description>
        
        
        &lt;p&gt;The Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/cluster-administration/swap-memory-management/&#34;&gt;NodeSwap feature&lt;/a&gt;, likely to graduate to &lt;em&gt;stable&lt;/em&gt; in the upcoming Kubernetes v1.34 release,
allows swap usage:
a significant shift from the conventional practice of disabling swap for performance predictability.
This article focuses exclusively on tuning swap on Linux nodes, where this feature is available. By allowing Linux nodes to use secondary storage for additional virtual memory when physical RAM is exhausted, node swap support aims to improve resource utilization and reduce out-of-memory (OOM) kills.&lt;/p&gt;
&lt;p&gt;However, enabling swap is not a &amp;quot;turn-key&amp;quot; solution. The performance and stability of your nodes under memory pressure are critically dependent on a set of Linux kernel parameters. Misconfiguration can lead to performance degradation and interfere with Kubelet&#39;s eviction logic.&lt;/p&gt;
&lt;p&gt;In this blogpost, I&#39;ll dive into critical Linux kernel parameters that govern swap behavior. I will explore how these parameters influence Kubernetes workload performance, swap utilization, and crucial eviction mechanisms.
I will present various test results showcasing the impact of different configurations, and share my findings on achieving optimal settings for stable and high-performing Kubernetes clusters.&lt;/p&gt;
&lt;h2 id=&#34;introduction-to-linux-swap&#34;&gt;Introduction to Linux swap&lt;/h2&gt;
&lt;p&gt;At a high level, the Linux kernel manages memory through pages, typically 4KiB in size. When physical memory becomes constrained, the kernel&#39;s page replacement algorithm decides which pages to move to swap space. While the exact logic is a sophisticated optimization, this decision-making process is influenced by certain key factors:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Page access patterns (how recently pages are accessed)&lt;/li&gt;
&lt;li&gt;Page dirtyness (whether pages have been modified)&lt;/li&gt;
&lt;li&gt;Memory pressure (how urgently the system needs free memory)&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;anonymous-vs-file-backed-memory&#34;&gt;Anonymous vs File-backed memory&lt;/h3&gt;
&lt;p&gt;It is important to understand that not all memory pages are the same. The kernel distinguishes between anonymous and file-backed memory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anonymous memory&lt;/strong&gt;: This is memory that is not backed by a specific file on the disk, such as a program&#39;s heap and stack. From the application&#39;s perspective this is private memory, and when the kernel needs to reclaim these pages, it must write them to a dedicated swap device.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;File-backed memory&lt;/strong&gt;: This memory is backed by a file on a filesystem. This includes a program&#39;s executable code, shared libraries, and filesystem caches. When the kernel needs to reclaim these pages, it can simply discard them if they have not been modified (&amp;quot;clean&amp;quot;). If a page has been modified (&amp;quot;dirty&amp;quot;), the kernel must first write the changes back to the file before it can be discarded.&lt;/p&gt;
&lt;p&gt;While a system without swap can still reclaim clean file-backed pages memory under pressure by dropping them, it has no way to offload anonymous memory. Enabling swap provides this capability, allowing the kernel to move less-frequently accessed memory pages to disk to conserve memory to avoid system OOM kills.&lt;/p&gt;
&lt;h3 id=&#34;key-kernel-parameters-for-swap-tuning&#34;&gt;Key kernel parameters for swap tuning&lt;/h3&gt;
&lt;p&gt;To effectively tune swap behavior, Linux provides several kernel parameters that can be managed via &lt;code&gt;sysctl&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;vm.swappiness&lt;/code&gt;: This is the most well-known parameter. It is a value from 0 to 200 (100 in older kernels) that controls the kernel&#39;s preference for swapping anonymous memory pages versus reclaiming file-backed memory pages (page cache).
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;High value (eg: 90+)&lt;/strong&gt;: The kernel will be aggressive in swapping out less-used anonymous memory to make room for file-cache.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Low value (eg: &amp;lt; 10)&lt;/strong&gt;: The kernel will strongly prefer dropping file cache pages over swapping anonymous memory.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;vm.min_free_kbytes&lt;/code&gt;: This parameter tells the kernel to keep a minimum amount of memory free as a buffer. When the amount of free memory drops below the this safety buffer, the kernel starts more aggressively reclaiming pages (swapping, and eventually handling OOM kills).
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Function:&lt;/strong&gt; It acts as a safety lever to ensure the kernel has enough memory for critical allocation requests that cannot be deferred.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Impact on swap&lt;/strong&gt;: Setting a higher &lt;code&gt;min_free_kbytes&lt;/code&gt; effectively raises the floor for for free memory, causing the kernel to initiate swap earlier under memory pressure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;vm.watermark_scale_factor&lt;/code&gt;: This setting controls the gap between different watermarks: &lt;code&gt;min&lt;/code&gt;, &lt;code&gt;low&lt;/code&gt; and &lt;code&gt;high&lt;/code&gt;, which are calculated based on &lt;code&gt;min_free_kbytes&lt;/code&gt;.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Watermarks explained&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;low&lt;/code&gt;: When free memory is below this mark, the &lt;code&gt;kswapd&lt;/code&gt; kernel process wakes up to reclaim pages in the background. This is when a swapping cycle begins.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;min&lt;/code&gt;: When free memory hits this minimum level, then aggressive page reclamation will block process allocation. Failing to reclaim pages will cause OOM kills.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;high&lt;/code&gt;: Memory reclamation stops once the free memory reaches this level.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Impact&lt;/strong&gt;: A higher &lt;code&gt;watermark_scale_factor&lt;/code&gt; careates a larger buffer between the &lt;code&gt;low&lt;/code&gt; and &lt;code&gt;min&lt;/code&gt; watermarks. This gives &lt;code&gt;kswapd&lt;/code&gt; more time to reclaim memory gradually before the system hits a critical state.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a typical server workload, you might have a long-running process with some memory that becomes &#39;cold&#39;. A higher &lt;code&gt;swappiness&lt;/code&gt; value can free up RAM by swapping out the cold memory, for other active processes that can benefit from keeping their file-cache.&lt;/p&gt;
&lt;p&gt;Tuning the &lt;code&gt;min_free_kbytes&lt;/code&gt; and &lt;code&gt;watermark_scale_factor&lt;/code&gt; parameters to move the swapping window early will give more room for &lt;code&gt;kswapd&lt;/code&gt; to offload memory to disk and prevent OOM kills during sudden memory spikes.&lt;/p&gt;
&lt;h2 id=&#34;swap-tests-and-results&#34;&gt;Swap tests and results&lt;/h2&gt;
&lt;p&gt;To understand the real-impact of these parameters, I designed a series of stress tests.&lt;/p&gt;
&lt;h3 id=&#34;test-setup&#34;&gt;Test setup&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Environment&lt;/strong&gt;: GKE on Google Cloud&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kubernetes version&lt;/strong&gt;: 1.33.2&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Node configuration&lt;/strong&gt;: &lt;code&gt;n2-standard-2&lt;/code&gt; (8GiB RAM, 50GB swap on a &lt;code&gt;pd-balanced&lt;/code&gt; disk, without encryption), Ubuntu 22.04&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Workload&lt;/strong&gt;: A custom Go application designed to allocate memory at a configurable rate, generate file-cache pressure, and simulate different memory access patterns (random vs sequential).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Monitoring&lt;/strong&gt;: A sidecar container capturing system metrics every second.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Protection&lt;/strong&gt;: Critical system components (kubelet, container runtime, sshd) were prevented from swapping by setting &lt;code&gt;memory.swap.max=0&lt;/code&gt; in their respective cgroups.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;test-methodology&#34;&gt;Test methodology&lt;/h3&gt;
&lt;p&gt;I ran a stress-test pod on nodes with different swappiness settings (0, 60, and 90) and varied the &lt;code&gt;min_free_kbytes&lt;/code&gt; and &lt;code&gt;watermark_scale_factor&lt;/code&gt; parameters to observe the outcomes under heavy memory allocation and I/O pressure.&lt;/p&gt;
&lt;h4 id=&#34;visualizing-swap-in-action&#34;&gt;Visualizing swap in action&lt;/h4&gt;
&lt;p&gt;The graph below, from a 100MBps stress test, shows swap in action. As free memory (in the &amp;quot;Memory Usage&amp;quot; plot) decreases, swap usage (&lt;code&gt;Swap Used (GiB)&lt;/code&gt;) and swap-out activity (&lt;code&gt;Swap Out (MiB/s)&lt;/code&gt;) increase. Critically, as the system relies more on swap, the I/O activity and corresponding wait time (&lt;code&gt;IO Wait %&lt;/code&gt; in the &amp;quot;CPU Usage&amp;quot; plot) also rises, indicating CPU stress.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Graph showing CPU, Memory, Swap utilization and I/O activity on a Kubernetes node&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/swap_visualization.png&#34; title=&#34;swap visualization&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;findings&#34;&gt;Findings&lt;/h3&gt;
&lt;p&gt;My initial tests with default kernel parameters (&lt;code&gt;swappiness=60&lt;/code&gt;, &lt;code&gt;min_free_kbytes=68MB&lt;/code&gt;, &lt;code&gt;watermark_scale_factor=10&lt;/code&gt;) quickly led to OOM kills and even unexpected node restarts under high memory pressure. With selecting appropriate kernel parameters a good balance in node stability and performance can be achieved.&lt;/p&gt;
&lt;h4 id=&#34;the-impact-of-swappiness&#34;&gt;The impact of &lt;code&gt;swappiness&lt;/code&gt;&lt;/h4&gt;
&lt;p&gt;The swappiness parameter directly influences the kernel&#39;s choice between reclaiming anonymous memory (swapping) and dropping page cache. To observe this, I ran a test where one pod generated and held file-cache pressure, followed by a second pod allocating anonymous memory at 100MB/s, to observe the kernel preference on reclaim:&lt;/p&gt;
&lt;p&gt;My findings reveal a clear trade-off:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;swappiness=90&lt;/code&gt;: The kernel proactively swapped out the inactive anonymous memory to keep the file cache. This resulted in high and sustained swap usage and significant I/O activity (&amp;quot;Blocks Out&amp;quot;), which in turn caused spikes in I/O wait on the CPU.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;swappiness=0&lt;/code&gt;: The kernel favored dropping file-cache pages delaying swap consumption. However, it&#39;s critical to understand that this &lt;strong&gt;does not disable swapping&lt;/strong&gt;. When memory pressure was high, the kernel still swapped anonymous memory to disk.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The choice is workload-dependent. For workloads sensitive to I/O latency, a lower swappiness is preferable. For workloads that rely on a large and frequently accessed file cache, a higher swappiness may be beneficial, provided the underlying disk is fast enough to handle the load.&lt;/p&gt;
&lt;h4 id=&#34;tuning-watermarks-to-prevent-eviction-and-oom-kills&#34;&gt;Tuning watermarks to prevent eviction and OOM kills&lt;/h4&gt;
&lt;p&gt;The most critical challenge I encountered was the interaction between rapid memory allocation and Kubelet&#39;s eviction mechanism. When my test pod, which was deliberately configured to overcommit memory, allocated it at a high rate (e.g., 300-500 MBps), the system quickly ran out of free memory.&lt;/p&gt;
&lt;p&gt;With default watermarks, the buffer for reclamation was too small. Before &lt;code&gt;kswapd&lt;/code&gt; could free up enough memory by swapping, the node would hit a critical state, leading to two potential outcomes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Kubelet eviction&lt;/strong&gt; If kubelet&#39;s eviction manager detected &lt;code&gt;memory.available&lt;/code&gt; was below its threshold, it would evict the pod.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OOM killer&lt;/strong&gt; In some high-rate scenarios, the OOM Killer would activate before eviction could complete, sometimes killing higher priority pods that were not the source of the pressure.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To mitigate this I tuned the watermarks:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Increased &lt;code&gt;min_free_kbytes&lt;/code&gt; to 512MiB: This forces the kernel to start reclaiming memory much earlier, providing a larger safety buffer.&lt;/li&gt;
&lt;li&gt;Increased &lt;code&gt;watermark_scale_factor&lt;/code&gt; to 2000: This widened the gap between the &lt;code&gt;low&lt;/code&gt; and &lt;code&gt;high&lt;/code&gt; watermarks (from ≈337MB to ≈591MB in my test node&#39;s &lt;code&gt;/proc/zoneinfo&lt;/code&gt;), effectively increasing the swapping window.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This combination gave &lt;code&gt;kswapd&lt;/code&gt; a larger operational zone and more time to swap pages to disk during memory spikes, successfully preventing both premature evictions and OOM kills in my test runs.&lt;/p&gt;
&lt;p&gt;Table compares watermark levels from &lt;code&gt;/proc/zoneinfo&lt;/code&gt; (Non-NUMA node):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;code&gt;min_free_kbytes=67584KiB&lt;/code&gt; and &lt;code&gt;watermark_scale_factor=10&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;min_free_kbytes=524288KiB&lt;/code&gt; and &lt;code&gt;watermark_scale_factor=2000&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Node 0, zone Normal &lt;br&gt;   pages free 583273 &lt;br&gt;   boost 0 &lt;br&gt;   min 10504 &lt;br&gt;   low 13130 &lt;br&gt;   high 15756 &lt;br&gt;   spanned 1310720 &lt;br&gt;   present 1310720 &lt;br&gt;   managed 1265603&lt;/td&gt;
&lt;td&gt;Node 0, zone Normal &lt;br&gt;   pages free 470539 &lt;br&gt;   min 82109 &lt;br&gt;   low 337017 &lt;br&gt;   high 591925&lt;br&gt;   spanned 1310720&lt;br&gt;   present 1310720 &lt;br&gt;   managed 1274542&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The graph below reveals that the kernel buffer size and scaling factor play a crucial role in determining how the system responds to memory load. With the right combination of these parameters, the system can effectively use swap space to avoid eviction and maintain stability.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;A side-by-side comparison of different min_free_kbytes settings, showing differences in Swap, Memory Usage and Eviction impact&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/memory-and-swap-growth.png&#34; title=&#34;Memory and Swap Utilization with min_free_kbytes&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;risks-and-recommendations&#34;&gt;Risks and recommendations&lt;/h3&gt;
&lt;p&gt;Enabling swap in Kubernetes is a powerful tool, but it comes with risks that must be managed through careful tuning.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Risk of performance degradation&lt;/strong&gt; Swapping is orders of magnitude slower than accessing RAM. If an application&#39;s active working set is swapped out, its performance will suffer dramatically due to high I/O wait times (thrashing). Swap could preferably be provisioned with a SSD backed storage to improve performance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Risk of masking memory leaks&lt;/strong&gt; Swap can hide memory leaks in applications, which might otherwise lead to a quick OOM kill. With swap, a leaky application might slowly degrade node performance over time, making the root cause harder to diagnose.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Risk of disabling evictions&lt;/strong&gt; Kubelet proactively monitors the node for memory-pressure and terminates pods to reclaim the resources. Improper tuning can lead to OOM kills before kubelet has a chance to evict pods gracefully. A properly configured &lt;code&gt;min_free_kbytes&lt;/code&gt; is essential to ensure kubelet&#39;s eviction mechanism remains effective.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;kubernetes-context&#34;&gt;Kubernetes context&lt;/h3&gt;
&lt;p&gt;Together, the kernel watermarks and kubelet eviction threshold create a series of memory pressure zones on a node. The eviction-threshold parameters need to be adjusted to configure Kubernetes managed evictions occur before the OOM kills.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Preferred thresholds for effective swap utilization&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/swap-thresholds.png&#34; title=&#34;Recommended Thresholds&#34;&gt;&lt;/p&gt;
&lt;p&gt;As the diagram shows, an ideal configuration will be to create a large enough &#39;swapping zone&#39; (between &lt;code&gt;high&lt;/code&gt; and &lt;code&gt;min&lt;/code&gt; watermarks) so that the kernel can handle memory pressure by swapping before available memory drops into the Eviction/Direct Reclaim zone.&lt;/p&gt;
&lt;h3 id=&#34;recommended-starting-point&#34;&gt;Recommended starting point&lt;/h3&gt;
&lt;p&gt;Based on these findings, I recommend the following as a starting point for Linux nodes with swap enabled. You should benchmark this with your own workloads.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;vm.swappiness=60&lt;/code&gt;: Linux default is a good starting point for general-purpose workloads. However, the ideal value is workload-dependent, and swap-sensitive applications may need more careful tuning.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;vm.min_free_kbytes=500000&lt;/code&gt; (500MB): Set this to a reasonably high value (e.g., 2-3% of total node memory) to give the node a reasonable safety buffer.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;vm.watermark_scale_factor=2000&lt;/code&gt;: Create a larger window for &lt;code&gt;kswapd&lt;/code&gt; to work with, preventing OOM kills during sudden memory allocation spikes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I encourage running benchmark tests with your own workloads in test-environments, when setting up swap for the first time in your Kubernetes cluster. Swap performance can be sensitive to different environment differences such as CPU load, disk type (SSD vs HDD) and I/O patterns.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Introducing Headlamp AI Assistant</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/07/introducing-headlamp-ai-assistant/</link>
      <pubDate>Thu, 07 Aug 2025 20:00:00 +0100</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/08/07/introducing-headlamp-ai-assistant/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;This announcement originally &lt;a href=&#34;https://headlamp.dev/blog/2025/08/07/introducing-the-headlamp-ai-assistant&#34;&gt;appeared&lt;/a&gt; on the Headlamp blog.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;To simplify Kubernetes management and troubleshooting, we&#39;re thrilled to
introduce &lt;a href=&#34;https://github.com/headlamp-k8s/plugins/tree/main/ai-assistant#readme&#34;&gt;Headlamp AI Assistant&lt;/a&gt;: a powerful new plugin for Headlamp that helps
you understand and operate your Kubernetes clusters and applications with
greater clarity and ease.&lt;/p&gt;
&lt;p&gt;Whether you&#39;re a seasoned engineer or just getting started, the AI Assistant offers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Fast time to value:&lt;/strong&gt; Ask questions like &lt;em&gt;&amp;quot;Is my application healthy?&amp;quot;&lt;/em&gt; or
&lt;em&gt;&amp;quot;How can I fix this?&amp;quot;&lt;/em&gt; without needing deep Kubernetes knowledge.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deep insights:&lt;/strong&gt; Start with high-level queries and dig deeper with prompts
like &lt;em&gt;&amp;quot;List all the problematic pods&amp;quot;&lt;/em&gt; or &lt;em&gt;&amp;quot;How can I fix this pod?&amp;quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Focused &amp;amp; relevant:&lt;/strong&gt; Ask questions in the context of what you&#39;re viewing
in the UI, such as &lt;em&gt;&amp;quot;What&#39;s wrong here?&amp;quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Action-oriented:&lt;/strong&gt; Let the AI take action for you, like &lt;em&gt;&amp;quot;Restart that
deployment&amp;quot;&lt;/em&gt;, with your permission.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here is a demo of the AI Assistant in action as it helps troubleshoot an
application running with issues in a Kubernetes cluster:&lt;/p&gt;


    
    &lt;div class=&#34;youtube-quote-sm&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&#34; allowfullscreen=&#34;allowfullscreen&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/GzXkUuCTcd4?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0&#34; title=&#34;Headlamp AI Assistant&#34;
      &gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;h2 id=&#34;hopping-on-the-ai-train&#34;&gt;Hopping on the AI train&lt;/h2&gt;
&lt;p&gt;Large Language Models (LLMs) have transformed not just how we access data but
also how we interact with it. The rise of tools like ChatGPT opened a world of
possibilities, inspiring a wave of new applications. Asking questions or giving
commands in natural language is intuitive, especially for users who aren&#39;t deeply
technical. Now everyone can quickly ask how to do X or Y, without feeling awkward
or having to traverse pages and pages of documentation like before.&lt;/p&gt;
&lt;p&gt;Therefore, Headlamp AI Assistant brings a conversational UI to &lt;a href=&#34;https://headlamp.dev&#34;&gt;Headlamp&lt;/a&gt;,
powered by LLMs that Headlamp users can configure with their own API keys.
It is available as a Headlamp plugin, making it easy to integrate into your
existing setup. Users can enable it by installing the plugin and configuring
it with their own LLM API keys, giving them control over which model powers
the assistant. Once enabled, the assistant becomes part of the Headlamp UI,
ready to respond to contextual queries and perform actions directly from the
interface.&lt;/p&gt;
&lt;h2 id=&#34;context-is-everything&#34;&gt;Context is everything&lt;/h2&gt;
&lt;p&gt;As expected, the AI Assistant is focused on helping users with Kubernetes
concepts. Yet, while there is a lot of value in responding to Kubernetes
related questions from Headlamp&#39;s UI, we believe that the great benefit of such
an integration is when it can use the context of what the user is experiencing
in an application. So, the Headlamp AI Assistant knows what you&#39;re currently
viewing in Headlamp, and this makes the interaction feel more like working
with a human assistant.&lt;/p&gt;
&lt;p&gt;For example, if a pod is failing, users can simply ask &lt;em&gt;&amp;quot;What&#39;s wrong here?&amp;quot;&lt;/em&gt;
and the AI Assistant will respond with the root cause, like a missing
environment variable or a typo in the image name. Follow-up prompts like
&lt;em&gt;&amp;quot;How can I fix this?&amp;quot;&lt;/em&gt; allow the AI Assistant to suggest a fix, streamlining
what used to take multiple steps into a quick, conversational flow.&lt;/p&gt;
&lt;p&gt;Sharing the context from Headlamp is not a trivial task though, so it&#39;s
something we will keep working on perfecting.&lt;/p&gt;
&lt;h2 id=&#34;tools&#34;&gt;Tools&lt;/h2&gt;
&lt;p&gt;Context from the UI is helpful, but sometimes additional capabilities are
needed. If the user is viewing the pod list and wants to identify problematic
deployments, switching views should not be necessary. To address this, the AI
Assistant includes support for a Kubernetes tool. This allows asking questions
like &amp;quot;Get me all deployments with problems&amp;quot; prompting the assistant to fetch
and display relevant data from the current cluster. Likewise, if the user
requests an action like &amp;quot;Restart that deployment&amp;quot; after the AI points out what
deployment needs restarting, it can also do that. In case of &amp;quot;write&amp;quot;
operations, the AI Assistant does check with the user for permission to run them.&lt;/p&gt;
&lt;h2 id=&#34;ai-plugins&#34;&gt;AI Plugins&lt;/h2&gt;
&lt;p&gt;Although the initial version of the AI Assistant is already useful for
Kubernetes users, future iterations will expand its capabilities. Currently,
the assistant supports only the Kubernetes tool, but further integration with
Headlamp plugins is underway. Similarly, we could get richer insights for
GitOps via the Flux plugin, monitoring through Prometheus, package management
with Helm, and more.&lt;/p&gt;
&lt;p&gt;And of course, as the popularity of MCP grows, we are looking into how to
integrate it as well, for a more plug-and-play fashion.&lt;/p&gt;
&lt;h2 id=&#34;try-it-out&#34;&gt;Try it out!&lt;/h2&gt;
&lt;p&gt;We hope this first version of the AI Assistant helps users manage Kubernetes
clusters more effectively and assist newcomers in navigating the learning
curve. We invite you to try out this early version and give us your feedback.
The AI Assistant plugin can be installed from Headlamp&#39;s Plugin Catalog in the
desktop version, or by using the container image when deploying Headlamp.
Stay tuned for the future versions of the Headlamp AI Assistant!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.34 Sneak Peek</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/28/kubernetes-v1-34-sneak-peek/</link>
      <pubDate>Mon, 28 Jul 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/28/kubernetes-v1-34-sneak-peek/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes v1.34 is coming at the end of August 2025.
This release will not include any removal or deprecation, but it is packed with an impressive number of enhancements.
Here are some of the features we are most excited about in this cycle!&lt;/p&gt;
&lt;p&gt;Please note that this information reflects the current state of v1.34 development and may change before release.&lt;/p&gt;
&lt;h2 id=&#34;featured-enhancements-of-kubernetes-v1-34&#34;&gt;Featured enhancements of Kubernetes v1.34&lt;/h2&gt;
&lt;p&gt;The following list highlights some of the notable enhancements likely to be included in the v1.34 release,
but is not an exhaustive list of all planned changes.
This is not a commitment and the release content is subject to change.&lt;/p&gt;
&lt;h3 id=&#34;the-core-of-dra-targets-stable&#34;&gt;The core of DRA targets stable&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/scheduling-eviction/dynamic-resource-allocation/&#34;&gt;Dynamic Resource Allocation&lt;/a&gt; (DRA) provides a flexible way to categorize,
request, and use devices like GPUs or custom hardware in your Kubernetes cluster.&lt;/p&gt;
&lt;p&gt;Since the v1.30 release, DRA has been based around claiming devices using &lt;em&gt;structured parameters&lt;/em&gt; that are opaque to the core of Kubernetes.
The relevant enhancement proposal, &lt;a href=&#34;https://kep.k8s.io/4381&#34;&gt;KEP-4381&lt;/a&gt;, took inspiration from dynamic provisioning for storage volumes.
DRA with structured parameters relies on a set of supporting API kinds: ResourceClaim, DeviceClass, ResourceClaimTemplate,
and ResourceSlice API types under &lt;code&gt;resource.k8s.io&lt;/code&gt;, while extending the &lt;code&gt;.spec&lt;/code&gt; for Pods with a new &lt;code&gt;resourceClaims&lt;/code&gt; field.
The core of DRA is targeting graduation to stable in Kubernetes v1.34.&lt;/p&gt;
&lt;p&gt;With DRA, device drivers and cluster admins define device classes that are available for use.
Workloads can claim devices from a device class within device requests.
Kubernetes allocates matching devices to specific claims and places the corresponding Pods on nodes that can access the allocated devices.
This framework provides flexible device filtering using CEL, centralized device categorization, and simplified Pod requests, among other benefits.&lt;/p&gt;
&lt;p&gt;Once this feature has graduated, the &lt;code&gt;resource.k8s.io/v1&lt;/code&gt; APIs will be available by default.&lt;/p&gt;
&lt;h3 id=&#34;serviceaccount-tokens-for-image-pull-authentication&#34;&gt;ServiceAccount tokens for image pull authentication&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/security/service-accounts/&#34;&gt;ServiceAccount&lt;/a&gt; token integration for &lt;code&gt;kubelet&lt;/code&gt; credential providers is likely to reach beta and be enabled by default in Kubernetes v1.34.
This allows the &lt;code&gt;kubelet&lt;/code&gt; to use these tokens when pulling container images from registries that require authentication.&lt;/p&gt;
&lt;p&gt;That support already exists as alpha, and is tracked as part of &lt;a href=&#34;https://kep.k8s.io/4412&#34;&gt;KEP-4412&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The existing alpha integration allows the &lt;code&gt;kubelet&lt;/code&gt; to use short-lived, automatically rotated ServiceAccount tokens (that follow OIDC-compliant semantics) to authenticate to a container image registry.
Each token is scoped to one associated Pod; the overall mechanism replaces the need for long-lived image pull Secrets.&lt;/p&gt;
&lt;p&gt;Adopting this new approach reduces security risks, supports workload-level identity, and helps cut operational overhead.
It brings image pull authentication closer to modern, identity-aware good practice.&lt;/p&gt;
&lt;h3 id=&#34;pod-replacement-policy-for-deployments&#34;&gt;Pod replacement policy for Deployments&lt;/h3&gt;
&lt;p&gt;After a change to a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/deployment/&#34;&gt;Deployment&lt;/a&gt;, terminating pods may stay up for a considerable amount of time and may consume additional resources.
As part of &lt;a href=&#34;https://kep.k8s.io/3973&#34;&gt;KEP-3973&lt;/a&gt;, the &lt;code&gt;.spec.podReplacementPolicy&lt;/code&gt; field will be introduced (as alpha) for Deployments.&lt;/p&gt;
&lt;p&gt;If your cluster has the feature enabled, you&#39;ll be able to select one of two policies:&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;TerminationStarted&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Creates new pods as soon as old ones start terminating, resulting in faster rollouts at the cost of potentially higher resource consumption.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;TerminationComplete&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Waits until old pods fully terminate before creating new ones, resulting in slower rollouts but ensuring controlled resource consumption.&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;This feature makes Deployment behavior more predictable by letting you choose when new pods should be created during updates or scaling.
It&#39;s beneficial when working in clusters with tight resource constraints or with workloads with long termination periods.&lt;/p&gt;
&lt;p&gt;It&#39;s expected to be available as an alpha feature and can be enabled using the &lt;code&gt;DeploymentPodReplacementPolicy&lt;/code&gt; and &lt;code&gt;DeploymentReplicaSetTerminatingReplicas&lt;/code&gt; feature gates in the API server and kube-controller-manager.&lt;/p&gt;
&lt;h3 id=&#34;production-ready-tracing-for-kubelet-and-api-server&#34;&gt;Production-ready tracing for &lt;code&gt;kubelet&lt;/code&gt; and API Server&lt;/h3&gt;
&lt;p&gt;To address the longstanding challenge of debugging node-level issues by correlating disconnected logs,
&lt;a href=&#34;https://kep.k8s.io/2831&#34;&gt;KEP-2831&lt;/a&gt; provides deep, contextual insights into the &lt;code&gt;kubelet&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This feature instruments critical &lt;code&gt;kubelet&lt;/code&gt; operations, particularly its gRPC calls to the Container Runtime Interface (CRI), using the vendor-agnostic OpenTelemetry standard.
It allows operators to visualize the entire lifecycle of events (for example: a Pod startup) to pinpoint sources of latency and errors.
Its most powerful aspect is the propagation of trace context; the &lt;code&gt;kubelet&lt;/code&gt; passes a trace ID with its requests to the container runtime, enabling runtimes to link their own spans.&lt;/p&gt;
&lt;p&gt;This effort is complemented by a parallel enhancement, &lt;a href=&#34;https://kep.k8s.io/647&#34;&gt;KEP-647&lt;/a&gt;, which brings the same tracing capabilities to the Kubernetes API server.
Together, these enhancements provide a more unified, end-to-end view of events, simplifying the process of pinpointing latency and errors from the control plane down to the node.
These features have matured through the official Kubernetes release process.
&lt;a href=&#34;https://kep.k8s.io/2831&#34;&gt;KEP-2831&lt;/a&gt; was introduced as an alpha feature in v1.25, while &lt;a href=&#34;https://kep.k8s.io/647&#34;&gt;KEP-647&lt;/a&gt; debuted as alpha in v1.22.
Both enhancements were promoted to beta together in the v1.27 release.
Looking forward, Kubelet Tracing (&lt;a href=&#34;https://kep.k8s.io/2831&#34;&gt;KEP-2831&lt;/a&gt;) and API Server Tracing (&lt;a href=&#34;https://kep.k8s.io/647&#34;&gt;KEP-647&lt;/a&gt;) are now targeting graduation to stable in the upcoming v1.34 release.&lt;/p&gt;
&lt;h3 id=&#34;prefersamezone-and-prefersamenode-traffic-distribution-for-services&#34;&gt;&lt;code&gt;PreferSameZone&lt;/code&gt; and &lt;code&gt;PreferSameNode&lt;/code&gt; traffic distribution for Services&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;spec.trafficDistribution&lt;/code&gt; field within a Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/services-networking/service/&#34;&gt;Service&lt;/a&gt; allows users to express preferences for how traffic should be routed to Service endpoints.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://kep.k8s.io/3015&#34;&gt;KEP-3015&lt;/a&gt; deprecates &lt;code&gt;PreferClose&lt;/code&gt; and introduces two additional values: &lt;code&gt;PreferSameZone&lt;/code&gt; and &lt;code&gt;PreferSameNode&lt;/code&gt;.
&lt;code&gt;PreferSameZone&lt;/code&gt; is equivalent to the current &lt;code&gt;PreferClose&lt;/code&gt;.
&lt;code&gt;PreferSameNode&lt;/code&gt; prioritizes sending traffic to endpoints on the same node as the client.&lt;/p&gt;
&lt;p&gt;This feature was introduced in v1.33 behind the &lt;code&gt;PreferSameTrafficDistribution&lt;/code&gt; feature gate.
It is targeting graduation to beta in v1.34 with its feature gate enabled by default.&lt;/p&gt;
&lt;h3 id=&#34;support-for-kyaml-a-kubernetes-dialect-of-yaml&#34;&gt;Support for KYAML: a Kubernetes dialect of YAML&lt;/h3&gt;
&lt;p&gt;KYAML aims to be a safer and less ambiguous YAML subset, and was designed specifically
for Kubernetes. Whatever version of Kubernetes you use, you&#39;ll be able use KYAML for writing manifests
and/or Helm charts.
You can write KYAML and pass it as an input to &lt;strong&gt;any&lt;/strong&gt; version of &lt;code&gt;kubectl&lt;/code&gt;,
because all KYAML files are also valid as YAML.
With kubectl v1.34, we expect you&#39;ll also be able to request KYAML output from &lt;code&gt;kubectl&lt;/code&gt; (as in &lt;code&gt;kubectl get -o kyaml …&lt;/code&gt;).
If you prefer, you can still request the output in JSON or YAML format.&lt;/p&gt;
&lt;p&gt;KYAML addresses specific challenges with both YAML and JSON.
YAML&#39;s significant whitespace requires careful attention to indentation and nesting,
while its optional string-quoting can lead to unexpected type coercion (for example: &lt;a href=&#34;https://hitchdev.com/strictyaml/why/implicit-typing-removed/&#34;&gt;&amp;quot;The Norway Bug&amp;quot;&lt;/a&gt;).
Meanwhile, JSON lacks comment support and has strict requirements for trailing commas and quoted keys.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://kep.k8s.io/5295&#34;&gt;KEP-5295&lt;/a&gt; introduces KYAML, which tries to address the most significant problems by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Always double-quoting value strings&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Leaving keys unquoted unless they are potentially ambiguous&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Always using &lt;code&gt;{}&lt;/code&gt; for mappings (associative arrays)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Always using &lt;code&gt;[]&lt;/code&gt; for lists&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This might sound a lot like JSON, because it is! But unlike JSON, KYAML supports comments, allows trailing commas, and doesn&#39;t require quoted keys.&lt;/p&gt;
&lt;p&gt;We&#39;re hoping to see KYAML introduced as a new output format for &lt;code&gt;kubectl&lt;/code&gt; v1.34.
As with all these features, none of these changes are 100% confirmed; watch this space!&lt;/p&gt;
&lt;p&gt;As a format, KYAML is and will remain a &lt;strong&gt;strict subset of YAML&lt;/strong&gt;, ensuring that any compliant YAML parser can parse KYAML documents.
Kubernetes does not require you to provide input specifically formatted as KYAML, and we have no plans to change that.&lt;/p&gt;
&lt;h3 id=&#34;fine-grained-autoscaling-control-with-hpa-configurable-tolerance&#34;&gt;Fine-grained autoscaling control with HPA configurable tolerance&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://kep.k8s.io/4951&#34;&gt;KEP-4951&lt;/a&gt; introduces a new feature that allows users to configure autoscaling tolerance on a per-HPA basis,
overriding the default cluster-wide 10% tolerance setting that often proves too coarse-grained for diverse workloads.
The enhancement adds an optional &lt;code&gt;tolerance&lt;/code&gt; field to the HPA&#39;s &lt;code&gt;spec.behavior.scaleUp&lt;/code&gt; and &lt;code&gt;spec.behavior.scaleDown&lt;/code&gt; sections,
enabling different tolerance values for scale-up and scale-down operations,
which is particularly valuable since scale-up responsiveness is typically more critical than scale-down speed for handling traffic surges.&lt;/p&gt;
&lt;p&gt;Released as alpha in Kubernetes v1.33 behind the &lt;code&gt;HPAConfigurableTolerance&lt;/code&gt; feature gate, this feature is expected to graduate to beta in v1.34.
This improvement helps to address scaling challenges with large deployments, where for scaling in,
a 10% tolerance might mean leaving hundreds of unnecessary Pods running.
Using the new, more flexible approach would enable workload-specific optimization for both
responsive and conservative scaling behaviors.&lt;/p&gt;
&lt;h2 id=&#34;want-to-know-more&#34;&gt;Want to know more?&lt;/h2&gt;
&lt;p&gt;New features and deprecations are also announced in the Kubernetes release notes.
We will formally announce what&#39;s new in &lt;a href=&#34;https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.34.md&#34;&gt;Kubernetes v1.34&lt;/a&gt; as part of the CHANGELOG for that release.&lt;/p&gt;
&lt;p&gt;The Kubernetes v1.34 release is planned for &lt;strong&gt;Wednesday 27th August 2025&lt;/strong&gt;. Stay tuned for updates!&lt;/p&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;The simplest way to get involved with Kubernetes is to join one of the many &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-list.md&#34;&gt;Special Interest Groups&lt;/a&gt; (SIGs) that align with your interests.
Have something you&#39;d like to broadcast to the Kubernetes community? Share your voice at our weekly &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/communication&#34;&gt;community meeting&lt;/a&gt;, and through the channels below.
Thank you for your continued feedback and support.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Follow us on Bluesky &lt;a href=&#34;https://bsky.app/profile/kubernetes.io&#34;&gt;@kubernetes.io&lt;/a&gt; for the latest updates&lt;/li&gt;
&lt;li&gt;Join the community discussion on &lt;a href=&#34;https://discuss.kubernetes.io/&#34;&gt;Discuss&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Join the community on &lt;a href=&#34;http://slack.k8s.io/&#34;&gt;Slack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Post questions (or answer questions) on &lt;a href=&#34;https://serverfault.com/questions/tagged/kubernetes&#34;&gt;Server Fault&lt;/a&gt; or &lt;a href=&#34;http://stackoverflow.com/questions/tagged/kubernetes&#34;&gt;Stack Overflow&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Share your Kubernetes &lt;a href=&#34;https://docs.google.com/a/linuxfoundation.org/forms/d/e/1FAIpQLScuI7Ye3VQHQTwBASrgkjQDSS5TP0g3AXfFhwSM9YpHgxRKFA/viewform&#34;&gt;story&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Read more about what&#39;s happening with Kubernetes on the &lt;a href=&#34;https://kubernetes.io/blog/&#34;&gt;blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Learn more about the &lt;a href=&#34;https://github.com/kubernetes/sig-release/tree/master/release-team&#34;&gt;Kubernetes Release Team&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Post-Quantum Cryptography in Kubernetes</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/18/pqc-in-k8s/</link>
      <pubDate>Fri, 18 Jul 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/18/pqc-in-k8s/</guid>
      <description>
        
        
        &lt;p&gt;The world of cryptography is on the cusp of a major shift with the advent of
quantum computing. While powerful quantum computers are still largely
theoretical for many applications, their potential to break current
cryptographic standards is a serious concern, especially for long-lived
systems. This is where &lt;em&gt;Post-Quantum Cryptography&lt;/em&gt; (PQC) comes in. In this
article, I&#39;ll dive into what PQC means for TLS and, more specifically, for the
Kubernetes ecosystem. I&#39;ll explain what the (suprising) state of PQC in
Kubernetes is and what the implications are for current and future clusters.&lt;/p&gt;
&lt;h2 id=&#34;what-is-post-quantum-cryptography&#34;&gt;What is Post-Quantum Cryptography&lt;/h2&gt;
&lt;p&gt;Post-Quantum Cryptography refers to cryptographic algorithms that are thought to
be secure against attacks by both classical and quantum computers. The primary
concern is that quantum computers, using algorithms like &lt;a href=&#34;https://en.wikipedia.org/wiki/Shor%27s_algorithm&#34;&gt;Shor&#39;s Algorithm&lt;/a&gt;,
could efficiently break widely used public-key cryptosystems such as RSA and
Elliptic Curve Cryptography (ECC), which underpin much of today&#39;s secure
communication, including TLS. The industry is actively working on standardizing
and adopting PQC algorithms. One of the first to be standardized by &lt;a href=&#34;https://www.nist.gov/&#34;&gt;NIST&lt;/a&gt; is
the Module-Lattice Key Encapsulation Mechanism (&lt;code&gt;ML-KEM&lt;/code&gt;), formerly known as
Kyber, and now standardized as &lt;a href=&#34;https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf&#34;&gt;FIPS-203&lt;/a&gt; (PDF download).&lt;/p&gt;
&lt;p&gt;It is difficult to predict when quantum computers will be able to break
classical algorithms. However, it is clear that we need to start migrating to
PQC algorithms now, as the next section shows. To get a feeling for the
predicted timeline we can look at a &lt;a href=&#34;https://nvlpubs.nist.gov/nistpubs/ir/2024/NIST.IR.8547.ipd.pdf&#34;&gt;NIST report&lt;/a&gt; covering the transition to
post-quantum cryptography standards. It declares that system with classical
crypto should be deprecated after 2030 and disallowed after 2035.&lt;/p&gt;
&lt;h2 id=&#34;timelines&#34;&gt;Key exchange vs. digital signatures: different needs, different timelines&lt;/h2&gt;
&lt;p&gt;In TLS, there are two main cryptographic operations we need to secure:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Key Exchange&lt;/strong&gt;: This is how the client and server agree on a shared secret to
encrypt their communication. If an attacker records encrypted traffic today,
they could decrypt it in the future, if they gain access to a quantum computer
capable of breaking the key exchange. This makes migrating KEMs to PQC an
immediate priority.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Digital Signatures&lt;/strong&gt;: These are primarily used to authenticate the server (and
sometimes the client) via certificates. The authenticity of a server is
verified at the time of connection. While important, the risk of an attack
today is much lower, because the decision of trusting a server cannot be abused
after the fact. Additionally, current PQC signature schemes often come with
significant computational overhead and larger key/signature sizes compared to
their classical counterparts.&lt;/p&gt;
&lt;p&gt;Another significant hurdle in the migration to PQ certificates is the upgrade
of root certificates. These certificates have long validity periods and are
installed in many devices and operating systems as trust anchors.&lt;/p&gt;
&lt;p&gt;Given these differences, the focus for immediate PQC adoption in TLS has been
on hybrid key exchange mechanisms. These combine a classical algorithm (such as
Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as
&lt;code&gt;ML-KEM&lt;/code&gt;). The resulting shared secret is secure as long as at least one of the
component algorithms remains unbroken. The &lt;code&gt;X25519MLKEM768&lt;/code&gt; hybrid scheme is the
most widely supported one.&lt;/p&gt;
&lt;h2 id=&#34;state-of-kems&#34;&gt;State of PQC key exchange mechanisms (KEMs) today&lt;/h2&gt;
&lt;p&gt;Support for PQC KEMs is rapidly improving across the ecosystem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Go&lt;/strong&gt;: The Go standard library&#39;s &lt;code&gt;crypto/tls&lt;/code&gt; package introduced support for
&lt;code&gt;X25519MLKEM768&lt;/code&gt; in version 1.24 (released February 2025). Crucially, it&#39;s
enabled by default when there is no explicit configuration, i.e.,
&lt;code&gt;Config.CurvePreferences&lt;/code&gt; is &lt;code&gt;nil&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Browsers &amp;amp; OpenSSL&lt;/strong&gt;: Major browsers like Chrome (version 131, November 2024)
and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0,
April 2025), have also added support for the &lt;code&gt;ML-KEM&lt;/code&gt; based hybrid scheme.&lt;/p&gt;
&lt;p&gt;Apple is also &lt;a href=&#34;https://support.apple.com/en-lb/122756&#34;&gt;rolling out support&lt;/a&gt; for &lt;code&gt;X25519MLKEM768&lt;/code&gt; in version
26 of their operating systems. Given the proliferation of Apple devices, this
will have a significant impact on the global PQC adoption.&lt;/p&gt;
&lt;p&gt;For a more detailed overview of the state of PQC in the wider industry,
see &lt;a href=&#34;https://blog.cloudflare.com/pq-2024/&#34;&gt;this blog post by Cloudflare&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;post-quantum-kems-in-kubernetes-an-unexpected-arrival&#34;&gt;Post-quantum KEMs in Kubernetes: an unexpected arrival&lt;/h2&gt;
&lt;p&gt;So, what does this mean for Kubernetes? Kubernetes components, including the
API server and kubelet, are built with Go.&lt;/p&gt;
&lt;p&gt;As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A
quick check of the Kubernetes codebase reveals that &lt;code&gt;Config.CurvePreferences&lt;/code&gt;
is not explicitly set. This leads to a fascinating conclusion: Kubernetes
v1.33, by virtue of using Go 1.24, supports hybrid post-quantum
&lt;code&gt;X25519MLKEM768&lt;/code&gt; for TLS connections by default!&lt;/p&gt;
&lt;p&gt;You can test this yourself. If you set up a Minikube cluster running Kubernetes
v1.33.0, you can connect to the API server using a recent OpenSSL client:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000080;font-weight:bold&#34;&gt;$&lt;/span&gt; minikube start --kubernetes-version&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;v1.33.0
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000080;font-weight:bold&#34;&gt;$&lt;/span&gt; kubectl cluster-info
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;Kubernetes control plane is running at https://127.0.0.1:&amp;lt;PORT&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#000080;font-weight:bold&#34;&gt;$&lt;/span&gt; kubectl config view --minify --raw -o &lt;span style=&#34;color:#b8860b&#34;&gt;jsonpath&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#b62;font-weight:bold&#34;&gt;\&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;{&lt;/span&gt;.clusters&lt;span style=&#34;color:#666&#34;&gt;[&lt;/span&gt;0&lt;span style=&#34;color:#666&#34;&gt;]&lt;/span&gt;.cluster.certificate-authority-data&lt;span style=&#34;color:#666&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#b62;font-weight:bold&#34;&gt;\&amp;#39;&lt;/span&gt; | base64 -d &amp;gt; ca.crt
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000080;font-weight:bold&#34;&gt;$&lt;/span&gt; openssl version
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#000080;font-weight:bold&#34;&gt;$&lt;/span&gt; &lt;span style=&#34;color:#a2f&#34;&gt;echo&lt;/span&gt; -n &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;Q&amp;#34;&lt;/span&gt; | openssl s_client -connect 127.0.0.1:&amp;lt;PORT&amp;gt; -CAfile ca.crt
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;[...]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;Negotiated TLS1.3 group: X25519MLKEM768
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;[...]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;DONE
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Lo and behold, the negotiated group is &lt;code&gt;X25519MLKEM768&lt;/code&gt;! This is a significant
step towards making Kubernetes quantum-safe, seemingly without a major
announcement or dedicated KEP (Kubernetes Enhancement Proposal).&lt;/p&gt;
&lt;h2 id=&#34;the-go-version-mismatch-pitfall&#34;&gt;The Go version mismatch pitfall&lt;/h2&gt;
&lt;p&gt;An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23
included experimental support for a draft version of &lt;code&gt;ML-KEM&lt;/code&gt;, identified as
&lt;code&gt;X25519Kyber768Draft00&lt;/code&gt;. This was also enabled by default if
&lt;code&gt;Config.CurvePreferences&lt;/code&gt; was &lt;code&gt;nil&lt;/code&gt;. Kubernetes v1.32 used Go 1.23. However,
Go 1.24 removed the draft support and replaced it with the standardized version
&lt;code&gt;X25519MLKEM768&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What happens if a client and server are using mismatched Go versions (one on
1.23, the other on 1.24)? They won&#39;t have a common PQC KEM to negotiate, and
the handshake will fall back to classical ECC curves (e.g., &lt;code&gt;X25519&lt;/code&gt;). How
could this happen in practice?&lt;/p&gt;
&lt;p&gt;Consider a scenario:&lt;/p&gt;
&lt;p&gt;A Kubernetes cluster is running v1.32 (using Go 1.23 and thus
&lt;code&gt;X25519Kyber768Draft00&lt;/code&gt;). A developer upgrades their &lt;code&gt;kubectl&lt;/code&gt; to v1.33,
compiled with Go 1.24, only supporting &lt;code&gt;X25519MLKEM768&lt;/code&gt;. Now, when &lt;code&gt;kubectl&lt;/code&gt;
communicates with the v1.32 API server, they no longer share a common PQC
algorithm. The connection will downgrade to classical cryptography, silently
losing the PQC protection that has been in place. This highlights the
importance of understanding the implications of Go version upgrades, and the
details of the TLS stack.&lt;/p&gt;
&lt;h2 id=&#34;limitation-packet-size&#34;&gt;Limitations: packet size&lt;/h2&gt;
&lt;p&gt;One practical consideration with &lt;code&gt;ML-KEM&lt;/code&gt; is the size of its public keys
with encoded key sizes of around 1.2 kilobytes for &lt;code&gt;ML-KEM-768&lt;/code&gt;.
This can cause the initial TLS &lt;code&gt;ClientHello&lt;/code&gt; message not to fit inside
a single TCP/IP packet, given the typical networking constraints
(most commonly, the standard Ethernet frame size limit of 1500
bytes). Some TLS libraries or network appliances might not handle this
gracefully, assuming the Client Hello always fits in one packet. This issue
has been observed in some Kubernetes-related projects and networking
components, potentially leading to connection failures when PQC KEMs are used.
More details can be found at &lt;a href=&#34;https://tldr.fail/&#34;&gt;tldr.fail&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;state-of-post-quantum-signatures&#34;&gt;State of Post-Quantum Signatures&lt;/h2&gt;
&lt;p&gt;While KEMs are seeing broader adoption, PQC digital signatures are further
behind in terms of widespread integration into standard toolchains. NIST has
published standards for PQC signatures, such as &lt;code&gt;ML-DSA&lt;/code&gt; (&lt;code&gt;FIPS-204&lt;/code&gt;) and
&lt;code&gt;SLH-DSA&lt;/code&gt; (&lt;code&gt;FIPS-205&lt;/code&gt;). However, implementing these in a way that&#39;s broadly
usable (e.g., for PQC Certificate Authorities) &lt;a href=&#34;https://blog.cloudflare.com/another-look-at-pq-signatures/#the-algorithms&#34;&gt;presents challenges&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Larger Keys and Signatures&lt;/strong&gt;: PQC signature schemes often have significantly
larger public keys and signature sizes compared to classical algorithms like
Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than
Ed25519 keys, and certificates can be 12 times larger.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Signing and verification operations &lt;a href=&#34;https://pqshield.github.io/nist-sigs-zoo/&#34;&gt;can be substantially slower&lt;/a&gt;.
While some algorithms are on par with classical algorithms, others may have a
much higher overhead, sometimes on the order of 10x to 1000x worse performance.
To improve this situation, NIST is running a
&lt;a href=&#34;https://csrc.nist.gov/news/2024/pqc-digital-signature-second-round-announcement&#34;&gt;second round of standardization&lt;/a&gt; for PQC signatures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Toolchain Support&lt;/strong&gt;: Mainstream TLS libraries and CA software do not yet have
mature, built-in support for these new signature algorithms. The Go team, for
example, has indicated that &lt;code&gt;ML-DSA&lt;/code&gt; support is a high priority, but the
soonest it might appear in the standard library is Go 1.26 &lt;a href=&#34;https://github.com/golang/go/issues/64537#issuecomment-2877714729&#34;&gt;(as of May 2025)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/cloudflare/circl&#34;&gt;Cloudflare&#39;s CIRCL&lt;/a&gt; (Cloudflare Interoperable Reusable Cryptographic Library)
library implements some PQC signature schemes like variants of Dilithium, and
they maintain a &lt;a href=&#34;https://github.com/cloudflare/go&#34;&gt;fork of Go (cfgo)&lt;/a&gt; that integrates CIRCL. Using &lt;code&gt;cfgo&lt;/code&gt;, it&#39;s
possible to experiment with generating certificates signed with PQC algorithms
like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and
is not yet part of the mainstream Kubernetes or Go distributions.&lt;/p&gt;
&lt;h2 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The journey to a post-quantum secure Kubernetes is underway, and perhaps
further along than many realize, thanks to the proactive adoption of &lt;code&gt;ML-KEM&lt;/code&gt;
in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key
exchange in many TLS connections by default.&lt;/p&gt;
&lt;p&gt;However, awareness of potential pitfalls, such as Go version mismatches leading
to downgrades and issues with Client Hello packet sizes, is crucial. While PQC
for KEMs is becoming a reality, PQC for digital signatures and certificate
hierarchies is still in earlier stages of development and adoption for
mainstream use. As Kubernetes maintainers and contributors, staying informed
about these developments will be key to ensuring the long-term security of the
platform.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Navigating Failures in Pods With Devices</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/</link>
      <pubDate>Thu, 03 Jul 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes is the de facto standard for container orchestration, but when it
comes to handling specialized hardware like GPUs and other accelerators, things
get a bit complicated. This blog post dives into the challenges of managing
failure modes when operating pods with devices in Kubernetes, based on insights
from &lt;a href=&#34;https://sched.co/1i7pT&#34;&gt;Sergey Kanzhelev and Mrunal Patel&#39;s talk at KubeCon NA
2024&lt;/a&gt;. You can follow the links to
&lt;a href=&#34;https://static.sched.com/hosted_files/kccncna2024/b9/KubeCon%20NA%202024_%20Navigating%20Failures%20in%20Pods%20With%20Devices_%20Challenges%20and%20Solutions.pptx.pdf?_gl=1*191m4j5*_gcl_au*MTU1MDM0MTM1My4xNzMwOTE4ODY5LjIxNDI4Nzk1NDIuMTczMTY0ODgyMC4xNzMxNjQ4ODIy*FPAU*MTU1MDM0MTM1My4xNzMwOTE4ODY5&#34;&gt;slides&lt;/a&gt;
and
&lt;a href=&#34;https://www.youtube.com/watch?v=-YCnOYTtVO8&amp;list=PLj6h78yzYM2Pw4mRw4S-1p_xLARMqPkA7&amp;index=150&#34;&gt;recording&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;the-ai-ml-boom-and-its-impact-on-kubernetes&#34;&gt;The AI/ML boom and its impact on Kubernetes&lt;/h2&gt;
&lt;p&gt;The rise of AI/ML workloads has brought new challenges to Kubernetes. These
workloads often rely heavily on specialized hardware, and any device failure can
significantly impact performance and lead to frustrating interruptions. As
highlighted in the 2024 &lt;a href=&#34;https://ai.meta.com/research/publications/the-llama-3-herd-of-models/&#34;&gt;Llama
paper&lt;/a&gt;,
hardware issues, particularly GPU failures, are a major cause of disruption in
AI/ML training. You can also learn how much effort NVIDIA spends on handling
devices failures and maintenance in the KubeCon talk by &lt;a href=&#34;https://kccncna2024.sched.com/event/1i7kJ/all-your-gpus-are-belong-to-us-an-inside-look-at-nvidias-self-healing-geforce-now-infrastructure-ryan-hallisey-piotr-prokop-pl-nvidia&#34;&gt;Ryan Hallisey and Piotr
Prokop All-Your-GPUs-Are-Belong-to-Us: An Inside Look at NVIDIA&#39;s Self-Healing
GeForce NOW
Infrastructure&lt;/a&gt;
(&lt;a href=&#34;https://www.youtube.com/watch?v=iLnHtKwmu2I&#34;&gt;recording&lt;/a&gt;) as they see 19
remediation requests per 1000 nodes a day!
We also see data centers offering spot consumption models and overcommit on
power, making device failures commonplace and a part of the business model.&lt;/p&gt;
&lt;p&gt;However, Kubernetes’s view on resources is still very static. The resource is
either there or not. And if it is there, the assumption is that it will stay
there fully functional - Kubernetes lacks good support for handling full or partial
hardware failures. These long-existing assumptions combined with the overall complexity of a setup lead
to a variety of failure modes, which we discuss here.&lt;/p&gt;
&lt;h3 id=&#34;understanding-ai-ml-workloads&#34;&gt;Understanding AI/ML workloads&lt;/h3&gt;
&lt;p&gt;Generally, all AI/ML workloads require specialized hardware, have challenging
scheduling requirements, and are expensive when idle. AI/ML workloads typically
fall into two categories - training and inference. Here is an oversimplified
view of those categories’ characteristics, which are different from traditional workloads
like web services:&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;Training&lt;/dt&gt;
&lt;dd&gt;These workloads are resource-intensive, often consuming entire
machines and running as gangs of pods. Training jobs are usually &amp;quot;run to
completion&amp;quot; - but that could be days, weeks or even months. Any failure in a
single pod can necessitate restarting the entire step across all the pods.&lt;/dd&gt;
&lt;dt&gt;Inference&lt;/dt&gt;
&lt;dd&gt;These workloads are usually long-running or run indefinitely,
and can be small enough to consume a subset of a Node’s devices or large enough to span
multiple nodes. They often require downloading huge files with the model
weights.&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;These workload types specifically break many past assumptions:&lt;/p&gt;


 





&lt;table&gt;&lt;caption style=&#34;display: none;&#34;&gt;Workload assumptions before and now&lt;/caption&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left&#34;&gt;Before&lt;/th&gt;
&lt;th style=&#34;text-align:left&#34;&gt;Now&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Can get a better CPU and the app will work faster.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Require a &lt;strong&gt;specific&lt;/strong&gt; device (or &lt;strong&gt;class of devices&lt;/strong&gt;) to run.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;When something doesn’t work, just recreate it.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Allocation or reallocation is expensive.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Any node will work. No need to coordinate between Pods.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Scheduled in a special way - devices often connected in a cross-node topology.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Each Pod can be plug-and-play replaced if failed.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Pods are a part of a larger task. Lifecycle of an entire task depends on each Pod.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Container images are slim and easily available.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Container images may be so big that they require special handling.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Long initialization can be offset by slow rollout.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Initialization may be long and should be optimized, sometimes across many Pods together.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Compute nodes are commoditized and relatively inexpensive, so some idle time is acceptable.&lt;/td&gt;
&lt;td style=&#34;text-align:left&#34;&gt;Nodes with specialized hardware can be an order of magnitude more expensive than those without, so idle time is very wasteful.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The existing failure model was relying on old assumptions. It may still work for
the new workload types, but it has limited knowledge about devices and is very
expensive for them. In some cases, even prohibitively expensive. You will see
more examples later in this article.&lt;/p&gt;
&lt;h3 id=&#34;why-kubernetes-still-reigns-supreme&#34;&gt;Why Kubernetes still reigns supreme&lt;/h3&gt;
&lt;p&gt;This article is not going deeper into the question: why not start fresh for&lt;br&gt;
AI/ML workloads since they are so different from the traditional Kubernetes
workloads. Despite many challenges, Kubernetes remains the platform of choice
for AI/ML workloads. Its maturity, security, and rich ecosystem of tools make it
a compelling option. While alternatives exist, they often lack the years of
development and refinement that Kubernetes offers. And the Kubernetes developers
are actively addressing the gaps identified in this article and beyond.&lt;/p&gt;
&lt;h2 id=&#34;the-current-state-of-device-failure-handling&#34;&gt;The current state of device failure handling&lt;/h2&gt;
&lt;p&gt;This section outlines different failure modes and the best practices and DIY
(Do-It-Yourself) solutions used today. The next session will describe a roadmap
of improving things for those failure modes.&lt;/p&gt;
&lt;h3 id=&#34;failure-modes-k8s-infrastructure&#34;&gt;Failure modes: K8s infrastructure&lt;/h3&gt;
&lt;p&gt;In order to understand the failures related to the Kubernetes infrastructure,
you need to understand how many moving parts are involved in scheduling a Pod on
the node. The sequence of events when the Pod is scheduled in the Node is as
follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;Device plugin&lt;/em&gt; is scheduled on the Node&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Device plugin&lt;/em&gt; is registered with the &lt;em&gt;kubelet&lt;/em&gt; via local gRPC&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Kubelet&lt;/em&gt; uses &lt;em&gt;device plugin&lt;/em&gt; to watch for devices and updates capacity of
the node&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Scheduler&lt;/em&gt; places a &lt;em&gt;user Pod&lt;/em&gt; on a Node based on the updated capacity&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Kubelet&lt;/em&gt; asks &lt;em&gt;Device plugin&lt;/em&gt; to &lt;strong&gt;Allocate&lt;/strong&gt; devices for a &lt;em&gt;User Pod&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Kubelet&lt;/em&gt; creates a &lt;em&gt;User Pod&lt;/em&gt; with the allocated devices attached to it&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This diagram shows some of those actors involved:&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/k8s-infra-devices.svg&#34;
         alt=&#34;The diagram shows relationships between the kubelet, Device plugin, and a user Pod. It shows that kubelet connects to the Device plugin named my-device, kubelet reports the node status with the my-device availability, and the user Pod requesting the 2 of my-device.&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;As there are so many actors interconnected, every one of them and every
connection may experience interruptions. This leads to many exceptional
situations that are often considered failures, and may cause serious workload
interruptions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pods failing admission at various stages of its lifecycle&lt;/li&gt;
&lt;li&gt;Pods unable to run on perfectly fine hardware&lt;/li&gt;
&lt;li&gt;Scheduling taking unexpectedly long time&lt;/li&gt;
&lt;/ul&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/k8s-infra-failures.svg&#34;
         alt=&#34;The same diagram as one above it, however it has an overlayed orange bang drawings over individual components with the text indicating what can break in that component. Over the kubelet text reads: &amp;#39;kubelet restart: looses all devices info before re-Watch&amp;#39;. Over the Device plugin text reads: &amp;#39;device plugin update, evictIon, restart: kubelet cannot Allocate devices or loses all devices state&amp;#39;. Over the user Pod text reads: &amp;#39;slow pod termination: devices are unavailable&amp;#39;.&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;The goal for Kubernetes is to make the interruption between these components as
reliable as possible. Kubelet already implements retries, grace periods, and
other techniques to improve it. The roadmap section goes into details on other
edge cases that the Kubernetes project tracks. However, all these improvements
only work when these best practices are followed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Configure and restart kubelet and the container runtime (such as containerd or CRI-O)
as early as possible to not interrupt the workload.&lt;/li&gt;
&lt;li&gt;Monitor device plugin health and carefully plan for upgrades.&lt;/li&gt;
&lt;li&gt;Do not overload the node with less-important workloads to prevent interruption
of device plugin and other components.&lt;/li&gt;
&lt;li&gt;Configure user pods tolerations to handle node readiness flakes.&lt;/li&gt;
&lt;li&gt;Configure and code graceful termination logic carefully to not block devices
for too long.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Another class of Kubernetes infra-related issues is driver-related. With
traditional resources like CPU and memory, no compatibility checks between the
application and hardware were needed. With special devices like hardware
accelerators, there are new failure modes. Device drivers installed on the node:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Must match the hardware&lt;/li&gt;
&lt;li&gt;Be compatible with an app&lt;/li&gt;
&lt;li&gt;Must work with other drivers (like &lt;a href=&#34;https://developer.nvidia.com/nccl&#34;&gt;nccl&lt;/a&gt;,
etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Best practices for handling driver versions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Monitor driver installer health&lt;/li&gt;
&lt;li&gt;Plan upgrades of infrastructure and Pods to match the version&lt;/li&gt;
&lt;li&gt;Have canary deployments whenever possible&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Following the best practices in this section and using device plugins and device
driver installers from trusted and reliable sources generally eliminate this
class of failures. Kubernetes is tracking work to make this space even better.&lt;/p&gt;
&lt;h3 id=&#34;failure-modes-device-failed&#34;&gt;Failure modes: device failed&lt;/h3&gt;
&lt;p&gt;There is very little handling of device failure in Kubernetes today. Device
plugins report the device failure only by changing the count of allocatable
devices. And Kubernetes relies on standard mechanisms like liveness probes or
container failures to allow Pods to communicate the failure condition to the
kubelet. However, Kubernetes does not correlate device failures with container
crashes and does not offer any mitigation beyond restarting the container while
being attached to the same device.&lt;/p&gt;
&lt;p&gt;This is why many plugins and DIY solutions exist to handle device failures based
on various signals.&lt;/p&gt;
&lt;h4 id=&#34;health-controller&#34;&gt;Health controller&lt;/h4&gt;
&lt;p&gt;In many cases a failed device will result in unrecoverable and very expensive
nodes doing nothing. A simple DIY solution is a &lt;em&gt;node health controller&lt;/em&gt;. The
controller could compare the device allocatable count with the capacity and if
the capacity is greater, it starts a timer. Once the timer reaches a threshold,
the health controller kills and recreates a node.&lt;/p&gt;
&lt;p&gt;There are problems with the &lt;em&gt;health controller&lt;/em&gt; approach:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Root cause of the device failure is typically not known&lt;/li&gt;
&lt;li&gt;The controller is not workload aware&lt;/li&gt;
&lt;li&gt;Failed device might not be in use and you want to keep other devices running&lt;/li&gt;
&lt;li&gt;The detection may be too slow as it is very generic&lt;/li&gt;
&lt;li&gt;The node may be part of a bigger set of nodes and simply cannot be deleted in
isolation without other nodes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are variations of the health controller solving some of the problems
above. The overall theme here though is that to best handle failed devices, you
need customized handling for the specific workload. Kubernetes doesn’t yet offer
enough abstraction to express how critical the device is for a node, for the
cluster, and for the Pod it is assigned to.&lt;/p&gt;
&lt;h4 id=&#34;pod-failure-policy&#34;&gt;Pod failure policy&lt;/h4&gt;
&lt;p&gt;Another DIY approach for device failure handling is a per-pod reaction on a
failed device. This approach is applicable for &lt;em&gt;training&lt;/em&gt; workloads that are
implemented as Jobs.&lt;/p&gt;
&lt;p&gt;Pod can define special error codes for device failures. For example, whenever
unexpected device behavior is encountered, Pod exits with a special exit code.
Then the Pod failure policy can handle the device failure in a special way. Read
more on &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-failure-policy&#34;&gt;Handling retriable and non-retriable pod failures with Pod failure
policy&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There are some problems with the &lt;em&gt;Pod failure policy&lt;/em&gt; approach for Jobs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;There is no well-known &lt;em&gt;device failed&lt;/em&gt; condition, so this approach does not work for the
generic Pod case&lt;/li&gt;
&lt;li&gt;Error codes must be coded carefully and in some cases are hard to guarantee.&lt;/li&gt;
&lt;li&gt;Only works with Jobs with &lt;code&gt;restartPolicy: Never&lt;/code&gt;, due to the limitation of a pod
failure policy feature.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, this solution has limited applicability.&lt;/p&gt;
&lt;h4 id=&#34;custom-pod-watcher&#34;&gt;Custom pod watcher&lt;/h4&gt;
&lt;p&gt;A little more generic approach is to implement the Pod watcher as a DIY solution
or use some third party tools offering this functionality. The pod watcher is
most often used to handle device failures for inference workloads.&lt;/p&gt;
&lt;p&gt;Since Kubernetes just keeps a pod assigned to a device, even if the device is
reportedly unhealthy, the idea is to detect this situation with the pod watcher
and apply some remediation. It often involves obtaining device health status and
its mapping to the Pod using Pod Resources API on the node. If a device fails,
it can then delete the attached Pod as a remediation. The replica set will
handle the Pod recreation on a healthy device.&lt;/p&gt;
&lt;p&gt;The other reasons to implement this watcher:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Without it, the Pod will keep being assigned to the failed device forever.&lt;/li&gt;
&lt;li&gt;There is no &lt;em&gt;descheduling&lt;/em&gt; for a pod with &lt;code&gt;restartPolicy=Always&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;There are no built-in controllers that delete Pods in CrashLoopBackoff.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Problems with the &lt;em&gt;custom pod watcher&lt;/em&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The signal for the pod watcher is expensive to get, and involves some
privileged actions.&lt;/li&gt;
&lt;li&gt;It is a custom solution and it assumes the importance of a device for a Pod.&lt;/li&gt;
&lt;li&gt;The pod watcher relies on external controllers to reschedule a Pod.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are more variations of DIY solutions for handling device failures or
upcoming maintenance. Overall, Kubernetes has enough extension points to
implement these solutions. However, some extension points require higher
privilege than users may be comfortable with or are too disruptive. The roadmap
section goes into more details on specific improvements in handling the device
failures.&lt;/p&gt;
&lt;h3 id=&#34;failure-modes-container-code-failed&#34;&gt;Failure modes: container code failed&lt;/h3&gt;
&lt;p&gt;When the container code fails or something bad happens with it, like out of
memory conditions, Kubernetes knows how to handle those cases. There is either
the restart of a container, or a crash of a Pod if it has &lt;code&gt;restartPolicy: Never&lt;/code&gt;
and scheduling it on another node. Kubernetes has limited expressiveness on what
is a failure (for example, non-zero exit code or liveness probe failure) and how
to react on such a failure (mostly either Always restart or immediately fail the
Pod).&lt;/p&gt;
&lt;p&gt;This level of expressiveness is often not enough for the complicated AI/ML
workloads. AI/ML pods are better rescheduled locally or even in-place as that
would save on image pulling time and device allocation. AI/ML pods are often
interconnected and need to be restarted together. This adds another level of
complexity and optimizing it often brings major savings in running AI/ML
workloads.&lt;/p&gt;
&lt;p&gt;There are various DIY solutions to handle Pod failures orchestration. The most
typical one is to wrap a main executable in a container by some orchestrator.
And this orchestrator will be able to restart the main executable whenever the
job needs to be restarted because some other pod has failed.&lt;/p&gt;
&lt;p&gt;Solutions like this are very fragile and elaborate. They are often worth the
money saved comparing to a regular JobSet delete/recreate cycle when used in
large training jobs. Making these solutions less fragile and more streamlined
by developing new hooks and extension points in Kubernetes will make it
easy to apply to smaller jobs, benefiting everybody.&lt;/p&gt;
&lt;h3 id=&#34;failure-modes-device-degradation&#34;&gt;Failure modes: device degradation&lt;/h3&gt;
&lt;p&gt;Not all device failures are terminal for the overall workload or batch job.
As the hardware stack gets more and more
complex, misconfiguration on one of the hardware stack layers, or driver
failures, may result in devices that are functional, but lagging on performance.
One device that is lagging behind can slow down the whole training job.&lt;/p&gt;
&lt;p&gt;We see reports of such cases more and more often. Kubernetes has no way to
express this type of failures today and since it is the newest type of failure
mode, there is not much of a best practice offered by hardware vendors for
detection and third party tooling for remediation of these situations.&lt;/p&gt;
&lt;p&gt;Typically, these failures are detected based on observed workload
characteristics. For example, the expected speed of AI/ML training steps on
particular hardware. Remediation for those issues is highly depend on a workload needs.&lt;/p&gt;
&lt;h2 id=&#34;roadmap&#34;&gt;Roadmap&lt;/h2&gt;
&lt;p&gt;As outlined in a section above, Kubernetes offers a lot of extension points
which are used to implement various DIY solutions. The space of AI/ML is
developing very fast, with changing requirements and usage patterns. SIG Node is
taking a measured approach of enabling more extension points to implement the
workload-specific scenarios over introduction of new semantics to support
specific scenarios. This means prioritizing making information about failures
readily available over implementing automatic remediations for those failures
that might only be suitable for a subset of workloads.&lt;/p&gt;
&lt;p&gt;This approach ensures there are no drastic changes for workload handling which
may break existing, well-oiled DIY solutions or experiences with the existing
more traditional workloads.&lt;/p&gt;
&lt;p&gt;Many error handling techniques used today work for AI/ML, but are very
expensive. SIG Node will invest in extension points to make those cheaper, with
the understanding that the price cutting for AI/ML is critical.&lt;/p&gt;
&lt;p&gt;The following is the set of specific investments we envision for various failure
modes.&lt;/p&gt;
&lt;h3 id=&#34;roadmap-for-failure-modes-k8s-infrastructure&#34;&gt;Roadmap for failure modes: K8s infrastructure&lt;/h3&gt;
&lt;p&gt;The area of Kubernetes infrastructure is the easiest to understand and very
important to make right for the upcoming transition from Device Plugins to DRA.
SIG Node is tracking many work items in this area, most notably the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/127460&#34;&gt;integrate kubelet with the systemd watchdog · Issue
#127460&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/128696&#34;&gt;DRA: detect stale DRA plugin sockets · Issue
#128696&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/127803&#34;&gt;Support takeover for devicemanager/device-plugin · Issue
#127803&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/127457&#34;&gt;Kubelet plugin registration reliability · Issue
#127457&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/128167&#34;&gt;Recreate the Device Manager gRPC server if failed · Issue
#128167&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/128043&#34;&gt;Retry pod admission on device plugin grpc failures · Issue
#128043&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Basically, every interaction of Kubernetes components must be reliable via
either the kubelet improvements or the best practices in plugins development
and deployment.&lt;/p&gt;
&lt;h3 id=&#34;roadmap-for-failure-modes-device-failed&#34;&gt;Roadmap for failure modes: device failed&lt;/h3&gt;
&lt;p&gt;For the device failures some patterns are already emerging in common scenarios
that Kubernetes can support. However, the very first step is to make information
about failed devices available easier. The very first step here is the work in
&lt;a href=&#34;https://kep.k8s.io/4680&#34;&gt;KEP 4680&lt;/a&gt; (Add Resource Health Status to the Pod Status for
Device Plugin and DRA).&lt;/p&gt;
&lt;p&gt;Longer term ideas include to be tested:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Integrate device failures into Pod Failure Policy.&lt;/li&gt;
&lt;li&gt;Node-local retry policies, enabling pod failure policies for Pods with
restartPolicy=OnFailure and possibly beyond that.&lt;/li&gt;
&lt;li&gt;Ability to &lt;em&gt;deschedule&lt;/em&gt; pod, including with the &lt;code&gt;restartPolicy: Always&lt;/code&gt;, so it can
get a new device allocated.&lt;/li&gt;
&lt;li&gt;Add device health to the ResourceSlice used to represent devices in DRA,
rather than simply withdrawing an unhealthy device from the ResourceSlice.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;roadmap-for-failure-modes-container-code-failed&#34;&gt;Roadmap for failure modes: container code failed&lt;/h3&gt;
&lt;p&gt;The main improvements to handle container code failures for AI/ML workloads are
all targeting cheaper error handling and recovery. The cheapness is mostly
coming from reuse of pre-allocated resources as much as possible. From reusing
the Pods by restarting containers in-place, to node local restart of containers
instead of rescheduling whenever possible, to snapshotting support, and
re-scheduling prioritizing the same node to save on image pulls.&lt;/p&gt;
&lt;p&gt;Consider this scenario: A big training job needs 512 Pods to run. And one of the
pods failed. It means that all Pods need to be interrupted and synced up to
restart the failed step. The most efficient way to achieve this generally is to
reuse as many Pods as possible by restarting them in-place, while replacing the
failed pod to clear up the error from it. Like demonstrated in this picture:&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/07/03/navigating-failures-in-pods-with-devices/inplace-pod-restarts.svg&#34;
         alt=&#34;The picture shows 512 pod, most ot them are green and have a recycle sign next to them indicating that they can be reused, and one Pod drawn in red, and a new green replacement Pod next to it indicating that it needs to be replaced.&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;It is possible to implement this scenario, but all solutions implementing it are
fragile due to lack of certain extension points in Kubernetes. Adding these
extension points to implement this scenario is on the Kubernetes roadmap.&lt;/p&gt;
&lt;h3 id=&#34;roadmap-for-failure-modes-device-degradation&#34;&gt;Roadmap for failure modes: device degradation&lt;/h3&gt;
&lt;p&gt;There is very little done in this area - there is no clear detection signal,
very limited troubleshooting tooling, and no built-in semantics to express the
&amp;quot;degraded&amp;quot; device on Kubernetes. There has been discussion of adding data on
device performance or degradation in the ResourceSlice used by DRA to represent
devices, but it is not yet clearly defined. There are also projects like
&lt;a href=&#34;https://github.com/medik8s/node-healthcheck-operator&#34;&gt;node-healthcheck-operator&lt;/a&gt;
that can be used for some scenarios.&lt;/p&gt;
&lt;p&gt;We expect developments in this area from hardware vendors and cloud providers, and we expect to see mostly DIY
solutions in the near future. As more users get exposed to AI/ML workloads, this
is a space needing feedback on patterns used here.&lt;/p&gt;
&lt;h2 id=&#34;join-the-conversation&#34;&gt;Join the conversation&lt;/h2&gt;
&lt;p&gt;The Kubernetes community encourages feedback and participation in shaping the
future of device failure handling. Join SIG Node and contribute to the ongoing
discussions!&lt;/p&gt;
&lt;p&gt;This blog post provides a high-level overview of the challenges and future
directions for device failure management in Kubernetes. By addressing these
issues, Kubernetes can solidify its position as the leading platform for AI/ML
workloads, ensuring resilience and reliability for applications that depend on
specialized hardware.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Image Compatibility In Cloud Native Environments</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/25/image-compatibility-in-cloud-native-environments/</link>
      <pubDate>Wed, 25 Jun 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/25/image-compatibility-in-cloud-native-environments/</guid>
      <description>
        
        
        &lt;p&gt;In industries where systems must run very reliably and meet strict performance criteria such as telecommunication, high-performance or AI computing, containerized applications often need specific operating system configuration or hardware presence.
It is common practice to require the use of specific versions of the kernel, its configuration, device drivers, or system components.
Despite the existence of the &lt;a href=&#34;https://opencontainers.org/&#34;&gt;Open Container Initiative (OCI)&lt;/a&gt;, a governing community to define standards and specifications for container images, there has been a gap in expression of such compatibility requirements.
The need to address this issue has led to different proposals and, ultimately, an implementation in Kubernetes&#39; &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html&#34;&gt;Node Feature Discovery (NFD)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html&#34;&gt;NFD&lt;/a&gt; is an open source Kubernetes project that automatically detects and reports &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/usage/customization-guide.html#available-features&#34;&gt;hardware and system features&lt;/a&gt; of cluster nodes. This information helps users to schedule workloads on nodes that meet specific system requirements, which is especially useful for applications with strict hardware or operating system dependencies.&lt;/p&gt;
&lt;h2 id=&#34;the-need-for-image-compatibility-specification&#34;&gt;The need for image compatibility specification&lt;/h2&gt;
&lt;h3 id=&#34;dependencies-between-containers-and-host-os&#34;&gt;Dependencies between containers and host OS&lt;/h3&gt;
&lt;p&gt;A container image is built on a base image, which provides a minimal runtime environment, often a stripped-down Linux userland, completely empty or distroless. When an application requires certain features from the host OS, compatibility issues arise. These dependencies can manifest in several ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Drivers&lt;/strong&gt;:
Host driver versions must match the supported range of a library version inside the container to avoid compatibility problems. Examples include GPUs and network drivers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Libraries or Software&lt;/strong&gt;:
The container must come with a specific version or range of versions for a library or software to run optimally in the environment. Examples from high performance computing are MPI, EFA, or Infiniband.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kernel Modules or Features&lt;/strong&gt;:
Specific kernel features or modules must be present. Examples include having support of write protected huge page faults, or the presence of VFIO&lt;/li&gt;
&lt;li&gt;And more…&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While containers in Kubernetes are the most likely unit of abstraction for these needs, the definition of compatibility can extend further to include other container technologies such as Singularity and other OCI artifacts such as binaries from a spack binary cache.&lt;/p&gt;
&lt;h3 id=&#34;multi-cloud-and-hybrid-cloud-challenges&#34;&gt;Multi-cloud and hybrid cloud challenges&lt;/h3&gt;
&lt;p&gt;Containerized applications are deployed across various Kubernetes distributions and cloud providers, where different host operating systems introduce compatibility challenges.
Often those have to be pre-configured before workload deployment or are immutable.
For instance, different cloud providers will include different operating systems like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;RHCOS/RHEL&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Photon OS&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Amazon Linux 2&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Container-Optimized OS&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Azure Linux OS&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;And more...&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each OS comes with unique kernel versions, configurations, and drivers, making compatibility a non-trivial issue for applications requiring specific features.
It must be possible to quickly assess a container for its suitability to run on any specific environment.&lt;/p&gt;
&lt;h3 id=&#34;image-compatibility-initiative&#34;&gt;Image compatibility initiative&lt;/h3&gt;
&lt;p&gt;An effort was made within the &lt;a href=&#34;https://github.com/opencontainers/wg-image-compatibility&#34;&gt;Open Containers Initiative Image Compatibility&lt;/a&gt; working group to introduce a standard for image compatibility metadata.
A specification for compatibility would allow container authors to declare required host OS features, making compatibility requirements discoverable and programmable.
The specification implemented in Kubernetes Node Feature Discovery is one of the discussed proposals.
It aims to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Define a structured way to express compatibility in OCI image manifests.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Support a compatibility specification alongside container images in image registries.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Allow automated validation of compatibility before scheduling containers.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The concept has since been implemented in the Kubernetes Node Feature Discovery project.&lt;/p&gt;
&lt;h3 id=&#34;implementation-in-node-feature-discovery&#34;&gt;Implementation in Node Feature Discovery&lt;/h3&gt;
&lt;p&gt;The solution integrates compatibility metadata into Kubernetes via NFD features and the &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/usage/custom-resources.html#nodefeaturegroup&#34;&gt;NodeFeatureGroup&lt;/a&gt; API.
This interface enables the user to match containers to nodes based on exposing features of hardware and software, allowing for intelligent scheduling and workload optimization.&lt;/p&gt;
&lt;h3 id=&#34;compatibility-specification&#34;&gt;Compatibility specification&lt;/h3&gt;
&lt;p&gt;The compatibility specification is a structured list of compatibility objects containing &lt;em&gt;&lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/usage/custom-resources.html#nodefeaturegroup&#34;&gt;Node Feature Groups&lt;/a&gt;&lt;/em&gt;.
These objects define image requirements and facilitate validation against host nodes.
The feature requirements are described by using &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/usage/customization-guide.html#available-features&#34;&gt;the list of available features&lt;/a&gt; from the NFD project.
The schema has the following structure:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;version&lt;/strong&gt; (string) - Specifies the API version.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;compatibilities&lt;/strong&gt; (array of objects) - List of compatibility sets.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;rules&lt;/strong&gt; (object) - Specifies &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/usage/custom-resources.html#nodefeaturegroup&#34;&gt;NodeFeatureGroup&lt;/a&gt; to define image requirements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;weight&lt;/strong&gt; (int, optional) - Node affinity weight.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;tag&lt;/strong&gt; (string, optional) - Categorization tag.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;description&lt;/strong&gt; (string, optional) - Short description.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;An example might look like the following:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;version&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1alpha1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;compatibilities&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;description&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;My image requirements&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;kernel and cpu&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchFeatures&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;feature&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;kernel.loadedmodule&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchExpressions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;vfio-pci&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Exists}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;feature&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;cpu.model&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchExpressions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;vendor_id&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op: In, value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;Intel&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;AMD&amp;#34;&lt;/span&gt;]}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;one of available nics&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchAny&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchFeatures&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;feature&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pci.device&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchExpressions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;vendor&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op: In, value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0eee&amp;#34;&lt;/span&gt;]}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;class&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op: In, value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0200&amp;#34;&lt;/span&gt;]}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchFeatures&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;feature&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pci.device&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchExpressions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;vendor&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op: In, value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0fff&amp;#34;&lt;/span&gt;]}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;class&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;op: In, value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;0200&amp;#34;&lt;/span&gt;]}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;client-implementation-for-node-validation&#34;&gt;Client implementation for node validation&lt;/h3&gt;
&lt;p&gt;To streamline compatibility validation, we implemented a &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/reference/node-feature-client-reference.html&#34;&gt;client tool&lt;/a&gt; that allows for node validation based on an image&#39;s compatibility artifact.
In this workflow, the image author would generate a compatibility artifact that points to the image it describes in a registry via the referrers API.
When a need arises to assess the fit of an image to a host, the tool can discover the artifact and verify compatibility of an image to a node before deployment.
The client can validate nodes both inside and outside a Kubernetes cluster, extending the utility of the tool beyond the single Kubernetes use case.
In the future, image compatibility could play a crucial role in creating specific workload profiles based on image compatibility requirements, aiding in more efficient scheduling.
Additionally, it could potentially enable automatic node configuration to some extent, further optimizing resource allocation and ensuring seamless deployment of specialized workloads.&lt;/p&gt;
&lt;h3 id=&#34;examples-of-usage&#34;&gt;Examples of usage&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Define image compatibility metadata&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/containers/images/&#34;&gt;container image&lt;/a&gt; can have metadata that describes
its requirements based on features discovered from nodes, like kernel modules or CPU models.
The previous compatibility specification example in this article exemplified this use case.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Attach the artifact to the image&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The image compatibility specification is stored as an OCI artifact.
You can attach this metadata to your container image using the &lt;a href=&#34;https://oras.land/&#34;&gt;oras&lt;/a&gt; tool.
The registry only needs to support OCI artifacts, support for arbitrary types is not required.
Keep in mind that the container image and the artifact must be stored in the same registry.
Use the following command to attach the artifact to the image:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;oras attach &lt;span style=&#34;color:#b62;font-weight:bold&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b62;font-weight:bold&#34;&gt;&lt;/span&gt;--artifact-type application/vnd.nfd.image-compatibility.v1alpha1 &amp;lt;image-url&amp;gt; &lt;span style=&#34;color:#b62;font-weight:bold&#34;&gt;\ &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&amp;lt;path-to-spec&amp;gt;.yaml:application/vnd.nfd.image-compatibility.spec.v1alpha1+yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Validate image compatibility&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;After attaching the compatibility specification, you can validate whether a node meets the
image&#39;s requirements. This validation can be done using the
&lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/reference/node-feature-client-reference.html&#34;&gt;nfd client&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;nfd compat validate-node --image &amp;lt;image-url&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Read the output from the client&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Finally you can read the report generated by the tool or use your own tools to act based on the generated JSON report.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;validate-node command output&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/25/image-compatibility-in-cloud-native-environments/validate-node-output.png&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The addition of image compatibility to Kubernetes through Node Feature Discovery underscores the growing importance of addressing compatibility in cloud native environments.
It is only a start, as further work is needed to integrate compatibility into scheduling of workloads within and outside of Kubernetes.
However, by integrating this feature into Kubernetes, mission-critical workloads can now define and validate host OS requirements more efficiently.
Moving forward, the adoption of compatibility metadata within Kubernetes ecosystems will significantly enhance the reliability and performance of specialized containerized applications, ensuring they meet the stringent requirements of industries like telecommunications, high-performance computing or any environment that requires special hardware or host OS configuration.&lt;/p&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;Join the &lt;a href=&#34;https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/contributing/&#34;&gt;Kubernetes Node Feature Discovery&lt;/a&gt; project if you&#39;re interested in getting involved with the design and development of Image Compatibility API and tools.
We always welcome new contributors.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Changes to Kubernetes Slack</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/16/changes-to-kubernetes-slack/</link>
      <pubDate>Mon, 16 Jun 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/16/changes-to-kubernetes-slack/</guid>
      <description>
        
        
        &lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: We’ve received notice from Salesforce that our Slack workspace &lt;strong&gt;WILL NOT BE DOWNGRADED&lt;/strong&gt; on June 20th. Stand by for more details, but for now, there is no urgency to back up private channels or direct messages.&lt;/p&gt;
&lt;p&gt;&lt;del&gt;Kubernetes Slack will lose its special status and will be changing into a standard free Slack on June 20, 2025&lt;/del&gt;. Sometime later this year, our community may move to a new platform. If you are responsible for a channel or private channel, or a member of a User Group, you will need to take some actions as soon as you can.&lt;/p&gt;
&lt;p&gt;For the last decade, Slack has supported our project with a free customized enterprise account. They have let us know that they can no longer do so, particularly since our Slack is one of the largest and more active ones on the platform. As such, they will be downgrading it to a standard free Slack while we decide on, and implement, other options.&lt;/p&gt;
&lt;p&gt;On Friday, June 20, we will be subject to the &lt;a href=&#34;https://slack.com/help/articles/27204752526611-Feature-limitations-on-the-free-version-of-Slack&#34;&gt;feature limitations of free Slack&lt;/a&gt;. The primary ones which will affect us will be only retaining 90 days of history, and having to disable several apps and workflows which we are currently using. The Slack Admin team will do their best to manage these limitations.&lt;/p&gt;
&lt;p&gt;Responsible channel owners, members of private channels, and members of User Groups should &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/communication/slack-migration-faq.md#what-actions-do-channel-owners-and-user-group-members-need-to-take-soon&#34;&gt;take some actions&lt;/a&gt; to prepare for the upgrade and preserve information as soon as possible.&lt;/p&gt;
&lt;p&gt;The CNCF Projects Staff have proposed that our community look at migrating to Discord. Because of existing issues where we have been pushing the limits of Slack, they have already explored what a Kubernetes Discord would look like. Discord would allow us to implement new tools and integrations which would help the community, such as GitHub group membership synchronization. The Steering Committee will discuss and decide on our future platform.&lt;/p&gt;
&lt;p&gt;Please see our &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/communication/slack-migration-faq.md&#34;&gt;FAQ&lt;/a&gt;, and check the &lt;a href=&#34;https://groups.google.com/a/kubernetes.io/g/dev/&#34;&gt;kubernetes-dev mailing list&lt;/a&gt; and the &lt;a href=&#34;https://kubernetes.slack.com/archives/C9T0QMNG4&#34;&gt;#announcements channel&lt;/a&gt; for further news. If you have specific feedback on our Slack status join the &lt;a href=&#34;https://github.com/kubernetes/community/issues/8490&#34;&gt;discussion on GitHub&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Enhancing Kubernetes Event Management with Custom Aggregation</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/10/enhancing-kubernetes-event-management-custom-aggregation/</link>
      <pubDate>Tue, 10 Jun 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/10/enhancing-kubernetes-event-management-custom-aggregation/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubernetes-api/cluster-resources/event-v1/&#34;&gt;Events&lt;/a&gt; provide crucial insights into cluster operations, but as clusters grow, managing and analyzing these events becomes increasingly challenging. This blog post explores how to build custom event aggregation systems that help engineering teams better understand cluster behavior and troubleshoot issues more effectively.&lt;/p&gt;
&lt;h2 id=&#34;the-challenge-with-kubernetes-events&#34;&gt;The challenge with Kubernetes events&lt;/h2&gt;
&lt;p&gt;In a Kubernetes cluster, events are generated for various operations - from pod scheduling and container starts to volume mounts and network configurations. While these events are invaluable for debugging and monitoring, several challenges emerge in production environments:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Volume&lt;/strong&gt;: Large clusters can generate thousands of events per minute&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Retention&lt;/strong&gt;: Default event retention is limited to one hour&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Correlation&lt;/strong&gt;: Related events from different components are not automatically linked&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Classification&lt;/strong&gt;: Events lack standardized severity or category classifications&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Aggregation&lt;/strong&gt;: Similar events are not automatically grouped&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To learn more about Events in Kubernetes, read the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/reference/kubernetes-api/cluster-resources/event-v1/&#34;&gt;Event&lt;/a&gt; API reference.&lt;/p&gt;
&lt;h2 id=&#34;real-world-value&#34;&gt;Real-World value&lt;/h2&gt;
&lt;p&gt;Consider a production environment with tens of microservices where the users report intermittent transaction failures:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Traditional event aggregation process:&lt;/strong&gt; Engineers are wasting hours sifting through thousands of standalone events spread across namespaces. By the time they look into it, the older events have long since purged, and correlating pod restarts to node-level issues is practically impossible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;With its event aggregation in its custom events:&lt;/strong&gt; The system groups events across resources, instantly surfacing correlation patterns such as volume mount timeouts before pod restarts. History indicates it occurred during past record traffic spikes, highlighting a storage scalability issue in minutes rather than hours.&lt;/p&gt;
&lt;p&gt;The beneﬁt of this approach is that organizations that implement it commonly cut down their troubleshooting time significantly along with increasing the reliability of systems by detecting patterns early.&lt;/p&gt;
&lt;h2 id=&#34;building-an-event-aggregation-system&#34;&gt;Building an Event aggregation system&lt;/h2&gt;
&lt;p&gt;This post explores how to build a custom event aggregation system that addresses these challenges, aligned to Kubernetes best practices. I&#39;ve picked the Go programming language for my example.&lt;/p&gt;
&lt;h3 id=&#34;architecture-overview&#34;&gt;Architecture overview&lt;/h3&gt;
&lt;p&gt;This event aggregation system consists of three main components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Event Watcher&lt;/strong&gt;: Monitors the Kubernetes API for new events&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Event Processor&lt;/strong&gt;: Processes, categorizes, and correlates events&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Storage Backend&lt;/strong&gt;: Stores processed events for longer retention&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here&#39;s a sketch for how to implement the event watcher:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;package&lt;/span&gt; main
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;import&lt;/span&gt; (
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;context&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    metav1 &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;k8s.io/apimachinery/pkg/apis/meta/v1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;k8s.io/client-go/kubernetes&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;k8s.io/client-go/rest&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    eventsv1 &lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;k8s.io/api/events/v1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; EventWatcher &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    clientset &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;kubernetes.Clientset
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;NewEventWatcher&lt;/span&gt;(config &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;rest.Config) (&lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;EventWatcher, &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;error&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    clientset, err &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; kubernetes.&lt;span style=&#34;color:#00a000&#34;&gt;NewForConfig&lt;/span&gt;(config)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; err &lt;span style=&#34;color:#666&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt;, err
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;&amp;amp;&lt;/span&gt;EventWatcher{clientset: clientset}, &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; (w &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;EventWatcher) &lt;span style=&#34;color:#00a000&#34;&gt;Watch&lt;/span&gt;(ctx context.Context) (&lt;span style=&#34;color:#666&#34;&gt;&amp;lt;-&lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;chan&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event, &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;error&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    events &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a2f&#34;&gt;make&lt;/span&gt;(&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;chan&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    watcher, err &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; w.clientset.&lt;span style=&#34;color:#00a000&#34;&gt;EventsV1&lt;/span&gt;().&lt;span style=&#34;color:#00a000&#34;&gt;Events&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;).&lt;span style=&#34;color:#00a000&#34;&gt;Watch&lt;/span&gt;(ctx, metav1.ListOptions{})
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; err &lt;span style=&#34;color:#666&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt;, err
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;go&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt;() {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;defer&lt;/span&gt; &lt;span style=&#34;color:#a2f&#34;&gt;close&lt;/span&gt;(events)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;for&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;select&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;case&lt;/span&gt; event &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;&amp;lt;-&lt;/span&gt;watcher.&lt;span style=&#34;color:#00a000&#34;&gt;ResultChan&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; e, ok &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; event.Object.(&lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event); ok {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                    events &lt;span style=&#34;color:#666&#34;&gt;&amp;lt;-&lt;/span&gt; e
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;case&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;&amp;lt;-&lt;/span&gt;ctx.&lt;span style=&#34;color:#00a000&#34;&gt;Done&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                watcher.&lt;span style=&#34;color:#00a000&#34;&gt;Stop&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; events, &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;nil&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;event-processing-and-classification&#34;&gt;Event processing and classification&lt;/h3&gt;
&lt;p&gt;The event processor enriches events with additional context and classification:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; EventProcessor &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    categoryRules []CategoryRule
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    correlationRules []CorrelationRule
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; ProcessedEvent &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Event     &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Category  &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Severity  &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    CorrelationID &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Metadata  &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;]&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; (p &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;EventProcessor) &lt;span style=&#34;color:#00a000&#34;&gt;Process&lt;/span&gt;(event &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.&lt;span style=&#34;&#34;&gt;E&lt;/span&gt;vent) &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;ProcessedEvent {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    processed &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#666&#34;&gt;&amp;amp;&lt;/span&gt;ProcessedEvent{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        Event:    event,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        Metadata: &lt;span style=&#34;color:#a2f&#34;&gt;make&lt;/span&gt;(&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;]&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Apply classification rules
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    processed.Category = p.&lt;span style=&#34;color:#00a000&#34;&gt;classifyEvent&lt;/span&gt;(event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    processed.Severity = p.&lt;span style=&#34;color:#00a000&#34;&gt;determineSeverity&lt;/span&gt;(event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Generate correlation ID for related events
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    processed.CorrelationID = p.&lt;span style=&#34;color:#00a000&#34;&gt;correlateEvent&lt;/span&gt;(event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Add useful metadata
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    processed.Metadata = p.&lt;span style=&#34;color:#00a000&#34;&gt;extractMetadata&lt;/span&gt;(event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; processed
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;implementing-event-correlation&#34;&gt;Implementing Event correlation&lt;/h3&gt;
&lt;p&gt;One of the key features you could implement is a way of correlating related Events.
Here&#39;s an example correlation strategy:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; (p &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;EventProcessor) &lt;span style=&#34;color:#00a000&#34;&gt;correlateEvent&lt;/span&gt;(event &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event) &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Correlation strategies:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// 1. Time-based: Events within a time window
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// 2. Resource-based: Events affecting the same resource
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// 3. Causation-based: Events with cause-effect relationships
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    correlationKey &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;generateCorrelationKey&lt;/span&gt;(event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; correlationKey
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;generateCorrelationKey&lt;/span&gt;(event &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;eventsv1.Event) &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Example: Combine namespace, resource type, and name
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; fmt.&lt;span style=&#34;color:#00a000&#34;&gt;Sprintf&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;%s/%s/%s&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        event.InvolvedObject.Namespace,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        event.InvolvedObject.Kind,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        event.InvolvedObject.Name,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;event-storage-and-retention&#34;&gt;Event storage and retention&lt;/h2&gt;
&lt;p&gt;For long-term storage and analysis, you&#39;ll probably want a backend that supports:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Efficient querying of large event volumes&lt;/li&gt;
&lt;li&gt;Flexible retention policies&lt;/li&gt;
&lt;li&gt;Support for aggregation queries&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here&#39;s a sample storage interface:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; EventStorage &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;interface&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#00a000&#34;&gt;Store&lt;/span&gt;(context.Context, &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;ProcessedEvent) &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;error&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#00a000&#34;&gt;Query&lt;/span&gt;(context.Context, EventQuery) ([]ProcessedEvent, &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;error&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#00a000&#34;&gt;Aggregate&lt;/span&gt;(context.Context, AggregationParams) ([]EventAggregate, &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;error&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; EventQuery &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    TimeRange     TimeRange
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Categories    []&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Severity      []&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    CorrelationID &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Limit         &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;int&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; AggregationParams &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    GroupBy    []&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    TimeWindow &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Metrics    []&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;good-practices-for-event-management&#34;&gt;Good practices for Event management&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Resource Efficiency&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Implement rate limiting for event processing&lt;/li&gt;
&lt;li&gt;Use efficient filtering at the API server level&lt;/li&gt;
&lt;li&gt;Batch events for storage operations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Distribute event processing across multiple workers&lt;/li&gt;
&lt;li&gt;Use leader election for coordination&lt;/li&gt;
&lt;li&gt;Implement backoff strategies for API rate limits&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reliability&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Handle API server disconnections gracefully&lt;/li&gt;
&lt;li&gt;Buffer events during storage backend unavailability&lt;/li&gt;
&lt;li&gt;Implement retry mechanisms with exponential backoff&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;advanced-features&#34;&gt;Advanced features&lt;/h2&gt;
&lt;h3 id=&#34;pattern-detection&#34;&gt;Pattern detection&lt;/h3&gt;
&lt;p&gt;Implement pattern detection to identify recurring issues:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; PatternDetector &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    patterns &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;]&lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;Pattern
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    threshold &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;int&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; (d &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;PatternDetector) &lt;span style=&#34;color:#00a000&#34;&gt;Detect&lt;/span&gt;(events []ProcessedEvent) []Pattern {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Group similar events
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    groups &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;groupSimilarEvents&lt;/span&gt;(events)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Analyze frequency and timing
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;    patterns &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;identifyPatterns&lt;/span&gt;(groups)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; patterns
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;groupSimilarEvents&lt;/span&gt;(events []ProcessedEvent) &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;][]ProcessedEvent {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    groups &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a2f&#34;&gt;make&lt;/span&gt;(&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;][]ProcessedEvent)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;for&lt;/span&gt; _, event &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;range&lt;/span&gt; events {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Create similarity key based on event characteristics
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        similarityKey &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; fmt.&lt;span style=&#34;color:#00a000&#34;&gt;Sprintf&lt;/span&gt;(&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;%s:%s:%s&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            event.Event.Reason,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            event.Event.InvolvedObject.Kind,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            event.Event.InvolvedObject.Namespace,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Group events with the same key
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        groups[similarityKey] = &lt;span style=&#34;color:#a2f&#34;&gt;append&lt;/span&gt;(groups[similarityKey], event)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; groups
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; &lt;span style=&#34;color:#00a000&#34;&gt;identifyPatterns&lt;/span&gt;(groups &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;map&lt;/span&gt;[&lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;string&lt;/span&gt;][]ProcessedEvent) []Pattern {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;var&lt;/span&gt; patterns []Pattern
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;for&lt;/span&gt; key, events &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;range&lt;/span&gt; groups {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Only consider groups with enough events to form a pattern
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; &lt;span style=&#34;color:#a2f&#34;&gt;len&lt;/span&gt;(events) &amp;lt; &lt;span style=&#34;color:#666&#34;&gt;3&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;continue&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Sort events by time
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        sort.&lt;span style=&#34;color:#00a000&#34;&gt;Slice&lt;/span&gt;(events, &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt;(i, j &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;int&lt;/span&gt;) &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;bool&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; events[i].Event.LastTimestamp.Time.&lt;span style=&#34;color:#00a000&#34;&gt;Before&lt;/span&gt;(events[j].Event.LastTimestamp.Time)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Calculate time range and frequency
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        firstSeen &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; events[&lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;].Event.FirstTimestamp.Time
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        lastSeen &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; events[&lt;span style=&#34;color:#a2f&#34;&gt;len&lt;/span&gt;(events)&lt;span style=&#34;color:#666&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;].Event.LastTimestamp.Time
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        duration &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; lastSeen.&lt;span style=&#34;color:#00a000&#34;&gt;Sub&lt;/span&gt;(firstSeen).&lt;span style=&#34;color:#00a000&#34;&gt;Minutes&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;var&lt;/span&gt; frequency &lt;span style=&#34;color:#0b0;font-weight:bold&#34;&gt;float64&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; duration &amp;gt; &lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            frequency = &lt;span style=&#34;color:#a2f&#34;&gt;float64&lt;/span&gt;(&lt;span style=&#34;color:#a2f&#34;&gt;len&lt;/span&gt;(events)) &lt;span style=&#34;color:#666&#34;&gt;/&lt;/span&gt; duration
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Create a pattern if it meets threshold criteria
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; frequency &amp;gt; &lt;span style=&#34;color:#666&#34;&gt;0.5&lt;/span&gt; { &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// More than 1 event per 2 minutes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;            pattern &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; Pattern{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                Type:         key,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                Count:        &lt;span style=&#34;color:#a2f&#34;&gt;len&lt;/span&gt;(events),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                FirstSeen:    firstSeen,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                LastSeen:     lastSeen,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                Frequency:    frequency,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                EventSamples: events[:&lt;span style=&#34;color:#a2f&#34;&gt;min&lt;/span&gt;(&lt;span style=&#34;color:#666&#34;&gt;3&lt;/span&gt;, &lt;span style=&#34;color:#a2f&#34;&gt;len&lt;/span&gt;(events))], &lt;span style=&#34;color:#080;font-style:italic&#34;&gt;// Keep up to 3 samples
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;&lt;/span&gt;            }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            patterns = &lt;span style=&#34;color:#a2f&#34;&gt;append&lt;/span&gt;(patterns, pattern)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;return&lt;/span&gt; patterns
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With this implementation, the system can identify recurring patterns such as node pressure events, pod scheduling failures, or networking issues that occur with a specific frequency.&lt;/p&gt;
&lt;h3 id=&#34;real-time-alerts&#34;&gt;Real-time alerts&lt;/h3&gt;
&lt;p&gt;The following example provides a starting point for building an alerting system based on event patterns. It is not a complete solution but a conceptual sketch to illustrate the approach.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;type&lt;/span&gt; AlertManager &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;struct&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    rules []AlertRule
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    notifiers []Notifier
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;func&lt;/span&gt; (a &lt;span style=&#34;color:#666&#34;&gt;*&lt;/span&gt;AlertManager) &lt;span style=&#34;color:#00a000&#34;&gt;EvaluateEvents&lt;/span&gt;(events []ProcessedEvent) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;for&lt;/span&gt; _, rule &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;range&lt;/span&gt; a.rules {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;if&lt;/span&gt; rule.&lt;span style=&#34;color:#00a000&#34;&gt;Matches&lt;/span&gt;(events) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            alert &lt;span style=&#34;color:#666&#34;&gt;:=&lt;/span&gt; rule.&lt;span style=&#34;color:#00a000&#34;&gt;GenerateAlert&lt;/span&gt;(events)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            a.&lt;span style=&#34;color:#00a000&#34;&gt;notify&lt;/span&gt;(alert)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;A well-designed event aggregation system can significantly improve cluster observability and troubleshooting capabilities. By implementing custom event processing, correlation, and storage, operators can better understand cluster behavior and respond to issues more effectively.&lt;/p&gt;
&lt;p&gt;The solutions presented here can be extended and customized based on specific requirements while maintaining compatibility with the Kubernetes API and following best practices for scalability and reliability.&lt;/p&gt;
&lt;h2 id=&#34;next-steps&#34;&gt;Next steps&lt;/h2&gt;
&lt;p&gt;Future enhancements could include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Machine learning for anomaly detection&lt;/li&gt;
&lt;li&gt;Integration with popular observability platforms&lt;/li&gt;
&lt;li&gt;Custom event APIs for application-specific events&lt;/li&gt;
&lt;li&gt;Enhanced visualization and reporting capabilities&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For more information on Kubernetes events and custom &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/architecture/controller/&#34;&gt;controllers&lt;/a&gt;,
refer to the official Kubernetes &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/&#34;&gt;documentation&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Introducing Gateway API Inference Extension</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/</link>
      <pubDate>Thu, 05 Jun 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/</guid>
      <description>
        
        
        &lt;p&gt;Modern generative AI and large language model (LLM) services create unique traffic-routing challenges
on Kubernetes. Unlike typical short-lived, stateless web requests, LLM inference sessions are often
long-running, resource-intensive, and partially stateful. For example, a single GPU-backed model server
may keep multiple inference sessions active and maintain in-memory token caches.&lt;/p&gt;
&lt;p&gt;Traditional load balancers focused on HTTP path or round-robin lack the specialized capabilities needed
for these workloads. They also don’t account for model identity or request criticality (e.g., interactive
chat vs. batch jobs). Organizations often patch together ad-hoc solutions, but a standardized approach
is missing.&lt;/p&gt;
&lt;h2 id=&#34;gateway-api-inference-extension&#34;&gt;Gateway API Inference Extension&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/&#34;&gt;Gateway API Inference Extension&lt;/a&gt; was created to address
this gap by building on the existing &lt;a href=&#34;https://gateway-api.sigs.k8s.io/&#34;&gt;Gateway API&lt;/a&gt;, adding inference-specific
routing capabilities while retaining the familiar model of Gateways and HTTPRoutes. By adding an inference
extension to your existing gateway, you effectively transform it into an &lt;strong&gt;Inference Gateway&lt;/strong&gt;, enabling you to
self-host GenAI/LLMs with a “model-as-a-service” mindset.&lt;/p&gt;
&lt;p&gt;The project’s goal is to improve and standardize routing to inference workloads across the ecosystem. Key
objectives include enabling model-aware routing, supporting per-request criticalities, facilitating safe model
roll-outs, and optimizing load balancing based on real-time model metrics. By achieving these, the project aims
to reduce latency and improve accelerator (GPU) utilization for AI workloads.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;The design introduces two new Custom Resources (CRDs) with distinct responsibilities, each aligning with a
specific user persona in the AI/ML serving workflow​:&lt;/p&gt;


&lt;figure class=&#34;diagram-large clickable-zoom&#34;&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/inference-extension-resource-model.png&#34;
         alt=&#34;Resource Model&#34;/&gt; 
&lt;/figure&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/api-types/inferencepool/&#34;&gt;InferencePool&lt;/a&gt;
Defines a pool of pods (model servers) running on shared compute (e.g., GPU nodes). The platform admin can
configure how these pods are deployed, scaled, and balanced. An InferencePool ensures consistent resource
usage and enforces platform-wide policies. An InferencePool is similar to a Service but specialized for AI/ML
serving needs and aware of the model-serving protocol.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/api-types/inferencemodel/&#34;&gt;InferenceModel&lt;/a&gt;
A user-facing model endpoint managed by AI/ML owners. It maps a public name (e.g., &amp;quot;gpt-4-chat&amp;quot;) to the actual
model within an InferencePool. This lets workload owners specify which models (and optional fine-tuning) they
want served, plus a traffic-splitting or prioritization policy.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In summary, the InferenceModel API lets AI/ML owners manage what is served, while the InferencePool lets platform
operators manage where and how it’s served.&lt;/p&gt;
&lt;h2 id=&#34;request-flow&#34;&gt;Request flow&lt;/h2&gt;
&lt;p&gt;The flow of a request builds on the Gateway API model (Gateways and HTTPRoutes) with one or more extra inference-aware
steps (extensions) in the middle. Here’s a high-level example of the request flow with the
&lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/#endpoint-selection-extension&#34;&gt;Endpoint Selection Extension (ESE)&lt;/a&gt;:&lt;/p&gt;


&lt;figure class=&#34;diagram-large clickable-zoom&#34;&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/inference-extension-request-flow.png&#34;
         alt=&#34;Request Flow&#34;/&gt; 
&lt;/figure&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Gateway Routing&lt;/strong&gt;&lt;br&gt;
A client sends a request (e.g., an HTTP POST to /completions). The Gateway (like Envoy) examines the HTTPRoute
and identifies the matching InferencePool backend.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Endpoint Selection&lt;/strong&gt;&lt;br&gt;
Instead of simply forwarding to any available pod, the Gateway consults an inference-specific routing extension—
the Endpoint Selection Extension—to pick the best of the available pods. This extension examines live pod metrics
(queue lengths, memory usage, loaded adapters) to choose the ideal pod for the request.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Inference-Aware Scheduling&lt;/strong&gt;&lt;br&gt;
The chosen pod is the one that can handle the request with the lowest latency or highest efficiency, given the
user’s criticality or resource needs. The Gateway then forwards traffic to that specific pod.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;


&lt;figure class=&#34;diagram-large clickable-zoom&#34;&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/inference-extension-epp-scheduling.png&#34;
         alt=&#34;Endpoint Extension Scheduling&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;This extra step provides a smarter, model-aware routing mechanism that still feels like a normal single request to
the client. Additionally, the design is extensible—any Inference Gateway can be enhanced with additional inference-specific
extensions to handle new routing strategies, advanced scheduling logic, or specialized hardware needs. As the project
continues to grow, contributors are encouraged to develop new extensions that are fully compatible with the same underlying
Gateway API model, further expanding the possibilities for efficient and intelligent GenAI/LLM routing.&lt;/p&gt;
&lt;h2 id=&#34;benchmarks&#34;&gt;Benchmarks&lt;/h2&gt;
&lt;p&gt;We evaluated ​this extension against a standard Kubernetes Service for a &lt;a href=&#34;https://docs.vllm.ai/en/latest/&#34;&gt;vLLM&lt;/a&gt;‐based model
serving deployment. The test environment consisted of multiple H100 (80 GB) GPU pods running vLLM (&lt;a href=&#34;https://blog.vllm.ai/2025/01/27/v1-alpha-release.html&#34;&gt;version 1&lt;/a&gt;)
on a Kubernetes cluster, with 10 Llama2 model replicas. The &lt;a href=&#34;https://github.com/AI-Hypercomputer/inference-benchmark&#34;&gt;Latency Profile Generator (LPG)&lt;/a&gt;
tool was used to generate traffic and measure throughput, latency, and other metrics. The
&lt;a href=&#34;https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json&#34;&gt;ShareGPT&lt;/a&gt;
dataset served as the workload, and traffic was ramped from 100 Queries per Second (QPS) up to 1000 QPS.&lt;/p&gt;
&lt;h3 id=&#34;key-results&#34;&gt;Key results&lt;/h3&gt;


&lt;figure class=&#34;diagram-large clickable-zoom&#34;&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/05/introducing-gateway-api-inference-extension/inference-extension-benchmark.png&#34;
         alt=&#34;Endpoint Extension Scheduling&#34;/&gt; 
&lt;/figure&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Comparable Throughput&lt;/strong&gt;: Throughout the tested QPS range, the ESE delivered throughput roughly on par with a standard
Kubernetes Service.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lower Latency&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Per‐Output‐Token Latency&lt;/strong&gt;: The ​ESE showed significantly lower p90 latency at higher QPS (500+), indicating that
its model-aware routing decisions reduce queueing and resource contention as GPU memory approaches saturation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Overall p90 Latency&lt;/strong&gt;: Similar trends emerged, with the ​ESE reducing end‐to‐end tail latencies compared to the
baseline, particularly as traffic increased beyond 400–500 QPS.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These results suggest that this extension&#39;s model‐aware routing significantly reduced latency for GPU‐backed LLM
workloads. By dynamically selecting the least‐loaded or best‐performing model server, it avoids hotspots that can
appear when using traditional load balancing methods for large, long‐running inference requests.&lt;/p&gt;
&lt;h2 id=&#34;roadmap&#34;&gt;Roadmap&lt;/h2&gt;
&lt;p&gt;As the Gateway API Inference Extension heads toward GA, planned features include:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Prefix-cache aware load balancing&lt;/strong&gt; for remote caches&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LoRA adapter pipelines&lt;/strong&gt; for automated rollout&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fairness and priority&lt;/strong&gt; between workloads in the same criticality band&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HPA support&lt;/strong&gt; for scaling based on aggregate, per-model metrics&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Support for large multi-modal inputs/outputs&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Additional model types&lt;/strong&gt; (e.g., diffusion models)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Heterogeneous accelerators&lt;/strong&gt; (serving on multiple accelerator types with latency- and cost-aware load balancing)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Disaggregated serving&lt;/strong&gt; for independently scaling pools&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;By aligning model serving with Kubernetes-native tooling, Gateway API Inference Extension aims to simplify
and standardize how AI/ML traffic is routed. With model-aware routing, criticality-based prioritization, and
more, it helps ops teams deliver the right LLM services to the right users—smoothly and efficiently.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ready to learn more?&lt;/strong&gt; Visit the &lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/&#34;&gt;project docs&lt;/a&gt; to dive deeper,
give an Inference Gateway extension a try with a few &lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/guides/&#34;&gt;simple steps&lt;/a&gt;,
and &lt;a href=&#34;https://gateway-api-inference-extension.sigs.k8s.io/contributing/&#34;&gt;get involved&lt;/a&gt; if you’re interested in
contributing to the project!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Start Sidecar First: How To Avoid Snags</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/03/start-sidecar-first/</link>
      <pubDate>Tue, 03 Jun 2025 00:00:00 +0000</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/03/start-sidecar-first/</guid>
      <description>
        
        
        &lt;p&gt;From the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/04/22/multi-container-pods-overview/&#34;&gt;Kubernetes Multicontainer Pods: An Overview blog post&lt;/a&gt; you know what their job is, what are the main architectural patterns, and how they are implemented in Kubernetes. The main thing I’ll cover in this article is how to ensure that your sidecar containers start before the main app. It’s more complicated than you might think!&lt;/p&gt;
&lt;h2 id=&#34;a-gentle-refresher&#34;&gt;A gentle refresher&lt;/h2&gt;
&lt;p&gt;I&#39;d just like to remind readers that the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/12/13/kubernetes-v1-29-release/&#34;&gt;v1.29.0 release of Kubernetes&lt;/a&gt; added native support for
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/pods/sidecar-containers/&#34;&gt;sidecar containers&lt;/a&gt;, which can now be defined within the &lt;code&gt;.spec.initContainers&lt;/code&gt; field,
but with &lt;code&gt;restartPolicy: Always&lt;/code&gt;. You can see that illustrated in the following example Pod manifest snippet:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;logshipper&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;alpine:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# this is what makes it a sidecar container&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;sh&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;-c&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#39;tail -F /opt/logs.txt&amp;#39;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeMounts&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;data&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;mountPath&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;/opt&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What are the specifics of defining sidecars with a &lt;code&gt;.spec.initContainers&lt;/code&gt; block, rather than as a legacy multi-container pod with multiple &lt;code&gt;.spec.containers&lt;/code&gt;?
Well, all &lt;code&gt;.spec.initContainers&lt;/code&gt; are always launched &lt;strong&gt;before&lt;/strong&gt; the main application. If you define Kubernetes-native sidecars, those are terminated &lt;strong&gt;after&lt;/strong&gt; the main application. Furthermore, when used with &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/&#34;&gt;Jobs&lt;/a&gt;, a sidecar container should still be alive and could potentially even restart after the owning Job is complete; Kubernetes-native sidecar containers do not block pod completion.&lt;/p&gt;
&lt;p&gt;To learn more, you can also read the official &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tutorials/configuration/pod-sidecar-containers/&#34;&gt;Pod sidecar containers tutorial&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;the-problem&#34;&gt;The problem&lt;/h2&gt;
&lt;p&gt;Now you know that defining a sidecar with this native approach will always start it before the main application. From the &lt;a href=&#34;https://github.com/kubernetes/kubernetes/blob/537a602195efdc04cdf2cb0368792afad082d9fd/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L827-L830&#34;&gt;kubelet source code&lt;/a&gt;, it&#39;s visible that this often means being started almost in parallel, and this is not always what an engineer wants to achieve. What I&#39;m really interested in is whether I can delay the start of the main application until the sidecar is not just started, but fully running and ready to serve.
It might be a bit tricky because the problem with sidecars is there’s no obvious success signal, contrary to init containers - designed to run only for a specified period of time. With an init container, exit status 0 is unambiguously &amp;quot;I succeeded&amp;quot;. With a sidecar, there are lots of points at which you can say &amp;quot;a thing is running&amp;quot;.
Starting one container only after the previous one is ready is part of a graceful deployment strategy, ensuring proper sequencing and stability during startup. It’s also actually how I’d expect sidecar containers to work as well, to cover the scenario where the main application is dependent on the sidecar. For example, it may happen that an app errors out if the sidecar isn’t available to serve requests (e.g., logging with DataDog). Sure, one could change the application code (and it would actually be the “best practice” solution), but sometimes they can’t - and this post focuses on this use case.&lt;/p&gt;
&lt;p&gt;I&#39;ll explain some ways that you might try, and show you what approaches will really work.&lt;/p&gt;
&lt;h2 id=&#34;readiness-probe&#34;&gt;Readiness probe&lt;/h2&gt;
&lt;p&gt;To check whether Kubernetes native sidecar delays the start of the main application until the sidecar is ready, let’s simulate a short investigation. Firstly, I’ll simulate a sidecar container which will never be ready by implementing a readiness probe which will never succeed. As a reminder, a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/configuration/liveness-readiness-startup-probes/&#34;&gt;readiness probe&lt;/a&gt; checks if the container is ready to start accepting traffic and therefore, if the pod can be used as a backend for services.&lt;/p&gt;
&lt;p&gt;(Unlike standard init containers, sidecar containers can have &lt;a href=&#34;https://kubernetes.io/docs/concepts/configuration/liveness-readiness-startup-probes/&#34;&gt;probes&lt;/a&gt; so that the kubelet can supervise the sidecar and intervene if there are problems. For example, restarting a sidecar container if it fails a health check.)&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;apps/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Deployment&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;replicas&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;selector&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchLabels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;template&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;alpine:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep 3600&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;ports&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containerPort&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;TCP&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;readinessProbe&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;- /bin/sh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;- -c&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;- exit 1&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# this command always fails, keeping the container &amp;#34;Not Ready&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;periodSeconds&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;data&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;emptyDir&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The result is:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;controlplane $ kubectl get pods -w
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;NAME                    READY   STATUS    RESTARTS   AGE
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;myapp-db5474f45-htgw5   1/2     Running   0          9m28s
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;&lt;/span&gt;&lt;span style=&#34;&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#888&#34;&gt;controlplane $ kubectl describe pod myapp-db5474f45-htgw5 
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;Name:             myapp-db5474f45-htgw5
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;Namespace:        default
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;(...)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;Events:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Type     Reason     Age               From               Message
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  ----     ------     ----              ----               -------
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Scheduled  17s               default-scheduler  Successfully assigned default/myapp-db5474f45-htgw5 to node01
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Pulling    16s               kubelet            Pulling image &amp;#34;nginx:latest&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Pulled     16s               kubelet            Successfully pulled image &amp;#34;nginx:latest&amp;#34; in 163ms (163ms including waiting). Image size: 72080558 bytes.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Created    16s               kubelet            Created container nginx
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Started    16s               kubelet            Started container nginx
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Pulling    15s               kubelet            Pulling image &amp;#34;alpine:latest&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Pulled     15s               kubelet            Successfully pulled image &amp;#34;alpine:latest&amp;#34; in 159ms (160ms including waiting). Image size: 3652536 bytes.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Created    15s               kubelet            Created container myapp
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Normal   Started    15s               kubelet            Started container myapp
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#888&#34;&gt;  Warning  Unhealthy  1s (x6 over 15s)  kubelet            Readiness probe failed:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;From these logs it’s evident that only one container is ready - and I know it can’t be the sidecar, because I’ve defined it so it’ll never be ready (you can also check container statuses in &lt;code&gt;kubectl get pod -o json&lt;/code&gt;). I also saw that myapp has been started before the sidecar is ready. That was not the result I wanted to achieve; in this case, the main app container has a hard dependency on its sidecar.&lt;/p&gt;
&lt;h2 id=&#34;maybe-a-startup-probe&#34;&gt;Maybe a startup probe?&lt;/h2&gt;
&lt;p&gt;To ensure that the sidecar is ready before the main app container starts, I can define a &lt;code&gt;startupProbe&lt;/code&gt;. It will delay the start of the main container until the command is successfully executed (returns &lt;code&gt;0&lt;/code&gt; exit status). If you’re wondering why I’ve added it to my &lt;code&gt;initContainer&lt;/code&gt;, let’s analyse what happens If I’d added it to myapp container. I wouldn’t have guaranteed the probe would run before the main application code - and this one, can potentially error out without the sidecar being up and running.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;apps/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Deployment&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;replicas&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;selector&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matchLabels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;template&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;app&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;myapp&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;alpine:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep 3600&amp;#34;&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;ports&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containerPort&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;TCP&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;startupProbe&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;httpGet&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;path&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;/&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;              &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initialDelaySeconds&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;periodSeconds&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;failureThreshold&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;timeoutSeconds&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;20&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;data&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;emptyDir&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This results in 2/2 containers being ready and running, and from events, it can be inferred that the main application started only after nginx had already been started. But to confirm whether it waited for the sidecar readiness, let’s change the &lt;code&gt;startupProbe&lt;/code&gt; to the exec type of command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;startupProbe&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- /bin/sh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- -c&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- sleep 15&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;and run &lt;code&gt;kubectl get pods -w&lt;/code&gt; to watch in real time whether the readiness of both containers only changes after a 15 second delay. Again, events confirm the main application starts after the sidecar.
That means that using the &lt;code&gt;startupProbe&lt;/code&gt; with a correct &lt;code&gt;startupProbe.httpGet&lt;/code&gt; request helps to delay the main application start until the sidecar is ready. It’s not optimal, but it works.&lt;/p&gt;
&lt;h2 id=&#34;what-about-the-poststart-lifecycle-hook&#34;&gt;What about the postStart lifecycle hook?&lt;/h2&gt;
&lt;p&gt;Fun fact: using the &lt;code&gt;postStart&lt;/code&gt; lifecycle hook block will also do the job, but I’d have to write my own mini-shell script, which is even less efficient.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;initContainers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;restartPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Always&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;ports&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containerPort&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;TCP&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;lifecycle&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;postStart&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;- /bin/sh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;- -c&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;- |&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;            echo &amp;#34;Waiting for readiness at http://localhost:80&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;            until curl -sf http://localhost:80; do
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;              echo &amp;#34;Still waiting for http://localhost:80...&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;              sleep 5
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;            done
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#b44;font-style:italic&#34;&gt;            echo &amp;#34;Service is ready at http://localhost:80&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;            
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;liveness-probe&#34;&gt;Liveness probe&lt;/h2&gt;
&lt;p&gt;An interesting exercise would be to check the sidecar container behavior with a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/configuration/liveness-readiness-startup-probes/&#34;&gt;liveness probe&lt;/a&gt;.
A liveness probe behaves and is configured similarly to a readiness probe - only with the difference that it doesn’t affect the readiness of the container but restarts it in case the probe fails.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;livenessProbe&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;exec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- /bin/sh&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- -c&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- exit 1&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# this command always fails, keeping the container &amp;#34;Not Ready&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;periodSeconds&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After adding the liveness probe configured just as the previous readiness probe and checking events of the pod by &lt;code&gt;kubectl describe pod&lt;/code&gt; it’s visible that the sidecar has a restart count above 0. Nevertheless, the main application is not restarted nor influenced at all, even though I&#39;m aware that (in our imaginary worst-case scenario) it can error out when the sidecar is not there serving requests.
What if I’d used a &lt;code&gt;livenessProbe&lt;/code&gt; without lifecycle &lt;code&gt;postStart&lt;/code&gt;? Both containers will be immediately ready: at the beginning, this behavior will not be different from the one without any additional probes since the liveness probe doesn’t affect readiness at all. After a while, the sidecar will begin to restart itself, but it won’t influence the main container.&lt;/p&gt;
&lt;h2 id=&#34;findings-summary&#34;&gt;Findings summary&lt;/h2&gt;
&lt;p&gt;I’ll summarize the startup behavior in the table below:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Probe/Hook&lt;/th&gt;
&lt;th&gt;Sidecar starts before the main app?&lt;/th&gt;
&lt;th&gt;Main app waits for the sidecar to be ready?&lt;/th&gt;
&lt;th&gt;What if the check doesn’t pass?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;readinessProbe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;, but it’s almost in parallel (effectively &lt;strong&gt;no&lt;/strong&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sidecar is not ready; main app continues running&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;livenessProbe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes, but it’s almost in parallel (effectively &lt;strong&gt;no&lt;/strong&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sidecar is restarted, main app continues running&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;startupProbe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Main app is not started&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;postStart&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;, main app container starts after &lt;code&gt;postStart&lt;/code&gt; completes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;, but you have to provide custom logic for that&lt;/td&gt;
&lt;td&gt;Main app is not started&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;To summarize: with sidecars often being a dependency of the main application, you may want to delay the start of the latter until the sidecar is healthy.
The ideal pattern is to start both containers simultaneously and have the app container logic delay at all levels, but it’s not always possible. If that&#39;s what you need, you have to use the right kind of customization to the Pod definition. Thankfully, it’s nice and quick, and you have the recipe ready above.&lt;/p&gt;
&lt;p&gt;Happy deploying!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Gateway API v1.3.0: Advancements in Request Mirroring, CORS, Gateway Merging, and Retry Budgets</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/02/gateway-api-v1-3/</link>
      <pubDate>Mon, 02 Jun 2025 09:00:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/02/gateway-api-v1-3/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img alt=&#34;Gateway API logo&#34; src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/06/02/gateway-api-v1-3/gateway-api-logo.svg&#34;&gt;&lt;/p&gt;
&lt;p&gt;Join us in the Kubernetes SIG Network community in celebrating the general
availability of &lt;a href=&#34;https://gateway-api.sigs.k8s.io/&#34;&gt;Gateway API&lt;/a&gt; v1.3.0! We are
also pleased to announce that there are already a number of conformant
implementations to try, made possible by postponing this blog
announcement. Version 1.3.0 of the API was released about a month ago on
April 24, 2025.&lt;/p&gt;
&lt;p&gt;Gateway API v1.3.0 brings a new feature to the &lt;em&gt;Standard&lt;/em&gt; channel
(Gateway API&#39;s GA release channel): &lt;em&gt;percentage-based request mirroring&lt;/em&gt;, and
introduces three new experimental features: cross-origin resource sharing (CORS)
filters, a standardized mechanism for listener and gateway merging, and retry
budgets.&lt;/p&gt;
&lt;p&gt;Also see the full
&lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/54df0a899c1c5c845dd3a80f05dcfdf65576f03c/CHANGELOG/1.3-CHANGELOG.md&#34;&gt;release notes&lt;/a&gt;
and applaud the
&lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/54df0a899c1c5c845dd3a80f05dcfdf65576f03c/CHANGELOG/1.3-TEAM.md&#34;&gt;v1.3.0 release team&lt;/a&gt;
next time you see them.&lt;/p&gt;
&lt;h2 id=&#34;graduation-to-standard-channel&#34;&gt;Graduation to Standard channel&lt;/h2&gt;
&lt;p&gt;Graduation to the Standard channel is a notable achievement for Gateway API
features, as inclusion in the Standard release channel denotes a high level of
confidence in the API surface and provides guarantees of backward compatibility.
Of course, as with any other Kubernetes API, Standard channel features can continue
to evolve with backward-compatible additions over time, and we (SIG Network)
certainly expect
further refinements and improvements in the future. For more information on how
all of this works, refer to the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/concepts/versioning/&#34;&gt;Gateway API Versioning Policy&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;percentage-based-request-mirroring&#34;&gt;Percentage-based request mirroring&lt;/h3&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/LiorLieberman&#34;&gt;Lior Lieberman&lt;/a&gt;,&lt;a href=&#34;https://github.com/jakebennert&#34;&gt;Jake Bennert&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-3171: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/geps/gep-3171/index.md&#34;&gt;Percentage-Based Request Mirroring&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Percentage-based request mirroring&lt;/em&gt; is an enhancement to the
existing support for &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/http-request-mirroring/&#34;&gt;HTTP request mirroring&lt;/a&gt;, which allows HTTP requests to be duplicated to another backend using the
RequestMirror filter type.  Request mirroring is particularly useful in
blue-green deployment. It can be used to assess the impact of request scaling on
application performance without impacting responses to clients.&lt;/p&gt;
&lt;p&gt;The previous mirroring capability worked on all the requests to a &lt;code&gt;backendRef&lt;/code&gt;.&lt;br&gt;
Percentage-based request mirroring allows users to specify a subset of requests
they want to be mirrored, either by percentage or fraction. This can be
particularly useful when services are receiving a large volume of requests.
Instead of mirroring all of those requests, this new feature can be used to
mirror a smaller subset of them.&lt;/p&gt;
&lt;p&gt;Here&#39;s an example with 42% of the requests to &amp;quot;foo-v1&amp;quot; being mirrored to &amp;quot;foo-v2&amp;quot;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;http-filter-mirror&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;labels&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;gateway&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;mirror-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;mirror-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostnames&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- mirror.example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo-v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;filters&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;RequestMirror&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requestMirror&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo-v2&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;percent&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;42&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# This value must be an integer.&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You can also configure the partial mirroring using a fraction. Here is an example
with 5 out of every 1000 requests to &amp;quot;foo-v1&amp;quot; being mirrored to &amp;quot;foo-v2&amp;quot;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo-v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;filters&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;RequestMirror&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;requestMirror&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo-v2&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;8080&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;fraction&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;numerator&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;          &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;denominator&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;additions-to-experimental-channel&#34;&gt;Additions to Experimental channel&lt;/h2&gt;
&lt;p&gt;The Experimental channel is Gateway API&#39;s channel for experimenting with new
features and gaining confidence with them before allowing them to graduate to
standard.  Please note: the experimental channel may include features that are
changed or removed later.&lt;/p&gt;
&lt;p&gt;Starting in release v1.3.0, in an effort to distinguish Experimental channel
resources from Standard channel resources, any new experimental API kinds have the
prefix &amp;quot;&lt;strong&gt;X&lt;/strong&gt;&amp;quot;.  For the same reason, experimental resources are now added to the
API group &lt;code&gt;gateway.networking.x-k8s.io&lt;/code&gt; instead of &lt;code&gt;gateway.networking.k8s.io&lt;/code&gt;.
Bear in mind that using new experimental channel resources means they can coexist
with standard channel resources, but migrating these resources to the standard
channel will require recreating them with the standard channel names and API
group (both of which lack the &amp;quot;x-k8s&amp;quot; designator or &amp;quot;X&amp;quot; prefix).&lt;/p&gt;
&lt;p&gt;The v1.3 release introduces two new experimental API kinds: XBackendTrafficPolicy
and XListenerSet.  To be able to use experimental API kinds, you need to install
the Experimental channel Gateway API YAMLs from the locations listed below.&lt;/p&gt;
&lt;h3 id=&#34;cors-filtering&#34;&gt;CORS filtering&lt;/h3&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/liangli&#34;&gt;Liang Li&lt;/a&gt;, &lt;a href=&#34;https://github.com/EyalPazz&#34;&gt;Eyal Pazz&lt;/a&gt;, &lt;a href=&#34;https://github.com/robscott&#34;&gt;Rob Scott&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-1767: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/geps/gep-1767/index.md&#34;&gt;CORS Filter&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cross-origin resource sharing (CORS) is an HTTP-header based mechanism that allows
a web page to access restricted resources from a server on an origin (domain,
scheme, or port) different from the domain that served the web page. This feature
adds a new HTTPRoute &lt;code&gt;filter&lt;/code&gt; type, called &amp;quot;CORS&amp;quot;, to configure the handling of
cross-origin requests before the response is sent back to the client.&lt;/p&gt;
&lt;p&gt;To be able to use experimental CORS filtering, you need to install the
&lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/config/crd/experimental/gateway.networking.k8s.io_httproutes.yaml&#34;&gt;Experimental channel Gateway API HTTPRoute yaml&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&#39;s an example of a simple cross-origin configuration:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;http-route-cors&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;http-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;matches&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;path&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PathPrefix&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;value&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;/resource/foo&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;filters&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;cors&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;CORS&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowOrigins&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- *&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowMethods&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- GET&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- HEAD&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- POST&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowHeaders&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- Accept&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- Accept-Language&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- Content-Language&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- Content-Type&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- Range&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backendRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Service&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;http-route-cors&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this case, the Gateway returns an &lt;em&gt;origin header&lt;/em&gt; of &amp;quot;*&amp;quot;, which means that the
requested resource can be referenced from any origin, a &lt;em&gt;methods header&lt;/em&gt;
(&lt;code&gt;Access-Control-Allow-Methods&lt;/code&gt;) that permits the &lt;code&gt;GET&lt;/code&gt;, &lt;code&gt;HEAD&lt;/code&gt;, and &lt;code&gt;POST&lt;/code&gt;
verbs, and a &lt;em&gt;headers header&lt;/em&gt; allowing &lt;code&gt;Accept&lt;/code&gt;, &lt;code&gt;Accept-Language&lt;/code&gt;,
&lt;code&gt;Content-Language&lt;/code&gt;, &lt;code&gt;Content-Type&lt;/code&gt;, and &lt;code&gt;Range&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;HTTP/1.1 200 OK
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Access-Control-Allow-Origin: *
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Access-Control-Allow-Methods: GET, HEAD, POST
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Access-Control-Allow-Headers: Accept,Accept-Language,Content-Language,Content-Type,Range
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The complete list of fields in the new CORS filter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;allowOrigins&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;allowMethods&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;allowHeaders&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;allowCredentials&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;exposeHeaders&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;maxAge&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;See &lt;a href=&#34;https://fetch.spec.whatwg.org/#http-cors-protocol&#34;&gt;CORS protocol&lt;/a&gt; for details.&lt;/p&gt;
&lt;h3 id=&#34;XListenerSet&#34;&gt;XListenerSets (standardized mechanism for Listener and Gateway merging)&lt;/h3&gt;
&lt;p&gt;Lead: &lt;a href=&#34;https://github.com/dprotaso&#34;&gt;Dave Protasowski&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-1713: &lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/pull/3213&#34;&gt;ListenerSets - Standard Mechanism to Merge Multiple Gateways&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This release adds a new experimental API kind, XListenerSet, that allows a
shared list of &lt;em&gt;listeners&lt;/em&gt; to be attached to one or more parent Gateway(s).  In
addition, it expands upon the existing suggestion that Gateway API implementations
may merge configuration from multiple Gateway objects.  It also:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;adds a new field &lt;code&gt;allowedListeners&lt;/code&gt; to the &lt;code&gt;.spec&lt;/code&gt; of a Gateway. The
&lt;code&gt;allowedListeners&lt;/code&gt; field defines from which Namespaces to select XListenerSets
that are allowed to attach to that Gateway: Same, All, None, or Selector based.&lt;/li&gt;
&lt;li&gt;increases the previous maximum number (64) of listeners with the addition of
XListenerSets.&lt;/li&gt;
&lt;li&gt;allows the delegation of listener configuration, such as TLS, to applications in
other namespaces.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To be able to use experimental XListenerSet, you need to install the
&lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/config/crd/experimental/gateway.networking.x-k8s.io_xlistenersets.yaml&#34;&gt;Experimental channel Gateway API XListenerSet yaml&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The following example shows a Gateway with an HTTP listener and two child HTTPS
XListenerSets with unique hostnames and certificates.  The combined set of listeners
attached to the Gateway includes the two additional HTTPS listeners in the
XListenerSets that attach to the Gateway.  This example illustrates the
delegation of listener TLS config to application owners in different namespaces
(&amp;quot;store&amp;quot; and &amp;quot;app&amp;quot;).  The HTTPRoute has both the Gateway listener named &amp;quot;foo&amp;quot; and
one XListenerSet listener named &amp;quot;second&amp;quot; as &lt;code&gt;parentRefs&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;prod-external&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;infra&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;gatewayClassName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowedListeners&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;from&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;All&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;listeners&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTP&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.x-k8s.io/v1alpha1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;XListenerSet&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;store&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;store&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;prod-external&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;listeners&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;first&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;first.foo.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPS&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;443&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tls&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;mode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Terminate&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;certificateRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Secret&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;first-workload-cert&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.x-k8s.io/v1alpha1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;XListenerSet&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;app&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;app&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;prod-external&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;listeners&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;second&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;hostname&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;second.foo.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;protocol&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPS&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;port&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;443&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;tls&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;mode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Terminate&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;certificateRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Secret&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;group&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;second-workload-cert&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;---&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.k8s.io/v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;HTTPRoute&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;httproute-example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parentRefs&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;app&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;XListenerSet&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;sectionName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;second&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;parent-gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Gateway&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;sectionName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;foo&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each listener in a Gateway must have a unique combination of &lt;code&gt;port&lt;/code&gt;, &lt;code&gt;protocol&lt;/code&gt;,
(and &lt;code&gt;hostname&lt;/code&gt; if supported by the protocol) in order for all listeners to be
&lt;strong&gt;compatible&lt;/strong&gt; and not conflicted over which traffic they should receive.&lt;/p&gt;
&lt;p&gt;Furthermore, implementations can &lt;em&gt;merge&lt;/em&gt; separate Gateways into a single set of
listener addresses if all listeners across those Gateways are compatible.  The
management of merged listeners was under-specified in releases prior to v1.3.0.&lt;/p&gt;
&lt;p&gt;With the new feature, the specification on merging is expanded.  Implementations
must treat the parent Gateways as having the merged list of all listeners from
itself and from attached XListenerSets, and validation of this list of listeners
must behave the same as if the list were part of a single Gateway. Within a single
Gateway, listeners are ordered using the following precedence:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Single Listeners (not a part of an XListenerSet) first,&lt;/li&gt;
&lt;li&gt;Remaining listeners ordered by:
&lt;ul&gt;
&lt;li&gt;object creation time (oldest first), and if two listeners are defined in
objects that have the same timestamp, then&lt;/li&gt;
&lt;li&gt;alphabetically based on &amp;quot;{namespace}/{name of listener}&amp;quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;XBackendTrafficPolicy&#34;&gt;Retry budgets (XBackendTrafficPolicy)&lt;/h3&gt;
&lt;p&gt;Leads: &lt;a href=&#34;https://github.com/ericdbishop&#34;&gt;Eric Bishop&lt;/a&gt;, &lt;a href=&#34;https://github.com/mikemorris&#34;&gt;Mike Morris&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;GEP-3388: &lt;a href=&#34;https://gateway-api.sigs.k8s.io/geps/gep-3388&#34;&gt;Retry Budgets&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This feature allows you to configure a &lt;em&gt;retry budget&lt;/em&gt; across all endpoints
of a destination Service.  This is used to limit additional client-side retries
after reaching a configured threshold. When configuring the budget, the maximum
percentage of active requests that may consist of retries may be specified, as well as
the interval over which requests will be considered when calculating the threshold
for retries. The development of this specification changed the existing
experimental API kind BackendLBPolicy into a new experimental API kind,
XBackendTrafficPolicy, in the interest of reducing the proliferation of policy
resources that had commonalities.&lt;/p&gt;
&lt;p&gt;To be able to use experimental retry budgets, you need to install the
&lt;a href=&#34;https://github.com/kubernetes-sigs/gateway-api/blob/main/config/crd/experimental/gateway.networking.x-k8s.io_xbackendtrafficpolicies.yaml&#34;&gt;Experimental channel Gateway API XBackendTrafficPolicy yaml&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The following example shows an XBackendTrafficPolicy that applies a
&lt;code&gt;retryConstraint&lt;/code&gt; that represents a budget that limits the retries to a maximum
of 20% of requests, over a duration of 10 seconds, and to a minimum of 3 retries
over 1 second.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;gateway.networking.x-k8s.io/v1alpha1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;XBackendTrafficPolicy&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;traffic-policy-example&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;retryConstraint&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;budget&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;percent&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;20&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;interval&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;minRetryRate&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;count&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;3&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;interval&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;1s&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;try-it-out&#34;&gt;Try it out&lt;/h2&gt;
&lt;p&gt;Unlike other Kubernetes APIs, you don&#39;t need to upgrade to the latest version of
Kubernetes to get the latest version of Gateway API. As long as you&#39;re running
Kubernetes 1.26 or later, you&#39;ll be able to get up and running with this version
of Gateway API.&lt;/p&gt;
&lt;p&gt;To try out the API, follow the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides/&#34;&gt;Getting Started Guide&lt;/a&gt;.
As of this writing, four implementations are already conformant with Gateway API
v1.3 experimental channel features. In alphabetical order:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/airlock/microgateway/releases/tag/4.6.0&#34;&gt;Airlock Microgateway 4.6&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/cilium/cilium&#34;&gt;Cilium main&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/envoyproxy/gateway/releases/tag/v1.4.0&#34;&gt;Envoy Gateway v1.4.0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://istio.io&#34;&gt;Istio 1.27-dev&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;Wondering when a feature will be added?  There are lots of opportunities to get
involved and help define the future of Kubernetes routing APIs for both ingress
and service mesh.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Check out the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/guides&#34;&gt;user guides&lt;/a&gt; to see what use-cases can be addressed.&lt;/li&gt;
&lt;li&gt;Try out one of the &lt;a href=&#34;https://gateway-api.sigs.k8s.io/implementations/&#34;&gt;existing Gateway controllers&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Or &lt;a href=&#34;https://gateway-api.sigs.k8s.io/contributing/&#34;&gt;join us in the community&lt;/a&gt;
and help us build the future of Gateway API together!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The maintainers would like to thank &lt;em&gt;everyone&lt;/em&gt; who&#39;s contributed to Gateway
API, whether in the form of commits to the repo, discussion, ideas, or general
support. We could never have made this kind of progress without the support of
this dedicated and active community.&lt;/p&gt;
&lt;h2 id=&#34;related-kubernetes-blog-articles&#34;&gt;Related Kubernetes blog articles&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/11/21/gateway-api-v1-2/&#34;&gt;Gateway API v1.2: WebSockets, Timeouts, Retries, and More&lt;/a&gt;
(November 2024)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/05/09/gateway-api-v1-1/&#34;&gt;Gateway API v1.1: Service mesh, GRPCRoute, and a whole lot more&lt;/a&gt;
(May 2024)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/11/28/gateway-api-ga/&#34;&gt;New Experimental Features in Gateway API v1.0&lt;/a&gt;
(November 2023)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2023/10/31/gateway-api-ga/&#34;&gt;Gateway API v1.0: GA Release&lt;/a&gt;
(October 2023)&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: In-Place Pod Resize Graduated to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/16/kubernetes-v1-33-in-place-pod-resize-beta/</link>
      <pubDate>Fri, 16 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/16/kubernetes-v1-33-in-place-pod-resize-beta/</guid>
      <description>
        
        
        &lt;p&gt;On behalf of the Kubernetes project, I am excited to announce that the &lt;strong&gt;in-place Pod resize&lt;/strong&gt; feature (also known as In-Place Pod Vertical Scaling), first introduced as alpha in Kubernetes v1.27, has graduated to &lt;strong&gt;Beta&lt;/strong&gt; and will be enabled by default in the Kubernetes v1.33 release! This marks a significant milestone in making resource management for Kubernetes workloads more flexible and less disruptive.&lt;/p&gt;
&lt;h2 id=&#34;what-is-in-place-pod-resize&#34;&gt;What is in-place Pod resize?&lt;/h2&gt;
&lt;p&gt;Traditionally, changing the CPU or memory resources allocated to a container required restarting the Pod. While acceptable for many stateless applications, this could be disruptive for stateful services, batch jobs, or any workloads sensitive to restarts.&lt;/p&gt;
&lt;p&gt;In-place Pod resizing allows you to change the CPU and memory requests and limits assigned to containers within a &lt;em&gt;running&lt;/em&gt; Pod, often without requiring a container restart.&lt;/p&gt;
&lt;p&gt;Here&#39;s the core idea:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;spec.containers[*].resources&lt;/code&gt; field in a Pod specification now represents the &lt;em&gt;desired&lt;/em&gt; resources and is mutable for CPU and memory.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;status.containerStatuses[*].resources&lt;/code&gt; field reflects the &lt;em&gt;actual&lt;/em&gt; resources currently configured on a running container.&lt;/li&gt;
&lt;li&gt;You can trigger a resize by updating the desired resources in the Pod spec via the new &lt;code&gt;resize&lt;/code&gt; subresource.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can try it out on a v1.33 Kubernetes cluster by using kubectl to edit a Pod (requires &lt;code&gt;kubectl&lt;/code&gt; v1.32+):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl edit pod &amp;lt;pod-name&amp;gt; --subresource resize
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For detailed usage instructions and examples, please refer to the official Kubernetes documentation:
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/resize-container-resources/&#34;&gt;Resize CPU and Memory Resources assigned to Containers&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;why-does-in-place-pod-resize-matter&#34;&gt;Why does in-place Pod resize matter?&lt;/h2&gt;
&lt;p&gt;Kubernetes still excels at scaling workloads horizontally (adding or removing replicas), but in-place Pod resizing unlocks several key benefits for vertical scaling:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reduced Disruption:&lt;/strong&gt; Stateful applications, long-running batch jobs, and sensitive workloads can have their resources adjusted without suffering the downtime or state loss associated with a Pod restart.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improved Resource Utilization:&lt;/strong&gt; Scale down over-provisioned Pods without disruption, freeing up resources in the cluster. Conversely, provide more resources to Pods under heavy load without needing a restart.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Faster Scaling:&lt;/strong&gt; Address transient resource needs more quickly. For example Java applications often need more CPU during startup than during steady-state operation. Start with higher CPU and resize down later.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-s-changed-between-alpha-and-beta&#34;&gt;What&#39;s changed between Alpha and Beta?&lt;/h2&gt;
&lt;p&gt;Since the alpha release in v1.27, significant work has gone into maturing the feature, improving its stability, and refining the user experience based on feedback and further development. Here are the key changes:&lt;/p&gt;
&lt;h3 id=&#34;notable-user-facing-changes&#34;&gt;Notable user-facing changes&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;resize&lt;/code&gt; Subresource:&lt;/strong&gt; Modifying Pod resources must now be done via the Pod&#39;s &lt;code&gt;resize&lt;/code&gt; subresource (&lt;code&gt;kubectl patch pod &amp;lt;name&amp;gt; --subresource resize ...&lt;/code&gt;). &lt;code&gt;kubectl&lt;/code&gt; versions v1.32+ support this argument.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Resize Status via Conditions:&lt;/strong&gt; The old &lt;code&gt;status.resize&lt;/code&gt; field is deprecated. The status of a resize operation is now exposed via two Pod conditions:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;PodResizePending&lt;/code&gt;: Indicates the Kubelet cannot grant the resize immediately (e.g., &lt;code&gt;reason: Deferred&lt;/code&gt; if temporarily unable, &lt;code&gt;reason: Infeasible&lt;/code&gt; if impossible on the node).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PodResizeInProgress&lt;/code&gt;: Indicates the resize is accepted and being applied. Errors encountered during this phase are now reported in this condition&#39;s message with &lt;code&gt;reason: Error&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sidecar Support:&lt;/strong&gt; Resizing &lt;a class=&#39;glossary-tooltip&#39; title=&#39;An auxilliary container that stays running throughout the lifecycle of a Pod.&#39; data-toggle=&#39;tooltip&#39; data-placement=&#39;top&#39; href=&#39;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/pods/sidecar-containers/&#39; target=&#39;_blank&#39; aria-label=&#39;sidecar containers&#39;&gt;sidecar containers&lt;/a&gt; in-place is now supported.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;stability-and-reliability-enhancements&#34;&gt;Stability and reliability enhancements&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Refined Allocated Resources Management:&lt;/strong&gt; The allocation management logic with the Kubelet was significantly reworked, making it more consistent and robust. The changes eliminated whole classes of bugs, and greatly improved the reliability of in-place Pod resize.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improved Checkpointing &amp;amp; State Tracking:&lt;/strong&gt; A more robust system for tracking &amp;quot;allocated&amp;quot; and &amp;quot;actuated&amp;quot; resources was implemented, using new checkpoint files (&lt;code&gt;allocated_pods_state&lt;/code&gt;, &lt;code&gt;actuated_pods_state&lt;/code&gt;) to reliably manage resize state across Kubelet restarts and handle edge cases where runtime-reported resources differ from requested ones. Several bugs related to checkpointing and state restoration were fixed. Checkpointing efficiency was also improved.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Faster Resize Detection:&lt;/strong&gt; Enhancements to the Kubelet&#39;s Pod Lifecycle Event Generator (PLEG) allow the Kubelet to respond to and complete resizes much more quickly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Enhanced CRI Integration:&lt;/strong&gt; A new &lt;code&gt;UpdatePodSandboxResources&lt;/code&gt; CRI call was added to better inform runtimes and plugins (like NRI) about Pod-level resource changes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Numerous Bug Fixes:&lt;/strong&gt; Addressed issues related to systemd cgroup drivers, handling of containers without limits, CPU minimum share calculations, container restart backoffs, error propagation, test stability, and more.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;Graduating to Beta means the feature is ready for broader adoption, but development doesn&#39;t stop here! Here&#39;s what the community is focusing on next:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stability and Productionization:&lt;/strong&gt; Continued focus on hardening the feature, improving performance, and ensuring it is robust for production environments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Addressing Limitations:&lt;/strong&gt; Working towards relaxing some of the current limitations noted in the documentation, such as allowing memory limit decreases.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/autoscaling/#scaling-workloads-vertically&#34;&gt;VerticalPodAutoscaler&lt;/a&gt; (VPA) Integration:&lt;/strong&gt; Work to enable VPA to leverage in-place Pod resize is already underway. A new &lt;code&gt;InPlaceOrRecreate&lt;/code&gt; update mode will allow it to attempt non-disruptive resizes first, or fall back to recreation if needed. This will allow users to benefit from VPA&#39;s recommendations with significantly less disruption.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;User Feedback:&lt;/strong&gt; Gathering feedback from users adopting the beta feature is crucial for prioritizing further enhancements and addressing any uncovered issues or bugs.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;getting-started-and-providing-feedback&#34;&gt;Getting started and providing feedback&lt;/h2&gt;
&lt;p&gt;With the &lt;code&gt;InPlacePodVerticalScaling&lt;/code&gt; feature gate enabled by default in v1.33, you can start experimenting with in-place Pod resizing right away!&lt;/p&gt;
&lt;p&gt;Refer to the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/resize-container-resources/&#34;&gt;documentation&lt;/a&gt; for detailed guides and examples.&lt;/p&gt;
&lt;p&gt;As this feature moves through Beta, your feedback is invaluable. Please report any issues or share your experiences via the standard Kubernetes communication channels (GitHub issues, mailing lists, Slack). You can also review the &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources&#34;&gt;KEP-1287: In-place Update of Pod Resources&lt;/a&gt; for the full in-depth design details.&lt;/p&gt;
&lt;p&gt;We look forward to seeing how the community leverages in-place Pod resize to build more efficient and resilient applications on Kubernetes!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Announcing etcd v3.6.0</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/</link>
      <pubDate>Thu, 15 May 2025 16:00:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;This announcement originally &lt;a href=&#34;https://etcd.io/blog/2025/announcing-etcd-3.6/&#34;&gt;appeared&lt;/a&gt; on the etcd blog.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Today, we are releasing &lt;a href=&#34;https://github.com/etcd-io/etcd/releases/tag/v3.6.0&#34;&gt;etcd v3.6.0&lt;/a&gt;, the first minor release since etcd v3.5.0 on June 15, 2021. This release
introduces several new features, makes significant progress on long-standing efforts like downgrade support and
migration to v3store, and addresses numerous critical &amp;amp; major issues. It also includes major optimizations in
memory usage, improving efficiency and performance.&lt;/p&gt;
&lt;p&gt;In addition to the features of v3.6.0, etcd has joined Kubernetes as a SIG (sig-etcd), enabling us to improve
project sustainability. We&#39;ve introduced systematic robustness testing to ensure correctness and reliability.
Through the etcd-operator Working Group, we plan to improve usability as well.&lt;/p&gt;
&lt;p&gt;What follows are the most significant changes introduced in etcd v3.6.0, along with the discussion of the
roadmap for future development. For a detailed list of changes, please refer to the &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md&#34;&gt;CHANGELOG-3.6&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A heartfelt thank you to all the contributors who made this release possible!&lt;/p&gt;
&lt;h2 id=&#34;security&#34;&gt;Security&lt;/h2&gt;
&lt;p&gt;etcd takes security seriously. To enhance software security in v3.6.0, we have improved our workflow checks by
integrating &lt;code&gt;govulncheck&lt;/code&gt; to scan the source code and &lt;code&gt;trivy&lt;/code&gt; to scan container images. These improvements
have also been backported to supported stable releases.&lt;/p&gt;
&lt;p&gt;etcd continues to follow the &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/main/security/security-release-process.md&#34;&gt;Security Release Process&lt;/a&gt; to ensure vulnerabilities are properly managed and addressed.&lt;/p&gt;
&lt;h2 id=&#34;features&#34;&gt;Features&lt;/h2&gt;
&lt;h3 id=&#34;migration-to-v3store&#34;&gt;Migration to v3store&lt;/h3&gt;
&lt;p&gt;The v2store has been deprecated since etcd v3.4 but could still be enabled via &lt;code&gt;--enable-v2&lt;/code&gt;. It remained the source of
truth for membership data. In etcd v3.6.0, v2store can no longer be enabled as the &lt;code&gt;--enable-v2&lt;/code&gt; flag has been removed,
and v3store has become the sole source of truth for membership data.&lt;/p&gt;
&lt;p&gt;While v2store still exists in v3.6.0, etcd will fail to start if it contains any data other than membership information.
To assist with migration, etcd v3.5.18+ provides the &lt;code&gt;etcdutl check v2store&lt;/code&gt; command, which verifies that v2store
contains only membership data (see &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/19113&#34;&gt;PR 19113&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Compared to v2store, v3store offers better performance and transactional support. It is also the actively maintained
storage engine moving forward.&lt;/p&gt;
&lt;p&gt;The removal of v2store is still ongoing and is tracked in &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/12913&#34;&gt;issues/12913&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;downgrade&#34;&gt;Downgrade&lt;/h3&gt;
&lt;p&gt;etcd v3.6.0 is the first version to fully support downgrade. The effort for this downgrade task spans
both versions 3.5 and 3.6, and all related work is tracked in &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/11716&#34;&gt;issues/11716&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At a high level, the process involves migrating the data schema to the target version (e.g., v3.5),
followed by a rolling downgrade.&lt;/p&gt;
&lt;p&gt;Ensure the cluster is healthy, and take a snapshot backup. Validate whether the downgrade is valid:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ etcdctl downgrade validate 3.5
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Downgrade validate success, cluster version 3.6
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If the downgrade is valid, enable downgrade mode:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ etcdctl downgrade &lt;span style=&#34;color:#a2f&#34;&gt;enable&lt;/span&gt; 3.5
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Downgrade &lt;span style=&#34;color:#a2f&#34;&gt;enable&lt;/span&gt; success, cluster version 3.6
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;etcd will then migrate the data schema in the background. Once complete, proceed with the rolling downgrade.&lt;/p&gt;
&lt;p&gt;For details, refer to the &lt;a href=&#34;https://etcd.io/docs/v3.6/downgrades/downgrade_3_6/&#34;&gt;Downgrade-3.6&lt;/a&gt; guide.&lt;/p&gt;
&lt;h3 id=&#34;feature-gates&#34;&gt;Feature gates&lt;/h3&gt;
&lt;p&gt;In etcd v3.6.0, we introduced Kubernetes-style feature gates for managing new features. Previously, we
indicated unstable features through the &lt;code&gt;--experimental&lt;/code&gt; prefix in feature flag names. The prefix was removed
once the feature was stable, causing a breaking change. Now, features will start in Alpha, progress
to Beta, then GA, or get deprecated. This ensures a much smoother upgrade and downgrade experience for users.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&#34;https://etcd.io/docs/v3.6/feature-gates/&#34;&gt;feature-gates&lt;/a&gt; for details.&lt;/p&gt;
&lt;h3 id=&#34;livezreadyz-checks&#34;&gt;livez / readyz checks&lt;/h3&gt;
&lt;p&gt;etcd now supports &lt;code&gt;/livez&lt;/code&gt; and &lt;code&gt;/readyz&lt;/code&gt; endpoints, aligning with Kubernetes&#39; Liveness and Readiness probes.
&lt;code&gt;/livez&lt;/code&gt; indicates whether the etcd instance is alive, while &lt;code&gt;/readyz&lt;/code&gt; indicates when it is ready to serve requests.
This feature has also been backported to release-3.5 (starting from v3.5.11) and release-3.4 (starting from v3.4.29).
See &lt;a href=&#34;https://etcd.io/docs/v3.6/op-guide/monitoring/&#34;&gt;livez/readyz&lt;/a&gt; for details.&lt;/p&gt;
&lt;p&gt;The existing &lt;code&gt;/health&lt;/code&gt; endpoint remains functional. &lt;code&gt;/livez&lt;/code&gt; is similar to &lt;code&gt;/health?serializable=true&lt;/code&gt;, while
&lt;code&gt;/readyz&lt;/code&gt; is similar to &lt;code&gt;/health&lt;/code&gt; or &lt;code&gt;/health?serializable=false&lt;/code&gt;. Clearly, the &lt;code&gt;/livez&lt;/code&gt; and &lt;code&gt;/readyz&lt;/code&gt;
endpoints provide clearer semantics and are easier to understand.&lt;/p&gt;
&lt;h3 id=&#34;v3discovery&#34;&gt;v3discovery&lt;/h3&gt;
&lt;p&gt;In etcd v3.6.0, the new discovery protocol &lt;a href=&#34;https://etcd.io/docs/v3.6/dev-internal/discovery_protocol/&#34;&gt;v3discovery&lt;/a&gt; was introduced, based on clientv3.
It facilitates the discovery of all cluster members during the bootstrap phase.&lt;/p&gt;
&lt;p&gt;The previous &lt;a href=&#34;https://etcd.io/docs/v3.5/dev-internal/discovery_protocol/&#34;&gt;v2discovery&lt;/a&gt; protocol, based on clientv2, has been deprecated. Additionally,
the public discovery service at &lt;a href=&#34;https://discovery.etcd.io/&#34;&gt;https://discovery.etcd.io/&lt;/a&gt;, which relied on v2discovery, is no longer maintained.&lt;/p&gt;
&lt;h2 id=&#34;performance&#34;&gt;Performance&lt;/h2&gt;
&lt;h3 id=&#34;memory&#34;&gt;Memory&lt;/h3&gt;
&lt;p&gt;In this release, we reduced average memory consumption by at least 50% (see Figure 1). This improvement is primarily due to two changes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The default value of &lt;code&gt;--snapshot-count&lt;/code&gt; has been reduced from 100,000 in v3.5 to 10,000 in v3.6. As a result, etcd v3.6 now retains only about 10% of the history records compared to v3.5.&lt;/li&gt;
&lt;li&gt;Raft history is compacted more frequently, as introduced in &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/18825&#34;&gt;PR/18825&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/figure-1.png&#34;
         alt=&#34;Diagram of memory usage&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Figure 1:&lt;/strong&gt; Memory usage comparison between etcd v3.5.20 and v3.6.0-rc.2 under different read/write ratios.
Each subplot shows the memory usage over time with a specific read/write ratio. The red line represents etcd
v3.5.20, while the teal line represents v3.6.0-rc.2. Across all tested ratios, v3.6.0-rc.2 exhibits lower and
more stable memory usage.&lt;/em&gt;&lt;/p&gt;
&lt;h3 id=&#34;throughput&#34;&gt;Throughput&lt;/h3&gt;
&lt;p&gt;Compared to v3.5, etcd v3.6 delivers an average performance improvement of approximately 10%
in both read and write throughput (see Figure 2, 3, 4 and 5). This improvement is not attributed to
any single major change, but rather the cumulative effect of multiple minor enhancements. One such
example is the optimization of the free page queries introduced in &lt;a href=&#34;https://github.com/etcd-io/bbolt/pull/419&#34;&gt;PR/419&lt;/a&gt;.&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/figure-2.png&#34;
         alt=&#34;etcd read transaction performance with a high write ratio&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Figure 2:&lt;/strong&gt; Read throughput comparison between etcd v3.5.20 and v3.6.0-rc.2 under a high write ratio. The
read/write ratio is 0.0078, meaning 1 read per 128 writes. The right bar shows the percentage improvement
in read throughput of v3.6.0-rc.2 over v3.5.20, ranging from 3.21% to 25.59%.&lt;/em&gt;&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/figure-3.png&#34;
         alt=&#34;etcd read transaction performance with a high read ratio&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Figure 3:&lt;/strong&gt; Read throughput comparison between etcd v3.5.20 and v3.6.0-rc.2 under a high read ratio.
The read/write ratio is 8, meaning 8 reads per write. The right bar shows the percentage improvement in
read throughput of v3.6.0-rc.2 over v3.5.20, ranging from 4.38% to 27.20%.&lt;/em&gt;&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/figure-4.png&#34;
         alt=&#34;etcd write transaction performance with a high write ratio&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Figure 4:&lt;/strong&gt; Write throughput comparison between etcd v3.5.20 and v3.6.0-rc.2 under a high write ratio. The
read/write ratio is 0.0078, meaning 1 read per 128 writes. The right bar shows the percentage improvement
in write throughput of v3.6.0-rc.2 over v3.5.20, ranging from 2.95% to 24.24%.&lt;/em&gt;&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/announcing-etcd-3.6/figure-5.png&#34;
         alt=&#34;etcd write transaction performance with a high read ratio&#34;/&gt; 
&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Figure 5:&lt;/strong&gt; Write throughput comparison between etcd v3.5.20 and v3.6.0-rc.2 under a high read ratio.
The read/write ratio is 8, meaning 8 reads per write. The right bar shows the percentage improvement in
write throughput of v3.6.0-rc.2 over v3.5.20, ranging from 3.86% to 28.37%.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id=&#34;breaking-changes&#34;&gt;Breaking changes&lt;/h2&gt;
&lt;p&gt;This section highlights a few notable breaking changes. For a complete list, please refer to
the &lt;a href=&#34;https://etcd.io/docs/v3.6/upgrades/upgrade_3_6/&#34;&gt;Upgrade etcd from v3.5 to v3.6&lt;/a&gt; and the &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md&#34;&gt;CHANGELOG-3.6&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Old binaries are incompatible with new schema versions&lt;/p&gt;
&lt;p&gt;Old etcd binaries are not compatible with newer data schema versions. For example, etcd 3.5 cannot start with
data created by etcd 3.6, and etcd 3.4 cannot start with data created by either 3.5 or 3.6.&lt;/p&gt;
&lt;p&gt;When downgrading etcd, it&#39;s important to follow the documented downgrade procedure. Simply replacing
the binary or image will result in the incompatibility issue.&lt;/p&gt;
&lt;h3 id=&#34;peer-endpoints-no-longer-serve-client-requests&#34;&gt;Peer endpoints no longer serve client requests&lt;/h3&gt;
&lt;p&gt;Client endpoints (&lt;code&gt;--advertise-client-urls&lt;/code&gt;) are intended to serve client requests only, while peer
endpoints (&lt;code&gt;--initial-advertise-peer-urls&lt;/code&gt;) are intended solely for peer communication. However, due to an implementation
oversight, the peer endpoints were also able to handle client requests in etcd 3.4 and 3.5. This behavior was misleading and
encouraged incorrect usage patterns. In etcd 3.6, this misleading behavior was corrected via &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/13565&#34;&gt;PR/13565&lt;/a&gt;; peer endpoints
no longer serve client requests.&lt;/p&gt;
&lt;h3 id=&#34;clear-boundary-between-etcdctl-and-etcdutl&#34;&gt;Clear boundary between etcdctl and etcdutl&lt;/h3&gt;
&lt;p&gt;Both &lt;code&gt;etcdctl&lt;/code&gt; and &lt;code&gt;etcdutl&lt;/code&gt; are command line tools. &lt;code&gt;etcdutl&lt;/code&gt; is an offline utility designed to operate directly on
etcd data files, while &lt;code&gt;etcdctl&lt;/code&gt; is an online tool that interacts with etcd over a network. Previously, there were some
overlapping functionalities between the two, but these overlaps were removed in 3.6.0.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Removed &lt;code&gt;etcdctl defrag --data-dir&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;etcdctl defrag&lt;/code&gt; command only support online defragmentation and no longer supports offline defragmentation.
To perform offline defragmentation, use the &lt;code&gt;etcdutl defrag --data-dir&lt;/code&gt; command instead.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Removed &lt;code&gt;etcdctl snapshot status&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;etcdctl&lt;/code&gt; no longer supports retrieving the status of a snapshot. Use the &lt;code&gt;etcdutl snapshot status&lt;/code&gt; command instead.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Removed &lt;code&gt;etcdctl snapshot restore&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;etcdctl&lt;/code&gt; no longer supports restoring from a snapshot. Use the &lt;code&gt;etcdutl snapshot restore&lt;/code&gt; command instead.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;critical-bug-fixes&#34;&gt;Critical bug fixes&lt;/h2&gt;
&lt;p&gt;Correctness has always been a top priority for the etcd project. In the process of developing 3.6.0, we found and
fixed a few notable bugs that could lead to data inconsistency in specific cases. These fixes have been backported
to previous releases, but we believe they deserve special mention here.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data Inconsistency when Crashing Under Load&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Previously, when etcd was applying data, it would update the consistent-index first, followed by committing the
data. However, these operations were not atomic. If etcd crashed in between, it could lead to data inconsistency
(see &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/13766&#34;&gt;issue/13766&lt;/a&gt;). The issue was introduced in v3.5.0, and fixed in v3.5.3 with &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/13854&#34;&gt;PR/13854&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Durability API guarantee broken in single node cluster&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When a client writes data and receives a success response, the data is expected to be persisted. However, the data might
be lost if etcd crashes immediately after sending the success response to the client. This was a legacy issue (see &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/14370&#34;&gt;issue/14370&lt;/a&gt;)
affecting all previous releases. It was addressed in v3.4.21 and v3.5.5 with &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/14400&#34;&gt;PR/14400&lt;/a&gt;, and fixed in raft side in
main branch (now release-3.6) with &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/14413&#34;&gt;PR/14413&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Revision Inconsistency when Crashing During Defragmentation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If etcd crashed during the defragmentation operation, upon restart, it might reapply
some entries which had already been applied, accordingly leading to the revision inconsistency issue
(see the discussions in &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/14685&#34;&gt;PR/14685&lt;/a&gt;). The issue was introduced in v3.5.0, and fixed in v3.5.6 with &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/14730&#34;&gt;PR/14730&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;upgrade-issue&#34;&gt;Upgrade issue&lt;/h2&gt;
&lt;p&gt;This section highlights a common issue &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/19557&#34;&gt;issues/19557&lt;/a&gt; in the etcd v3.5 to v3.6 upgrade that may cause the upgrade
process to fail. For a complete upgrade guide, refer to &lt;a href=&#34;https://etcd.io/docs/v3.6/upgrades/upgrade_3_6/&#34;&gt;Upgrade etcd from v3.5 to v3.6&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The issue was introduced in etcd v3.5.1, and resolved in v3.5.20.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Key takeaway&lt;/strong&gt;: users are required to first upgrade to etcd v3.5.20 (or a higher patch version) before upgrading
to etcd v3.6.0; otherwise, the upgrade may fail.&lt;/p&gt;
&lt;p&gt;For more background and technical context, see &lt;a href=&#34;https://etcd.io/blog/2025/upgrade_from_3.5_to_3.6_issue/&#34;&gt;upgrade_from_3.5_to_3.6_issue&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;testing&#34;&gt;Testing&lt;/h2&gt;
&lt;p&gt;We introduced the &lt;a href=&#34;https://github.com/etcd-io/etcd/tree/main/tests/robustness&#34;&gt;Robustness testing&lt;/a&gt; to verify correctness, which has always been our top priority.
It plays traffic of various types and volumes against an etcd cluster, concurrently injects a random
failpoint, records all operations (including both requests and responses), and finally performs a
linearizability check. It also verifies that the &lt;a href=&#34;https://etcd.io/docs/v3.5/learning/api_guarantees/#watch-apis&#34;&gt;Watch APIs&lt;/a&gt; guarantees have not been violated.
The robustness test increases our confidence in ensuring the quality of each etcd release.&lt;/p&gt;
&lt;p&gt;We have migrated most of the etcd workflow tests to Kubernetes&#39; Prow testing infrastructure to
take advantage of its benefit, such as nice dashboards for viewing test results and the ability
for contributors to rerun failed tests themselves.&lt;/p&gt;
&lt;h2 id=&#34;platforms&#34;&gt;Platforms&lt;/h2&gt;
&lt;p&gt;While retaining all existing supported platforms, we have promoted Linux/ARM64 to Tier 1 support.
For more details, please refer to &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/15951&#34;&gt;issues/15951&lt;/a&gt;. For the complete list of supported platforms,
see &lt;a href=&#34;https://etcd.io/docs/v3.6/op-guide/supported-platform/&#34;&gt;supported-platform&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;dependencies&#34;&gt;Dependencies&lt;/h2&gt;
&lt;h3 id=&#34;dependency-bumping-guide&#34;&gt;Dependency bumping guide&lt;/h3&gt;
&lt;p&gt;We have published an official guide on how to bump dependencies for etcd’s main branch and stable releases.
It also covers how to update the Go version. For more details, please refer to &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/dependency_management.md&#34;&gt;dependency_management&lt;/a&gt;.
With this guide available, any contributors can now help with dependency upgrades.&lt;/p&gt;
&lt;h3 id=&#34;core-dependency-updates&#34;&gt;Core Dependency Updates&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/etcd-io/bbolt&#34;&gt;bbolt&lt;/a&gt; and &lt;a href=&#34;https://github.com/etcd-io/raft&#34;&gt;raft&lt;/a&gt; are two core dependencies of etcd.&lt;/p&gt;
&lt;p&gt;Both etcd v3.4 and v3.5 depend on bbolt v1.3, while etcd v3.6 depends on bbolt v1.4.&lt;/p&gt;
&lt;p&gt;For the release-3.4 and release-3.5 branches, raft is included in the etcd repository itself, so etcd v3.4 and v3.5
do not depend on an external raft module. Starting from etcd v3.6, raft was moved to a separate repository (&lt;a href=&#34;https://github.com/etcd-io/raft&#34;&gt;raft&lt;/a&gt;),
and the first standalone raft release is v3.6.0. As a result, etcd v3.6.0 depends on raft v3.6.0.&lt;/p&gt;
&lt;p&gt;Please see the table below for a summary:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;etcd versions&lt;/th&gt;
&lt;th&gt;bbolt versions&lt;/th&gt;
&lt;th&gt;raft versions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;3.4.x&lt;/td&gt;
&lt;td&gt;v1.3.x&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.5.x&lt;/td&gt;
&lt;td&gt;v1.3.x&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.6.x&lt;/td&gt;
&lt;td&gt;v1.4.x&lt;/td&gt;
&lt;td&gt;v3.6.x&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;grpc-gateway-v2&#34;&gt;grpc-gateway@v2&lt;/h3&gt;
&lt;p&gt;We upgraded &lt;a href=&#34;https://github.com/grpc-ecosystem/grpc-gateway&#34;&gt;grpc-gateway&lt;/a&gt; from v1 to v2 via &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/16595&#34;&gt;PR/16595&lt;/a&gt; in etcd v3.6.0. This is a major step toward
migrating to &lt;a href=&#34;https://github.com/protocolbuffers/protobuf-go&#34;&gt;protobuf-go&lt;/a&gt;, the second major version of the Go protocol buffer API implementation.&lt;/p&gt;
&lt;p&gt;grpc-gateway@v2 is designed to work with &lt;a href=&#34;https://github.com/protocolbuffers/protobuf-go&#34;&gt;protobuf-go&lt;/a&gt;. However, etcd v3.6 still depends on the deprecated
&lt;a href=&#34;https://github.com/gogo/protobuf&#34;&gt;gogo/protobuf&lt;/a&gt;, which is actually protocol buffer v1 implementation. To resolve this incompatibility,
we applied a &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/158b9e0d468d310c3edf4cf13f2458c51b0406fa/scripts/genproto.sh#L151-L184&#34;&gt;patch&lt;/a&gt; to the generated *.pb.gw.go files to convert v1 messages to v2 messages.&lt;/p&gt;
&lt;h3 id=&#34;grpc-ecosystem-go-grpc-middleware-providers-prometheus&#34;&gt;grpc-ecosystem/go-grpc-middleware/providers/prometheus&lt;/h3&gt;
&lt;p&gt;We switched from the deprecated (and archived) &lt;a href=&#34;https://github.com/grpc-ecosystem/go-grpc-prometheus&#34;&gt;grpc-ecosystem/go-grpc-prometheus&lt;/a&gt; to
&lt;a href=&#34;https://github.com/grpc-ecosystem/go-grpc-middleware/tree/main/providers/prometheus&#34;&gt;grpc-ecosystem/go-grpc-middleware/providers/prometheus&lt;/a&gt; via &lt;a href=&#34;https://github.com/etcd-io/etcd/pull/19195&#34;&gt;PR/19195&lt;/a&gt;. This change ensures continued
support and access to the latest features and improvements in the gRPC Prometheus integration.&lt;/p&gt;
&lt;h2 id=&#34;community&#34;&gt;Community&lt;/h2&gt;
&lt;p&gt;There are exciting developments in the etcd community that reflect our ongoing commitment
to strengthening collaboration, improving maintainability, and evolving the project’s governance.&lt;/p&gt;
&lt;h3 id=&#34;etcd-becomes-a-kubernetes-sig&#34;&gt;etcd Becomes a Kubernetes SIG&lt;/h3&gt;
&lt;p&gt;etcd has officially become a Kubernetes Special Interest Group: SIG-etcd. This change reflects
etcd’s critical role as the primary datastore for Kubernetes and establishes a more structured
and transparent home for long-term stewardship and cross-project collaboration. The new SIG
designation will help streamline decision-making, align roadmaps with Kubernetes needs,
and attract broader community involvement.&lt;/p&gt;
&lt;h3 id=&#34;new-contributors-maintainers-and-reviewers&#34;&gt;New contributors, maintainers, and reviewers&lt;/h3&gt;
&lt;p&gt;We’ve seen increasing engagement from contributors, which has resulted in the addition of three new maintainers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/fuweid&#34;&gt;fuweid&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/jmhbnz&#34;&gt;jmhbnz&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/wenjiaswe&#34;&gt;wenjiaswe&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Their continued contributions have been instrumental in driving the project forward.&lt;/p&gt;
&lt;p&gt;We also welcome two new reviewers to the project:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/ivanvc&#34;&gt;ivanvc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/siyuanfoundation&#34;&gt;siyuanfoundation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We appreciate their dedication to code quality and their willingness to take on broader review responsibilities
within the community.&lt;/p&gt;
&lt;p&gt;New release team&lt;/p&gt;
&lt;p&gt;We&#39;ve formed a new release team led by &lt;a href=&#34;https://github.com/ivanvc&#34;&gt;ivanvc&lt;/a&gt; and &lt;a href=&#34;https://github.com/jmhbnz&#34;&gt;jmhbnz&lt;/a&gt;, streamlining the release process by automating
many previously manual steps. Inspired by Kubernetes SIG Release, we&#39;ve adopted several best practices, including
clearly defined release team roles and the introduction of release shadows to support knowledge sharing and team
sustainability. These changes have made our releases smoother and more reliable, allowing us to approach each
release with greater confidence and consistency.&lt;/p&gt;
&lt;h3 id=&#34;introducing-the-etcd-operator-working-group&#34;&gt;Introducing the etcd Operator Working Group&lt;/h3&gt;
&lt;p&gt;To further advance etcd’s operational excellence, we have formed a new working group: &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/wg-etcd-operator&#34;&gt;WG-etcd-operator&lt;/a&gt;.
The working group is dedicated to enabling the automatic and efficient operation of etcd clusters that run in
the Kubernetes environment using an etcd-operator.&lt;/p&gt;
&lt;h2 id=&#34;future-development&#34;&gt;Future Development&lt;/h2&gt;
&lt;p&gt;The legacy v2store has been deprecated since etcd v3.4, and the flag &lt;code&gt;--enable-v2&lt;/code&gt; was removed entirely in v3.6.
This means that starting from v3.6, there is no longer a way to enable or use the v2store. However, etcd still
bootstraps internally from the legacy v2 snapshots. To address this inconsistency, We plan to change etcd to
bootstrap from the v3store and replay the WAL entries based on the &lt;code&gt;consistent-index&lt;/code&gt;. The work is being tracked
in &lt;a href=&#34;https://github.com/etcd-io/etcd/issues/12913&#34;&gt;issues/12913&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;One of the most persistent challenges remains the large range of queries from the kube-apiserver, which can
lead to process crashes due to their unpredictable nature. The range stream feature, originally outlined in
the &lt;a href=&#34;https://etcd.io/blog/2021/announcing-etcd-3.5/#future-roadmaps&#34;&gt;v3.5 release blog/Future roadmaps&lt;/a&gt;, remains an idea worth revisiting to address the challenges of large
range queries.&lt;/p&gt;
&lt;p&gt;For more details and upcoming plans, please refer to the &lt;a href=&#34;https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/roadmap.md&#34;&gt;etcd roadmap&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes 1.33: Job&#39;s SuccessPolicy Goes GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/kubernetes-1-33-jobs-success-policy-goes-ga/</link>
      <pubDate>Thu, 15 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/15/kubernetes-1-33-jobs-success-policy-goes-ga/</guid>
      <description>
        
        
        &lt;p&gt;On behalf of the Kubernetes project, I&#39;m pleased to announce that Job &lt;em&gt;success policy&lt;/em&gt; has graduated to General Availability (GA) as part of the v1.33 release.&lt;/p&gt;
&lt;h2 id=&#34;about-job-s-success-policy&#34;&gt;About Job&#39;s Success Policy&lt;/h2&gt;
&lt;p&gt;In batch workloads, you might want to use leader-follower patterns like &lt;a href=&#34;https://en.wikipedia.org/wiki/Message_Passing_Interface&#34;&gt;MPI&lt;/a&gt;,
in which the leader controls the execution, including the followers&#39; lifecycle.&lt;/p&gt;
&lt;p&gt;In this case, you might want to mark it as succeeded
even if some of the indexes failed. Unfortunately, a leader-follower Kubernetes Job that didn&#39;t use a success policy, in most cases, would have to require &lt;strong&gt;all&lt;/strong&gt; Pods to finish successfully
for that Job to reach an overall succeeded state.&lt;/p&gt;
&lt;p&gt;For Kubernetes Jobs, the API allows you to specify the early exit criteria using the &lt;code&gt;.spec.successPolicy&lt;/code&gt;
field (you can only use the &lt;code&gt;.spec.successPolicy&lt;/code&gt; field for an &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concept/workloads/controllers/job/#completion-mode&#34;&gt;indexed Job&lt;/a&gt;).
Which describes a set of rules either using a list of succeeded indexes for a job, or defining a minimal required size of succeeded indexes.&lt;/p&gt;
&lt;p&gt;This newly stable field is especially valuable for scientific simulation, AI/ML and High-Performance Computing (HPC) batch workloads.
Users in these areas often run numerous experiments and may only need a specific number to complete successfully, rather than requiring all of them to succeed.
In this case, the leader index failure is the only relevant Job exit criteria, and the outcomes for individual follower Pods are handled
only indirectly via the status of the leader index.
Moreover, followers do not know when they can terminate themselves.&lt;/p&gt;
&lt;p&gt;After Job meets any &lt;strong&gt;Success Policy&lt;/strong&gt;, the Job is marked as succeeded, and all Pods are terminated including the running ones.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;The following excerpt from a Job manifest, using &lt;code&gt;.successPolicy.rules[0].succeededCount&lt;/code&gt;, shows an example of
using a custom success policy:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parallelism&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completionMode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Indexed&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;successPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;succeededCount&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here, the Job is marked as succeeded when one index succeeded regardless of its number.
Additionally, you can constrain index numbers against &lt;code&gt;succeededCount&lt;/code&gt; in &lt;code&gt;.successPolicy.rules[0].succeededCount&lt;/code&gt;
as shown below:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parallelism&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completionMode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Indexed&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;successPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;succeededIndexes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;0&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#080;font-style:italic&#34;&gt;# index of the leader Pod&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;succeededCount&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This example shows that the Job will be marked as succeeded once a Pod with a specific index (Pod index 0) has succeeded.&lt;/p&gt;
&lt;p&gt;Once the Job either reaches one of the &lt;code&gt;successPolicy&lt;/code&gt; rules, or achieves its &lt;code&gt;Complete&lt;/code&gt; criteria based on &lt;code&gt;.spec.completions&lt;/code&gt;,
the Job controller within kube-controller-manager adds the &lt;code&gt;SuccessCriteriaMet&lt;/code&gt; condition to the Job status.
After that, the job-controller initiates cleanup and termination of Pods for Jobs with &lt;code&gt;SuccessCriteriaMet&lt;/code&gt; condition.
Eventually, Jobs obtain &lt;code&gt;Complete&lt;/code&gt; condition when the job-controller finished cleanup and termination.&lt;/p&gt;
&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Read the documentation for
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#success-policy&#34;&gt;success policy&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Read the KEP for the &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3998-job-success-completion-policy&#34;&gt;Job success/completion policy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;This work was led by the Kubernetes
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/wg-batch&#34;&gt;batch working group&lt;/a&gt;
in close collaboration with the
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-apps&#34;&gt;SIG Apps&lt;/a&gt; community.&lt;/p&gt;
&lt;p&gt;If you are interested in working on new features in the space I recommend
subscribing to our &lt;a href=&#34;https://kubernetes.slack.com/messages/wg-batch&#34;&gt;Slack&lt;/a&gt;
channel and attending the regular community meetings.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Updates to Container Lifecycle</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/14/kubernetes-v1-33-updates-to-container-lifecycle/</link>
      <pubDate>Wed, 14 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/14/kubernetes-v1-33-updates-to-container-lifecycle/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes v1.33 introduces a few updates to the lifecycle of containers. The Sleep action for container lifecycle hooks now supports a zero sleep duration (feature enabled by default).
There is also alpha support for customizing the stop signal sent to containers when they are being terminated.&lt;/p&gt;
&lt;p&gt;This blog post goes into the details of these new aspects of the container lifecycle, and how you can use them.&lt;/p&gt;
&lt;h2 id=&#34;zero-value-for-sleep-action&#34;&gt;Zero value for Sleep action&lt;/h2&gt;
&lt;p&gt;Kubernetes v1.29 introduced the &lt;code&gt;Sleep&lt;/code&gt; action for container PreStop and PostStart Lifecycle hooks. The Sleep action lets your containers pause for a specified duration after the container is started or before it is terminated. This was needed to provide a straightforward way to manage graceful shutdowns. Before the Sleep action, folks used to run the &lt;code&gt;sleep&lt;/code&gt; command using the exec action in their container lifecycle hooks. If you wanted to do this you&#39;d need to have the binary for the &lt;code&gt;sleep&lt;/code&gt; command in your container image. This is difficult if you&#39;re using third party images.&lt;/p&gt;
&lt;p&gt;The sleep action when it was added initially didn&#39;t have support for a sleep duration of zero seconds. The &lt;code&gt;time.Sleep&lt;/code&gt; which the Sleep action uses under the hood supports a duration of zero seconds. Using a negative or a zero value for the sleep returns immediately, resulting in a no-op. We wanted the same behaviour with the sleep action. This support for the zero duration was later added in v1.32, with the &lt;code&gt;PodLifecycleSleepActionAllowZero&lt;/code&gt; feature gate.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;PodLifecycleSleepActionAllowZero&lt;/code&gt; feature gate has graduated to beta in v1.33, and is now enabled by default.
The original Sleep action for &lt;code&gt;preStop&lt;/code&gt; and &lt;code&gt;postStart&lt;/code&gt; hooks is been enabled by default, starting from Kubernetes v1.30.
With a cluster running Kubernetes v1.33, you are able to set a
zero duration for sleep lifecycle hooks. For a cluster with default configuration, you don&#39;t need
to enable any feature gate to make that possible.&lt;/p&gt;
&lt;h2 id=&#34;container-stop-signals&#34;&gt;Container stop signals&lt;/h2&gt;
&lt;p&gt;Container runtimes such as containerd and CRI-O honor a &lt;code&gt;StopSignal&lt;/code&gt; instruction in the container image definition. This can be used to specify a custom stop signal
that the runtime will used to terminate containers based on that image.
Stop signal configuration was not originally part of the Pod API in Kubernetes.
Until Kubernetes v1.33, the only way to override the stop signal for containers was by rebuilding your container image with the new custom stop signal
(for example, specifying &lt;code&gt;STOPSIGNAL&lt;/code&gt; in a &lt;code&gt;Containerfile&lt;/code&gt; or &lt;code&gt;Dockerfile&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;ContainerStopSignals&lt;/code&gt; feature gate which is newly added in Kubernetes v1.33 adds stop signals to the Kubernetes API. This allows users to specify a custom stop signal in the container spec. Stop signals are added to the API as a new lifecycle along with the existing PreStop and PostStart lifecycle handlers. In order to use this feature, we expect the Pod to have the operating system specified with &lt;code&gt;spec.os.name&lt;/code&gt;. This is enforced so that we can cross-validate the stop signal against the operating system and make sure that the containers in the Pod are created with a valid stop signal for the operating system the Pod is being scheduled to. For Pods scheduled on Windows nodes, only &lt;code&gt;SIGTERM&lt;/code&gt; and &lt;code&gt;SIGKILL&lt;/code&gt; are allowed as valid stop signals. Find the full list of signals supported in Linux nodes &lt;a href=&#34;https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/core/v1/types.go#L2985-L3053&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;default-behaviour&#34;&gt;Default behaviour&lt;/h3&gt;
&lt;p&gt;If a container has a custom stop signal defined in its lifecycle, the container runtime would use the signal defined in the lifecycle to kill the container, given that the container runtime also supports custom stop signals. If there is no custom stop signal defined in the container lifecycle, the runtime would fallback to the stop signal defined in the container image. If there is no stop signal defined in the container image, the default stop signal of the runtime would be used. The default signal is &lt;code&gt;SIGTERM&lt;/code&gt; for both containerd and CRI-O.&lt;/p&gt;
&lt;h3 id=&#34;version-skew&#34;&gt;Version skew&lt;/h3&gt;
&lt;p&gt;For the feature to work as intended, both the versions of Kubernetes and the container runtime should support container stop signals. The changes to the Kuberentes API and kubelet are available in alpha stage from v1.33, which can be enabled with the &lt;code&gt;ContainerStopSignals&lt;/code&gt; feature gate. The container runtime implementations for containerd and CRI-O are still a work in progress and will be rolled out soon.&lt;/p&gt;
&lt;h3 id=&#34;using-container-stop-signals&#34;&gt;Using container stop signals&lt;/h3&gt;
&lt;p&gt;To enable this feature, you need to turn on the &lt;code&gt;ContainerStopSignals&lt;/code&gt; feature gate in both the kube-apiserver and the kubelet. Once you have nodes where the feature gate is turned on, you can create Pods with a StopSignal lifecycle and a valid OS name like so:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;os&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;linux&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;nginx:latest&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;lifecycle&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;stopSignal&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;SIGUSR1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Do note that the &lt;code&gt;SIGUSR1&lt;/code&gt; signal in this example can only be used if the container&#39;s Pod is scheduled to a Linux node. Hence we need to specify &lt;code&gt;spec.os.name&lt;/code&gt; as &lt;code&gt;linux&lt;/code&gt; to be able to use the signal. You will only be able to configure &lt;code&gt;SIGTERM&lt;/code&gt; and &lt;code&gt;SIGKILL&lt;/code&gt; signals if the Pod is being scheduled to a Windows node. You cannot specify a &lt;code&gt;containers[*].lifecycle.stopSignal&lt;/code&gt; if the &lt;code&gt;spec.os.name&lt;/code&gt; field is nil or unset either.&lt;/p&gt;
&lt;h2 id=&#34;how-do-i-get-involved&#34;&gt;How do I get involved?&lt;/h2&gt;
&lt;p&gt;This feature is driven by the &lt;a href=&#34;https://github.com/Kubernetes/community/blob/master/sig-node/README.md&#34;&gt;SIG Node&lt;/a&gt;. If you are interested in helping develop this feature, sharing feedback, or participating in any other ongoing SIG Node projects, please reach out to us!&lt;/p&gt;
&lt;p&gt;You can reach SIG Node by several means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Slack: &lt;a href=&#34;https://kubernetes.slack.com/messages/sig-node&#34;&gt;#sig-node&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://groups.google.com/forum/#!forum/kubernetes-sig-node&#34;&gt;Mailing list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/community/labels/sig%2Fnode&#34;&gt;Open Community Issues/PRs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can also contact me directly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GitHub: @sreeram-venkitesh&lt;/li&gt;
&lt;li&gt;Slack: @sreeram.venkitesh&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Job&#39;s Backoff Limit Per Index Goes GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/13/kubernetes-v1-33-jobs-backoff-limit-per-index-goes-ga/</link>
      <pubDate>Tue, 13 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/13/kubernetes-v1-33-jobs-backoff-limit-per-index-goes-ga/</guid>
      <description>
        
        
        &lt;p&gt;In Kubernetes v1.33, the &lt;em&gt;Backoff Limit Per Index&lt;/em&gt; feature reaches general
availability (GA). This blog describes the Backoff Limit Per Index feature and
its benefits.&lt;/p&gt;
&lt;h2 id=&#34;about-backoff-limit-per-index&#34;&gt;About backoff limit per index&lt;/h2&gt;
&lt;p&gt;When you run workloads on Kubernetes, you must consider scenarios where Pod
failures can affect the completion of your workloads. Ideally, your workload
should tolerate transient failures and continue running.&lt;/p&gt;
&lt;p&gt;To achieve failure tolerance in a Kubernetes Job, you can set the
&lt;code&gt;spec.backoffLimit&lt;/code&gt; field. This field specifies the total number of tolerated
failures.&lt;/p&gt;
&lt;p&gt;However, for workloads where every index is considered independent, like
&lt;a href=&#34;https://en.wikipedia.org/wiki/Embarrassingly_parallel&#34;&gt;embarassingly parallel&lt;/a&gt;
workloads - the &lt;code&gt;spec.backoffLimit&lt;/code&gt; field is often not flexible enough.
For example, you may choose to run multiple suites of integration tests by
representing each suite as an index within an &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/job/indexed-parallel-processing-static/&#34;&gt;Indexed Job&lt;/a&gt;.
In that setup, a fast-failing index  (test suite) is likely to consume your
entire budget for tolerating Pod failures, and you might not be able to run the
other indexes.&lt;/p&gt;
&lt;p&gt;In order to address this limitation, Kubernetes introduced &lt;em&gt;backoff limit per index&lt;/em&gt;,
which allows you to control the number of retries per index.&lt;/p&gt;
&lt;h2 id=&#34;how-backoff-limit-per-index-works&#34;&gt;How backoff limit per index works&lt;/h2&gt;
&lt;p&gt;To use Backoff Limit Per Index for Indexed Jobs, specify the number of tolerated
Pod failures per index with the &lt;code&gt;spec.backoffLimitPerIndex&lt;/code&gt; field. When you set
this field, the Job executes all indexes by default.&lt;/p&gt;
&lt;p&gt;Additionally, to fine-tune the error handling:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Specify the cap on the total number of failed indexes by setting the
&lt;code&gt;spec.maxFailedIndexes&lt;/code&gt; field. When the limit is exceeded the entire Job is
terminated.&lt;/li&gt;
&lt;li&gt;Define a short-circuit to detect a failed index by using the &lt;code&gt;FailIndex&lt;/code&gt; action in the
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-failure-policy&#34;&gt;Pod Failure Policy&lt;/a&gt;
mechanism.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When the number of tolerated failures is exceeded, the Job marks that index as
failed and lists it in the Job&#39;s &lt;code&gt;status.failedIndexes&lt;/code&gt; field.&lt;/p&gt;
&lt;h3 id=&#34;example&#34;&gt;Example&lt;/h3&gt;
&lt;p&gt;The following Job spec snippet is an example of how to combine backoff limit per
index with the &lt;em&gt;Pod Failure Policy&lt;/em&gt; feature:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;parallelism&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;completionMode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Indexed&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;backoffLimitPerIndex&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;maxFailedIndexes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;podFailurePolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;rules&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;action&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Ignore&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;onPodConditions&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;DisruptionTarget&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;action&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;FailIndex&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;onExitCodes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;operator&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;In&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;values&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;42&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this example, the Job handles Pod failures as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ignores any failed Pods that have the built-in
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/pods/disruptions/#pod-disruption-conditions&#34;&gt;disruption condition&lt;/a&gt;,
called &lt;code&gt;DisruptionTarget&lt;/code&gt;. These Pods don&#39;t count towards Job backoff limits.&lt;/li&gt;
&lt;li&gt;Fails the index corresponding to the failed Pod if any of the failed Pod&#39;s
containers finished with the exit code 42 - based on the matching &amp;quot;FailIndex&amp;quot;
rule.&lt;/li&gt;
&lt;li&gt;Retries the first failure of any index, unless the index failed due to the
matching &lt;code&gt;FailIndex&lt;/code&gt; rule.&lt;/li&gt;
&lt;li&gt;Fails the entire Job if the number of failed indexes exceeded 5 (set by the
&lt;code&gt;spec.maxFailedIndexes&lt;/code&gt; field).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Read the blog post on the closely related feature of Pod Failure Policy &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/08/19/kubernetes-1-31-pod-failure-policy-for-jobs-goes-ga/&#34;&gt;Kubernetes 1.31: Pod Failure Policy for Jobs Goes GA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;For a hands-on guide to using Pod failure policy, including the use of FailIndex, see
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/job/pod-failure-policy/&#34;&gt;Handling retriable and non-retriable pod failures with Pod failure policy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Read the documentation for
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#backoff-limit-per-index&#34;&gt;Backoff limit per index&lt;/a&gt; and
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/workloads/controllers/job/#pod-failure-policy&#34;&gt;Pod failure policy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Read the KEP for the &lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3850-backoff-limits-per-index-for-indexed-jobs&#34;&gt;Backoff Limits Per Index For Indexed Jobs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;get-involved&#34;&gt;Get involved&lt;/h2&gt;
&lt;p&gt;This work was sponsored by the Kubernetes
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/wg-batch&#34;&gt;batch working group&lt;/a&gt;
in close collaboration with the
&lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-apps&#34;&gt;SIG Apps&lt;/a&gt; community.&lt;/p&gt;
&lt;p&gt;If you are interested in working on new features in the space we recommend
subscribing to our &lt;a href=&#34;https://kubernetes.slack.com/messages/wg-batch&#34;&gt;Slack&lt;/a&gt;
channel and attending the regular community meetings.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Image Pull Policy the way you always thought it worked!</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/12/kubernetes-v1-33-ensure-secret-pulled-images-alpha/</link>
      <pubDate>Mon, 12 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/12/kubernetes-v1-33-ensure-secret-pulled-images-alpha/</guid>
      <description>
        
        
        &lt;h2 id=&#34;image-pull-policy-the-way-you-always-thought-it-worked&#34;&gt;Image Pull Policy the way you always thought it worked!&lt;/h2&gt;
&lt;p&gt;Some things in Kubernetes are surprising, and the way &lt;code&gt;imagePullPolicy&lt;/code&gt; behaves might
be one of them. Given Kubernetes is all about running pods, it may be peculiar
to learn that there has been a caveat to restricting pod access to authenticated images for
over 10 years in the form of &lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/18787&#34;&gt;issue 18787&lt;/a&gt;!
It is an exciting release when you can resolve a ten-year-old issue.&lt;/p&gt;

&lt;div class=&#34;alert alert-info&#34; role=&#34;alert&#34;&gt;&lt;h4 class=&#34;alert-heading&#34;&gt;Note:&lt;/h4&gt;Throughout this blog post, the term &amp;quot;pod credentials&amp;quot; will be used often. In this context,
the term generally encapsulates the authentication material that is available to a pod
to authenticate a container image pull.&lt;/div&gt;

&lt;h2 id=&#34;ifnotpresent-even-if-i-m-not-supposed-to-have-it&#34;&gt;IfNotPresent, even if I&#39;m not supposed to have it&lt;/h2&gt;
&lt;p&gt;The gist of the problem is that the &lt;code&gt;imagePullPolicy: IfNotPresent&lt;/code&gt; strategy has done
precisely what it says, and nothing more. Let&#39;s set up a scenario. To begin, &lt;em&gt;Pod A&lt;/em&gt; in &lt;em&gt;Namespace X&lt;/em&gt; is scheduled to &lt;em&gt;Node 1&lt;/em&gt; and requires &lt;em&gt;image Foo&lt;/em&gt; from a private repository.
For it&#39;s image pull authentication material, the pod references &lt;em&gt;Secret 1&lt;/em&gt; in its &lt;code&gt;imagePullSecrets&lt;/code&gt;. &lt;em&gt;Secret 1&lt;/em&gt; contains the necessary credentials to pull from the private repository. The Kubelet will utilize the credentials from &lt;em&gt;Secret 1&lt;/em&gt; as supplied by &lt;em&gt;Pod A&lt;/em&gt;
and it will pull &lt;em&gt;container image Foo&lt;/em&gt; from the registry.  This is the intended (and secure)
behavior.&lt;/p&gt;
&lt;p&gt;But now things get curious. If &lt;em&gt;Pod B&lt;/em&gt; in &lt;em&gt;Namespace Y&lt;/em&gt; happens to also be scheduled to &lt;em&gt;Node 1&lt;/em&gt;, unexpected (and potentially insecure) things happen. &lt;em&gt;Pod B&lt;/em&gt; may reference the same private image, specifying the &lt;code&gt;IfNotPresent&lt;/code&gt; image pull policy. &lt;em&gt;Pod B&lt;/em&gt; does not reference &lt;em&gt;Secret 1&lt;/em&gt;
(or in our case, any secret) in its &lt;code&gt;imagePullSecrets&lt;/code&gt;. When the Kubelet tries to run the pod, it honors the &lt;code&gt;IfNotPresent&lt;/code&gt; policy. The Kubelet sees that the &lt;em&gt;image Foo&lt;/em&gt; is already present locally, and will provide &lt;em&gt;image Foo&lt;/em&gt; to &lt;em&gt;Pod B&lt;/em&gt;. &lt;em&gt;Pod B&lt;/em&gt; gets to run the image even though it did not provide credentials authorizing it to pull the image in the first place.&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/12/kubernetes-v1-33-ensure-secret-pulled-images-alpha/ensure_secret_image_pulls.svg&#34;
         alt=&#34;Illustration of the process of two pods trying to access a private image, the first one with a pull secret, the second one without it&#34;/&gt; &lt;figcaption&gt;
            &lt;p&gt;Using a private image pulled by a different pod&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;While &lt;code&gt;IfNotPresent&lt;/code&gt; should not pull &lt;em&gt;image Foo&lt;/em&gt; if it is already present
on the node, it is an incorrect security posture to allow all pods scheduled
to a node to have access to previously pulled private image. These pods were never
authorized to pull the image in the first place.&lt;/p&gt;
&lt;h2 id=&#34;ifnotpresent-but-only-if-i-am-supposed-to-have-it&#34;&gt;IfNotPresent, but only if I am supposed to have it&lt;/h2&gt;
&lt;p&gt;In Kubernetes v1.33, we - SIG Auth and SIG Node - have finally started to address this (really old) problem and getting the verification right! The basic expected behavior is not changed. If
an image is not present, the Kubelet will attempt to pull the image. The credentials each pod supplies will be utilized for this task. This matches behavior prior to 1.33.&lt;/p&gt;
&lt;p&gt;If the image is present, then the behavior of the Kubelet changes. The Kubelet will now
verify the pod&#39;s credentials before allowing the pod to use the image.&lt;/p&gt;
&lt;p&gt;Performance and service stability have been a consideration while revising the feature.
Pods utilizing the same credential will not be required to re-authenticate. This is
also true when pods source credentials from the same Kubernetes Secret object, even
when the credentials are rotated.&lt;/p&gt;
&lt;h2 id=&#34;never-pull-but-use-if-authorized&#34;&gt;Never pull, but use if authorized&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;imagePullPolicy: Never&lt;/code&gt; option does not fetch images. However, if the
container image is already present on the node, any pod attempting to use the private
image will be required to provide credentials, and those credentials require verification.&lt;/p&gt;
&lt;p&gt;Pods utilizing the same credential will not be required to re-authenticate.
Pods that do not supply credentials previously used to successfully pull an
image will not be allowed to use the private image.&lt;/p&gt;
&lt;h2 id=&#34;always-pull-if-authorized&#34;&gt;Always pull, if authorized&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;imagePullPolicy: Always&lt;/code&gt; has always worked as intended. Each time an image
is requested, the request goes to the registry and the registry will perform an authentication
check.&lt;/p&gt;
&lt;p&gt;In the past, forcing the &lt;code&gt;Always&lt;/code&gt; image pull policy via pod admission was the only way to ensure
that your private container images didn&#39;t get reused by other pods on nodes which already pulled the images.&lt;/p&gt;
&lt;p&gt;Fortunately, this was somewhat performant. Only the image manifest was pulled, not the image. However, there was still a cost and a risk. During a new rollout, scale up, or pod restart, the image registry that provided the image MUST be available for the auth check, putting the image registry in the critical path for stability of services running inside of the cluster.&lt;/p&gt;
&lt;h2 id=&#34;how-it-all-works&#34;&gt;How it all works&lt;/h2&gt;
&lt;p&gt;The feature is based on persistent, file-based caches that are present on each of
the nodes. The following is a simplified description of how the feature works.
For the complete version, please see &lt;a href=&#34;https://kep.k8s.io/2535&#34;&gt;KEP-2535&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The process of requesting an image for the first time goes like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A pod requesting an image from a private registry is scheduled to a node.&lt;/li&gt;
&lt;li&gt;The image is not present on the node.&lt;/li&gt;
&lt;li&gt;The Kubelet makes a record of the intention to pull the image.&lt;/li&gt;
&lt;li&gt;The Kubelet extracts credentials from the Kubernetes Secret referenced by the pod
as an image pull secret, and uses them to pull the image from the private registry.&lt;/li&gt;
&lt;li&gt;After the image has been successfully pulled, the Kubelet makes a record of
the successful pull. This record includes details about credentials used
(in the form of a hash) as well as the Secret from which they originated.&lt;/li&gt;
&lt;li&gt;The Kubelet removes the original record of intent.&lt;/li&gt;
&lt;li&gt;The Kubelet retains the record of successful pull for later use.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;When future pods scheduled to the same node request the previously pulled private image:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The Kubelet checks the credentials that the new pod provides for the pull.&lt;/li&gt;
&lt;li&gt;If the hash of these credentials, or the source Secret of the credentials match
the hash or source Secret which were recorded for a previous successful pull,
the pod is allowed to use the previously pulled image.&lt;/li&gt;
&lt;li&gt;If the credentials or their source Secret are not found in the records of
successful pulls for that image, the Kubelet will attempt to use
these new credentials to request a pull from the remote registry, triggering
the authorization flow.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;try-it-out&#34;&gt;Try it out&lt;/h2&gt;
&lt;p&gt;In Kubernetes v1.33 we shipped the alpha version of this feature. To give it a spin,
enable the &lt;code&gt;KubeletEnsureSecretPulledImages&lt;/code&gt; feature gate for your 1.33 Kubelets.&lt;/p&gt;
&lt;p&gt;You can learn more about the feature and additional optional configuration on the
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/containers/images/#ensureimagepullcredentialverification&#34;&gt;concept page for Images&lt;/a&gt;
in the official Kubernetes documentation.&lt;/p&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;In future releases we are going to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Make this feature work together with &lt;a href=&#34;https://kep.k8s.io/4412&#34;&gt;Projected service account tokens for Kubelet image credential providers&lt;/a&gt; which adds a new, workload-specific source of image pull credentials.&lt;/li&gt;
&lt;li&gt;Write a benchmarking suite to measure the performance of this feature and assess the impact of
any future changes.&lt;/li&gt;
&lt;li&gt;Implement an in-memory caching layer so that we don&#39;t need to read files for each image
pull request.&lt;/li&gt;
&lt;li&gt;Add support for credential expirations, thus forcing previously validated credentials to
be re-authenticated.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;how-to-get-involved&#34;&gt;How to get involved&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://kep.k8s.io/2535&#34;&gt;Reading KEP-2535&lt;/a&gt; is a great way to understand these changes in depth.&lt;/p&gt;
&lt;p&gt;If you are interested in further involvement, reach out to us on the &lt;a href=&#34;https://kubernetes.slack.com/archives/C04UMAUC4UA&#34;&gt;#sig-auth-authenticators-dev&lt;/a&gt; channel
on Kubernetes Slack (for an invitation, visit &lt;a href=&#34;https://slack.k8s.io/&#34;&gt;https://slack.k8s.io/&lt;/a&gt;).
You are also welcome to join the bi-weekly &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-auth/README.md#meetings&#34;&gt;SIG Auth meetings&lt;/a&gt;,
held every other Wednesday.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Streaming List responses</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/09/kubernetes-v1-33-streaming-list-responses/</link>
      <pubDate>Fri, 09 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/09/kubernetes-v1-33-streaming-list-responses/</guid>
      <description>
        
        
        &lt;p&gt;Managing Kubernetes cluster stability becomes increasingly critical as your infrastructure grows. One of the most challenging aspects of operating large-scale clusters has been handling List requests that fetch substantial datasets - a common operation that could unexpectedly impact your cluster&#39;s stability.&lt;/p&gt;
&lt;p&gt;Today, the Kubernetes community is excited to announce a significant architectural improvement: streaming encoding for List responses.&lt;/p&gt;
&lt;h2 id=&#34;the-problem-unnecessary-memory-consumption-with-large-resources&#34;&gt;The problem: unnecessary memory consumption with large resources&lt;/h2&gt;
&lt;p&gt;Current API response encoders just serialize an entire response into a single contiguous memory and perform one &lt;a href=&#34;https://pkg.go.dev/net/http#ResponseWriter.Write&#34;&gt;ResponseWriter.Write&lt;/a&gt; call to transmit data to the client. Despite HTTP/2&#39;s capability to split responses into smaller frames for transmission, the underlying HTTP server continues to hold the complete response data as a single buffer. Even as individual frames are transmitted to the client, the memory associated with these frames cannot be freed incrementally.&lt;/p&gt;
&lt;p&gt;When cluster size grows, the single response body can be substantial - like hundreds of megabytes in size. At large scale, the current approach becomes particularly inefficient, as it prevents incremental memory release during transmission. Imagining that when network congestion occurs, that large response body’s memory block stays active for tens of seconds or even minutes. This limitation leads to unnecessarily high and prolonged memory consumption in the kube-apiserver process. If multiple large List requests occur simultaneously, the cumulative memory consumption can escalate rapidly, potentially leading to an Out-of-Memory (OOM) situation that compromises cluster stability.&lt;/p&gt;
&lt;p&gt;The encoding/json package uses sync.Pool to reuse memory buffers during serialization. While efficient for consistent workloads, this mechanism creates challenges with sporadic large List responses. When processing these large responses, memory pools expand significantly. But due to sync.Pool&#39;s design, these oversized buffers remain reserved after use. Subsequent small List requests continue utilizing these large memory allocations, preventing garbage collection and maintaining persistently high memory consumption in the kube-apiserver even after the initial large responses complete.&lt;/p&gt;
&lt;p&gt;Additionally, &lt;a href=&#34;https://github.com/protocolbuffers/protocolbuffers.github.io/blob/c14731f55296f8c6367faa4f2e55a3d3594544c6/content/programming-guides/techniques.md?plain=1#L39&#34;&gt;Protocol Buffers&lt;/a&gt; are not designed to handle large datasets. But it’s great for handling &lt;strong&gt;individual&lt;/strong&gt; messages within a large data set. This highlights the need for streaming-based approaches that can process and transmit large collections incrementally rather than as monolithic blocks.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;From &lt;a href=&#34;https://protobuf.dev/programming-guides/techniques/&#34;&gt;https://protobuf.dev/programming-guides/techniques/&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;streaming-encoder-for-list-responses&#34;&gt;Streaming encoder for List responses&lt;/h2&gt;
&lt;p&gt;The streaming encoding mechanism is specifically designed for List responses, leveraging their common well-defined collection structures. The core idea focuses exclusively on the &lt;strong&gt;Items&lt;/strong&gt; field within collection structures, which represents the bulk of memory consumption in large responses. Rather than encoding the entire &lt;strong&gt;Items&lt;/strong&gt; array as one contiguous memory block, the new streaming encoder processes and transmits each item individually, allowing memory to be freed progressively as frame or chunk is transmitted. As a result, encoding items one by one significantly reduces the memory footprint required by the API server.&lt;/p&gt;
&lt;p&gt;With Kubernetes objects typically limited to 1.5 MiB (from ETCD), streaming encoding keeps memory consumption predictable and manageable regardless of how many objects are in a List response. The result is significantly improved API server stability, reduced memory spikes, and better overall cluster performance - especially in environments where multiple large List operations might occur simultaneously.&lt;/p&gt;
&lt;p&gt;To ensure perfect backward compatibility, the streaming encoder validates Go struct tags rigorously before activation, guaranteeing byte-for-byte consistency with the original encoder. Standard encoding mechanisms process all fields except &lt;strong&gt;Items&lt;/strong&gt;, maintaining identical output formatting throughout. This approach seamlessly supports all Kubernetes List types—from built-in &lt;strong&gt;*List&lt;/strong&gt; objects to Custom Resource &lt;strong&gt;UnstructuredList&lt;/strong&gt; objects - requiring zero client-side modifications or awareness that the underlying encoding method has changed.&lt;/p&gt;
&lt;h2 id=&#34;performance-gains-you-ll-notice&#34;&gt;Performance gains you&#39;ll notice&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reduced Memory Consumption:&lt;/strong&gt; Significantly lowers the memory footprint of the API server when handling large &lt;strong&gt;list&lt;/strong&gt; requests,
especially when dealing with &lt;strong&gt;large resources&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improved Scalability:&lt;/strong&gt; Enables the API server to handle more concurrent requests and larger datasets without running out of memory.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Increased Stability:&lt;/strong&gt; Reduces the risk of OOM kills and service disruptions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Efficient Resource Utilization:&lt;/strong&gt; Optimizes memory usage and improves overall resource efficiency.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;benchmark-results&#34;&gt;Benchmark results&lt;/h2&gt;
&lt;p&gt;To validate results Kubernetes has introduced a new &lt;strong&gt;list&lt;/strong&gt; benchmark which executes concurrently 10 &lt;strong&gt;list&lt;/strong&gt; requests each returning 1GB of data.&lt;/p&gt;
&lt;p&gt;The benchmark has showed 20x improvement, reducing memory usage from 70-80GB to 3GB.&lt;/p&gt;


&lt;figure&gt;
    &lt;img src=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/09/kubernetes-v1-33-streaming-list-responses/results.png&#34;
         alt=&#34;Screenshot of a K8s performance dashboard showing memory usage for benchmark list going down from 60GB to 3GB&#34;/&gt; &lt;figcaption&gt;
            &lt;p&gt;List benchmark memory usage&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes 1.33: Volume Populators Graduate to GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/08/kubernetes-v1-33-volume-populators-ga/</link>
      <pubDate>Thu, 08 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/08/kubernetes-v1-33-volume-populators-ga/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes &lt;em&gt;volume populators&lt;/em&gt; are now  generally available (GA)! The &lt;code&gt;AnyVolumeDataSource&lt;/code&gt; feature
gate is treated as always enabled for Kubernetes v1.33, which means that users can specify any appropriate
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/extend-kubernetes/api-extension/custom-resources/#custom-resources&#34;&gt;custom resource&lt;/a&gt;
as the data source of a PersistentVolumeClaim (PVC).&lt;/p&gt;
&lt;p&gt;An example of how to use dataSourceRef in PVC:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PersistentVolumeClaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pvc1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;dataSourceRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiGroup&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;provider.example.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Provider&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;provider1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;what-is-new&#34;&gt;What is new&lt;/h2&gt;
&lt;p&gt;There are four major enhancements from beta.&lt;/p&gt;
&lt;h3 id=&#34;populator-pod-is-optional&#34;&gt;Populator Pod is optional&lt;/h3&gt;
&lt;p&gt;During the beta phase, contributors to Kubernetes identified potential resource leaks with PersistentVolumeClaim (PVC) deletion while volume population was in progress; these leaks happened due to limitations in finalizer handling.
Ahead of the graduation to general availability, the Kubernetes project added support to delete temporary resources (PVC prime, etc.) if the original PVC is deleted.&lt;/p&gt;
&lt;p&gt;To accommodate this, we&#39;ve introduced three new plugin-based functions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;PopulateFn()&lt;/code&gt;: Executes the provider-specific data population logic.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PopulateCompleteFn()&lt;/code&gt;: Checks if the data population operation has finished successfully.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PopulateCleanupFn()&lt;/code&gt;: Cleans up temporary resources created by the provider-specific functions after data population is completed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A provider example is added in &lt;a href=&#34;https://github.com/kubernetes-csi/lib-volume-populator/tree/master/example&#34;&gt;lib-volume-populator/example&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;mutator-functions-to-modify-the-kubernetes-resources&#34;&gt;Mutator functions to modify the Kubernetes resources&lt;/h3&gt;
&lt;p&gt;For GA, the CSI volume populator controller code gained a &lt;code&gt;MutatorConfig&lt;/code&gt;, allowing the specification of mutator functions to modify Kubernetes resources.
For example, if the PVC prime is not an exact copy of the PVC and you need provider-specific information for the driver, you can include this information in the optional &lt;code&gt;MutatorConfig&lt;/code&gt;.
This allows you to customize the Kubernetes objects in the volume populator.&lt;/p&gt;
&lt;h3 id=&#34;flexible-metric-handling-for-providers&#34;&gt;Flexible metric handling for providers&lt;/h3&gt;
&lt;p&gt;Our beta phase highlighted a new requirement: the need to aggregate metrics not just from lib-volume-populator, but also from other components within the provider&#39;s codebase.&lt;/p&gt;
&lt;p&gt;To address this, SIG Storage introduced a &lt;a href=&#34;https://github.com/kubernetes-csi/lib-volume-populator/blob/8a922a5302fdba13a6c27328ee50e5396940214b/populator-machinery/controller.go#L122&#34;&gt;provider metric manager&lt;/a&gt;.
This enhancement delegates the implementation of metrics logic to the provider itself, rather than relying solely on lib-volume-populator.
This shift provides greater flexibility and control over metrics collection and aggregation, enabling a more comprehensive view of provider performance.&lt;/p&gt;
&lt;h3 id=&#34;clean-up-for-temporary-resources&#34;&gt;Clean up for temporary resources&lt;/h3&gt;
&lt;p&gt;During the beta phase, we identified potential resource leaks with PersistentVolumeClaim (PVC) deletion while volume population was in progress, due to limitations in finalizer handling. We have improved the populator to support the deletion of temporary resources (PVC prime, etc.) if the original PVC is deleted in this GA release.&lt;/p&gt;
&lt;h2 id=&#34;how-to-use-it&#34;&gt;How to use it&lt;/h2&gt;
&lt;p&gt;To try it out, please follow the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2022/05/16/volume-populators-beta/#trying-it-out&#34;&gt;steps&lt;/a&gt; in the previous beta blog.&lt;/p&gt;
&lt;h2 id=&#34;future-directions-and-potential-feature-requests&#34;&gt;Future directions and potential feature requests&lt;/h2&gt;
&lt;p&gt;For next step, there are several potential feature requests for volume populator:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Multi sync: the current implementation is a one-time unidirectional sync from source to destination. This can be extended to support multiple syncs, enabling periodic syncs or allowing users to sync on demand&lt;/li&gt;
&lt;li&gt;Bidirectional sync: an extension of multi sync above, but making it bidirectional between source and destination&lt;/li&gt;
&lt;li&gt;Populate data with priorities: with a list of different dataSourceRef, populate based on priorities&lt;/li&gt;
&lt;li&gt;Populate data from multiple sources of the same provider: populate multiple different sources to one destination&lt;/li&gt;
&lt;li&gt;Populate data from multiple sources of the different providers: populate multiple different sources to one destination, pipelining different resources’ population&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To ensure we&#39;re building something truly valuable, Kubernetes SIG Storage would love to hear about any specific use cases you have in mind for this feature.
For any inquiries or specific questions related to volume populator, please reach out to the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;SIG Storage community&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/07/kubernetes-v1-33-wi-for-image-pulls/</link>
      <pubDate>Wed, 07 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/07/kubernetes-v1-33-wi-for-image-pulls/</guid>
      <description>
        
        
        &lt;p&gt;Kubernetes has steadily evolved to reduce reliance on long-lived credentials
stored in the API.
A prime example of this shift is the transition of Kubernetes Service Account (KSA) tokens
from long-lived, static tokens to ephemeral, automatically rotated tokens
with OpenID Connect (OIDC)-compliant semantics.
This advancement enables workloads to securely authenticate with external services
without needing persistent secrets.&lt;/p&gt;
&lt;p&gt;However, one major gap remains: &lt;strong&gt;image pull authentication&lt;/strong&gt;.
Today, Kubernetes clusters rely on image pull secrets stored in the API,
which are long-lived and difficult to rotate,
or on node-level kubelet credential providers,
which allow any pod running on a node to access the same credentials.
This presents security and operational challenges.&lt;/p&gt;
&lt;p&gt;To address this, Kubernetes is introducing &lt;strong&gt;Service Account Token Integration
for Kubelet Credential Providers&lt;/strong&gt;, now available in &lt;strong&gt;alpha&lt;/strong&gt;.
This enhancement allows credential providers to use pod-specific service account tokens
to obtain registry credentials, which kubelet can then use for image pulls —
eliminating the need for long-lived image pull secrets.&lt;/p&gt;
&lt;h2 id=&#34;the-problem-with-image-pull-secrets&#34;&gt;The problem with image pull secrets&lt;/h2&gt;
&lt;p&gt;Currently, Kubernetes administrators have two primary options
for handling private container image pulls:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Image pull secrets stored in the Kubernetes API&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;These secrets are often long-lived because they are hard to rotate.&lt;/li&gt;
&lt;li&gt;They must be explicitly attached to a service account or pod.&lt;/li&gt;
&lt;li&gt;Compromise of a pull secret can lead to unauthorized image access.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Kubelet credential providers&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;These providers fetch credentials dynamically at the node level.&lt;/li&gt;
&lt;li&gt;Any pod running on the node can access the same credentials.&lt;/li&gt;
&lt;li&gt;There’s no per-workload isolation, increasing security risks.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Neither approach aligns with the principles of &lt;strong&gt;least privilege&lt;/strong&gt;
or &lt;strong&gt;ephemeral authentication&lt;/strong&gt;, leaving Kubernetes with a security gap.&lt;/p&gt;
&lt;h2 id=&#34;the-solution-service-account-token-integration-for-kubelet-credential-providers&#34;&gt;The solution: Service Account token integration for Kubelet credential providers&lt;/h2&gt;
&lt;p&gt;This new enhancement enables kubelet credential providers
to use &lt;strong&gt;workload identity&lt;/strong&gt; when fetching image registry credentials.
Instead of relying on long-lived secrets, credential providers can use
service account tokens to request short-lived credentials
tied to a specific pod’s identity.&lt;/p&gt;
&lt;p&gt;This approach provides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Workload-specific authentication&lt;/strong&gt;:
Image pull credentials are scoped to a particular workload.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ephemeral credentials&lt;/strong&gt;:
Tokens are automatically rotated, eliminating the risks of long-lived secrets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Seamless integration&lt;/strong&gt;:
Works with existing Kubernetes authentication mechanisms,
aligning with cloud-native security best practices.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;h3 id=&#34;1-service-account-tokens-for-credential-providers&#34;&gt;1. Service Account tokens for credential providers&lt;/h3&gt;
&lt;p&gt;Kubelet generates &lt;strong&gt;short-lived, automatically rotated&lt;/strong&gt; tokens for service accounts
if the credential provider it communicates with has opted into receiving
a service account token for image pulls.
These tokens conform to OIDC ID token semantics
and are provided to the credential provider
as part of the &lt;code&gt;CredentialProviderRequest&lt;/code&gt;.
The credential provider can then use this token
to authenticate with an external service.&lt;/p&gt;
&lt;h3 id=&#34;2-image-registry-authentication-flow&#34;&gt;2. Image registry authentication flow&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;When a pod starts, the kubelet requests credentials from a &lt;strong&gt;credential provider&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;If the credential provider has opted in,
the kubelet generates a &lt;strong&gt;service account token&lt;/strong&gt; for the pod.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;service account token is included in the &lt;code&gt;CredentialProviderRequest&lt;/code&gt;&lt;/strong&gt;,
allowing the credential provider to authenticate
and exchange it for &lt;strong&gt;temporary image pull credentials&lt;/strong&gt;
from a registry (e.g. AWS ECR, GCP Artifact Registry, Azure ACR).&lt;/li&gt;
&lt;li&gt;The kubelet then uses these credentials
to pull images on behalf of the pod.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;benefits-of-this-approach&#34;&gt;Benefits of this approach&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;:
Eliminates long-lived image pull secrets, reducing attack surfaces.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Granular Access Control&lt;/strong&gt;:
Credentials are tied to individual workloads rather than entire nodes or clusters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operational Simplicity&lt;/strong&gt;:
No need for administrators to manage and rotate image pull secrets manually.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improved Compliance&lt;/strong&gt;:
Helps organizations meet security policies
that prohibit persistent credentials in the cluster.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-s-next&#34;&gt;What&#39;s next?&lt;/h2&gt;
&lt;p&gt;For Kubernetes &lt;strong&gt;v1.34&lt;/strong&gt;, we expect to ship this feature in &lt;strong&gt;beta&lt;/strong&gt;
while continuing to gather feedback from users.&lt;/p&gt;
&lt;p&gt;In the coming releases, we will focus on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Implementing &lt;strong&gt;caching mechanisms&lt;/strong&gt;
to improve performance for token generation.&lt;/li&gt;
&lt;li&gt;Giving more &lt;strong&gt;flexibility to credential providers&lt;/strong&gt;
to decide how the registry credentials returned to the kubelet are cached.&lt;/li&gt;
&lt;li&gt;Making the feature work with
&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2535-ensure-secret-pulled-images&#34;&gt;Ensure Secret Pulled Images&lt;/a&gt;
to ensure pods that use an image
are authorized to access that image
when service account tokens are used for authentication.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can learn more about this feature
on the &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/administer-cluster/kubelet-credential-provider/#service-account-token-for-image-pulls&#34;&gt;service account token for image pulls&lt;/a&gt;
page in the Kubernetes documentation.&lt;/p&gt;
&lt;p&gt;You can also follow along on the
&lt;a href=&#34;https://kep.k8s.io/4412&#34;&gt;KEP-4412&lt;/a&gt;
to track progress across the coming Kubernetes releases.&lt;/p&gt;
&lt;h2 id=&#34;try-it-out&#34;&gt;Try it out&lt;/h2&gt;
&lt;p&gt;To try out this feature:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Ensure you are running Kubernetes v1.33 or later&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Enable the &lt;code&gt;ServiceAccountTokenForKubeletCredentialProviders&lt;/code&gt; feature gate&lt;/strong&gt;
on the kubelet.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ensure credential provider support&lt;/strong&gt;:
Modify or update your credential provider
to use service account tokens for authentication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Update the credential provider configuration&lt;/strong&gt;
to opt into receiving service account tokens
for the credential provider by configuring the &lt;code&gt;tokenAttributes&lt;/code&gt; field.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deploy a pod&lt;/strong&gt;
that uses the credential provider to pull images from a private registry.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We would love to hear your feedback on this feature.
Please reach out to us on the
&lt;a href=&#34;https://kubernetes.slack.com/archives/C04UMAUC4UA&#34;&gt;#sig-auth-authenticators-dev&lt;/a&gt;
channel on Kubernetes Slack
(for an invitation, visit &lt;a href=&#34;https://slack.k8s.io/&#34;&gt;https://slack.k8s.io/&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id=&#34;how-to-get-involved&#34;&gt;How to get involved&lt;/h2&gt;
&lt;p&gt;If you are interested in getting involved
in the development of this feature,
sharing feedback, or participating in any other ongoing &lt;strong&gt;SIG Auth&lt;/strong&gt; projects,
please reach out on the
&lt;a href=&#34;https://kubernetes.slack.com/archives/C0EN96KUY&#34;&gt;#sig-auth&lt;/a&gt;
channel on Kubernetes Slack.&lt;/p&gt;
&lt;p&gt;You are also welcome to join the bi-weekly
&lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-auth/README.md#meetings&#34;&gt;SIG Auth meetings&lt;/a&gt;,
held every other Wednesday.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/06/kubernetes-v1-33-fine-grained-supplementalgroups-control-beta/</link>
      <pubDate>Tue, 06 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/06/kubernetes-v1-33-fine-grained-supplementalgroups-control-beta/</guid>
      <description>
        
        
        &lt;p&gt;The new field, &lt;code&gt;supplementalGroupsPolicy&lt;/code&gt;, was introduced as an opt-in alpha feature for Kubernetes v1.31 and has graduated to beta in v1.33; the corresponding feature gate (&lt;code&gt;SupplementalGroupsPolicy&lt;/code&gt;) is now enabled by default. This feature enables to implement more precise control over supplemental groups in containers that can strengthen the security posture, particularly in accessing volumes. Moreover, it also enhances the transparency of UID/GID details in containers, offering improved security oversight.&lt;/p&gt;
&lt;p&gt;Please be aware that this beta release contains some behavioral breaking change. See &lt;a href=&#34;#the-behavioral-changes-introduced-in-beta&#34;&gt;The Behavioral Changes Introduced In Beta&lt;/a&gt; and &lt;a href=&#34;#upgrade-consideration&#34;&gt;Upgrade Considerations&lt;/a&gt; sections for details.&lt;/p&gt;
&lt;h2 id=&#34;motivation-implicit-group-memberships-defined-in-etc-group-in-the-container-image&#34;&gt;Motivation: Implicit group memberships defined in &lt;code&gt;/etc/group&lt;/code&gt; in the container image&lt;/h2&gt;
&lt;p&gt;Although the majority of Kubernetes cluster admins/users may not be aware, kubernetes, by default, &lt;em&gt;merges&lt;/em&gt; group information from the Pod with information defined in &lt;code&gt;/etc/group&lt;/code&gt; in the container image.&lt;/p&gt;
&lt;p&gt;Let&#39;s see an example, below Pod manifest specifies &lt;code&gt;runAsUser=1000&lt;/code&gt;, &lt;code&gt;runAsGroup=3000&lt;/code&gt; and &lt;code&gt;supplementalGroups=4000&lt;/code&gt; in the Pod&#39;s security context.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;implicit-groups&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;securityContext&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;runAsUser&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;runAsGroup&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;3000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supplementalGroups&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#666&#34;&gt;4000&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ctr&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry.k8s.io/e2e-test-images/agnhost:2.45&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep 1h&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;securityContext&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowPrivilegeEscalation&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;false&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What is the result of &lt;code&gt;id&lt;/code&gt; command in the &lt;code&gt;ctr&lt;/code&gt; container? The output should be similar to this:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-none&#34; data-lang=&#34;none&#34;&gt;uid=1000 gid=3000 groups=3000,4000,50000
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Where does group ID &lt;code&gt;50000&lt;/code&gt; in supplementary groups (&lt;code&gt;groups&lt;/code&gt; field) come from, even though &lt;code&gt;50000&lt;/code&gt; is not defined in the Pod&#39;s manifest at all? The answer is &lt;code&gt;/etc/group&lt;/code&gt; file in the container image.&lt;/p&gt;
&lt;p&gt;Checking the contents of &lt;code&gt;/etc/group&lt;/code&gt; in the container image should show below:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-none&#34; data-lang=&#34;none&#34;&gt;user-defined-in-image:x:1000:
group-defined-in-image:x:50000:user-defined-in-image
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This shows that the container&#39;s primary user &lt;code&gt;1000&lt;/code&gt; belongs to the group &lt;code&gt;50000&lt;/code&gt; in the last entry.&lt;/p&gt;
&lt;p&gt;Thus, the group membership defined in &lt;code&gt;/etc/group&lt;/code&gt; in the container image for the container&#39;s primary user is &lt;em&gt;implicitly&lt;/em&gt; merged to the information from the Pod. Please note that this was a design decision the current CRI implementations inherited from Docker, and the community never really reconsidered it until now.&lt;/p&gt;
&lt;h3 id=&#34;what-s-wrong-with-it&#34;&gt;What&#39;s wrong with it?&lt;/h3&gt;
&lt;p&gt;The &lt;em&gt;implicitly&lt;/em&gt; merged group information from &lt;code&gt;/etc/group&lt;/code&gt; in the container image poses a security risk. These implicit GIDs can&#39;t be detected or validated by policy engines because there&#39;s no record of them in the Pod manifest. This can lead to unexpected access control issues, particularly when accessing volumes (see &lt;a href=&#34;https://issue.k8s.io/112879&#34;&gt;kubernetes/kubernetes#112879&lt;/a&gt; for details) because file permission is controlled by UID/GIDs in Linux.&lt;/p&gt;
&lt;h2 id=&#34;fine-grained-supplemental-groups-control-in-a-pod-supplementarygroupspolicy&#34;&gt;Fine-grained supplemental groups control in a Pod: &lt;code&gt;supplementaryGroupsPolicy&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;To tackle the above problem, Pod&#39;s &lt;code&gt;.spec.securityContext&lt;/code&gt; now includes &lt;code&gt;supplementalGroupsPolicy&lt;/code&gt; field.&lt;/p&gt;
&lt;p&gt;This field lets you control how Kubernetes calculates the supplementary groups for container processes within a Pod. The available policies are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Merge&lt;/em&gt;: The group membership defined in &lt;code&gt;/etc/group&lt;/code&gt; for the container&#39;s primary user will be merged. If not specified, this policy will be applied (i.e. as-is behavior for backward compatibility).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Strict&lt;/em&gt;: Only the group IDs specified in &lt;code&gt;fsGroup&lt;/code&gt;, &lt;code&gt;supplementalGroups&lt;/code&gt;, or &lt;code&gt;runAsGroup&lt;/code&gt; are attached as supplementary groups to the container processes. Group memberships defined in &lt;code&gt;/etc/group&lt;/code&gt; for the container&#39;s primary user are ignored.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let&#39;s see how &lt;code&gt;Strict&lt;/code&gt; policy works. Below Pod manifest specifies &lt;code&gt;supplementalGroupsPolicy: Strict&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;strict-supplementalgroups-policy&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;securityContext&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;runAsUser&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;runAsGroup&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;3000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supplementalGroups&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#666&#34;&gt;4000&lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supplementalGroupsPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Strict&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ctr&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;image&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;registry.k8s.io/e2e-test-images/agnhost:2.45&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;command&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;[&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;sleep 1h&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;]&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;securityContext&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;allowPrivilegeEscalation&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;false&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The result of &lt;code&gt;id&lt;/code&gt; command in the &lt;code&gt;ctr&lt;/code&gt; container should be similar to this:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-none&#34; data-lang=&#34;none&#34;&gt;uid=1000 gid=3000 groups=3000,4000
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can see &lt;code&gt;Strict&lt;/code&gt; policy can exclude group &lt;code&gt;50000&lt;/code&gt; from &lt;code&gt;groups&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;Thus, ensuring &lt;code&gt;supplementalGroupsPolicy: Strict&lt;/code&gt; (enforced by some policy mechanism) helps prevent the implicit supplementary groups in a Pod.&lt;/p&gt;

&lt;div class=&#34;alert alert-info&#34; role=&#34;alert&#34;&gt;&lt;h4 class=&#34;alert-heading&#34;&gt;Note:&lt;/h4&gt;A container with sufficient privileges can change its process identity. The &lt;code&gt;supplementalGroupsPolicy&lt;/code&gt; only affect the initial process identity. See the following section for details.&lt;/div&gt;

&lt;h2 id=&#34;attached-process-identity-in-pod-status&#34;&gt;Attached process identity in Pod status&lt;/h2&gt;
&lt;p&gt;This feature also exposes the process identity attached to the first container process of the container
via &lt;code&gt;.status.containerStatuses[].user.linux&lt;/code&gt; field. It would be helpful to see if implicit group IDs are attached.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;containerStatuses&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ctr&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;user&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;linux&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;gid&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;3000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supplementalGroups&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#666&#34;&gt;3000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;- &lt;span style=&#34;color:#666&#34;&gt;4000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;uid&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1000&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&#34;alert alert-info&#34; role=&#34;alert&#34;&gt;&lt;h4 class=&#34;alert-heading&#34;&gt;Note:&lt;/h4&gt;Please note that the values in &lt;code&gt;status.containerStatuses[].user.linux&lt;/code&gt; field is &lt;em&gt;the firstly attached&lt;/em&gt;
process identity to the first container process in the container. If the container has sufficient privilege
to call system calls related to process identity (e.g. &lt;a href=&#34;https://man7.org/linux/man-pages/man2/setuid.2.html&#34;&gt;&lt;code&gt;setuid(2)&lt;/code&gt;&lt;/a&gt;, &lt;a href=&#34;https://man7.org/linux/man-pages/man2/setgid.2.html&#34;&gt;&lt;code&gt;setgid(2)&lt;/code&gt;&lt;/a&gt; or &lt;a href=&#34;https://man7.org/linux/man-pages/man2/setgroups.2.html&#34;&gt;&lt;code&gt;setgroups(2)&lt;/code&gt;&lt;/a&gt;, etc.), the container process can change its identity. Thus, the &lt;em&gt;actual&lt;/em&gt; process identity will be dynamic.&lt;/div&gt;

&lt;h2 id=&#34;strict-policy-requires-newer-cri-versions&#34;&gt;&lt;code&gt;Strict&lt;/code&gt; Policy requires newer CRI versions&lt;/h2&gt;
&lt;p&gt;Actually, CRI runtime (e.g. containerd, CRI-O) plays a core role for calculating supplementary group ids to be attached to the containers. Thus, &lt;code&gt;SupplementalGroupsPolicy=Strict&lt;/code&gt; requires a CRI runtime that support this feature (&lt;code&gt;SupplementalGroupsPolicy: Merge&lt;/code&gt; can work with the CRI runtime which does not support this feature because this policy is fully backward compatible policy).&lt;/p&gt;
&lt;p&gt;Here are some CRI runtimes that support this feature, and the versions you need
to be running:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;containerd: v2.0 or later&lt;/li&gt;
&lt;li&gt;CRI-O: v1.31 or later&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And, you can see if the feature is supported in the Node&#39;s &lt;code&gt;.status.features.supplementalGroupsPolicy&lt;/code&gt; field.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Node&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;features&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;supplementalGroupsPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a2f;font-weight:bold&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;the-behavioral-changes-introduced-in-beta&#34;&gt;The behavioral changes introduced in beta&lt;/h2&gt;
&lt;p&gt;In the alpha release, when a Pod with &lt;code&gt;supplementalGroupsPolicy: Strict&lt;/code&gt; was scheduled to a node that did not support the feature (i.e., &lt;code&gt;.status.features.supplementalGroupsPolicy=false&lt;/code&gt;), the Pod&#39;s supplemental groups policy silently fell back to &lt;code&gt;Merge&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In v1.33, this has entered beta to enforce the policy more strictly, where kubelet rejects pods whose nodes cannot ensure the specified policy. If your pod is rejected, you will see warning events with &lt;code&gt;reason=SupplementalGroupsPolicyNotSupported&lt;/code&gt; like below:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Event&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00f;font-weight:bold&#34;&gt;...&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Warning&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;reason&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;SupplementalGroupsPolicyNotSupported&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;message&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;SupplementalGroupsPolicy=Strict is not supported in this node&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;involvedObject&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Pod&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;upgrade-consideration&#34;&gt;Upgrade consideration&lt;/h2&gt;
&lt;p&gt;If you&#39;re already using this feature, especially the &lt;code&gt;supplementalGroupsPolicy: Strict&lt;/code&gt; policy, we assume that your cluster&#39;s CRI runtimes already support this feature. In that case, you don&#39;t need to worry about the pod rejections described above.&lt;/p&gt;
&lt;p&gt;However, if your cluster:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;uses the &lt;code&gt;supplementalGroupsPolicy: Strict&lt;/code&gt; policy, but&lt;/li&gt;
&lt;li&gt;its CRI runtimes do NOT yet support the feature (i.e., &lt;code&gt;.status.features.supplementalGroupsPolicy=false&lt;/code&gt;),&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;you need to prepare the behavioral changes (pod rejection) when upgrading your cluster.&lt;/p&gt;
&lt;p&gt;We recommend several ways to avoid unexpected pod rejections:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Upgrading your cluster&#39;s CRI runtimes together with kubernetes or before the upgrade&lt;/li&gt;
&lt;li&gt;Putting some label to your nodes describing CRI runtime supports this feature or not and also putting label selector to pods with &lt;code&gt;Strict&lt;/code&gt; policy to select such nodes (but, you will need to monitor the number of &lt;code&gt;Pending&lt;/code&gt; pods in this case instead of pod rejections).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;getting-involved&#34;&gt;Getting involved&lt;/h2&gt;
&lt;p&gt;This feature is driven by the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-node&#34;&gt;SIG Node&lt;/a&gt; community.
Please join us to connect with the community and share your ideas and feedback around the above feature and
beyond. We look forward to hearing from you!&lt;/p&gt;
&lt;h2 id=&#34;how-can-i-learn-more&#34;&gt;How can I learn more?&lt;/h2&gt;
&lt;!-- https://github.com/kubernetes/website/pull/46920 --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/tasks/configure-pod-container/security-context/&#34;&gt;Configure a Security Context for a Pod or Container&lt;/a&gt;
for the further details of &lt;code&gt;supplementalGroupsPolicy&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/enhancements/issues/3619&#34;&gt;KEP-3619: Fine-grained SupplementalGroups control&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Kubernetes v1.33: Prevent PersistentVolume Leaks When Deleting out of Order graduates to GA</title>
      <link>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/05/kubernetes-v1-33-prevent-persistentvolume-leaks-when-deleting-out-of-order-graduate-to-ga/</link>
      <pubDate>Mon, 05 May 2025 10:30:00 -0800</pubDate>
      
      <guid>https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2025/05/05/kubernetes-v1-33-prevent-persistentvolume-leaks-when-deleting-out-of-order-graduate-to-ga/</guid>
      <description>
        
        
        &lt;p&gt;I am thrilled to announce that the feature to prevent
&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/storage/persistent-volumes/&#34;&gt;PersistentVolume&lt;/a&gt; (or PVs for short)
leaks when deleting out of order has graduated to General Availability (GA) in
Kubernetes v1.33! This improvement, initially introduced as a beta
feature in Kubernetes v1.31, ensures that your storage resources are properly
reclaimed, preventing unwanted leaks.&lt;/p&gt;
&lt;h2 id=&#34;how-did-reclaim-work-in-previous-kubernetes-releases&#34;&gt;How did reclaim work in previous Kubernetes releases?&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/storage/persistent-volumes/#Introduction&#34;&gt;PersistentVolumeClaim&lt;/a&gt; (or PVC for short) is
a user&#39;s request for storage. A PV and PVC are considered &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/storage/persistent-volumes/#Binding&#34;&gt;Bound&lt;/a&gt;
if a newly created PV or a matching PV is found. The PVs themselves are
backed by volumes allocated by the storage backend.&lt;/p&gt;
&lt;p&gt;Normally, if the volume is to be deleted, then the expectation is to delete the
PVC for a bound PV-PVC pair. However, there are no restrictions on deleting a PV
before deleting a PVC.&lt;/p&gt;
&lt;p&gt;For a &lt;code&gt;Bound&lt;/code&gt; PV-PVC pair, the ordering of PV-PVC deletion determines whether
the PV reclaim policy is honored. The reclaim policy is honored if the PVC is
deleted first; however, if the PV is deleted prior to deleting the PVC, then the
reclaim policy is not exercised. As a result of this behavior, the associated
storage asset in the external infrastructure is not removed.&lt;/p&gt;
&lt;h2 id=&#34;pv-reclaim-policy-with-kubernetes-v1-33&#34;&gt;PV reclaim policy with Kubernetes v1.33&lt;/h2&gt;
&lt;p&gt;With the graduation to GA in Kubernetes v1.33, this issue is now resolved. Kubernetes
now reliably honors the configured &lt;code&gt;Delete&lt;/code&gt; reclaim policy, even when PVs are deleted
before their bound PVCs. This is achieved through the use of finalizers,
ensuring that the storage backend releases the allocated storage resource as intended.&lt;/p&gt;
&lt;h3 id=&#34;how-does-it-work&#34;&gt;How does it work?&lt;/h3&gt;
&lt;p&gt;For CSI volumes, the new behavior is achieved by adding a &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/finalizers/&#34;&gt;finalizer&lt;/a&gt; &lt;code&gt;external-provisioner.volume.kubernetes.io/finalizer&lt;/code&gt;
on new and existing PVs. The finalizer is only removed after the storage from the backend is deleted. Addition or removal of finalizer is handled by &lt;code&gt;external-provisioner&lt;/code&gt;
`&lt;/p&gt;
&lt;p&gt;An example of a PV with the finalizer, notice the new finalizer in the finalizers list&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;kubectl get pv pvc-a7b7e3ba-f837-45ba-b243-dec7d8aaed53 -o yaml
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PersistentVolume&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;metadata&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;annotations&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;pv.kubernetes.io/provisioned-by&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;csi.example.driver.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;creationTimestamp&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;2021-11-17T19:28:56Z&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;finalizers&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- kubernetes.io/pv-protection&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- external-provisioner.volume.kubernetes.io/finalizer&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pvc-a7b7e3ba-f837-45ba-b243-dec7d8aaed53&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resourceVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;194711&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;uid&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;087f14f2-4157-4e95-8a70-8294b039d30e&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;spec&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;accessModes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;- ReadWriteOnce&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;capacity&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;storage&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;1Gi&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;claimRef&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;apiVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;v1&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;kind&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;PersistentVolumeClaim&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;name&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;example-vanilla-block-pvc&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;namespace&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;resourceVersion&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#b44&#34;&gt;&amp;#34;194677&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;uid&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;a7b7e3ba-f837-45ba-b243-dec7d8aaed53&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;csi&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;driver&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;csi.example.driver.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;fsType&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ext4&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeAttributes&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;storage.kubernetes.io/csiProvisionerIdentity&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#666&#34;&gt;1637110610497-8081&lt;/span&gt;-csi.example.driver.com&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;type&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;CNS Block Volume&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeHandle&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;2dacf297-803f-4ccc-afc7-3d3c3f02051e&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;persistentVolumeReclaimPolicy&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Delete&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;storageClassName&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;example-vanilla-block-sc&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;volumeMode&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Filesystem&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;status&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#008000;font-weight:bold&#34;&gt;phase&lt;/span&gt;:&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Bound&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/docs/concepts/overview/working-with-objects/finalizers/&#34;&gt;finalizer&lt;/a&gt; prevents this
PersistentVolume from being removed from the
cluster. As stated previously, the finalizer is only removed from the PV object
after it is successfully deleted from the storage backend. To learn more about
finalizers, please refer to &lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2021/05/14/using-finalizers-to-control-deletion/&#34;&gt;Using Finalizers to Control Deletion&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Similarly, the finalizer &lt;code&gt;kubernetes.io/pv-controller&lt;/code&gt; is added to dynamically provisioned in-tree plugin volumes.&lt;/p&gt;
&lt;h3 id=&#34;important-note&#34;&gt;Important note&lt;/h3&gt;
&lt;p&gt;The fix does not apply to statically provisioned in-tree plugin volumes.&lt;/p&gt;
&lt;h2 id=&#34;how-to-enable-new-behavior&#34;&gt;How to enable new behavior?&lt;/h2&gt;
&lt;p&gt;To take advantage of the new behavior, you must have upgraded your cluster to the v1.33 release of Kubernetes
and run the CSI &lt;a href=&#34;https://github.com/kubernetes-csi/external-provisioner&#34;&gt;&lt;code&gt;external-provisioner&lt;/code&gt;&lt;/a&gt; version &lt;code&gt;5.0.1&lt;/code&gt; or later.
The feature was released as beta in v1.31 release of Kubernetes, where it was enabled by default.&lt;/p&gt;
&lt;h2 id=&#34;references&#34;&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2644-honor-pv-reclaim-policy&#34;&gt;KEP-2644&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/kubernetes-csi/external-provisioner/issues/546&#34;&gt;Volume leak issue&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://deploy-preview-54651--kubernetes-io-main-staging.netlify.app/blog/2024/08/16/kubernetes-1-31-prevent-persistentvolume-leaks-when-deleting-out-of-order/&#34;&gt;Beta Release Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-do-i-get-involved&#34;&gt;How do I get involved?&lt;/h2&gt;
&lt;p&gt;The Kubernetes Slack channel &lt;a href=&#34;https://github.com/kubernetes/community/blob/master/sig-storage/README.md#contact&#34;&gt;SIG Storage communication channels&lt;/a&gt; are great mediums to reach out to the SIG Storage and migration working group teams.&lt;/p&gt;
&lt;p&gt;Special thanks to the following people for the insightful reviews, thorough consideration and valuable contribution:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fan Baofa (carlory)&lt;/li&gt;
&lt;li&gt;Jan Šafránek (jsafrane)&lt;/li&gt;
&lt;li&gt;Xing Yang (xing-yang)&lt;/li&gt;
&lt;li&gt;Matthew Wong (wongma7)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Join the &lt;a href=&#34;https://github.com/kubernetes/community/tree/master/sig-storage&#34;&gt;Kubernetes Storage Special Interest Group (SIG)&lt;/a&gt; if you&#39;re interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system. We’re rapidly growing and always welcome new contributors.&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
