Skip to content

Conversation

HirazawaUi
Copy link
Contributor

  • One-line PR description: Allow hostNetwork pods to use user namespaces
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 3, 2025
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 3, 2025
@HirazawaUi HirazawaUi force-pushed the kep-5607 branch 2 times, most recently from 6252f70 to f4441d3 Compare October 3, 2025 16:41
When the `UserNamespacesHostNetworkSupport` feature gate is enabled, we will relax this validation check.
The kube-apiserver will accept such a Pod spec and pass it on to the kubelet.
At this point, the responsibility for successfully creating and running the Pod shifts to the container runtime.
If the low-level container runtime (e.g., containerd/runc) does not support this combination, the pod will remain stuck in the `ContainerCreating` state and report an exception event, which is the expected behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with this proposal, we should include making it work with containers/crio/runc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

containerd needs changes for this, I think runc too. I'm unsure about crio and crun. @giuseppe ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crun supports it. I am not sure about CRI-O but I don't see any explicit check to prevent that combination

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been added as a graduation requirement for the beta phase.

### User Stories (Optional)

#### Story 1
As a cluster administrator, I want to enable user namespaces for my control plane static Pods (e.g., kube-apiserver, kube-controller-manager) to follow the principle of least privilege and reduce the attack surface. These Pods need to use hostNetwork to interact correctly with the cluster network. By enabling the new feature gate, I can add a critical layer of security isolation to these vital components without changing their networking model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be clear about what is possible with such a combination. For e.g. this may work for just listening on host network but will probably fail even if the pod has admin privileges and tries to make changes that are prevented by the user namespace.

cc: @rata @giuseppe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, this will need quite some documentation. To make sure users understand you probably can't bind on privileged ports even if you have cap whatever or maybe even the sysctl to change the privileged port range is ineffective too.

But LGTM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that capabilities such as CAP_NET_RAW, CAP_NET_ADMIN, and CAP_NET_BIND_SERVICE remain restricted. I have also added this as a graduation requirement for the beta phase in the KEP.

Please correct me if I'm wrong, as I am not deeply familiar with this area.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I'd say let's document this in alpha, but I don't oppose as doing it for beta. I don't see why to postpone it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the alpha stage, this feature is not accessible to users, as we are still awaiting runtime support across the board. Additionally, we need to wait until runtime support is in place to finalize the scope of this feature :)

@HirazawaUi HirazawaUi force-pushed the kep-5607 branch 4 times, most recently from 6fcfeb6 to 19be9bb Compare October 8, 2025 09:42
@wojtek-t wojtek-t self-assigned this Oct 8, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: HirazawaUi
Once this PR has been reviewed and has the lgtm label, please ask for approval from wojtek-t and additionally assign mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants