-
Notifications
You must be signed in to change notification settings - Fork 317
feat: Allow Fileownership change through FSGroup and VOLUME_MOUNT_GROUP #1841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Welcome @mytreya-rh! |
Hi @mytreya-rh. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
898943d
to
b797f9d
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The windows job failures are related to this PR.
E0630 16:56:06.282028 10108 atomic_writer.go:419] "unable to change file with owner" err="chown c:\\var\\lib\\kubelet\\pods\\ff425598-c3fa-480d-a6af-814831673629\\volumes\\kubernetes.io~csi\\secrets-store-inline\\mount\\..2025_06_30_16_56_06.1168464543\\secretalias: not supported by windows" logContext="secrets-store-csi-driver" fullPath="c:\\var\\lib\\kubelet\\pods\\ff425598-c3fa-480d-a6af-814831673629\\volumes\\kubernetes.io~csi\\secrets-store-inline\\mount\\..2025_06_30_16_56_06.1168464543\\secretalias" owner=-1
b797f9d
to
cbac857
Compare
Thanks @aramase ! |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one comment on test/bats/e2e-provider.bats
, but otherwise LGTM. It's a useful fix, implementation looks correct, good test coverage, and passing CI tests. Netlify is warning about a line unrelated from your changes.
test/bats/e2e-provider.bats
Outdated
kubectl wait -n rotation --for=condition=Ready --timeout=60s pod ${curl_pod_name} | ||
local pod_ip=$(kubectl get pod -n kube-system -l app=csi-secrets-store-e2e-provider -o jsonpath="{.items[0].status.podIP}") | ||
run kubectl exec ${curl_pod_name} -n rotation -- curl http://${pod_ip}:8080/rotation?rotated=true | ||
sleep 35 # 30 is poll interval, 5 second grace should be enough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that 35 seconds may not be enough to prevent flakes. In @test "Test auto rotation of mount contents and K8s secrets"
(line 472) it used to sleep 60 seconds, but now it only sleeps 35 seconds? Is it possible for a reconcile loop to be delayed for some reason that would cause this to take longer than 35?
I would probably not reduce this below 60, we had one similar case in vault.bats waiting on secret rotation where we had to increase it to 120 to improve the pass rate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dobsonj
Reverted the sleep back to 60s
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass.
I haven't looked at the test changes yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert changes to charts
directory. Chart changes should only be made in the manifest_staging
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert changes to deploy
directory. Chart changes should only be made in the manifest_staging
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,5 @@ | |||
package constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add boilerplate to the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/rotation/reconciler.go
Outdated
} | ||
newObjectVersions, errorReason, err := secretsstore.MountContent(ctx, providerClient, string(paramsJSON), string(secretsJSON), spcps.Status.TargetPath, string(permissionJSON), oldObjectVersions) | ||
gid := constants.NoGID | ||
if spcps.Status.FSGroup != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if spcps.Status.FSGroup != "" { | |
if len(spcps.Status.FSGroup) > 0 { |
this is the code style we follow in Kubernetes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/rotation/reconciler.go
Outdated
gid, err = strconv.ParseInt(spcps.Status.FSGroup, 10, 64) | ||
if err != nil { | ||
errorReason = internalerrors.FailedToParseFSGroup | ||
errStr := fmt.Sprintf("failed to rotate objects for pod %s/%s, err: %v, invalid FSGroup:%s", spcps.Namespace, spcps.Status.PodName, err, spcps.Status.FSGroup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errStr := fmt.Sprintf("failed to rotate objects for pod %s/%s, err: %v, invalid FSGroup:%s", spcps.Namespace, spcps.Status.PodName, err, spcps.Status.FSGroup) | |
errStr := fmt.Sprintf("failed to rotate objects for pod %s/%s, invalid FSGroup:%s, err: %w", spcps.Namespace, spcps.Status.PodName, spcps.Status.FSGroup, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the catch. This makes more sense to read as a log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however, %w (wrap) format seems to be specific only to fmt.Errorf. Thus, restructured the code to use %v in the Sprintf and %w in the Errorf
pkg/secrets-store/nodeserver.go
Outdated
gid, err = strconv.ParseInt(fsGroupStr, 10, 64) | ||
klog.V(5).Info("converted gid: %v\n", gid) | ||
if err != nil { | ||
klog.ErrorS(err, "failed to mount secrets store object content", "pod", klog.ObjectRef{Namespace: podNamespace, Name: podName}, "invalid FSGroup: ", fsGroupStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the key in structured log should be a single value
klog.ErrorS(err, "failed to mount secrets store object content", "pod", klog.ObjectRef{Namespace: podNamespace, Name: podName}, "invalid FSGroup: ", fsGroupStr) | |
klog.ErrorS(err, "failed to mount secrets store object content", "pod", klog.ObjectRef{Namespace: podNamespace, Name: podName}, "fSGroup: ", fsGroupStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/secrets-store/nodeserver.go
Outdated
return nil, status.Error(codes.InvalidArgument, "Error parsing FSGroup") | ||
} | ||
} else { | ||
klog.V(5).Info("mount group not set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the pod context here as well, otherwise it’s impossible to tell which pod the log is for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/secrets-store/utils.go
Outdated
if gid != constants.NoGID { | ||
fsGroup = strconv.FormatInt(gid, 10) | ||
} | ||
klog.V(5).Infof("gid string: %v for pod: %v/%v", fsGroup, namespace, podname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to use structured logging klog.InfoS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/secrets-store/utils.go
Outdated
}, | ||
}) | ||
|
||
klog.V(5).InfoS("creating", "spcPodStatus", spcPodStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
} | ||
|
||
if fileProjection.FsUser == nil { | ||
if fileProjection.FsGroup == nil || runtimeutil.IsRuntimeWindows() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to read through the kubernetes code to understand how this works for windows. Do you know if it's supported or they skip for windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as per documentation FSGroup is not supported for Windows.
And as per code as well, the Kubernetes FSGroup implementation of writable volumes for non linux is a no-op:
csi_mounter.go --> volume_unsupported.go
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mytreya-rh The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1. Undo changes to the charts and deploy directories 2. Add boilerplate to pkg/constants/constants.go 3. Changed logging to entire VolumeContext to just the FSGroup 4. Updated the way empty string is checked 5. changed Infof to InfoS 6. Removed un-necessary/redundant logs
New changes are detected. LGTM label has been removed. |
6ed5189
to
e8e905e
Compare
/retest |
@mytreya-rh: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
(As of now pls consider it as a draft PR to discuss the solution further.)
Allows the secrets to be mounted with FSGroup as specified in the POD spec.
Thus, A pod with a non-root user should be able to read a secret, and that secret need not be world-readable.
Which issue(s) this PR fixes :
Fixes #858
Is this a chart or deployment yaml update?
There is a yaml update for secrets-store.csi.x-k8s.io_secretproviderclasspodstatuses.yaml (generated through make manifests).
It is added in the manifest_staging/deploy
But if this PR merges after: #1622, the change in SecretProviderClassPodStatusStatus won't be required anymore and we can revert the changes related to reconciler.
Special notes for your reviewer:
Problem:
Solution:
Notes:
tests added in e2e-provider:
unit tests:
TODOs: