| Age | Commit message (Collapse) | Author | Files | Lines |
|
Address gemini-code-assist review - use os.RemoveAll instead of os.Remove
to handle cases where the directory is not empty after an imperfect
unmount. This ensures complete cleanup of stale staging paths.
|
|
- Handle unexpected stat errors in cleanupStaleStagingPath (high priority)
- Extract staging logic into stageNewVolume helper method for reuse
- Extract isReadOnlyAccessMode helper to avoid duplicated read-only checks
- Remove redundant mountutil.Unmount call (CleanupMountPoint already handles it)
|
|
This addresses issue #203 - CSI Driver Self-Healing for Volume Mount Failures.
Problem:
When the CSI node driver restarts, the in-memory volume cache is lost.
Kubelet then directly calls NodePublishVolume (skipping NodeStageVolume),
which fails with 'volume hasn't been staged yet' error.
Solution:
1. Added isStagingPathHealthy() to detect healthy vs stale/corrupted mounts
2. Added cleanupStaleStagingPath() to clean up stale mount points
3. Enhanced NodeStageVolume to clean up stale mounts before staging
4. Implemented self-healing in NodePublishVolume:
- If staging path is healthy: rebuild volume cache from existing mount
- If staging path is stale: clean up and re-stage automatically
5. Updated Volume.Unstage to handle rebuilt volumes without unmounter
Benefits:
- Automatic recovery after CSI driver restarts
- No manual intervention required (no kubelet/pod restarts needed)
- Handles both live and dead FUSE mount scenarios
- Backward compatible with normal operations
Fixes #203
|
|
Context:
https://github.com/golang/go/wiki/CodeReviewComments#error-strings
https://stackoverflow.com/a/68793833
|
|
|
|
k8s.io/mount-utils
|
|
Use pid from cmd.Process instead of /proc lookup
Use mount specific mutex
Log fuse mount process stderr and stdout for problems investigation
|
|
|
|
|