| Age | Commit message (Collapse) | Author | Files | Lines |
|
Add ability to enable/disable users and access keys without deleting them.
## Changes
### Protocol Buffer Updates
- Add `disabled` field (bool) to Identity message for user status
- false (default) = enabled, true = disabled
- No backward compatibility hack needed since zero value is correct
- Add `status` field (string: Active/Inactive) to Credential message
### New IAM Actions
- SetUserStatus: Enable or disable a user (requires admin)
- UpdateAccessKey: Change access key status (self-service or admin)
### Behavior
- Disabled users: All API requests return AccessDenied
- Inactive access keys: Signature validation fails
- Status check happens early in auth flow for performance
- Backward compatible: existing configs default to enabled (disabled=false)
### Use Cases
1. Temporary suspension: Disable user access during investigation
2. Key rotation: Deactivate old key before deletion
3. Offboarding: Disable rather than delete for audit purposes
4. Emergency response: Quickly disable compromised credentials
Fixes #7745
|
|
|
|
This resolves GitHub issue #7747 by extracting duplicated IAM code into
a shared package that both the embedded S3 IAM and standalone IAM use.
New shared package (weed/iam/):
- constants.go: Common constants (charsets, action strings, error messages)
- helpers.go: Shared helper functions (Hash, GenerateRandomString,
GenerateAccessKeyId, GenerateSecretAccessKey, StringSlicesEqual,
MapToStatementAction, MapToIdentitiesAction, MaskAccessKey)
- responses.go: Common IAM response structs (CommonResponse, ListUsersResponse,
CreateUserResponse, etc.)
- helpers_test.go: Unit tests for shared helpers
Updated files:
- weed/s3api/s3api_embedded_iam.go: Use type aliases and function wrappers
to the shared package, removing ~200 lines of duplicated code
- weed/iamapi/iamapi_management_handlers.go: Use shared package for constants
and helper functions, removing ~100 lines of duplicated code
- weed/iamapi/iamapi_response.go: Re-export types from shared package for
backwards compatibility
Benefits:
- Single source of truth for IAM constants and helpers
- Easier maintenance - changes only need to be made in one place
- Reduced risk of inconsistencies between embedded and standalone IAM
- Better test coverage through shared test suite
|
|
* Embed IAM API into S3 server
This change simplifies the S3 and IAM deployment by embedding the IAM API
directly into the S3 server, following the patterns used by MinIO and Ceph RGW.
Changes:
- Add -iam flag to S3 server (enabled by default)
- Create embedded IAM API handler in s3api package
- Register IAM routes (POST to /) in S3 server when enabled
- Deprecate standalone 'weed iam' command with warning
Benefits:
- Single binary, single port for both S3 and IAM APIs
- Simpler deployment and configuration
- Shared credential manager between S3 and IAM
- Backward compatible: 'weed iam' still works with deprecation warning
Usage:
- weed s3 -port=8333 # S3 + IAM on same port (default)
- weed s3 -iam=false # S3 only, disable embedded IAM
- weed iam -port=8111 # Deprecated, shows warning
* Fix nil pointer panic: add s3.iam flag to weed server command
The enableIam field was not initialized when running S3 via 'weed server',
causing a nil pointer dereference when checking *s3opt.enableIam.
* Fix nil pointer panic: add s3.iam flag to weed filer command
The enableIam field was not initialized when running S3 via 'weed filer -s3',
causing a nil pointer dereference when checking *s3opt.enableIam.
* Add integration tests for embedded IAM API
Tests cover:
- CreateUser, ListUsers, GetUser, UpdateUser, DeleteUser
- CreateAccessKey, DeleteAccessKey, ListAccessKeys
- CreatePolicy, PutUserPolicy, GetUserPolicy
- Implicit username extraction from authorization header
- Full user lifecycle workflow test
These tests validate the embedded IAM API functionality that was
added in the S3 server, ensuring IAM operations work correctly
when served from the same port as S3.
* Security: Use crypto/rand for IAM credential generation
SECURITY FIX: Replace math/rand with crypto/rand for generating
access keys and secret keys.
Using math/rand is not cryptographically secure and can lead to
predictable credentials. This change:
1. Replaces math/rand with crypto/rand in both:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)
2. Removes the seededRand variable that was initialized with
time-based seed (predictable)
3. Updates StringWithCharset/iamStringWithCharset to:
- Use crypto/rand.Int() for secure random index generation
- Return an error for proper error handling
4. Updates CreateAccessKey to handle the new error return
5. Updates DoActions handlers to propagate errors properly
* Fix critical bug: DeleteUserPolicy was deleting entire user instead of policy
BUG FIX: DeleteUserPolicy was incorrectly deleting the entire user
identity from s3cfg.Identities instead of just clearing the user's
inline policy (Actions).
Before (wrong):
s3cfg.Identities = append(s3cfg.Identities[:i], s3cfg.Identities[i+1:]...)
After (correct):
ident.Actions = nil
Also:
- Added proper iamDeleteUserPolicyResponse / DeleteUserPolicyResponse types
- Fixed return type from iamPutUserPolicyResponse to iamDeleteUserPolicyResponse
Affected files:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)
- weed/iamapi/iamapi_response.go (response types)
* Add tests for DeleteUserPolicy to prevent regression
Added two tests:
1. TestEmbeddedIamDeleteUserPolicy - Verifies that:
- User is NOT deleted (identity still exists)
- Credentials are NOT deleted
- Only Actions (policy) are cleared to nil
2. TestEmbeddedIamDeleteUserPolicyUserNotFound - Verifies:
- Returns 404 when user doesn't exist
These tests ensure the bug fixed in the previous commit
(deleting user instead of policy) doesn't regress.
* Fix race condition: Add mutex lock to IAM DoActions
The DoActions function performs a read-modify-write operation on the
shared IAM configuration without any locking. This could lead to race
conditions and data loss if multiple requests modify the IAM config
concurrently.
Added mutex lock at the start of DoActions in both:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)
The lock protects the entire read-modify-write cycle:
1. GetS3ApiConfiguration (read)
2. Modify s3cfg based on action
3. PutS3ApiConfiguration (write)
* Fix action comparison and document CreatePolicy limitation
1. Replace reflect.DeepEqual with order-independent string slice comparison
- Added iamStringSlicesEqual/stringSlicesEqual helper functions
- Prevents duplicate policy statements when actions are in different order
2. Document CreatePolicy limitation in embedded IAM
- Added TODO comment explaining that managed policies are not persisted
- Users should use PutUserPolicy for inline policies
3. Fix deadlock in standalone IAM's CreatePolicy
- Removed nested lock acquisition (DoActions already holds the lock)
Files changed:
- weed/s3api/s3api_embedded_iam.go
- weed/iamapi/iamapi_management_handlers.go
* Add rate limiting to embedded IAM endpoint
Apply circuit breaker rate limiting to the IAM endpoint to prevent abuse.
Also added request tracking for IAM operations.
The IAM endpoint now follows the same pattern as other S3 endpoints:
- track() for request metrics
- s3a.iam.Auth() for authentication
- s3a.cb.Limit() for rate limiting
* Fix handleImplicitUsername to properly look up username from AccessKeyId
According to AWS spec, when UserName is not specified in an IAM request,
IAM should determine the username implicitly based on the AccessKeyId
signing the request.
Previously, the code incorrectly extracted s[2] (region field) from the
SigV4 credential string as the username. This fix:
1. Extracts the AccessKeyId from s[0] of the credential string
2. Looks up the AccessKeyId in the credential store using LookupByAccessKey
3. Uses the identity's Name field as the username if found
Also:
- Added exported LookupByAccessKey wrapper method to IdentityAccessManagement
- Updated tests to verify correct access key lookup behavior
- Applied fix to both embedded IAM and standalone IAM implementations
* Fix CreatePolicy to not trigger unnecessary save
CreatePolicy validates the policy document and returns metadata but does not
actually store the policy (SeaweedFS uses inline policies attached via
PutUserPolicy). However, 'changed' was left as true, triggering an unnecessary
save operation.
Set changed = false after successful CreatePolicy validation in both embedded
IAM and standalone IAM implementations.
* Improve embedded IAM test quality
- Remove unused mock types (mockCredentialManager, mockEmbeddedIamApi)
- Use proto.Clone instead of proto.Merge for proper deep copy semantics
- Replace brittle regex-based XML error extraction with proper XML unmarshalling
- Remove unused regexp import
- Add state and field assertions to tests:
- CreateUser: verify username in response and user persisted in config
- ListUsers: verify response contains expected users
- GetUser: verify username in response
- CreatePolicy: verify policy metadata in response
- PutUserPolicy: verify actions were attached to user
- CreateAccessKey: verify credentials in response and persisted in config
* Remove shared test state and improve executeEmbeddedIamRequest
- Remove package-level embeddedIamApi variable to avoid shared test state
- Update executeEmbeddedIamRequest to accept API instance as parameter
- Only call xml.Unmarshal when v != nil, making nil-v cases explicit
- Return unmarshal error properly instead of always returning it
- Update all tests to create their own EmbeddedIamApiForTest instance
- Each test now has isolated state, preventing test interdependencies
* Add comprehensive test coverage for embedded IAM
Added tests for previously uncovered functions:
- iamStringSlicesEqual: 0% → 100%
- iamMapToStatementAction: 40% → 100%
- iamMapToIdentitiesAction: 30% → 70%
- iamHash: 100%
- iamStringWithCharset: 85.7%
- GetPolicyDocument: 75% → 100%
- CreatePolicy: 91.7% → 100%
- DeleteUser: 83.3% → 100%
- GetUser: 83.3% → 100%
- ListAccessKeys: 55.6% → 88.9%
New test cases for helper functions, error handling, and edge cases.
* Document IAM code duplication and reference GitHub issue #7747
Added comments to both IAM implementations noting the code duplication
and referencing the tracking issue for future refactoring:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)
See: https://github.com/seaweedfs/seaweedfs/issues/7747
* Implement granular IAM authorization for self-service operations
Previously, all IAM actions required ACTION_ADMIN permission, which was
overly restrictive. This change implements AWS-like granular permissions:
Self-service operations (allowed without admin for own resources):
- CreateAccessKey (on own user)
- DeleteAccessKey (on own user)
- ListAccessKeys (on own user)
- GetUser (on own user)
- UpdateAccessKey (on own user)
Admin-only operations:
- CreateUser, DeleteUser, UpdateUser
- PutUserPolicy, GetUserPolicy, DeleteUserPolicy
- CreatePolicy
- ListUsers
- Operations on other users
The new AuthIam middleware:
1. Authenticates the request (signature verification)
2. Parses the IAM Action and target UserName
3. For self-service actions, allows if user is operating on own resources
4. For all other actions or operations on other users, requires admin
* Fix misleading comment in standalone IAM CreatePolicy
The comment incorrectly stated that CreatePolicy only validates the policy
document. In the standalone IAM server, CreatePolicy actually persists
the policy via iama.s3ApiConfig.PutPolicies(). The changed flag is false
because it doesn't modify s3cfg.Identities, not because nothing is stored.
* Simplify IAM auth and add RequestId to responses
- Remove redundant ACTION_ADMIN fallback in AuthIam: The action parameter
in authRequest is for permission checking, not signature verification.
If auth fails with ACTION_READ, it will fail with ACTION_ADMIN too.
- Add SetRequestId() call before writing IAM responses for AWS compatibility.
All IAM response structs embed iamCommonResponse which has SetRequestId().
* Address code review feedback for IAM implementation
1. auth_credentials.go: Add documentation warning that LookupByAccessKey
returns internal pointers that should not be mutated.
2. iamapi_management_handlers.go & s3api_embedded_iam.go: Add input guards
for StringWithCharset/iamStringWithCharset when length <= 0 or charset
is empty to avoid runtime errors from rand.Int.
3. s3api_embedded_iam_test.go: Don't ignore xml.Marshal errors in test
DoActions handler. Return proper error response if marshaling fails.
4. s3api_embedded_iam_test.go: Use obviously fake access key IDs
(AKIATESTFAKEKEY*) to avoid CI secret scanner false positives.
* Address code review feedback for IAM implementation (batch 2)
1. iamapi/iamapi_management_handlers.go:
- Redact Authorization header log (security: avoid exposing signature)
- Add nil-guard for iama.iam before LookupByAccessKey call
2. iamapi/iamapi_test.go:
- Replace real-looking access keys with obviously fake ones
(AKIATESTFAKEKEY*) to avoid CI secret scanner false positives
3. s3api/s3api_embedded_iam.go - CreateUser:
- Validate UserName is not empty (return ErrCodeInvalidInputException)
- Check for duplicate users (return ErrCodeEntityAlreadyExistsException)
4. s3api/s3api_embedded_iam.go - CreateAccessKey:
- Return ErrCodeNoSuchEntityException if user doesn't exist
- Removed implicit user creation behavior
5. s3api/s3api_embedded_iam.go - getActions:
- Fix S3 ARN parsing for bucket/path patterns
- Handle mybucket, mybucket/*, mybucket/path/* correctly
- Return error if no valid actions found in policy
6. s3api/s3api_embedded_iam.go - handleImplicitUsername:
- Redact Authorization header log
- Add nil-guard for e.iam
7. s3api/s3api_embedded_iam.go - DoActions:
- Reload in-memory IAM maps after credential mutations
- Call LoadS3ApiConfigurationFromCredentialManager after save
8. s3api/auth_credentials.go - AuthSignatureOnly:
- Add new signature-only authentication method
- Bypasses S3 authorization checks for IAM operations
- Used by AuthIam to properly separate signature verification
from IAM-specific permission checks
* Fix nil pointer dereference and error handling in IAM
1. AuthIam: Add nil check for identity after AuthSignatureOnly
- AuthSignatureOnly can return nil identity with ErrNone for
authTypePostPolicy or authTypeStreamingUnsigned
- Now returns ErrAccessDenied if identity is nil
2. writeIamErrorResponse: Add missing error code cases
- ErrCodeEntityAlreadyExistsException -> HTTP 409 Conflict
- ErrCodeInvalidInputException -> HTTP 400 Bad Request
3. UpdateUser: Use consistent error handling
- Changed from direct ErrInvalidRequest to writeIamErrorResponse
- Now returns correct HTTP status codes based on error type
* Add IAM config reload to standalone IAM server after mutations
Match the behavior of embedded IAM (s3api_embedded_iam.go) by reloading
the in-memory identity maps after persisting configuration changes.
This ensures newly created access keys are visible to LookupByAccessKey
immediately without requiring a server restart.
* Minor improvements to test helpers and log masking
1. iamapi_test.go: Update mustMarshalJSON to use t.Helper() and t.Fatal()
instead of panic() for better test diagnostics
2. s3api_embedded_iam.go: Mask access key in 'not found' log message
to avoid exposing full access key IDs in logs
* Mask access key in standalone IAM log message for consistency
Match the embedded IAM version by masking the access key ID in
the 'not found' log message (show only first 4 chars).
|
|
fix: use path instead of filepath for S3 object paths on Windows (#7733)
|
|
The WaitUntilConnected function was not properly respecting context
cancellation when sleeping between attempts. The time.Sleep call would
block for up to 200ms even after the context was cancelled.
This fix uses select with time.After to immediately return when the
context is cancelled, rather than waiting for the sleep to complete.
This fixes flaky test behavior where the function would take ~200ms
to return instead of respecting the ~100ms context timeout.
|
|
|
|
test: add HTTPS test cases for CORS wildcard subdomain matching
This adds comprehensive test coverage for HTTPS subdomain wildcard matching
in TestMatchesOrigin:
- https exact match
- https no match
- https wildcard subdomain match
- https wildcard subdomain no match (base domain)
- https wildcard subdomain no match (different domain)
- protocol mismatch tests (http pattern vs https origin and vice versa)
The matchWildcard function was already working correctly - this just adds
test coverage for the HTTPS cases that were previously untested.
Note: The cache invalidation is already handled synchronously by
setBucketMetadata() which is called via:
- UpdateBucketCORS -> UpdateBucketMetadata -> setBucketMetadata
- ClearBucketCORS -> UpdateBucketMetadata -> setBucketMetadata
Added clarifying comments to document this call chain.
|
|
Object Lock (#7734)
* fix: admin UI bucket delete now properly deletes collection and checks Object Lock
Fixes #7711
The admin UI's DeleteS3Bucket function was missing two critical behaviors:
1. It did not delete the collection from the master (unlike s3.bucket.delete
shell command), leaving orphaned volume data that caused fs.verify errors.
2. It did not check for Object Lock protections before deletion, potentially
allowing deletion of buckets with locked objects.
Changes:
- Add shared Object Lock checking utilities to object_lock_utils.go:
- EntryHasActiveLock: standalone function to check if an entry has active lock
- HasObjectsWithActiveLocks: shared function to scan bucket for locked objects
- Refactor S3 API entryHasActiveLock to use shared EntryHasActiveLock function
- Update admin UI DeleteS3Bucket to:
- Check Object Lock using shared HasObjectsWithActiveLocks utility
- Delete the collection before deleting filer entries (matching s3.bucket.delete)
* refactor: S3 API uses shared Object Lock utilities
Removes 114 lines of duplicated code from s3api_bucket_handlers.go by
having hasObjectsWithActiveLocks delegate to the shared HasObjectsWithActiveLocks
function in object_lock_utils.go.
Now both S3 API and Admin UI use the same shared utilities:
- EntryHasActiveLock
- HasObjectsWithActiveLocks
- recursivelyCheckLocksWithClient
- checkVersionsForLocksWithClient
* feat: s3.bucket.delete shell command now checks Object Lock
Add Object Lock protection to the s3.bucket.delete shell command.
If the bucket has Object Lock enabled and contains objects with active
retention or legal hold, deletion is prevented.
Also refactors Object Lock checking utilities into a new s3_objectlock
package to avoid import cycles between shell, s3api, and admin packages.
All three components now share the same logic:
- S3 API (DeleteBucketHandler)
- Admin UI (DeleteS3Bucket)
- Shell command (s3.bucket.delete)
* refactor: unified Object Lock checking and consistent deletion parameters
1. Add CheckBucketForLockedObjects() - a unified function that combines:
- Bucket entry lookup
- Object Lock enabled check
- Scan for locked objects
2. All three components now use this single function:
- S3 API (via s3api.CheckBucketForLockedObjects)
- Admin UI (via s3api.CheckBucketForLockedObjects)
- Shell command (via s3_objectlock.CheckBucketForLockedObjects)
3. Aligned deletion parameters across all components:
- isDeleteData: false (collection already deleted separately)
- isRecursive: true
- ignoreRecursiveError: true
* fix: properly handle non-EOF errors in Recv() loops
The Recv() loops in recursivelyCheckLocksWithClient and
checkVersionsForLocksWithClient were breaking on any error, which
could hide real stream errors and incorrectly report 'no locks found'.
Now:
- io.EOF: break loop (normal end of stream)
- any other error: return it so caller knows the stream failed
* fix: address PR review comments
1. Add path traversal protection - validate entry names before building
subdirectory paths. Skip entries with empty names, '.', '..', or
containing path separators.
2. Use exact match for .versions folder instead of HasSuffix() to avoid
mismatching unrelated directories like 'foo.versions'.
3. Replace path.Join with simple string concatenation since we now
validate entry names.
* refactor: extract paginateEntries helper to reduce duplication
The recursivelyCheckLocksWithClient and checkVersionsForLocksWithClient
functions shared significant structural similarity. Extracted a generic
paginateEntries helper that:
- Handles pagination logic (lastFileName tracking, Limit)
- Handles stream receiving with proper EOF vs error handling
- Validates entry names (path traversal protection)
- Calls a processEntry callback for business logic
This centralizes pagination logic and makes the code more maintainable.
* feat: add context propagation for timeout and cancellation support
All Object Lock checking functions now accept context.Context parameter:
- paginateEntries(ctx, client, dir, processEntry)
- recursivelyCheckLocksWithClient(ctx, client, dir, hasLocks, currentTime)
- checkVersionsForLocksWithClient(ctx, client, versionsDir, hasLocks, currentTime)
- HasObjectsWithActiveLocks(ctx, client, bucketPath)
- CheckBucketForLockedObjects(ctx, client, bucketsPath, bucketName)
This enables:
- Timeout support for large bucket scans
- Cancellation propagation from HTTP requests
- The S3 API handler now uses r.Context() for proper request lifecycle
* fix: address PR review comments
1. Add DefaultBucketsPath constant in admin_server.go instead of
hardcoding "/buckets" in multiple places.
2. Add defensive normalization in EntryHasActiveLock:
- TrimSpace to handle whitespace around values
- ToUpper for case-insensitive comparison of legal hold and
retention mode values
- TrimSpace on retention date before parsing
* fix: use ctx variable consistently instead of context.Background()
In both DeleteS3Bucket and command_s3_bucket_delete, use the ctx
variable defined at the start of the function for all gRPC calls
instead of creating new context.Background() instances.
|
|
Fixes #7721
When S3 server is configured with a filerGroup, it creates a MasterClient
to enable dynamic filer discovery. However, the KeepConnectedToMaster()
background goroutine was never started, causing GetMaster() to block
indefinitely in WaitUntilConnected().
This resulted in the log message:
WaitUntilConnected still waiting for master connection (attempt N)...
being logged repeatedly every ~20 seconds.
The fix adds the missing 'go masterClient.KeepConnectedToMaster(ctx)'
call to properly establish the connection to master servers.
Also adds unit tests to verify WaitUntilConnected respects context
cancellation.
|
|
* test: add integration test for versioned object listing path fix
Add integration test that validates the fix for GitHub discussion #7573.
The test verifies that:
- Entry names use path.Base() to get base filename only
- Path doubling bug is prevented when listing versioned objects
- Logical entries are created correctly with proper attributes
- .versions folder paths are handled correctly
This test documents the Velero/Kopia compatibility fix and prevents
regression of the path doubling bug.
* test: add Velero/Kopia integration test for versioned object listing
Add integration tests that simulate Velero/Kopia's exact access patterns
when using S3 versioning. These tests validate the fix for GitHub
discussion #7573 where versioned objects with nested paths would have
their paths doubled in ListObjects responses.
Tests added:
- TestVeleroKopiaVersionedObjectListing: Tests various Kopia path patterns
- TestVeleroKopiaGetAfterList: Verifies list-then-get workflow works
- TestVeleroMultipleVersionsWithNestedPaths: Tests multi-version objects
- TestVeleroListVersionsWithNestedPaths: Tests ListObjectVersions API
Each test verifies:
1. Listed keys match original keys without path doubling
2. Objects can be retrieved using the listed keys
3. Content integrity is maintained
Related: https://github.com/seaweedfs/seaweedfs/discussions/7573
* refactor: remove old unit test, keep only Velero integration test
Remove weed/s3api/s3api_versioning_list_test.go as it was a simpler
unit test that the comprehensive Velero integration test supersedes.
The integration test in test/s3/versioning/s3_velero_integration_test.go
provides better coverage by actually exercising the S3 API with real
AWS SDK calls.
* refactor: use defer for response body cleanup in test loop
Use anonymous function with defer for getResp.Body.Close() to be more
defensive against future code additions in the loop body.
* refactor: improve hasDoubledPath function clarity and efficiency
- Fix comment to accurately describe checking for repeated pairs
- Tighten outer loop bound from len(parts)-2 to len(parts)-3
- Remove redundant bounds checks in the condition
|
|
* volume.fsck: increase default cutoffTimeAgo from 5 minutes to 5 hours
This change makes the fsck check more conservative by only considering
chunks older than 5 hours as potential orphans. A 5 minute window was
too aggressive and could incorrectly flag recently written chunks,
especially in busy systems or during backup operations.
Addresses #7649
* Update command_volume_fsck.go
* volume.fsck: add help text explaining cutoffTimeAgo parameter
* Update command_volume_fsck.go
|
|
* fix: prevent path doubling in versioned object listing
Fix path doubling bug in getLatestVersionEntryForListOperation that caused
Velero/Kopia backups to fail when using S3 bucket versioning.
The issue was that when creating logical entries for versioned object listing,
the entry.Name was set to the full object path (e.g., 'kopia/logpaste/kopia.blobcfg')
instead of just the base filename ('kopia.blobcfg'). When this entry was used in
the list callback which combines dir + entry.Name, the paths got doubled:
'/buckets/velero/kopia/logpaste/kopia/logpaste/kopia.blobcfg'
This caused Kopia to fail loading pack indexes with the error:
'unable to load pack indexes despite 10 retries'
The fix uses path.Base(object) to extract only the filename portion, matching
how regular (non-versioned) entries work in the listing callback.
Fixes: GitHub discussion #7573
* refactor: use path.Base directly in test instead of reimplementing
Address code review feedback to simplify the test by using the standard
library path.Base function directly instead of reimplementing it.
* remove test: unit test for path.Base doesn't add much value
|
|
(#7704)
Detailed file information will be implemented in a follow-up MR. Note also
that masters are currently not reporting back EC shard sizes correctly, via
`master_pb.VolumeEcShardInformationMessage.shard_sizes`.
F.ex:
```
> cluster.status
cluster:
id: topo
status: LOCKED
nodes: 10
topology: 1 DC(s)s, 1 disk(s) on 1 rack(s)
volumes:
total: 3 volumes on 1 collections
max size: 31457280000 bytes
regular: 2/80 volumes on 6 replicas, 6 writable (100.00%), 0 read-only (0.00%)
EC: 1 EC volumes on 14 shards (14.00 shards/volume)
storage:
total: 186024424 bytes
regular volumes: 186024424 bytes
EC volumes: 0 bytes
raw: 558073152 bytes on volume replicas, 0 bytes on EC shard files
```
|
|
* shell: add -owner flag to s3.bucket.create command
This fixes an issue where buckets created via weed shell cannot be accessed
by non-admin S3 users because the bucket has no owner set.
When using S3 IAM authentication, non-admin users can only access buckets
they own. Buckets created via lazy S3 creation automatically have their
owner set from the request context, but buckets created via weed shell
had no owner, making them inaccessible to non-admin users.
The new -owner flag allows setting the bucket owner identity (s3-identity-id)
at creation time:
s3.bucket.create -name my-bucket -owner my-identity-name
Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599
* shell: add s3.bucket.owner command to view/change bucket ownership
This command allows viewing and changing the owner of an S3 bucket,
making it easier to manage bucket access for IAM users.
Usage:
# View the current owner of a bucket
s3.bucket.owner -name my-bucket
# Set or change the owner of a bucket
s3.bucket.owner -name my-bucket -set -owner new-identity
# Remove the owner (make bucket admin-only)
s3.bucket.owner -name my-bucket -set -owner ""
* shell: show bucket owner in s3.bucket.list output
Display the bucket owner (s3-identity-id) when listing buckets,
making it easier to see which identity owns each bucket.
Example output:
my-bucket size:1024 chunk:5 owner:my-identity
* admin: add bucket owner support to admin UI
- Add Owner field to S3Bucket struct for displaying bucket ownership
- Add Owner field to CreateBucketRequest for setting owner at creation
- Add UpdateBucketOwner API endpoint (PUT /api/s3/buckets/:bucket/owner)
- Add SetBucketOwner function for updating bucket ownership
- Update GetS3Buckets to populate owner from s3-identity-id extended attribute
- Update CreateS3BucketWithObjectLock to set owner when creating bucket
This allows the admin UI to display bucket owners and supports creating/
editing bucket ownership, which is essential for S3 IAM authentication
where non-admin users can only access buckets they own.
* admin: show bucket owner in buckets list and create form
- Add Owner column to buckets table to display bucket ownership
- Add Owner field to create bucket form for setting owner at creation
- Show owner in bucket details modal
- Update JavaScript to include owner when creating buckets
This makes bucket ownership visible and configurable from the admin UI,
which is essential for S3 IAM authentication where non-admin users can
only access buckets they own.
* admin: add bucket owner management with user dropdown
- Add 'Manage Owner' button to bucket actions
- Add modal with dropdown to select owner from existing users
- Fetch users from /api/users endpoint to populate dropdown
- Update create bucket form to use dropdown for owner selection
- Allow setting owner to empty (no owner = admin-only access)
This provides a user-friendly way to manage bucket ownership by selecting
from existing S3 identities rather than manually typing identity names.
* fix: use username instead of name for user dropdown
The /api/users endpoint returns 'username' field, not 'name'.
Fixed both the manage owner modal and create bucket form.
* Update s3_buckets_templ.go
* fix: address code review feedback for s3.bucket.create
- Check if entry.Extended is nil before making a new map to prevent
overwriting any previously set extended attributes
- Use fmt.Fprintln(writer, ...) instead of println() for consistent
output handling across the shell command framework
* fix: improve help text and validate owner input
- Add note that -owner value should match identity name in s3.json
- Trim whitespace from owner and treat whitespace-only as empty
* fix: address code review feedback for list and owner commands
- s3.bucket.list: Use %q to escape owner value and prevent malformed
tabular output from special characters (tabs/newlines/control chars)
- s3.bucket.owner: Use neutral error message for lookup failures since
they can occur for reasons other than missing bucket (e.g., permission)
* fix: improve s3.bucket.owner CLI UX
- Remove confusing -set flag that was required but not shown in examples
- Add explicit -delete flag to remove owner (safer than empty string)
- Presence of -owner now implies set operation (no extra flag needed)
- Validate that -owner and -delete cannot be used together
- Trim whitespace from owner value
- Update help text with correct examples and add note about identity name
- Clearer success messages for each operation
* fix: address code review feedback for admin UI
- GetBucketDetails: Extract and return owner from extended attributes
- CSV export: Fix column indices after adding Owner column, add Owner to header
- XSS prevention: Add escapeHtml() function to sanitize user data in innerHTML
(bucket.name, bucket.owner, bucket.object_lock_mode, obj.key, obj.storage_class)
* fix: address additional code review feedback
- types.go: Add omitempty to Owner JSON tag, update comment
- bucket_management.go: Trim and validate owner (max 256 chars) in CreateBucket
- bucket_management.go: Use neutral error message in SetBucketOwner lookup
* fix: improve owner field handling and error recovery
bucket_management.go:
- Use *string pointer for Owner to detect if field was explicitly provided
- Return HTTP 400 if owner field is missing (use empty string to clear)
- Trim and validate owner (max 256 chars) in UpdateBucketOwner
s3_buckets.templ:
- Re-enable owner select dropdown on fetch error
- Reset dropdown to default 'No owner' option on error
- Allow users to retry or continue without selecting an owner
* fix: move modal instance variables to global scope
Move deleteModalInstance, quotaModalInstance, ownerModalInstance,
detailsModalInstance, and cachedUsers to global scope so they are
accessible from both DOMContentLoaded handlers and global functions
like deleteBucket(). This fixes the undefined variable issue.
* refactor: improve modal handling and avoid global window properties
- Initialize modal instances once on DOMContentLoaded and reuse with show()
- Replace window.currentBucket* global properties with data attributes on forms
- Remove modal dispose/recreate pattern and unnecessary cleanup code
- Scope state to relevant DOM elements instead of global namespace
* Update s3_buckets_templ.go
* fix: define MaxOwnerNameLength constant and implement RFC 4180 CSV escaping
bucket_management.go:
- Add MaxOwnerNameLength constant (256) with documentation
- Replace magic number 256 with constant in both validation checks
s3_buckets.templ:
- Add escapeCsvField() helper for RFC 4180 compliant CSV escaping
- Properly handle commas, double quotes, and newlines in field values
- Escape internal quotes by doubling them (")→("")
* Update s3_buckets_templ.go
* refactor: use direct gRPC client methods for consistency
- command_s3_bucket_create.go: Use client.CreateEntry instead of filer_pb.CreateEntry
- command_s3_bucket_owner.go: Use client.LookupDirectoryEntry instead of filer_pb.LookupEntry
- command_s3_bucket_owner.go: Use client.UpdateEntry instead of filer_pb.UpdateEntry
This aligns with the pattern used in weed/admin/dash/bucket_management.go
|
|
When both -s3.config and -s3.iam.config are configured, traditional
credentials from -s3.config were failing with Access Denied because
the authorization code always used IAM authorization when IAM
integration was configured.
The fix checks if the identity has legacy Actions (from -s3.config).
If so, use the legacy canDo() authorization. Only use IAM authorization
for JWT/STS identities that don't have legacy Actions.
This allows both configuration options to coexist:
- Traditional credentials use legacy authorization
- JWT/STS credentials use IAM authorization
Fixes #7720
|
|
When only IAM integration is configured (via -s3.iam.config) without
traditional S3 identities, the isAuthEnabled flag was not being set,
causing the Auth middleware to bypass all authentication checks.
This fix ensures that when SetIAMIntegration is called with a non-nil
integration, isAuthEnabled is set to true, properly enforcing
authentication for all requests.
Added negative authentication tests:
- TestS3AuthenticationDenied: tests rejection of unauthenticated,
invalid, and expired JWT requests
- TestS3IAMOnlyModeRejectsAnonymous: tests that IAM-only mode
properly rejects anonymous requests
Fixes #7724
|
|
* filer: reduce allocations in MatchStorageRule
Optimize MatchStorageRule to avoid allocations in common cases:
- Return singleton emptyPathConf when no rules match (zero allocations)
- Return existing rule directly when only one rule matches (zero allocations)
- Only allocate and merge when multiple rules match (rare case)
Based on heap profile analysis showing 111MB allocated from 1.64M calls
to this function during 180 seconds of operation.
* filer: add fast path for getActualStore when no path-specific stores
Add hasPathSpecificStore flag to FilerStoreWrapper to skip
the MatchPrefix() call and []byte(path) conversion when no
path-specific stores are configured (the common case).
Based on heap profile analysis showing 1.39M calls to this
function during 180 seconds of operation, each requiring a
string-to-byte slice conversion for the MatchPrefix call.
* filer/foundationdb: use sync.Pool for tuple allocation in genKey
Use sync.Pool to reuse tuple.Tuple slices in genKey(), reducing
allocation overhead for every FoundationDB operation.
Based on heap profile analysis showing 102MB allocated from 1.79M
calls to genKey() during 180 seconds of operation. The Pack() call
still allocates internally, but this reduces the tuple slice
allocation overhead by ~50%.
* filer: use sync.Pool for protobuf Entry and FuseAttributes
Add pooling for filer_pb.Entry and filer_pb.FuseAttributes in
EncodeAttributesAndChunks and DecodeAttributesAndChunks to reduce
allocations during filer store operations.
Changes:
- Add pbEntryPool with pre-allocated FuseAttributes
- Add EntryAttributeToExistingPb for in-place attribute conversion
- Update ToExistingProtoEntry to reuse existing Attributes when available
Based on heap profile showing:
- EncodeAttributesAndChunks: 69.5MB cumulative
- DecodeAttributesAndChunks: 46.5MB cumulative
- EntryAttributeToPb: 47.5MB flat allocations
* log_buffer: use sync.Pool for LogEntry in readTs
Add logEntryPool to reuse filer_pb.LogEntry objects in readTs(),
which is called frequently during binary search in ReadFromBuffer.
This function only needs the TsNs field from the unmarshaled entry,
so pooling the LogEntry avoids repeated allocations.
Based on heap profile showing readTs with 188MB cumulative allocations
from timestamp lookups during log buffer reads.
* pb: reduce gRPC metadata allocations in interceptor
Optimize requestIDUnaryInterceptor and WithGrpcClient to reduce
metadata allocations on every gRPC request:
- Use AppendToOutgoingContext instead of NewOutgoingContext + New()
This avoids creating a new map[string]string for single key-value pairs
- Check FromIncomingContext return value before using metadata
Based on heap profile showing metadata operations contributing 0.45GB
(10.5%) of allocations, with requestIDUnaryInterceptor being the main
source at 0.44GB cumulative.
Expected reduction: ~0.2GB from avoiding map allocations per request.
* filer/log_buffer: address code review feedback
- Use proto.Reset() instead of manual field clearing in resetLogEntry
for more idiomatic and comprehensive state clearing
- Add resetPbEntry() call before pool return in error path for
consistency with success path in DecodeAttributesAndChunks
* log_buffer: reduce PreviousBufferCount from 32 to 4
Reduce the number of retained previous buffers from 32 to 4.
Each buffer is 8MB, so this reduces the maximum retained memory
from 256MB to 32MB for previous buffers.
Most subscribers catch up quickly, so 4 buffers (32MB) should
be sufficient while significantly reducing memory footprint.
* filer/foundationdb: use defer for tuple pool cleanup in genKey
Refactor genKey to use defer for returning the pooled tuple.
This ensures the pooled object is always returned even if
store.seaweedfsDir.Pack panics, making the code more robust.
Also simplifies the code by removing the temporary variable.
* filer: early-stop MatchStorageRule prescan after 2 matches
Stop the prescan callback after finding 2 matches since we only
need to know if there are 0, 1, or multiple matches. This avoids
unnecessarily scanning the rest of the trie when many rules exist.
* fix: address critical code review issues
filer_conf.go:
- Remove mutable singleton emptyPathConf that could corrupt shared state
- Return fresh copy for no-match case and cloned copy for single-match case
- Add clonePathConf helper to create shallow copies safely
grpc_client_server.go:
- Remove incorrect AppendToOutgoingContext call in server interceptor
(that API is for outbound client calls, not server-side handlers)
- Rely on request_id.Set and SetTrailer for request ID propagation
* fix: treat FilerConf_PathConf as immutable
Fix callers that were incorrectly mutating the returned PathConf:
- filer_server_handlers_write.go: Use local variable for MaxFileNameLength
instead of mutating the shared rule
- command_s3_bucket_quota_check.go: Create new PathConf explicitly when
modifying config instead of mutating the returned one
This allows MatchStorageRule to safely return the singleton or direct
references without copying, restoring the memory optimization.
Callers must NOT mutate the returned *FilerConf_PathConf.
* filer: add ClonePathConf helper for creating mutable copies
Add reusable ClonePathConf function that creates a mutable copy of
a PathConf. This is useful when callers need to modify config before
calling SetLocationConf.
Update command_s3_bucket_quota_check.go to use the new helper.
Also fix redundant return statement in DeleteLocationConf.
* fmt
* filer: fix protobuf pool reset to clear internal fields
Address code review feedback:
1. resetPbEntry/resetFuseAttributes: Use struct assignment (*e = T{})
instead of field-by-field reset to clear protobuf internal fields
(unknownFields, sizeCache) that would otherwise accumulate across
pool reuses, causing data corruption or memory bloat.
2. EntryAttributeToExistingPb: Add nil guard for attr parameter to
prevent panic if caller passes nil.
* log_buffer: reset logEntry before pool return in error path
For consistency with success path, reset the logEntry before putting
it back in the pool in the error path. This prevents the pooled object
from holding references to partially unmarshaled data.
* filer: optimize MatchStorageRule and document ClonePathConf
1. Avoid double []byte(path) conversion in multi-match case by
converting once and reusing pathBytes.
2. Add IMPORTANT comment to ClonePathConf documenting that it must
be kept in sync with filer_pb.FilerConf_PathConf fields when
the protobuf evolves.
* filer/log_buffer: fix data race and use defer for pool cleanup
1. entry_codec.go EncodeAttributesAndChunks: Fix critical data race -
proto.Marshal may return a slice sharing memory with the message.
Copy the data before returning message to pool to prevent corruption.
2. entry_codec.go DecodeAttributesAndChunks: Use defer for cleaner
pool management, ensuring message is always returned to pool.
3. log_buffer.go readTs: Use defer for pool cleanup, removing
duplicated resetLogEntry/Put calls in success and error paths.
* filer: fix ClonePathConf field order and add comprehensive test
1. Fix field order in ClonePathConf to match protobuf struct definition
(WormGracePeriodSeconds before WormRetentionTimeSeconds).
2. Add TestClonePathConf that constructs a fully-populated PathConf,
calls ClonePathConf, and asserts equality of all exported fields.
This will catch future schema drift when new fields are added.
3. Add TestClonePathConfNil to verify nil handling.
* filer: use reflection in ClonePathConf test to detect schema drift
Replace hardcoded field comparisons with reflection-based comparison.
This automatically catches:
1. New fields added to the protobuf but not copied in ClonePathConf
2. Missing non-zero test values for any exported field
The test iterates over all exported fields using reflect and compares
src vs clone values, failing if any field differs.
* filer: update EntryAttributeToExistingPb comment to reflect nil handling
The function safely handles nil attr by returning early, but the comment
incorrectly stated 'attr must not be nil'. Update comment to accurately
describe the defensive behavior.
* Fix review feedback: restore request ID propagation and remove redundant resets
1. grpc_client_server.go: Restore AppendToOutgoingContext for request ID
so handlers making downstream gRPC calls will automatically propagate
the request ID to downstream services.
2. entry_codec.go: Remove redundant resetPbEntry calls after Get.
The defer block ensures reset before Put, so next Get receives clean object.
3. log_buffer.go: Remove redundant resetLogEntry call after Get for
same reason - defer already handles reset before Put.
|
|
Bumps [github.com/quic-go/quic-go](https://github.com/quic-go/quic-go) from 0.54.1 to 0.57.0.
- [Release notes](https://github.com/quic-go/quic-go/releases)
- [Commits](https://github.com/quic-go/quic-go/compare/v0.54.1...v0.57.0)
---
updated-dependencies:
- dependency-name: github.com/quic-go/quic-go
dependency-version: 0.57.0
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
|
* filer.sync: fix race condition on first checkpoint save
Initialize lastWriteTime to time.Now() instead of zero time to prevent
the first checkpoint save from being triggered immediately when the
first event arrives. This gives async jobs time to complete and update
the watermark before the checkpoint is saved.
Previously, the zero time caused lastWriteTime.Add(3s).Before(now) to
be true on the first event, triggering an immediate checkpoint save
attempt. But since jobs are processed asynchronously, the watermark
was still 0 (initial value), causing the save to be skipped due to
the 'if offsetTsNs == 0 { return nil }' check.
Fixes #7717
* filer.sync: save checkpoint on graceful shutdown
Add graceful shutdown handling to save the final checkpoint when
filer.sync is terminated. Previously, any sync progress within the
last 3-second checkpoint interval would be lost on shutdown.
Changes:
- Add syncState struct to track current processor and offset save info
- Add atomic pointers syncStateA2B and syncStateB2A for both directions
- Register grace.OnInterrupt hook to save checkpoints on shutdown
- Modify doSubscribeFilerMetaChanges to update sync state atomically
This ensures that when filer.sync is restarted, it resumes from the
correct position instead of potentially replaying old events.
Fixes #7717
|
|
* test: fix master client timeout causing test hangs
Use the main test context for KeepConnectedToMaster instead of creating
a separate 60s context. The tests have 180s outer timeouts but the master
client was disconnecting after 60s, causing subsequent commands to hang
waiting for reconnection.
* test: add -peers=none to all test masters and timeout for lock
- Add -peers=none flag to all master servers for faster startup
- Add tryLockWithTimeout helper to avoid tests hanging on lock acquisition
- Skip tests if lock cannot be acquired within 30 seconds
* test: extract connectToMasterAndSync helper to reduce duplication
* test: fix captureCommandOutput pipe deadlock
Close write end of pipe before calling io.ReadAll to signal EOF,
otherwise ReadAll blocks forever waiting for more data.
* test: fix tryLockWithTimeout to check lock command errors
Propagate lock command error through channel and only treat as
locked if command succeeded. Previously any completion (including
errors) was treated as successful lock acquisition.
|
|
* s3: fix presigned POST upload missing slash between bucket and key
When uploading a file using presigned POST (e.g., boto3.generate_presigned_post),
the file was saved with the bucket name and object key concatenated without a
slash (e.g., 'my-bucketfilename' instead of 'my-bucket/filename').
The issue was that PostPolicyBucketHandler retrieved the object key from form
values without ensuring it had a leading slash, unlike GetBucketAndObject()
which normalizes the key.
Fixes #7713
* s3: add tests for presigned POST key normalization
Add comprehensive tests for PostPolicyBucketHandler to ensure:
- Object keys without leading slashes are properly normalized
- ${filename} substitution works correctly with normalization
- Path construction correctly separates bucket and key
- Form value extraction works properly
These tests would have caught the bug fixed in the previous commit
where keys like 'test_image.png' were concatenated with bucket
without a separator, resulting in 'my-buckettest_image.png'.
* s3: create normalizeObjectKey function for robust key normalization
Address review feedback by creating a reusable normalizeObjectKey function
that both adds a leading slash and removes duplicate slashes, aligning with
how other handlers process paths (e.g., toFilerPath uses removeDuplicateSlashes).
The function handles edge cases like:
- Keys without leading slashes (the original bug)
- Keys with duplicate slashes (e.g., 'a//b' -> '/a/b')
- Keys with leading duplicate slashes (e.g., '///a' -> '/a')
Updated tests to use the new function and added TestNormalizeObjectKey
for comprehensive coverage of the new function.
* s3: move NormalizeObjectKey to s3_constants for shared use
Move the NormalizeObjectKey function to the s3_constants package so it can
be reused by:
- GetBucketAndObject() - now normalizes all object keys from URL paths
- GetPrefix() - now normalizes prefix query parameters
- PostPolicyBucketHandler - normalizes keys from form values
This ensures consistent object key normalization across all S3 API handlers,
handling both missing leading slashes and duplicate slashes.
Benefits:
- Single source of truth for key normalization
- GetBucketAndObject now removes duplicate slashes (previously only added leading slash)
- All handlers benefit from the improved normalization automatically
|
|
* ec: add diskType parameter to core EC functions
Add diskType parameter to:
- ecBalancer struct
- collectEcVolumeServersByDc()
- collectEcNodesForDC()
- collectEcNodes()
- EcBalance()
This allows EC operations to target specific disk types (hdd, ssd, etc.)
instead of being hardcoded to HardDriveType only.
For backward compatibility, all callers currently pass types.HardDriveType
as the default value. Subsequent commits will add -diskType flags to
the individual EC commands.
* ec: update helper functions to use configurable diskType
Update the following functions to accept/use diskType parameter:
- findEcVolumeShards()
- addEcVolumeShards()
- deleteEcVolumeShards()
- moveMountedShardToEcNode()
- countShardsByRack()
- pickNEcShardsToMoveFrom()
All ecBalancer methods now use ecb.diskType instead of hardcoded
types.HardDriveType. Non-ecBalancer callers (like volumeServer.evacuate
and ec.rebuild) use types.HardDriveType as the default.
Update all test files to pass diskType where needed.
* ec: add -diskType flag to ec.balance and ec.encode commands
Add -diskType flag to specify the target disk type for EC operations:
- ec.balance -diskType=ssd
- ec.encode -diskType=ssd
The disk type can be 'hdd', 'ssd', or empty for default (hdd).
This allows placing EC shards on SSD or other disk types instead of
only HDD.
Example usage:
ec.balance -collection=mybucket -diskType=ssd -apply
ec.encode -collection=mybucket -diskType=ssd -force
* test: add integration tests for EC disk type support
Add integration tests to verify the -diskType flag works correctly:
- TestECDiskTypeSupport: Tests EC encode and balance with SSD disk type
- TestECDiskTypeMixedCluster: Tests EC operations on a mixed HDD/SSD cluster
The tests verify:
- Volume servers can be configured with specific disk types
- ec.encode accepts -diskType flag and encodes to the correct disk type
- ec.balance accepts -diskType flag and balances on the correct disk type
- Mixed disk type clusters work correctly with separate collections
* ec: add -sourceDiskType to ec.encode and -diskType to ec.decode
ec.encode:
- Add -sourceDiskType flag to filter source volumes by disk type
- This enables tier migration scenarios (e.g., SSD volumes → HDD EC shards)
- -diskType specifies target disk type for EC shards
ec.decode:
- Add -diskType flag to specify source disk type where EC shards are stored
- Update collectEcShardIds() and collectEcNodeShardBits() to accept diskType
Examples:
# Encode SSD volumes to HDD EC shards (tier migration)
ec.encode -collection=mybucket -sourceDiskType=ssd -diskType=hdd
# Decode EC shards from SSD
ec.decode -collection=mybucket -diskType=ssd
Integration tests updated to cover new flags.
* ec: fix variable shadowing and add -diskType to ec.rebuild and volumeServer.evacuate
Address code review comments:
1. Fix variable shadowing in collectEcVolumeServersByDc():
- Rename loop variable 'diskType' to 'diskTypeKey' and 'diskTypeStr'
to avoid shadowing the function parameter
2. Fix hardcoded HardDriveType in ecBalancer methods:
- balanceEcRack(): use ecb.diskType instead of types.HardDriveType
- collectVolumeIdToEcNodes(): use ecb.diskType
3. Add -diskType flag to ec.rebuild command:
- Add diskType field to ecRebuilder struct
- Pass diskType to collectEcNodes() and addEcVolumeShards()
4. Add -diskType flag to volumeServer.evacuate command:
- Add diskType field to commandVolumeServerEvacuate struct
- Pass diskType to collectEcVolumeServersByDc() and moveMountedShardToEcNode()
* test: add diskType field to ecBalancer in TestPickEcNodeToBalanceShardsInto
Address nitpick comment: ensure test ecBalancer struct has diskType
field set for consistency with other tests.
* ec: filter disk selection by disk type in pickBestDiskOnNode
When evacuating or rebalancing EC shards, pickBestDiskOnNode now
filters disks by the target disk type. This ensures:
1. EC shards from SSD disks are moved to SSD disks on destination nodes
2. EC shards from HDD disks are moved to HDD disks on destination nodes
3. No cross-disk-type shard movement occurs
This maintains the storage tier isolation when moving EC shards
between nodes during evacuation or rebalancing operations.
* ec: allow disk type fallback during evacuation
Update pickBestDiskOnNode to accept a strictDiskType parameter:
- strictDiskType=true (balancing): Only use disks of matching type.
This maintains storage tier isolation during normal rebalancing.
- strictDiskType=false (evacuation): Prefer same disk type, but
fall back to other disk types if no matching disk is available.
This ensures evacuation can complete even when same-type capacity
is insufficient.
Priority order for evacuation:
1. Same disk type with lowest shard count (preferred)
2. Different disk type with lowest shard count (fallback)
* test: use defer for lock/unlock to prevent lock leaks
Use defer to ensure locks are always released, even on early returns
or test failures. This prevents lock leaks that could cause subsequent
tests to hang or fail.
Changes:
- Return early if lock acquisition fails
- Immediately defer unlock after successful lock
- Remove redundant explicit unlock calls at end of tests
- Fix unused variable warning (err -> encodeErr/locErr)
* ec: dynamically discover disk types from topology for evacuation
Disk types are free-form tags (e.g., 'ssd', 'nvme', 'archive') that come
from the topology, not a hardcoded set. Only 'hdd' (or empty) is the
default disk type.
Use collectVolumeDiskTypes() to discover all disk types present in the
cluster topology instead of hardcoding [HardDriveType, SsdType].
* test: add evacuation fallback and cross-rack EC placement tests
Add two new integration tests:
1. TestEvacuationFallbackBehavior:
- Tests that when same disk type has no capacity, shards fall back
to other disk types during evacuation
- Creates cluster with 1 SSD + 2 HDD servers (limited SSD capacity)
- Verifies pickBestDiskOnNode behavior with strictDiskType=false
2. TestCrossRackECPlacement:
- Tests EC shard distribution across different racks
- Creates cluster with 4 servers in 4 different racks
- Verifies shards are spread across multiple racks
- Tests that ec.balance respects rack placement
Helper functions added:
- startLimitedSsdCluster: 1 SSD + 2 HDD servers
- startMultiRackCluster: 4 servers in 4 racks
- countShardsPerRack: counts EC shards per rack from disk
* test: fix collection mismatch in TestCrossRackECPlacement
The EC commands were using collection 'rack_test' but uploaded test data
uses collection 'test' (default). This caused ec.encode/ec.balance to not
find the uploaded volume.
Fix: Change EC commands to use '-collection test' to match the uploaded data.
Addresses review comment from PR #7607.
* test: close log files in MultiDiskCluster.Stop() to prevent FD leaks
Track log files in MultiDiskCluster.logFiles and close them in Stop()
to prevent file descriptor accumulation in long-running or many-test
scenarios.
Addresses review comment about logging resources cleanup.
* test: improve EC integration tests with proper assertions
- Add assertNoFlagError helper to detect flag parsing regressions
- Update diskType subtests to fail on flag errors (ec.encode, ec.balance, ec.decode)
- Update verify_disktype_flag_parsing to check help output contains diskType
- Remove verify_fallback_disk_selection (was documentation-only, not executable)
- Add assertion to verify_cross_rack_distribution for minimum 2 racks
- Consolidate uploadTestDataWithDiskType to accept collection parameter
- Remove duplicate uploadTestDataWithDiskTypeMixed function
* test: extract captureCommandOutput helper and fix error handling
- Add captureCommandOutput helper to reduce code duplication in diskType tests
- Create commandRunner interface to match shell command Do method
- Update ec_encode_with_ssd_disktype, ec_balance_with_ssd_disktype,
ec_encode_with_source_disktype, ec_decode_with_disktype to use helper
- Fix filepath.Glob error handling in countShardsPerRack instead of ignoring it
* test: add flag validation to ec_balance_targets_correct_disk_type
Add assertNoFlagError calls after ec.balance commands to ensure
-diskType flag is properly recognized for both SSD and HDD disk types.
* test: add proper assertions for EC command results
- ec_encode_with_ssd_disktype: check for expected volume-related errors
- ec_balance_with_ssd_disktype: require success with require.NoError
- ec_encode_with_source_disktype: check for expected no-volume errors
- ec_decode_with_disktype: check for expected no-ec-volume errors
- upload_to_ssd_and_hdd: use require.NoError for setup validation
Tests now properly fail on unexpected errors rather than just logging.
* test: fix missing unlock in ec_encode_with_disk_awareness
Add defer unlock pattern to ensure lock is always released, matching
the pattern used in other subtests.
* test: improve helper robustness
- Make assertNoFlagError case-insensitive for pattern matching
- Use defer in captureCommandOutput to restore stdout/stderr and close
pipe ends to avoid FD leaks even if cmd.Do panics
|
|
|
|
* fix: filer do not support IP whitelist right now #7094
* Apply suggestion from @gemini-code-assist[bot]
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
|
Change all concurrentUploadLimitMB and concurrentDownloadLimitMB defaults
from fixed values (64, 128, 256 MB) to 0 (unlimited).
This removes artificial throttling that can limit throughput on high-performance
systems, especially on all-flash setups with many cores.
Files changed:
- volume.go: concurrentUploadLimitMB 256->0, concurrentDownloadLimitMB 256->0
- server.go: filer/volume/s3 concurrent limits 64/128->0
- s3.go: concurrentUploadLimitMB 128->0
- filer.go: concurrentUploadLimitMB 128->0, s3.concurrentUploadLimitMB 128->0
Users can still set explicit limits if needed for resource management.
|
|
fix: weed shell can't connect to master when no volume servers (#7701)
When there are no volume servers registered, the master's KeepConnected
handler would not send any initial message to clients. This caused the
shell's masterClient to block indefinitely on stream.Recv(), preventing
it from setting currentMaster and completing the connection handshake.
The fix ensures the master always sends at least one message with leader
information to newly connected clients, even when ToVolumeLocations()
returns an empty slice.
|
|
|
|
Alpine's busybox wget does not support --ca-cert, --certificate, and
--private-key options required for HTTPS healthchecks with client
certificate authentication.
Adding curl to Docker images enables proper HTTPS healthchecks.
Fixes #7707
|
|
|
|
mount: add periodic metadata flush to protect chunks from orphan cleanup
When a file is opened via FUSE mount and written for a long time without
being closed, chunks are uploaded to volume servers but the file metadata
(containing chunk references) is only saved to the filer on file close.
If volume.fsck runs during this window, it may identify these chunks as
orphans (not referenced in filer metadata) and purge them, causing data loss.
This commit adds a background task that periodically flushes file metadata
for open files to the filer, ensuring chunk references are visible to
volume.fsck even before files are closed.
New option:
-metadataFlushSeconds (default: 120)
Interval in seconds for flushing dirty file metadata to filer.
Set to 0 to disable.
Fixes: https://github.com/seaweedfs/seaweedfs/issues/7649
|
|
* fix: add pagination to list-object-versions for buckets with >1000 objects
The findVersionsRecursively() function used a fixed limit of 1000 entries
without pagination. This caused objects beyond the first 1000 entries
(sorted alphabetically) to never appear in list-object-versions responses.
Changes:
- Add pagination loop using filer.PaginationSize (1024)
- Use isLast flag from s3a.list() to detect end of pagination
- Track startFrom marker for each page
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: prevent infinite loop in ListObjects when processing .versions directories
The doListFilerEntries() function processes .versions directories in a
secondary loop after the main entry loop, but failed to update nextMarker.
This caused infinite pagination loops when results were truncated, as the
same .versions directories would be reprocessed on each page.
Bug introduced by: c196d03951a75d3b8976f556cb0400e5b522edeb
("fix listing object versions (#7006)")
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
|
This addresses issue #7699 where FoundationDB filer store had low throughput
(~400-500 obj/s) due to each write operation creating a separate transaction.
Changes:
- Add writeBatcher that collects multiple writes into batched transactions
- New config options: batch_size (default: 100), batch_interval (default: 5ms)
- Batching provides ~5.7x throughput improvement (from ~456 to ~2600 obj/s)
Benchmark results with different batch sizes:
- batch_size=1: ~456 obj/s (baseline, no batching)
- batch_size=10: ~2621 obj/s (5.7x improvement)
- batch_size=16: ~2514 obj/s (5.5x improvement)
- batch_size=100: ~2617 obj/s (5.7x improvement)
- batch_size=1000: ~2593 obj/s (5.7x improvement)
The batch_interval timer (5ms) ensures writes are flushed promptly even
when batch is not full, providing good latency characteristics.
Addressed review feedback:
- Changed wait=false to wait=true in UpdateEntry/DeleteEntry to properly
propagate errors to callers
- Fixed timer reset race condition by stopping and draining before reset
Fixes #7699
|
|
The condition was inverted - it was caching lookups with errors
instead of successful lookups. This caused every replicated write
to make a gRPC call to master for volume location lookup, resulting
in ~1 second latency for writeToReplicas.
The bug particularly affected TTL volumes because:
- More unique volumes are created (separate pools per TTL)
- Volumes expire and get recreated frequently
- Each new volume requires a fresh lookup (cache miss)
- Higher volume churn = more cache misses = more master lookups
With this fix, successful lookups are cached for 10 minutes,
reducing replication latency from ~1s to ~10ms for cached volumes.
|
|
batching (#7697)
* mount: add singleflight to deduplicate concurrent EnsureVisited calls
When multiple goroutines access the same uncached directory simultaneously,
they would all make redundant network requests to the filer. This change
uses singleflight.Group to ensure only one goroutine fetches the directory
entries while others wait for the result.
This fixes a race condition where concurrent lookups or readdir operations
on the same uncached directory would:
1. Make duplicate network requests to the filer
2. Insert duplicate entries into LevelDB cache
3. Waste CPU and network bandwidth
* mount: fetch parent directories in parallel during EnsureVisited
Previously, when accessing a deep path like /a/b/c/d, the parent directories
were fetched serially from target to root. This change:
1. Collects all uncached directories from target to root first
2. Fetches them all in parallel using errgroup
3. Relies on singleflight (from previous commit) for deduplication
This reduces latency when accessing deep uncached paths, especially in
high-latency network environments where parallel requests can significantly
improve performance.
* mount: add batch inserts for LevelDB meta cache
When populating the meta cache from filer, entries were inserted one-by-one
into LevelDB. This change:
1. Adds BatchInsertEntries method to LevelDBStore that uses LevelDB's
native batch write API
2. Updates MetaCache to keep a direct reference to the LevelDB store
for batch operations
3. Modifies doEnsureVisited to collect entries and insert them in
batches of 100 entries
Batch writes are more efficient because:
- Reduces number of individual write operations
- Reduces disk syncs
- Improves throughput for large directories
* mount: fix potential nil dereference in MarkChildrenCached
Add missing check for inode existence in inode2path map before accessing
the InodeEntry. This prevents a potential nil pointer dereference if the
inode exists in path2inode but not in inode2path (which could happen due
to race conditions or bugs).
This follows the same pattern used in IsChildrenCached which properly
checks for existence before accessing the entry.
* mount: fix batch flush when last entry is hidden
The previous batch insert implementation relied on the isLast flag to flush
remaining entries. However, if the last entry is a hidden system entry
(like 'topics' or 'etc' in root), the callback returns early and the
remaining entries in the batch are never flushed.
Fix by:
1. Only flush when batch reaches threshold inside the callback
2. Flush any remaining entries after ReadDirAllEntries completes
3. Use error wrapping instead of logging+returning to avoid duplicate logs
4. Create new slice after flush to allow GC of flushed entries
5. Add documentation for batchInsertSize constant
This ensures all entries are properly inserted regardless of whether
the last entry is hidden, and prevents memory retention issues.
* mount: add context support for cancellation in EnsureVisited
Thread context.Context through the batch insert call chain to enable
proper cancellation and timeout support:
1. Use errgroup.WithContext() so if one fetch fails, others are cancelled
2. Add context parameter to BatchInsertEntries for consistency with InsertEntry
3. Pass context to ReadDirAllEntries for cancellation during network calls
4. Check context cancellation before starting work in doEnsureVisited
5. Use %w for error wrapping to preserve error types for inspection
This prevents unnecessary work when one directory fetch fails and makes
the batch operations consistent with the existing context-aware APIs.
|
|
mount: remove unused isEarlyTerminated variable
The variable was redundant because when processEachEntryFn returns false,
we immediately return fuse.OK, so the check was always false.
|
|
* fix nfs list with prefix batch scan
* remove else branch
|
|
* fix: prevent filer.backup stall in single-filer setups (#4977)
When MetaAggregator.MetaLogBuffer is empty (which happens in single-filer
setups with no peers), ReadFromBuffer was returning nil error, causing
LoopProcessLogData to enter an infinite wait loop on ListenersCond.
This fix returns ResumeFromDiskError instead, allowing SubscribeMetadata
to loop back and read from persisted logs on disk. This ensures filer.backup
continues processing events even when the in-memory aggregator buffer is empty.
Fixes #4977
* test: add integration tests for metadata subscription
Add integration tests for metadata subscription functionality:
- TestMetadataSubscribeBasic: Tests basic subscription and event receiving
- TestMetadataSubscribeSingleFilerNoStall: Regression test for #4977,
verifies subscription doesn't stall under high load in single-filer setups
- TestMetadataSubscribeResumeFromDisk: Tests resuming subscription from disk
Related to #4977
* ci: add GitHub Actions workflow for metadata subscribe tests
Add CI workflow that runs on:
- Push/PR to master affecting filer, log_buffer, or metadata subscribe code
- Runs the integration tests for metadata subscription
- Uploads logs on failure for debugging
Related to #4977
* fix: use multipart form-data for file uploads in integration tests
The filer expects multipart/form-data for file uploads, not raw POST body.
This fixes the 'Content-Type isn't multipart/form-data' error.
* test: use -peers=none for faster master startup
* test: add -peers=none to remaining master startup in ec tests
* fix: use filer HTTP port 8888, WithFilerClient adds 10000 for gRPC
WithFilerClient calls ToGrpcAddress() which adds 10000 to the port.
Passing 18888 resulted in connecting to 28888. Use 8888 instead.
* test: add concurrent writes and million updates tests
- TestMetadataSubscribeConcurrentWrites: 50 goroutines writing 20 files each
- TestMetadataSubscribeMillionUpdates: 1 million metadata entries via gRPC
(metadata only, no actual file content for speed)
* fix: address PR review comments
- Handle os.MkdirAll errors explicitly instead of ignoring
- Handle log file creation errors with proper error messages
- Replace silent event dropping with 100ms timeout and warning log
* Update metadata_subscribe_integration_test.go
|
|
fix: skip log files with deleted volumes in filer backup (#3720)
When filer.backup or filer.meta.backup resumes after being stopped, it may
encounter persisted log files stored on volumes that have since been deleted
(via volume.deleteEmpty -force). Previously, this caused the backup to get
stuck in an infinite retry loop with 'volume X not found' errors.
This fix catches 'volume not found' errors when reading log files and skips
the problematic file instead of failing. The backup will now:
- Log a warning about the missing volume
- Skip the problematic log file
- Continue with the next log file, allowing progress
The VolumeNotFoundPattern regex was already defined but never used - this
change puts it to use.
Fixes #3720
|
|
* add admin and worker to helm charts
* workers are stateless, admin is stateful
* removed the duplicate admin-deployment.yaml
* address comments
* address comments
* purge
* Update README.md
* Update k8s/charts/seaweedfs/templates/admin/admin-ingress.yaml
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* address comments
* address comments
* supports Kubernetes versions from v1.14 to v1.30+, ensuring broad compatibility
* add probe for workers
* address comments
* add a todo
* chore: trigger CI
* use port name for probes in admin statefulset
* add secrets to admin helm chart
* fix error .Values.admin.secret.existingSecret
* helm: fix admin secret template paths and remove duplicate
- Fix value paths to use .Values.admin.secret.existingSecret instead of .Values.existingSecret
- Use templated secret name {{ template "seaweedfs.name" . }}-admin-secret
- Add .Values.admin.enabled check to admin-secret.yaml
- Remove duplicate admin-secret.yaml from templates/ root
* helm: address PR review feedback
- Only pass adminUser/adminPassword args when auth is enabled (fixes regression)
- Use $adminSecretName variable to reduce duplication (DRY)
- Only create admin-secret when adminPassword is set
- Add documentation comments for existingSecret, userKey, pwKey fields
- Clarify that empty adminPassword disables authentication
* helm: quote admin credentials to handle spaces
* helm: fix yaml lint errors (comment spacing, trailing blank line)
* helm: add validation for existingSecret requiring userKey and pwKey
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ubuntu <morez.martin@gmail.com>
|
|
* add admin and worker to helm charts
* workers are stateless, admin is stateful
* removed the duplicate admin-deployment.yaml
* address comments
* address comments
* purge
* Update README.md
* Update k8s/charts/seaweedfs/templates/admin/admin-ingress.yaml
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* address comments
* address comments
* supports Kubernetes versions from v1.14 to v1.30+, ensuring broad compatibility
* add probe for workers
* address comments
* add a todo
* chore: trigger CI
* use port name for probes in admin statefulset
* fix: remove trailing blank line in values.yaml
* address code review feedback
- Quote admin credentials in shell command to handle special characters
- Remove unimplemented capabilities (remote, replication) from worker defaults
- Add security note about admin password character restrictions
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
|
* fix: return error on size mismatch in ReadNeedleMeta for consistency
When ReadNeedleMeta encounters a size mismatch at offset >= MaxPossibleVolumeSize,
it previously just continued without returning an error, potentially using wrong data.
This fix makes ReadNeedleMeta consistent with ReadBytes (needle_read.go), which
properly returns an error in both cases:
- ErrorSizeMismatch when offset < MaxPossibleVolumeSize (to trigger retry at offset+32GB)
- A descriptive error when offset >= MaxPossibleVolumeSize (after retry failed)
Fixes #7673
* refactor: use more accurate error message for size mismatch
|
|
* fix: prevent empty .vif files from ec.decode causing parse errors
When ec.decode copies .vif files from EC shard nodes, if a source node
doesn't have the .vif file, an empty .vif file was created on the target
node. This caused volume.configure.replication to fail with 'proto: syntax
error' when trying to parse the empty file.
This fix:
1. In writeToFile: Remove empty files when no data was written (source
file was not found) to avoid leaving corrupted empty files
2. In MaybeLoadVolumeInfo: Handle empty .vif files gracefully by treating
them as non-existent, allowing the system to create a proper one
Fixes #7666
* refactor: remove redundant dst.Close() and add error logging
Address review feedback:
- Remove redundant dst.Close() call since defer already handles it
- Add error logging for os.Remove() failure
|
|
* mount: fix weed inode nlookup do not equel kernel inode nlookup
* mount: add underflow protection for nlookup decrement in Forget
* mount: use consistent == 0 check for uint64 nlookup
* Update weed/mount/inode_to_path.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* mount: snapshot data before unlock in Forget to avoid using deleted InodeEntry
---------
Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
|
* s3api: remove redundant auth verification in getRequestDataReader
The handlers PutObjectHandler and PutObjectPartHandler are already wrapped
with s3a.iam.Auth() middleware which performs signature verification via
authRequest() before the handler is invoked.
The signature verification for authTypeSignedV2, authTypePresignedV2,
authTypePresigned, and authTypeSigned in getRequestDataReader was therefore
redundant.
The newChunkedReader() call for streaming auth types is kept as it's needed
to parse the chunked transfer encoding and extract the actual data.
Fixes #7683
* simplify switch to if statement for single condition
|
|
* s3: add s3:ExistingObjectTag condition support in policy engine
Add support for s3:ExistingObjectTag/<tag-key> condition keys in bucket
policies, allowing access control based on object tags.
Changes:
- Add ObjectEntry field to PolicyEvaluationArgs (entry.Extended metadata)
- Update EvaluateConditions to handle s3:ExistingObjectTag/<key> format
- Extract tag value from entry metadata using X-Amz-Tagging-<key> prefix
This enables policies like:
{
"Condition": {
"StringEquals": {
"s3:ExistingObjectTag/status": ["public"]
}
}
}
Fixes: https://github.com/seaweedfs/seaweedfs/issues/7447
* s3: update EvaluatePolicy to accept object entry for tag conditions
Update BucketPolicyEngine.EvaluatePolicy to accept objectEntry parameter
(entry.Extended metadata) for evaluating tag-based policy conditions.
Changes:
- Add objectEntry parameter to EvaluatePolicy method
- Update callers in auth_credentials.go and s3api_bucket_handlers.go
- Pass nil for objectEntry in auth layer (entry fetched later in handlers)
For tag-based conditions to work, handlers should call EvaluatePolicy
with the object's entry.Extended after fetching the entry from filer.
* s3: add tests for s3:ExistingObjectTag policy conditions
Add comprehensive tests for object tag-based policy conditions:
- TestExistingObjectTagCondition: Basic tag matching scenarios
- Matching/non-matching tag values
- Missing tags, no tags, empty tags
- Multiple tags with one matching
- TestExistingObjectTagConditionMultipleTags: Multiple tag conditions
- Both tags match
- Only one tag matches
- TestExistingObjectTagDenyPolicy: Deny policies with tag conditions
- Default allow without tag
- Deny when specific tag present
* s3: document s3:ExistingObjectTag support and feature status
Update policy engine documentation:
- Add s3:ExistingObjectTag/<tag-key> to supported condition keys
- Add 'Object Tag-Based Access Control' section with examples
- Add 'Feature Status' section with implemented and planned features
Planned features for future implementation:
- s3:RequestObjectTag/<key>
- s3:RequestObjectTagKeys
- s3:x-amz-server-side-encryption
- Cross-account access
* Implement tag-based policy re-check in handlers
- Add checkPolicyWithEntry helper to S3ApiServer for handlers to re-check
policy after fetching object entry (for s3:ExistingObjectTag conditions)
- Add HasPolicyForBucket method to policy engine for efficient check
- Integrate policy re-check in GetObjectHandler after entry is fetched
- Integrate policy re-check in HeadObjectHandler after entry is fetched
- Update auth_credentials.go comments to explain two-phase evaluation
- Update documentation with supported operations for tag-based conditions
This implements 'Approach 1' where handlers re-check the policy with
the object entry after fetching it, allowing tag-based conditions to
be properly evaluated.
* Add integration tests for s3:ExistingObjectTag conditions
- Add TestCheckPolicyWithEntry: tests checkPolicyWithEntry helper with various
tag scenarios (matching tags, non-matching tags, empty entry, nil entry)
- Add TestCheckPolicyWithEntryNoPolicyForBucket: tests early return when no policy
- Add TestCheckPolicyWithEntryNilPolicyEngine: tests nil engine handling
- Add TestCheckPolicyWithEntryDenyPolicy: tests deny policies with tag conditions
- Add TestHasPolicyForBucket: tests HasPolicyForBucket method
These tests cover the Phase 2 policy evaluation with object entry metadata,
ensuring tag-based conditions are properly evaluated.
* Address code review nitpicks
- Remove unused extractObjectTags placeholder function (engine.go)
- Add clarifying comment about s3:ExistingObjectTag/<key> evaluation
- Consolidate duplicate tag-based examples in README
- Factor out tagsToEntry helper to package level in tests
* Address code review feedback
- Fix unsafe type assertions in GetObjectHandler and HeadObjectHandler
when getting identity from context (properly handle type assertion failure)
- Extract getConditionContextValue helper to eliminate duplicated logic
between EvaluateConditions and EvaluateConditionsLegacy
- Ensure consistent handling of missing condition keys (always return
empty slice)
* Fix GetObjectHandler to match HeadObjectHandler pattern
Add safety check for nil objectEntryForSSE before tag-based policy
evaluation, ensuring tag-based conditions are always evaluated rather
than silently skipped if entry is unexpectedly nil.
Addresses review comment from Copilot.
* Fix HeadObject action name in docs for consistency
Change 'HeadObject' to 's3:HeadObject' to match other action names.
* Extract recheckPolicyWithObjectEntry helper to reduce duplication
Move the repeated identity extraction and policy re-check logic from
GetObjectHandler and HeadObjectHandler into a shared helper method.
* Add validation for empty tag key in s3:ExistingObjectTag condition
Prevent potential issues with malformed policies containing
s3:ExistingObjectTag/ (empty tag key after slash).
|
|
Fixes #7467
The -mserver argument line in volume-statefulset.yaml was missing a
trailing backslash, which prevented extraArgs from being passed to
the weed volume process.
Also:
- Extracted master server list generation logic into shared helper
templates in _helpers.tpl for better maintainability
- Updated all occurrences of deprecated -mserver flag to -master
across docker-compose files, test files, and documentation
|
|
* fix: prevent makeslice panic in ReadNeedleMeta with corrupted needle
When a needle's DataSize in the .dat file is corrupted to a very large
value, the calculation of metaSize can become negative, causing a panic
with 'makeslice: len out of range' when creating the metadata slice.
This fix adds validation to check if metaSize is negative before
creating the slice, returning a descriptive error instead of panicking.
Fixes #7475
* Update weed/storage/needle/needle_read_page.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
|
* mount: add mutex to DirectoryHandle to fix race condition
When using Ganesha NFS on top of FUSE mount, ls operations would hang
forever on directories with hundreds of files. This was caused by a
race condition in DirectoryHandle where multiple concurrent readdir
operations could modify shared state (entryStream, entryStreamOffset,
isFinished) without synchronization.
The fix adds a mutex to DirectoryHandle and holds it for the entire
duration of doReadDirectory. This serializes concurrent readdir calls
on the same handle, which is the correct behavior for a directory
handle and fixes the race condition.
Key changes:
- Added sync.Mutex to DirectoryHandle struct
- Lock the mutex at the start of doReadDirectory
- This ensures thread-safe access to entryStream and other state
The lock is per-handle (not global), so different directories can
still be listed concurrently. Only concurrent operations on the
same directory handle are serialized.
Fixes: https://github.com/seaweedfs/seaweedfs/issues/7672
* mount: add mutex to DirectoryHandle to fix race condition
When using Ganesha NFS on top of FUSE mount, ls operations would hang
forever on directories with hundreds of files. This was caused by a
race condition in DirectoryHandle where multiple concurrent readdir
operations could modify shared state (entryStream, entryStreamOffset,
isFinished) without synchronization.
The fix adds a mutex to DirectoryHandle and holds it for the entire
duration of doReadDirectory. This serializes concurrent readdir calls
on the same handle, which is the correct behavior for a directory
handle and fixes the race condition.
Key changes:
- Added sync.Mutex to DirectoryHandle struct
- Lock the mutex at the start of doReadDirectory
- Optimized reset() to reuse slice capacity and allow GC of old entries
The lock is per-handle (not global), so different directories can
still be listed concurrently. Only concurrent operations on the
same directory handle are serialized.
Fixes: https://github.com/seaweedfs/seaweedfs/issues/7672
|
|
* sts: limit session duration to incoming token's exp claim
This fixes the issue where AssumeRoleWithWebIdentity would issue sessions
that outlive the source identity token's expiration.
For use cases like GitLab CI Jobs where the ID Token has an exp claim
limited to the CI job's timeout, the STS session should not exceed that
expiration.
Changes:
- Add TokenExpiration field to ExternalIdentity struct
- Extract exp/iat/nbf claims in OIDC provider's ValidateToken
- Pass token expiration from Authenticate to ExternalIdentity
- Modify calculateSessionDuration to cap at source token's exp
- Add comprehensive tests for the new behavior
Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7653
* refactor: reduce duplication in time claim extraction
Use a loop over claim names instead of repeating the same
extraction logic three times for exp, iat, and nbf claims.
* address review: add defense-in-depth for expired tokens
- Handle already-expired tokens defensively with 1 minute minimum duration
- Enforce MaxSessionLength from config as additional cap
- Fix potential nil dereference in test mock
- Add test case for expired token scenario
* remove issue reference from test
* fix: remove early return to ensure MaxSessionLength is always checked
|