aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-12-04Merge branch 'master' into feature/tus-protocolorigin/feature/tus-protocolChris Lu78-468/+6684
2025-12-04fmtchrislu6-0/+6
2025-12-03filer: async empty folder cleanup via metadata events (#7614)Chris Lu9-52/+1685
* filer: async empty folder cleanup via metadata events Implements asynchronous empty folder cleanup when files are deleted in S3. Key changes: 1. EmptyFolderCleaner - New component that handles folder cleanup: - Uses consistent hashing (LockRing) to determine folder ownership - Each filer owns specific folders, avoiding duplicate cleanup work - Debounces delete events (10s delay) to batch multiple deletes - Caches rough folder counts to skip unnecessary checks - Cancels pending cleanup when new files are created - Handles both file and subdirectory deletions 2. Integration with metadata events: - Listens to both local and remote filer metadata events - Processes create/delete/rename events to track folder state - Only processes folders under /buckets/<bucket>/... 3. Removed synchronous empty folder cleanup from S3 handlers: - DeleteObjectHandler no longer calls DoDeleteEmptyParentDirectories - DeleteMultipleObjectsHandler no longer tracks/cleans directories - Cleanup now happens asynchronously via metadata events Benefits: - Non-blocking: S3 delete requests return immediately - Coordinated: Only one filer (the owner) cleans each folder - Efficient: Batching and caching reduce unnecessary checks - Event-driven: Folder deletion triggers parent folder check automatically * filer: add CleanupQueue data structure for deduplicated folder cleanup CleanupQueue uses a linked list for FIFO ordering and a hashmap for O(1) deduplication. Processing is triggered when: - Queue size reaches maxSize (default 1000), OR - Oldest item exceeds maxAge (default 10 minutes) Key features: - O(1) Add, Remove, Pop, Contains operations - Duplicate folders are ignored (keeps original position/time) - Testable with injectable time function - Thread-safe with mutex protection * filer: use CleanupQueue for empty folder cleanup Replace timer-per-folder approach with queue-based processing: - Use CleanupQueue for deduplication and ordered processing - Process queue when full (1000 items) or oldest item exceeds 10 minutes - Background processor checks queue every 10 seconds - Remove from queue on create events to cancel pending cleanup Benefits: - Bounded memory: queue has max size, not unlimited timers - Efficient: O(1) add/remove/contains operations - Batch processing: handle many folders efficiently - Better for high-volume delete scenarios * filer: CleanupQueue.Add moves duplicate to back with updated time When adding a folder that already exists in the queue: - Remove it from its current position - Add it to the back of the queue - Update the queue time to current time This ensures that folders with recent delete activity are processed later, giving more time for additional deletes to occur. * filer: CleanupQueue uses event time and inserts in sorted order Changes: - Add() now takes eventTime parameter instead of using current time - Insert items in time-sorted order (oldest at front) to handle out-of-order events - When updating duplicate with newer time, reposition to maintain sort order - Ignore updates with older time (keep existing later time) This ensures proper ordering when processing events from distributed filers where event arrival order may not match event occurrence order. * filer: remove unused CleanupQueue functions (SetNowFunc, GetAll) Removed test-only functions: - SetNowFunc: tests now use real time with past event times - GetAll: tests now use Pop() to verify order Kept functions used in production: - Peek: used in filer_notify_read.go - OldestAge: used in empty_folder_cleaner.go logging * filer: initialize cache entry on first delete/create event Previously, roughCount was only updated if the cache entry already existed, but entries were only created during executeCleanup. This meant delete/create events before the first cleanup didn't track the count. Now create the cache entry on first event, so roughCount properly tracks all changes from the start. * filer: skip adding to cleanup queue if roughCount > 0 If the cached roughCount indicates there are still items in the folder, don't bother adding it to the cleanup queue. This avoids unnecessary queue entries and reduces wasted cleanup checks. * filer: don't create cache entry on create event Only update roughCount if the folder is already being tracked. New folders don't need tracking until we see a delete event. * filer: move empty folder cleanup to its own package - Created weed/filer/empty_folder_cleanup package - Defined FilerOperations interface to break circular dependency - Added CountDirectoryEntries method to Filer - Exported IsUnderPath and IsUnderBucketPath helper functions * filer: make isUnderPath and isUnderBucketPath private These helpers are only used within the empty_folder_cleanup package.
2025-12-03[helm] Fix liveness/readiness probe scheme path in templates (#7616)Chris Lu8-13/+87
Fix the templates to read scheme from httpGet.scheme instead of the probe level, matching the structure defined in values.yaml. This ensures that changing *.livenessProbe.httpGet.scheme or *.readinessProbe.httpGet.scheme in values.yaml now correctly affects the rendered manifests. Affected components: master, filer, volume, s3, all-in-one Fixes #7615
2025-12-03fix: SFTP HomeDir path translation for user operations (#7611)Chris Lu14-32/+1607
* fix: SFTP HomeDir path translation for user operations When users have a non-root HomeDir (e.g., '/sftp/user'), their SFTP operations should be relative to that directory. Previously, when a user uploaded to '/' via SFTP, the path was not translated to their home directory, causing 'permission denied for / for permission write'. This fix adds a toAbsolutePath() method that implements chroot-like behavior where the user's HomeDir becomes their root. All file and directory operations now translate paths through this method. Example: User with HomeDir='/sftp/user' uploading to '/' now correctly maps to '/sftp/user'. Fixes: https://github.com/seaweedfs/seaweedfs/issues/7470 * test: add SFTP integration tests Add comprehensive integration tests for the SFTP server including: - HomeDir path translation tests (verifies fix for issue #7470) - Basic file upload/download operations - Directory operations (mkdir, rmdir, list) - Large file handling (1MB test) - File rename operations - Stat/Lstat operations - Path edge cases (trailing slashes, .., unicode filenames) - Admin root access verification The test framework starts a complete SeaweedFS cluster with: - Master server - Volume server - Filer server - SFTP server with test user credentials Test users are configured in testdata/userstore.json: - admin: HomeDir=/ with full access - testuser: HomeDir=/sftp/testuser with access to home - readonly: HomeDir=/public with read-only access * fix: correct SFTP HomeDir path translation and add CI Fix path.Join issue where paths starting with '/' weren't joined correctly. path.Join('/sftp/user', '/file') returns '/file' instead of '/sftp/user/file'. Now we strip the leading '/' before joining. Test improvements: - Update go.mod to Go 1.24 - Fix weed binary discovery to prefer local build over PATH - Add stabilization delay after service startup - All 8 SFTP integration tests pass locally Add GitHub Actions workflow for SFTP tests: - Runs on push/PR affecting sftpd code or tests - Tests HomeDir path translation, file ops, directory ops - Covers issue #7470 fix verification * security: update golang.org/x/crypto to v0.45.0 Addresses security vulnerability in golang.org/x/crypto < 0.45.0 * security: use proper SSH host key verification in tests Replace ssh.InsecureIgnoreHostKey() with ssh.FixedHostKey() that verifies the server's host key matches the known test key we generated. This addresses CodeQL warning go/insecure-hostkeycallback. Also updates go.mod to specify go 1.24.0 explicitly. * security: fix path traversal vulnerability in SFTP toAbsolutePath The previous implementation had a critical security vulnerability: - Path traversal via '../..' could escape the HomeDir chroot jail - Absolute paths were not correctly prefixed with HomeDir The fix: 1. Concatenate HomeDir with userPath directly, then clean 2. Add security check to ensure final path stays within HomeDir 3. If traversal detected, safely return HomeDir instead Also adds path traversal prevention tests to verify the fix. * fix: address PR review comments 1. Fix SkipCleanup check to use actual test config instead of default - Added skipCleanup field to SftpTestFramework struct - Store config.SkipCleanup during Setup() - Use f.skipCleanup in Cleanup() instead of DefaultTestConfig() 2. Fix path prefix check false positive in mkdir - Changed from strings.HasPrefix(absPath, fs.user.HomeDir) - To: absPath == fs.user.HomeDir || strings.HasPrefix(absPath, fs.user.HomeDir+"/") - Prevents matching partial directory names (e.g., /sftp/username when HomeDir is /sftp/user) * fix: check write permission on parent dir for mkdir Aligns makeDir's permission check with newFileWriter for consistency. To create a directory, a user needs write permission on the parent directory, not mkdir permission on the new directory path. * fix: refine SFTP path traversal logic and tests 1. Refine toAbsolutePath: - Use path.Join with strings.TrimPrefix for idiomatic path construction - Return explicit error on path traversal attempt instead of clamping - Updated all call sites to handle the error 2. Add Unit Tests: - Added sftp_server_test.go to verify toAbsolutePath logic - Covers normal paths, root path, and various traversal attempts 3. Update Integration Tests: - Updated PathTraversalPrevention test to reflect that standard SFTP clients sanitize paths before sending. The test now verifies successful containment within the jail rather than blocking (since the server receives a clean path). - The server-side blocking is verified by the new unit tests. 4. Makefile: - Removed -v from default test target * fix: address PR comments on tests and makefile 1. Enhanced Unit Tests: - Added edge cases (empty path, multiple slashes, trailing slash) to sftp_server_test.go 2. Makefile Improvements: - Added 'all' target as default entry point 3. Code Clarity: - Added comment to mkdir permission check explaining defensive nature of HomeDir check * fix: address PR review comments on permissions and tests 1. Security: - Added write permission check on target directory in renameEntry 2. Logging: - Changed dispatch log verbosity from V(0) to V(1) 3. Testing: - Updated Makefile .PHONY targets - Added unit test cases for empty/root HomeDir behavior in toAbsolutePath * fix: set SFTP starting directory to virtual root 1. Critical Fix: - Changed sftp.WithStartDirectory from fs.user.HomeDir to '/' - Prevents double-prefixing when toAbsolutePath translates paths - Users now correctly start at their virtual root which maps to HomeDir 2. Test Improvements: - Use pointer for homeDir in tests for clearer nil vs empty distinction * fix: clean HomeDir at config load time Clean HomeDir path when loading users from JSON config. This handles trailing slashes and other path anomalies at the source, ensuring consistency throughout the codebase and avoiding repeated cleaning on every toAbsolutePath call. * test: strengthen assertions and add error checking in SFTP tests 1. Add error checking for cleanup operations in TestWalk 2. Strengthen cwd assertion to expect '/' explicitly in TestCurrentWorkingDirectory 3. Add error checking for cleanup in PathTraversalPrevention test
2025-12-03Fix handling of fixed read-only volumes for `volume.check.disk`. (#7612)Lisandro Pin1-19/+36
There's unfortunatley no way to tell whether a volume is flagged read-only because it got full, or because it is faulty. To address this, modify the check logic so all read-only volumes are processed; if no changes are written (i.e. the volume is healthy) it is kept as read-only. Volumes which are modified in this process are deemed fixed, and switched to writable.
2025-12-03fix: update getVersioningState to signal non-existent buckets with Er… (#7613)Xiao Wei1-1/+3
* fix: update getVersioningState to signal non-existent buckets with ErrNotFound This change modifies the getVersioningState function to return filer_pb.ErrNotFound when a requested bucket does not exist, allowing callers to handle the situation appropriately, such as auto-creating the bucket in PUT handlers. This improves error handling and clarity in the API's behavior regarding bucket existence. * Update weed/s3api/s3api_bucket_config.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: 洪晓威 <xiaoweihong@deepglint.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-02fix: volume server healthz now checks local conditions only (#7610)Chris Lu2-13/+22
This fixes issue #6823 where a single volume server shutdown would cause other healthy volume servers to fail their health checks and get restarted by Kubernetes, causing a cascading failure. Previously, the healthz handler checked if all replicated volumes could reach their remote replicas via GetWritableRemoteReplications(). When a volume server went down, the master would remove it from the volume location list. Other volume servers would then fail their healthz checks because they couldn't find all required replicas, causing Kubernetes to restart them. The healthz endpoint now only checks local conditions: 1. Is the server shutting down? 2. Is the server heartbeating with the master? This follows the principle that a health check should only verify the health of THIS server, not the overall cluster state. Fixes #6823
2025-12-02Support separate volume server ID independent of RPC bind address (#7609)Chris Lu14-28/+240
* pb: add id field to Heartbeat message for stable volume server identification This adds an 'id' field to the Heartbeat protobuf message that allows volume servers to identify themselves independently of their IP:port address. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * storage: add Id field to Store struct Add Id field to Store struct and include it in CollectHeartbeat(). The Id field provides a stable volume server identity independent of IP:port. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * topology: support id-based DataNode identification Update GetOrCreateDataNode to accept an id parameter for stable node identification. When id is provided, the DataNode can maintain its identity even when its IP address changes (e.g., in Kubernetes pod reschedules). For backward compatibility: - If id is provided, use it as the node ID - If id is empty, fall back to ip:port Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * volume: add -id flag for stable volume server identity Add -id command line flag to volume server that allows specifying a stable identifier independent of the IP address. This is useful for Kubernetes deployments with hostPath volumes where pods can be rescheduled to different nodes while the persisted data remains on the original node. Usage: weed volume -id=node-1 -ip=10.0.0.1 ... If -id is not specified, it defaults to ip:port for backward compatibility. Fixes https://github.com/seaweedfs/seaweedfs/issues/7487 * server: add -volume.id flag to weed server command Support the -volume.id flag in the all-in-one 'weed server' command, consistent with the standalone 'weed volume' command. Usage: weed server -volume.id=node-1 ... Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * topology: add test for id-based DataNode identification Test the key scenarios: 1. Create DataNode with explicit id 2. Same id with different IP returns same DataNode (K8s reschedule) 3. IP/PublicUrl are updated when node reconnects with new address 4. Different id creates new DataNode 5. Empty id falls back to ip:port (backward compatibility) Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * pb: add address field to DataNodeInfo for proper node addressing Previously, DataNodeInfo.Id was used as the node address, which worked when Id was always ip:port. Now that Id can be an explicit string, we need a separate Address field for connection purposes. Changes: - Add 'address' field to DataNodeInfo protobuf message - Update ToDataNodeInfo() to populate the address field - Update NewServerAddressFromDataNode() to use Address (with Id fallback) - Fix LookupEcVolume to use dn.Url() instead of dn.Id() Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: trim whitespace from volume server id and fix test - Trim whitespace from -id flag to treat ' ' as empty - Fix store_load_balancing_test.go to include id parameter in NewStore call Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * refactor: extract GetVolumeServerId to util package Move the volume server ID determination logic to a shared utility function to avoid code duplication between volume.go and rack.go. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: improve transition logic for legacy nodes - Use exact ip:port match instead of net.SplitHostPort heuristic - Update GrpcPort and PublicUrl during transition for consistency - Remove unused net import Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: add id normalization and address change logging - Normalize id parameter at function boundary (trim whitespace) - Log when DataNode IP:Port changes (helps debug K8s pod rescheduling) Ref: https://github.com/seaweedfs/seaweedfs/issues/7487
2025-12-02fix: skip cookie validation for EC volume deletion when SkipCookieCheck is ↵Chris Lu1-11/+6
set (#7608) fix: EC volume deletion issues Fixes #7489 1. Skip cookie check for EC volume deletion when SkipCookieCheck is set When batch deleting files from EC volumes with SkipCookieCheck=true (e.g., orphan file cleanup), the cookie is not available. The deletion was failing with 'unexpected cookie 0' because DeleteEcShardNeedle always validated the cookie. 2. Optimize doDeleteNeedleFromAtLeastOneRemoteEcShards to return early Return immediately when a deletion succeeds, instead of continuing to try all parity shards unnecessarily. 3. Remove useless log message that always logged nil error The log at V(1) was logging err after checking it was nil. Regression introduced in commit 7bdae5172 (Jan 3, 2023) when EC batch delete support was added.
2025-12-02Add disk-aware EC rebalancing (#7597)Chris Lu7-73/+1673
* Add placement package for EC shard placement logic - Consolidate EC shard placement algorithm for reuse across shell and worker tasks - Support multi-pass selection: racks, then servers, then disks - Include proper spread verification and scoring functions - Comprehensive test coverage for various cluster topologies * Make ec.balance disk-aware for multi-disk servers - Add EcDisk struct to track individual disks on volume servers - Update EcNode to maintain per-disk shard distribution - Parse disk_id from EC shard information during topology collection - Implement pickBestDiskOnNode() for selecting best disk per shard - Add diskDistributionScore() for tie-breaking node selection - Update all move operations to specify target disk in RPC calls - Improves shard balance within multi-disk servers, not just across servers * Use placement package in EC detection for consistent disk-level placement - Replace custom EC disk selection logic with shared placement package - Convert topology DiskInfo to placement.DiskCandidate format - Use SelectDestinations() for multi-rack/server/disk spreading - Convert placement results back to topology DiskInfo for task creation - Ensures EC detection uses same placement logic as shell commands * Make volume server evacuation disk-aware - Use pickBestDiskOnNode() when selecting evacuation target disk - Specify target disk in evacuation RPC requests - Maintains balanced disk distribution during server evacuations * Rename PlacementConfig to PlacementRequest for clarity PlacementRequest better reflects that this is a request for placement rather than a configuration object. This improves API semantics. * Rename DefaultConfig to DefaultPlacementRequest Aligns with the PlacementRequest type naming for consistency * Address review comments from Gemini and CodeRabbit Fix HIGH issues: - Fix empty disk discovery: Now discovers all disks from VolumeInfos, not just from EC shards. This ensures disks without EC shards are still considered for placement. - Fix EC shard count calculation in detection.go: Now correctly filters by DiskId and sums actual shard counts using ShardBits.ShardIdCount() instead of just counting EcShardInfo entries. Fix MEDIUM issues: - Add disk ID to evacuation log messages for consistency with other logging - Remove unused serverToDisks variable in placement.go - Fix comment that incorrectly said 'ascending' when sorting is 'descending' * add ec tests * Update ec-integration-tests.yml * Update ec_integration_test.go * Fix EC integration tests CI: build weed binary and update actions - Add 'Build weed binary' step before running tests - Update actions/setup-go from v4 to v6 (Node20 compatibility) - Update actions/checkout from v2 to v4 (Node20 compatibility) - Move working-directory to test step only * Add disk-aware EC rebalancing integration tests - Add TestDiskAwareECRebalancing test with multi-disk cluster setup - Test EC encode with disk awareness (shows disk ID in output) - Test EC balance with disk-level shard distribution - Add helper functions for disk-level verification: - startMultiDiskCluster: 3 servers x 4 disks each - countShardsPerDisk: track shards per disk per server - calculateDiskShardVariance: measure distribution balance - Verify no single disk is overloaded with shards
2025-12-02Mutex command output writes for `volume.check.disk`. (#7605)Lisandro Pin2-22/+28
Prevents potential screen garbling when operations are parallelized .Also simplifies logging by automatically adding newlines on output, if necessary. Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-12-02Parallelize read-only volume check pass for `volume.check.disk`. (#7602)Lisandro Pin2-23/+37
2025-12-02Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type (#7598)Chris Lu4-115/+244
* Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type Fixes GitHub #7562: Copying objects between encrypted buckets was failing. Root causes: 1. processMetadataBytes was re-adding SSE headers from source entry, undoing the encryption header filtering. Now uses dstEntry.Extended which is already filtered. 2. SSE-S3 streaming copy returned nil metadata. Now properly generates and returns SSE-S3 destination metadata (SeaweedFSSSES3Key, AES256 header) via ExecuteStreamingCopyWithMetadata. 3. Chunks created during streaming copy didn't have SseType set. Now sets SseType and per-chunk SseMetadata with chunk-specific IVs for SSE-S3, enabling proper decryption on GetObject. * Address review: make SSE-S3 metadata serialization failures fatal errors - In executeEncryptCopy: return error instead of just logging if SerializeSSES3Metadata fails - In createChunkFromData: return error if chunk SSE-S3 metadata serialization fails This ensures objects/chunks are never created without proper encryption metadata, preventing unreadable/corrupted data. * fmt * Refactor: reuse function names instead of creating WithMetadata variants - Change ExecuteStreamingCopy to return (*EncryptionSpec, error) directly - Remove ExecuteStreamingCopyWithMetadata wrapper - Change executeStreamingReencryptCopy to return (*EncryptionSpec, error) - Remove executeStreamingReencryptCopyWithMetadata wrapper - Update callers to ignore encryption spec with _ where not needed * Add TODO documenting large file SSE-S3 copy limitation The streaming copy approach encrypts the entire stream with a single IV but stores data in chunks with per-chunk IVs. This causes decryption issues for large files. Small inline files work correctly. This is a known architectural issue that needs separate work to fix. * Use chunk-by-chunk encryption for SSE-S3 copy (consistent with SSE-C/SSE-KMS) Instead of streaming encryption (which had IV mismatch issues for multi-chunk files), SSE-S3 now uses the same chunk-by-chunk approach as SSE-C and SSE-KMS: 1. Extended copyMultipartCrossEncryption to handle SSE-S3: - Added SSE-S3 source decryption in copyCrossEncryptionChunk - Added SSE-S3 destination encryption with per-chunk IVs - Added object-level metadata generation for SSE-S3 destinations 2. Updated routing in executeEncryptCopy/executeDecryptCopy/executeReencryptCopy to use copyMultipartCrossEncryption for all SSE-S3 scenarios 3. Removed streaming copy functions (shouldUseStreamingCopy, executeStreamingReencryptCopy) as they're no longer used 4. Added large file (1MB) integration test to verify chunk-by-chunk copy works This ensures consistent behavior across all SSE types and fixes data corruption that occurred with large files in the streaming copy approach. * fmt * fmt * Address review: fail explicitly if SSE-S3 metadata is missing Instead of silently ignoring missing SSE-S3 metadata (which could create unreadable objects), now explicitly fail the copy operation with a clear error message if: - First chunk is missing - First chunk doesn't have SSE-S3 type - First chunk has empty SSE metadata - Deserialization fails * Address review: improve comment to reflect full scope of chunk creation * Address review: fail explicitly if baseIV is empty for SSE-S3 chunk encryption If DestinationIV is not set when encrypting SSE-S3 chunks, the chunk would be created without SseMetadata, causing GetObject decryption to fail later. Now fails explicitly with a clear error message. Note: calculateIVWithOffset returns ([]byte, int) not ([]byte, error) - the int is a skip amount for intra-block alignment, not an error code. * Address review: handle 0-byte files in SSE-S3 copy For 0-byte files, there are no chunks to get metadata from. Generate an IV for the object-level metadata to ensure even empty files are properly marked as SSE-S3 encrypted. Also validate that we don't have a non-empty file with no chunks (which would indicate an internal error).
2025-12-01Fix issue #6847: S3 chunked encoding includes headers in stored content (#7595)Chris Lu1-39/+38
* Fix issue #6847: S3 chunked encoding includes headers in stored content - Add hasTrailer flag to s3ChunkedReader to track trailer presence - Update state transition logic to properly handle trailers in unsigned streaming - Enhance parseChunkChecksum to handle multiple trailer lines - Skip checksum verification for unsigned streaming uploads - Add test case for mixed format handling (unsigned headers with signed chunks) - Remove redundant CRLF reading in trailer processing This fixes the issue where chunk-signature and x-amz headers were appearing in stored file content when using chunked encoding with newer AWS SDKs. * Fix checksum validation for unsigned streaming uploads - Always validate checksum for data integrity regardless of signing - Correct checksum value in test case - Addresses PR review feedback about checksum verification * Add warning log when multiple checksum headers found in trailer - Log a warning when multiple valid checksum headers appear in trailers - Uses last checksum header as suggested by CodeRabbit reviewer - Improves debugging for edge cases with multiple checksum algorithms * Improve trailer parsing robustness in parseChunkChecksum - Remove redundant trimTrailingWhitespace call since readChunkLine already trims - Use bytes.TrimSpace for both key and value to handle whitespace around colon separator - Follows HTTP header specifications for optional whitespace around separators - Addresses Gemini Code Assist review feedback
2025-12-01Fix test stability: increase cluster stabilization delay to 5 secondschrislu3-14/+770
The tests were intermittently failing because the volume server needed more time to create volumes and register with the master. Increasing the delay from 2 to 5 seconds fixes the flaky test behavior.
2025-12-01Address code review comments: fix variable shadowing, sniff size, and test ↵chrislu2-7/+27
stability - Rename path variable to reqPath to avoid shadowing path package - Make sniff buffer size respect contentLength (read at most contentLength bytes) - Handle Content-Length < 0 in creation-with-upload (return error for chunked encoding) - Fix test cluster: use temp directory for filer store, add startup delay
2025-12-01Set S3_ENDPOINT environment variable in CI workflow for tagging testsChris Lu1-0/+5
2025-12-01Fix tagging test pattern to run our comprehensive tests instead of basic testsChris Lu1-2/+2
2025-12-01Merge branch 'fix-s3-object-tagging-issue-7589' into ↵Chris Lu0-0/+0
copilot/fix-s3-object-tagging-issue-again
2025-12-01Update s3-tagging-tests to use Makefile server management like other S3 testsChris Lu2-88/+324
2025-12-01Fix race condition to work across multiple filer instanceschrislu1-48/+90
- Store each chunk as a separate file entry instead of updating session JSON - Chunk file names encode offset, size, and fileId for atomic storage - getTusSession loads chunks from directory listing (atomic read) - Eliminates read-modify-write race condition across multiple filers - Remove in-memory mutex that only worked for single filer instance
2025-12-01Fix port conflict in s3-tagging-tests CI job by changing volume port from ↵Chris Lu1-2/+2
8084 to 8085
2025-12-01Address critical and high-priority review commentschrislu2-12/+66
- Add per-session locking to prevent race conditions in updateTusSessionOffset - Stream data directly to volume server instead of buffering entire chunk - Only buffer 512 bytes for MIME type detection, then stream remaining data - Clean up session locks when session is deleted
2025-12-01Fix port conflict in s3-tagging-tests CI job by changing volume port from ↵Chris Lu1-2/+2
8084 to 8085
2025-12-01Merge branch 'fix-s3-object-tagging-issue-7589' into ↵Chris Lu1-1/+1
copilot/fix-s3-object-tagging-issue-again
2025-12-01Add comment to s3-tagging-tests job to trigger CI re-runChris Lu1-0/+1
2025-12-01Fix CI workflow: remove cd weed since working directory is already set to weedChris Lu1-1/+0
2025-12-01Merge branch 'fix-s3-object-tagging-issue-7589' into ↵Chris Lu2-3/+127
copilot/fix-s3-object-tagging-issue-again
2025-12-01Add S3 object tagging tests to CI workflowChris Lu2-3/+127
- Modified test/s3/tagging/s3_tagging_test.go to use environment variables for configurable endpoint and credentials - Added s3-tagging-tests job to .github/workflows/s3-go-tests.yml to run tagging tests in CI - Tests will now run automatically on pull requests
2025-12-01Initial plancopilot-swe-agent[bot]0-0/+0
2025-12-01Fix S3 object tagging issue #7589Chris Lu4-2/+534
- Add X-Amz-Tagging header parsing in putToFiler function for PUT object operations - Store tags with X-Amz-Tagging- prefix in entry.Extended metadata - Add comprehensive test suite for S3 object tagging functionality - Tests cover upload tagging, API operations, special characters, and edge cases
2025-12-01Address remaining code review commentschrislu3-10/+36
- Fix potential open redirect vulnerability by sanitizing uploadLocation path - Add language specifier to README code block - Handle os.Create errors in test setup - Use waitForHTTPServer instead of time.Sleep for master/volume readiness - Improve test reliability and debugging
2025-12-01fmtchrislu2-2/+0
2025-12-01Address code review commentschrislu4-11/+23
- Sort chunks by offset before assembling final file - Use chunk.Offset directly instead of recalculating - Return error on invalid file ID instead of skipping - Require Content-Length header for PATCH requests - Use fs.option.Cipher for encryption setting - Detect MIME type from data using http.DetectContentType - Fix concurrency group for push events in workflow - Use os.Interrupt instead of Kill for graceful shutdown in tests
2025-12-01Rename -tus.path to -tusBasePath with default .tuschrislu6-13/+22
- Rename CLI flag from -tus.path to -tusBasePath - Default to .tus (TUS enabled by default) - Add -filer.tusBasePath option to weed server command - Properly handle path prefix (prepend / if missing)
2025-12-01filer: add username and keyPrefix support for Redis stores (#7591)Chris Lu9-35/+83
* filer: add username and keyPrefix support for Redis stores Addresses https://github.com/seaweedfs/seaweedfs/issues/7299 - Add username config option to redis2, redis_cluster2, redis_lua, and redis_lua_cluster stores (sentinel stores already had it) - Add keyPrefix config option to all Redis stores to prefix all keys, useful for Envoy Redis Proxy or multi-tenant Redis setups * refactor: reduce duplication in redis.NewClient creation Address code review feedback by defining redis.Options once and conditionally setting TLSConfig instead of duplicating the entire NewClient call. * filer.toml: add username and keyPrefix to redis2.tmp example
2025-12-01Make TUS base path configurable via CLIchrislu5-8/+43
- Add -tus.path CLI flag to filer command - TUS is disabled by default (empty path) - Example: -tus.path=/.tus to enable at /.tus endpoint - Update test Makefile to use -tus.path flag - Update README with TUS enabling instructions
2025-12-01Add TUS protocol tests to GitHub Actions CIchrislu1-0/+114
- Add tus-tests.yml workflow that runs on PRs and pushes - Runs when TUS-related files are modified - Automatic server management for integration testing - Upload logs on failure for debugging
2025-12-01Fix TUS integration tests and creation-with-uploadchrislu4-26/+42
- Fix test URLs to use full URLs instead of relative paths - Fix creation-with-upload to refresh session before completing - Fix Makefile to properly handle test cleanup - Add FullURL helper function to TestCluster
2025-12-01Improve TUS integration test setupchrislu2-40/+427
Add comprehensive Makefile for TUS tests with targets: - test-with-server: Run tests with automatic server management - test-basic/chunked/resume/errors: Specific test categories - manual-start/stop: For development testing - debug-logs/status: For debugging - ci-test: For CI/CD pipelines Update README.md with: - Detailed TUS protocol documentation - All endpoint descriptions with headers - Usage examples with curl commands - Architecture diagram - Comparison with S3 multipart uploads Follows the pattern established by other tests in test/ folder.
2025-12-01Wire up TUS protocol routes in filer serverchrislu1-0/+2
Add TUS handler route (/.tus/) to the filer HTTP server. The TUS route is registered before the catch-all route to ensure proper routing of TUS protocol requests. TUS protocol is now accessible at: - OPTIONS /.tus/ - Capability discovery - POST /.tus/{path} - Create upload - HEAD /.tus/.uploads/{id} - Get offset - PATCH /.tus/.uploads/{id} - Upload data - DELETE /.tus/.uploads/{id} - Cancel upload
2025-12-01Add TUS HTTP handlerschrislu1-0/+352
Implements TUS protocol HTTP handlers: - tusHandler: Main entry point routing requests - tusOptionsHandler: Capability discovery (OPTIONS) - tusCreateHandler: Create new upload (POST) - tusHeadHandler: Get upload offset (HEAD) - tusPatchHandler: Upload data at offset (PATCH) - tusDeleteHandler: Cancel upload (DELETE) - tusWriteData: Upload data to volume servers Features: - Supports creation-with-upload extension - Validates TUS protocol headers - Offset conflict detection - Automatic upload completion when size is reached - Metadata parsing from Upload-Metadata header
2025-12-01Add TUS session storage types and utilitieschrislu1-0/+255
Implements TUS upload session management: - TusSession struct for tracking upload state - Session creation with directory-based storage - Session persistence using filer entries - Session retrieval and offset updates - Session deletion with chunk cleanup - Upload completion with chunk assembly into final file Session data is stored in /.uploads.tus/{upload-id}/ directory, following the pattern used by S3 multipart uploads.
2025-12-01Add TUS protocol integration testschrislu2-0/+804
This commit adds integration tests for the TUS (resumable upload) protocol in preparation for implementing TUS support in the filer. Test coverage includes: - OPTIONS handler for capability discovery - Basic single-request upload - Chunked/resumable uploads - HEAD requests for offset tracking - DELETE for upload cancellation - Error handling (invalid offsets, missing uploads) - Creation-with-upload extension - Resume after interruption simulation Tests are skipped in short mode and require a running SeaweedFS cluster.
2025-12-01chore(deps): bump github.com/prometheus/procfs from 0.19.1 to 0.19.2 (#7577)dependabot[bot]4-6/+6
* chore(deps): bump github.com/prometheus/procfs from 0.19.1 to 0.19.2 Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.19.1 to 0.19.2. - [Release notes](https://github.com/prometheus/procfs/releases) - [Commits](https://github.com/prometheus/procfs/compare/v0.19.1...v0.19.2) --- updated-dependencies: - dependency-name: github.com/prometheus/procfs dependency-version: 0.19.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * go mod --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-12-01chore(deps): bump github.com/klauspost/compress from 1.18.1 to 1.18.2 (#7576)dependabot[bot]4-6/+6
* chore(deps): bump github.com/klauspost/compress from 1.18.1 to 1.18.2 Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.18.1 to 1.18.2. - [Release notes](https://github.com/klauspost/compress/releases) - [Commits](https://github.com/klauspost/compress/compare/v1.18.1...v1.18.2) --- updated-dependencies: - dependency-name: github.com/klauspost/compress dependency-version: 1.18.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * go mod --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-12-01Fix #7575: Correct interface check for filer address function in admin ↵Chris Lu3-15/+122
server (#7588) * Fix #7575: Correct interface check for filer address function in admin server Problem: User creation in object store was failing with error: 'filer_etc: filer address function not configured' Root Cause: In admin_server.go, the code checked for incorrect interface method SetFilerClient(string, grpc.DialOption) instead of the actual SetFilerAddressFunc(func() pb.ServerAddress, grpc.DialOption) This interface mismatch prevented the filer address function from being configured, causing user creation operations to fail. Solution: - Fixed interface check to use SetFilerAddressFunc - Updated function call to properly configure filer address function - Function now dynamically returns current active filer address Tests Added: - Unit tests in weed/admin/dash/user_management_test.go - Integration tests in test/admin/user_creation_integration_test.go - Documentation in test/admin/README.md All tests pass successfully. * Fix #7575: Correct interface check for filer address function in admin UI Problem: User creation in Admin UI was failing with error: 'filer_etc: filer address function not configured' Root Cause: In admin_server.go, the code checked for incorrect interface method SetFilerClient(string, grpc.DialOption) instead of the actual SetFilerAddressFunc(func() pb.ServerAddress, grpc.DialOption) This interface mismatch prevented the filer address function from being configured, causing user creation operations to fail in the Admin UI. Note: This bug only affects the Admin UI. The S3 API and weed shell commands (s3.configure) were unaffected as they use the correct interface or bypass the credential manager entirely. Solution: - Fixed interface check in admin_server.go to use SetFilerAddressFunc - Updated function call to properly configure filer address function - Function now dynamically returns current active filer (HA-aware) - Cleaned up redundant comments in the code Tests Added: - Unit tests in weed/admin/dash/user_management_test.go * TestFilerAddressFunctionInterface - verifies correct interface * TestGenerateAccessKey - tests key generation * TestGenerateSecretKey - tests secret generation * TestGenerateAccountId - tests account ID generation All tests pass and will run automatically in CI. * Fix #7575: Correct interface check for filer address function in admin UI Problem: User creation in Admin UI was failing with error: 'filer_etc: filer address function not configured' Root Cause: 1. In admin_server.go, the code checked for incorrect interface method SetFilerClient(string, grpc.DialOption) instead of the actual SetFilerAddressFunc(func() pb.ServerAddress, grpc.DialOption) 2. The admin command was missing the filer_etc import, so the store was never registered This interface mismatch prevented the filer address function from being configured, causing user creation operations to fail in the Admin UI. Note: This bug only affects the Admin UI. The S3 API and weed shell commands (s3.configure) were unaffected as they use the correct interface or bypass the credential manager entirely. Solution: - Added filer_etc import to weed/command/admin.go to register the store - Fixed interface check in admin_server.go to use SetFilerAddressFunc - Updated function call to properly configure filer address function - Function now dynamically returns current active filer (HA-aware) - Hoisted credentialManager assignment to reduce code duplication Tests Added: - Unit tests in weed/admin/dash/user_management_test.go * TestFilerAddressFunctionInterface - verifies correct interface * TestGenerateAccessKey - tests key generation * TestGenerateSecretKey - tests secret generation * TestGenerateAccountId - tests account ID generation All tests pass and will run automatically in CI.
2025-12-01Enable FIPS 140-3 compliant crypto by default (#7590)Chris Lu2-0/+7
* Enable FIPS 140-3 compliant crypto by default Addresses #6889 - Enable GOEXPERIMENT=systemcrypto by default in all Makefiles - Enable GOEXPERIMENT=systemcrypto by default in all Dockerfiles - Go 1.24+ has native FIPS 140-3 support via this setting - Users can disable by setting GOEXPERIMENT= (empty) Algorithms used (all FIPS approved): - AES-256-GCM for data encryption - AES-256-CTR for SSE-C - HMAC-SHA256 for S3 signatures - TLS 1.2/1.3 for transport encryption * Fix: Remove invalid GOEXPERIMENT=systemcrypto Go 1.24 uses GODEBUG=fips140=on at runtime, not GOEXPERIMENT at build time. - Remove GOEXPERIMENT=systemcrypto from all Makefiles - Remove GOEXPERIMENT=systemcrypto from all Dockerfiles FIPS 140-3 mode can be enabled at runtime: GODEBUG=fips140=on ./weed server ... * Add FIPS 140-3 support enabled by default Addresses #6889 - FIPS 140-3 mode is ON by default in Docker containers - Sets GODEBUG=fips140=on via entrypoint.sh - To disable: docker run -e GODEBUG=fips140=off ...
2025-12-01chore(deps): bump github.com/shirou/gopsutil/v4 from 4.25.10 to 4.25.11 (#7579)dependabot[bot]4-36/+36
* chore(deps): bump github.com/shirou/gopsutil/v4 from 4.25.10 to 4.25.11 Bumps [github.com/shirou/gopsutil/v4](https://github.com/shirou/gopsutil) from 4.25.10 to 4.25.11. - [Release notes](https://github.com/shirou/gopsutil/releases) - [Commits](https://github.com/shirou/gopsutil/compare/v4.25.10...v4.25.11) --- updated-dependencies: - dependency-name: github.com/shirou/gopsutil/v4 dependency-version: 4.25.11 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * go mod --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>