aboutsummaryrefslogtreecommitdiff
path: root/weed/s3api/s3api_object_handlers_delete.go
AgeCommit message (Collapse)AuthorFilesLines
2025-12-03filer: async empty folder cleanup via metadata events (#7614)Chris Lu1-52/+5
* filer: async empty folder cleanup via metadata events Implements asynchronous empty folder cleanup when files are deleted in S3. Key changes: 1. EmptyFolderCleaner - New component that handles folder cleanup: - Uses consistent hashing (LockRing) to determine folder ownership - Each filer owns specific folders, avoiding duplicate cleanup work - Debounces delete events (10s delay) to batch multiple deletes - Caches rough folder counts to skip unnecessary checks - Cancels pending cleanup when new files are created - Handles both file and subdirectory deletions 2. Integration with metadata events: - Listens to both local and remote filer metadata events - Processes create/delete/rename events to track folder state - Only processes folders under /buckets/<bucket>/... 3. Removed synchronous empty folder cleanup from S3 handlers: - DeleteObjectHandler no longer calls DoDeleteEmptyParentDirectories - DeleteMultipleObjectsHandler no longer tracks/cleans directories - Cleanup now happens asynchronously via metadata events Benefits: - Non-blocking: S3 delete requests return immediately - Coordinated: Only one filer (the owner) cleans each folder - Efficient: Batching and caching reduce unnecessary checks - Event-driven: Folder deletion triggers parent folder check automatically * filer: add CleanupQueue data structure for deduplicated folder cleanup CleanupQueue uses a linked list for FIFO ordering and a hashmap for O(1) deduplication. Processing is triggered when: - Queue size reaches maxSize (default 1000), OR - Oldest item exceeds maxAge (default 10 minutes) Key features: - O(1) Add, Remove, Pop, Contains operations - Duplicate folders are ignored (keeps original position/time) - Testable with injectable time function - Thread-safe with mutex protection * filer: use CleanupQueue for empty folder cleanup Replace timer-per-folder approach with queue-based processing: - Use CleanupQueue for deduplication and ordered processing - Process queue when full (1000 items) or oldest item exceeds 10 minutes - Background processor checks queue every 10 seconds - Remove from queue on create events to cancel pending cleanup Benefits: - Bounded memory: queue has max size, not unlimited timers - Efficient: O(1) add/remove/contains operations - Batch processing: handle many folders efficiently - Better for high-volume delete scenarios * filer: CleanupQueue.Add moves duplicate to back with updated time When adding a folder that already exists in the queue: - Remove it from its current position - Add it to the back of the queue - Update the queue time to current time This ensures that folders with recent delete activity are processed later, giving more time for additional deletes to occur. * filer: CleanupQueue uses event time and inserts in sorted order Changes: - Add() now takes eventTime parameter instead of using current time - Insert items in time-sorted order (oldest at front) to handle out-of-order events - When updating duplicate with newer time, reposition to maintain sort order - Ignore updates with older time (keep existing later time) This ensures proper ordering when processing events from distributed filers where event arrival order may not match event occurrence order. * filer: remove unused CleanupQueue functions (SetNowFunc, GetAll) Removed test-only functions: - SetNowFunc: tests now use real time with past event times - GetAll: tests now use Pop() to verify order Kept functions used in production: - Peek: used in filer_notify_read.go - OldestAge: used in empty_folder_cleaner.go logging * filer: initialize cache entry on first delete/create event Previously, roughCount was only updated if the cache entry already existed, but entries were only created during executeCleanup. This meant delete/create events before the first cleanup didn't track the count. Now create the cache entry on first event, so roughCount properly tracks all changes from the start. * filer: skip adding to cleanup queue if roughCount > 0 If the cached roughCount indicates there are still items in the folder, don't bother adding it to the cleanup queue. This avoids unnecessary queue entries and reduces wasted cleanup checks. * filer: don't create cache entry on create event Only update roughCount if the folder is already being tracked. New folders don't need tracking until we see a delete event. * filer: move empty folder cleanup to its own package - Created weed/filer/empty_folder_cleanup package - Defined FilerOperations interface to break circular dependency - Added CountDirectoryEntries method to Filer - Exported IsUnderPath and IsUnderBucketPath helper functions * filer: make isUnderPath and isUnderBucketPath private These helpers are only used within the empty_folder_cleanup package.
2025-11-05do delete expired entries on s3 list request (#7426)Konstantin Lebedev1-48/+41
* do delete expired entries on s3 list request https://github.com/seaweedfs/seaweedfs/issues/6837 * disable delete expires s3 entry in filer * pass opt allowDeleteObjectsByTTL to all servers * delete on get and head * add lifecycle expiration s3 tests * fix opt allowDeleteObjectsByTTL for server * fix test lifecycle expiration * fix IsExpired * fix locationPrefix for updateEntriesTTL * fix s3tests * resolv coderabbitai * GetS3ExpireTime on filer * go mod * clear TtlSeconds for volume * move s3 delete expired entry to filer * filer delete meta and data * del unusing func removeExpiredObject * test s3 put * test s3 put multipart * allowDeleteObjectsByTTL by default * fix pipline tests * rm dublicate SeaweedFSExpiresS3 * revert expiration tests * fix updateTTL * rm log * resolv comment * fix delete version object * fix S3Versioning * fix delete on FindEntry * fix delete chunks * fix sqlite not support concurrent writes/reads * move deletion out of listing transaction; delete entries and empty folders * Revert "fix sqlite not support concurrent writes/reads" This reverts commit 5d5da14e0ed91c613fe5c0ed058f58bb04fba6f0. * clearer handling on recursive empty directory deletion * handle listing errors * strut copying * reuse code to delete empty folders * use iterative approach with a queue to avoid recursive WithFilerClient calls * stop a gRPC stream from the client-side callback is to return a specific error, e.g., io.EOF * still issue UpdateEntry when the flag must be added * errors join * join path * cleaner * add context, sort directories by depth (deepest first) to avoid redundant checks * batched operation, refactoring * prevent deleting bucket * constant * reuse code * more logging * refactoring * s3 TTL time * Safety check --------- Co-authored-by: chrislu <chris.lu@gmail.com>
2025-07-21fix listing object versions (#7006)Chris Lu1-31/+85
* fix listing object versions * Update s3api_object_versioning.go * Update s3_directory_versioning_test.go * check previous skipped tests * fix test_versioning_stack_delete_merkers * address test_bucket_list_return_data_versioning * Update s3_directory_versioning_test.go * fix test_versioning_concurrent_multi_object_delete * fix test_versioning_obj_suspend_versions test * fix empty owner * fix listing versioned objects * default owner * fix path
2025-07-19test versioning also (#7000)Chris Lu1-30/+91
* test versioning also * fix some versioning tests * fall back * fixes Never-versioned buckets: No VersionId headers, no Status field Pre-versioning objects: Regular files, VersionId="null", included in all operations Post-versioning objects: Stored in .versions directories with real version IDs Suspended versioning: Proper status handling and null version IDs * fixes Bucket Versioning Status Compliance Fixed: New buckets now return no Status field (AWS S3 compliant) Before: Always returned "Suspended" ❌ After: Returns empty VersioningConfiguration for unconfigured buckets ✅ 2. Multi-Object Delete Versioning Support Fixed: DeleteMultipleObjectsHandler now fully versioning-aware Before: Always deleted physical files, breaking versioning ❌ After: Creates delete markers or deletes specific versions properly ✅ Added: DeleteMarker field in response structure for AWS compatibility 3. Copy Operations Versioning Support Fixed: CopyObjectHandler and CopyObjectPartHandler now versioning-aware Before: Only copied regular files, couldn't handle versioned sources ❌ After: Parses version IDs from copy source, creates versions in destination ✅ Added: pathToBucketObjectAndVersion() function for version ID parsing 4. Pre-versioning Object Handling Fixed: getLatestObjectVersion() now has proper fallback logic Before: Failed when .versions directory didn't exist ❌ After: Falls back to regular objects for pre-versioning scenarios ✅ 5. Enhanced Object Version Listings Fixed: listObjectVersions() includes both versioned AND pre-versioning objects Before: Only showed .versions directories, ignored pre-versioning objects ❌ After: Shows complete version history with VersionId="null" for pre-versioning ✅ 6. Null Version ID Handling Fixed: getSpecificObjectVersion() properly handles versionId="null" Before: Couldn't retrieve pre-versioning objects by version ID ❌ After: Returns regular object files for "null" version requests ✅ 7. Version ID Response Headers Fixed: PUT operations only return x-amz-version-id when appropriate Before: Returned version IDs for non-versioned buckets ❌ After: Only returns version IDs for explicitly configured versioning ✅ * more fixes * fix copying with versioning, multipart upload * more fixes * reduce volume size for easier dev test * fix * fix version id * fix versioning * Update filer_multipart.go * fix multipart versioned upload * more fixes * more fixes * fix versioning on suspended * fixes * fixing test_versioning_obj_suspended_copy * Update s3api_object_versioning.go * fix versions * skipping test_versioning_obj_suspend_versions * > If the versioning state has never been set on a bucket, it has no versioning state; a GetBucketVersioning request does not return a versioning state value. * fix tests, avoid duplicated bucket creation, skip tests * only run s3tests_boto3/functional/test_s3.py * fix checking filer_pb.ErrNotFound * Update weed/s3api/s3api_object_versioning.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_object_handlers_copy.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_bucket_config.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update test/s3/versioning/s3_versioning_test.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-18Test object lock and retention (#6997)Chris Lu1-14/+21
* fix GetObjectLockConfigurationHandler * cache and use bucket object lock config * subscribe to bucket configuration changes * increase bucket config cache TTL * refactor * Update weed/s3api/s3api_server.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * avoid duplidated work * rename variable * Update s3api_object_handlers_put.go * fix routing * admin ui and api handler are consistent now * use fields instead of xml * fix test * address comments * Update weed/s3api/s3api_object_handlers_put.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update test/s3/retention/s3_retention_test.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/object_lock_utils.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * change error style * errorf * read entry once * add s3 tests for object lock and retention * use marker * install s3 tests * Update s3tests.yml * Update s3tests.yml * Update s3tests.conf * Update s3tests.conf * address test errors * address test errors With these fixes, the s3-tests should now: ✅ Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets ✅ Return MalformedXML for invalid retention configurations ✅ Include VersionId in response headers when available ✅ Return proper HTTP status codes (403 Forbidden for retention mode changes) ✅ Handle all object lock validation errors consistently * fixes With these comprehensive fixes, the s3-tests should now: ✅ Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets ✅ Return InvalidRetentionPeriod for invalid retention periods ✅ Return MalformedXML for malformed retention configurations ✅ Include VersionId in response headers when available ✅ Return proper HTTP status codes for all error conditions ✅ Handle all object lock validation errors consistently The workflow should now pass significantly more object lock tests, bringing SeaweedFS's S3 object lock implementation much closer to AWS S3 compatibility standards. * fixes With these final fixes, the s3-tests should now: ✅ Return MalformedXML for ObjectLockEnabled: 'Disabled' ✅ Return MalformedXML when both Days and Years are specified in retention configuration ✅ Return InvalidBucketState (409 Conflict) when trying to suspend versioning on buckets with object lock enabled ✅ Handle all object lock validation errors consistently with proper error codes * constants and fixes ✅ Return InvalidRetentionPeriod for invalid retention values (0 days, negative years) ✅ Return ObjectLockConfigurationNotFoundError when object lock configuration doesn't exist ✅ Handle all object lock validation errors consistently with proper error codes * fixes ✅ Return MalformedXML when both Days and Years are specified in the same retention configuration ✅ Return 400 (Bad Request) with InvalidRequest when object lock operations are attempted on buckets without object lock enabled ✅ Handle all object lock validation errors consistently with proper error codes * fixes ✅ Return 409 (Conflict) with InvalidBucketState for bucket-level object lock configuration operations on buckets without object lock enabled ✅ Allow increasing retention periods and overriding retention with same/later dates ✅ Only block decreasing retention periods without proper bypass permissions ✅ Handle all object lock validation errors consistently with proper error codes * fixes ✅ Include VersionId in multipart upload completion responses when versioning is enabled ✅ Block retention mode changes (GOVERNANCE ↔ COMPLIANCE) without bypass permissions ✅ Handle all object lock validation errors consistently with proper error codes ✅ Pass the remaining object lock tests * fix tests * fixes * pass tests * fix tests * fixes * add error mapping * Update s3tests.conf * fix test_object_lock_put_obj_lock_invalid_days * fixes * fix many issues * fix test_object_lock_delete_multipart_object_with_legal_hold_on * fix tests * refactor * fix test_object_lock_delete_object_with_retention_and_marker * fix tests * fix tests * fix tests * fix test itself * fix tests * fix test * Update weed/s3api/s3api_object_retention.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * reduce logs * address comments --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-13Add policy engine (#6970)Chris Lu1-2/+2
2025-07-12implement PubObjectRetention and WORM (#6969)Chris Lu1-14/+57
* implement PubObjectRetention and WORM * Update s3_worm_integration_test.go * avoid previous buckets * Update s3-versioning-tests.yml * address comments * address comments * rename to ExtObjectLockModeKey * only checkObjectLockPermissions if versioningEnabled * address comments * comments * Revert "comments" This reverts commit 6736434176f86c6e222b867777324b17c2de716f. * Update s3api_object_handlers_skip.go * Update s3api_object_retention_test.go * add version id to ObjectIdentifier * address comments * add comments * Add proper error logging for timestamp parsing failures * address comments * add version id to the error * Update weed/s3api/s3api_object_retention_test.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_object_retention.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * constants * fix comments * address comments * address comment * refactor out handleObjectLockAvailabilityCheck * errors.Is ErrBucketNotFound * better error checking * address comments --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-09S3: add object versioning (#6945)Chris Lu1-24/+67
* add object versioning * add missing file * Update weed/s3api/s3api_object_versioning.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_object_versioning.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_object_versioning.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * ListObjectVersionsResult is better to show multiple version entries * fix test * Update weed/s3api/s3api_object_handlers_put.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/s3api/s3api_object_versioning.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * multiple improvements * move PutBucketVersioningHandler into weed/s3api/s3api_bucket_handlers.go file * duplicated code for reading bucket config, versioningEnabled, etc. try to use functions * opportunity to cache bucket config * error handling if bucket is not found * in case bucket is not found * fix build * add object versioning tests * remove non-existent tests * add tests * add versioning tests * skip a new test * ensure .versions directory exists before saving info into it * fix creating version entry * logging on creating version directory * Update s3api_object_versioning_test.go * retry and wait for directory creation * revert add more logging * Update s3api_object_versioning.go * more debug messages * clean up logs, and touch directory correctly * log the .versions creation and then parent directory listing * use mkFile instead of touch touch is for update * clean up data * add versioning test in go * change location * if modified, latest version is moved to .versions directory, and create a new latest version Core versioning functionality: WORKING TestVersioningBasicWorkflow - PASS TestVersioningDeleteMarkers - PASS TestVersioningMultipleVersionsSameObject - PASS TestVersioningDeleteAndRecreate - PASS TestVersioningListWithPagination - PASS ❌ Some advanced features still failing: ETag calculation issues (using mtime instead of proper MD5) Specific version retrieval (EOF error) Version deletion (internal errors) Concurrent operations (race conditions) * calculate multi chunk md5 Test Results - All Passing: ✅ TestBucketListReturnDataVersioning - PASS ✅ TestVersioningCreateObjectsInOrder - PASS ✅ TestVersioningBasicWorkflow - PASS ✅ TestVersioningMultipleVersionsSameObject - PASS ✅ TestVersioningDeleteMarkers - PASS * dedupe * fix TestVersioningErrorCases * fix eof error of reading old versions * get specific version also check current version * enable integration tests for versioning * trigger action to work for now * Fix GitHub Actions S3 versioning tests workflow - Fix syntax error (incorrect indentation) - Update directory paths from weed/s3api/versioning_tests/ to test/s3/versioning/ - Add push trigger for add-object-versioning branch to enable CI during development - Update artifact paths to match correct directory structure * Improve CI robustness for S3 versioning tests Makefile improvements: - Increase server startup timeout from 30s to 90s for CI environments - Add progressive timeout reporting (logs at 30s, full logs at 90s) - Better error handling with server logs on failure - Add server PID tracking for debugging - Improved test failure reporting GitHub Actions workflow improvements: - Increase job timeouts to account for CI environment delays - Add system information logging (memory, disk space) - Add detailed failure reporting with server logs - Add process and network diagnostics on failure - Better error messaging and log collection These changes should resolve the 'Server failed to start within 30 seconds' issue that was causing the CI tests to fail. * adjust testing volume size * Update Makefile * Update Makefile * Update Makefile * Update Makefile * Update s3-versioning-tests.yml * Update s3api_object_versioning.go * Update Makefile * do not clean up * log received version id * more logs * printout response * print out list version response * use tmp files when put versioned object * change to versions folder layout * Delete weed-test.log * test with mixed versioned and unversioned objects * remove versionDirCache * remove unused functions * remove unused function * remove fallback checking * minor --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-03-18fix: restore deletion audit of individual objects (#6644)SmoothDenis1-0/+11
2025-02-07fix: record and delete bucket metrics after inactive (#6523)zouyixiong1-1/+4
Co-authored-by: XYZ <XYZ>
2025-01-25Add metrics for uploaded and deleted s3 objects (#6475)Hadi Zamani1-0/+3
2024-12-19fix compilationchrislu1-3/+0
2024-12-19"golang.org/x/exp/slices" => "slices" and go fmtchrislu1-0/+2
2024-12-19Fix for DeleteMultipleObjectsHandler wrongly deleting parent folders (#6380)Warren Hodgkinson1-2/+7
What problem are we solving? Fix: #6379 How are we solving the problem? We check for the AllowEmptyFolders option prior to cascade deleting parent folders in S3 DeleteMultipleObjectsHandler. How is the PR tested? We ran SeaweedFS in a Kubernetes Cluster with a joint Filer and S3 server in one container, with leveldb2 as the filer storage, and AllowEmptyFolders set to true. When using the Distribution Registry as the S3 client, it calls the DeleteMultipleObjectsHandler as part of the artifact upload process (uploads to a temp location, then performs a copy and delete). Without this fix, the deletion cascade deleted parent folder until the entire contents of the bucket were gone. With this fix, the existing content of the bucket remained, and the newly uploaded content was added. Checks [ ] I have added unit tests if possible. [ ] I will add related wiki document changes and link to this PR after merging. Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2024-07-10[s3] revert skip deletion error, since the error file was not found is ↵Konstantin Lebedev1-5/+7
already skipped on the side of the grpc function (#5760) * revert skip deletion error, since the error file was not found is already skipped on the side of the grpc function * fix response error * fix test_lifecycle_get * Revert "fix test_lifecycle_get" This reverts commit 8f991bdcf93d9a13c7787988173713ad1a263bae.
2024-07-08simplifychrislu1-5/+2
2024-07-08Fix S3 deletion in deep folders, and names with empty spaceschrislu1-2/+3
fix https://github.com/seaweedfs/seaweedfs/issues/5748
2024-05-07also delete parent folder if emptychrislu1-1/+1
fix https://github.com/seaweedfs/seaweedfs/issues/5567
2024-04-29AllowEmptyFolder checks during object deletionchrislu1-7/+27
2024-04-29split filechrislu1-0/+179