diff options
| author | Chris Lu <chrislusf@users.noreply.github.com> | 2025-12-02 09:24:31 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-12-02 09:24:31 -0800 |
| commit | 733ca8e6df3de57e2dcc0923fb2f166d3222921d (patch) | |
| tree | 89fe56d8a72775ba40afb2c2066464f9995dbf61 /weed/s3api/s3api_object_handlers_copy_unified.go | |
| parent | 099a351f3b084454f0d437b2d967c0f4a3bb9e1f (diff) | |
| download | seaweedfs-733ca8e6df3de57e2dcc0923fb2f166d3222921d.tar.xz seaweedfs-733ca8e6df3de57e2dcc0923fb2f166d3222921d.zip | |
Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type (#7598)
* Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type
Fixes GitHub #7562: Copying objects between encrypted buckets was failing.
Root causes:
1. processMetadataBytes was re-adding SSE headers from source entry, undoing
the encryption header filtering. Now uses dstEntry.Extended which is
already filtered.
2. SSE-S3 streaming copy returned nil metadata. Now properly generates and
returns SSE-S3 destination metadata (SeaweedFSSSES3Key, AES256 header)
via ExecuteStreamingCopyWithMetadata.
3. Chunks created during streaming copy didn't have SseType set. Now sets
SseType and per-chunk SseMetadata with chunk-specific IVs for SSE-S3,
enabling proper decryption on GetObject.
* Address review: make SSE-S3 metadata serialization failures fatal errors
- In executeEncryptCopy: return error instead of just logging if
SerializeSSES3Metadata fails
- In createChunkFromData: return error if chunk SSE-S3 metadata
serialization fails
This ensures objects/chunks are never created without proper encryption
metadata, preventing unreadable/corrupted data.
* fmt
* Refactor: reuse function names instead of creating WithMetadata variants
- Change ExecuteStreamingCopy to return (*EncryptionSpec, error) directly
- Remove ExecuteStreamingCopyWithMetadata wrapper
- Change executeStreamingReencryptCopy to return (*EncryptionSpec, error)
- Remove executeStreamingReencryptCopyWithMetadata wrapper
- Update callers to ignore encryption spec with _ where not needed
* Add TODO documenting large file SSE-S3 copy limitation
The streaming copy approach encrypts the entire stream with a single IV
but stores data in chunks with per-chunk IVs. This causes decryption
issues for large files. Small inline files work correctly.
This is a known architectural issue that needs separate work to fix.
* Use chunk-by-chunk encryption for SSE-S3 copy (consistent with SSE-C/SSE-KMS)
Instead of streaming encryption (which had IV mismatch issues for multi-chunk
files), SSE-S3 now uses the same chunk-by-chunk approach as SSE-C and SSE-KMS:
1. Extended copyMultipartCrossEncryption to handle SSE-S3:
- Added SSE-S3 source decryption in copyCrossEncryptionChunk
- Added SSE-S3 destination encryption with per-chunk IVs
- Added object-level metadata generation for SSE-S3 destinations
2. Updated routing in executeEncryptCopy/executeDecryptCopy/executeReencryptCopy
to use copyMultipartCrossEncryption for all SSE-S3 scenarios
3. Removed streaming copy functions (shouldUseStreamingCopy,
executeStreamingReencryptCopy) as they're no longer used
4. Added large file (1MB) integration test to verify chunk-by-chunk copy works
This ensures consistent behavior across all SSE types and fixes data corruption
that occurred with large files in the streaming copy approach.
* fmt
* fmt
* Address review: fail explicitly if SSE-S3 metadata is missing
Instead of silently ignoring missing SSE-S3 metadata (which could create
unreadable objects), now explicitly fail the copy operation with a clear
error message if:
- First chunk is missing
- First chunk doesn't have SSE-S3 type
- First chunk has empty SSE metadata
- Deserialization fails
* Address review: improve comment to reflect full scope of chunk creation
* Address review: fail explicitly if baseIV is empty for SSE-S3 chunk encryption
If DestinationIV is not set when encrypting SSE-S3 chunks, the chunk would
be created without SseMetadata, causing GetObject decryption to fail later.
Now fails explicitly with a clear error message.
Note: calculateIVWithOffset returns ([]byte, int) not ([]byte, error) - the
int is a skip amount for intra-block alignment, not an error code.
* Address review: handle 0-byte files in SSE-S3 copy
For 0-byte files, there are no chunks to get metadata from. Generate an IV
for the object-level metadata to ensure even empty files are properly marked
as SSE-S3 encrypted.
Also validate that we don't have a non-empty file with no chunks (which
would indicate an internal error).
Diffstat (limited to 'weed/s3api/s3api_object_handlers_copy_unified.go')
| -rw-r--r-- | weed/s3api/s3api_object_handlers_copy_unified.go | 108 |
1 files changed, 10 insertions, 98 deletions
diff --git a/weed/s3api/s3api_object_handlers_copy_unified.go b/weed/s3api/s3api_object_handlers_copy_unified.go index 255c3eb2d..f1b4ff280 100644 --- a/weed/s3api/s3api_object_handlers_copy_unified.go +++ b/weed/s3api/s3api_object_handlers_copy_unified.go @@ -1,7 +1,6 @@ package s3api import ( - "context" "errors" "fmt" "net/http" @@ -133,9 +132,9 @@ func (s3a *S3ApiServer) executeEncryptCopy(entry *filer_pb.Entry, r *http.Reques } if state.DstSSES3 { - // Use streaming copy for SSE-S3 encryption - chunks, err := s3a.executeStreamingReencryptCopy(entry, r, state, dstPath) - return chunks, nil, err + // Use chunk-by-chunk copy for SSE-S3 encryption (consistent with SSE-C and SSE-KMS) + glog.V(2).Infof("Plain→SSE-S3 copy: using unified multipart encrypt copy") + return s3a.copyMultipartCrossEncryption(entry, r, state, dstBucket, dstPath) } return nil, nil, fmt.Errorf("unknown target encryption type") @@ -143,30 +142,18 @@ func (s3a *S3ApiServer) executeEncryptCopy(entry *filer_pb.Entry, r *http.Reques // executeDecryptCopy handles encrypted → plain copies func (s3a *S3ApiServer) executeDecryptCopy(entry *filer_pb.Entry, r *http.Request, state *EncryptionState, dstPath string) ([]*filer_pb.FileChunk, map[string][]byte, error) { - // Use unified multipart-aware decrypt copy for all encryption types - if state.SrcSSEC || state.SrcSSEKMS { + // Use unified multipart-aware decrypt copy for all encryption types (consistent chunk-by-chunk) + if state.SrcSSEC || state.SrcSSEKMS || state.SrcSSES3 { glog.V(2).Infof("Encrypted→Plain copy: using unified multipart decrypt copy") return s3a.copyMultipartCrossEncryption(entry, r, state, "", dstPath) } - if state.SrcSSES3 { - // Use streaming copy for SSE-S3 decryption - chunks, err := s3a.executeStreamingReencryptCopy(entry, r, state, dstPath) - return chunks, nil, err - } - return nil, nil, fmt.Errorf("unknown source encryption type") } // executeReencryptCopy handles encrypted → encrypted copies with different keys/methods func (s3a *S3ApiServer) executeReencryptCopy(entry *filer_pb.Entry, r *http.Request, state *EncryptionState, dstBucket, dstPath string) ([]*filer_pb.FileChunk, map[string][]byte, error) { - // Check if we should use streaming copy for better performance - if s3a.shouldUseStreamingCopy(entry, state) { - chunks, err := s3a.executeStreamingReencryptCopy(entry, r, state, dstPath) - return chunks, nil, err - } - - // Fallback to chunk-by-chunk approach for compatibility + // Use chunk-by-chunk approach for all cross-encryption scenarios (consistent behavior) if state.SrcSSEC && state.DstSSEC { return s3a.copyChunksWithSSEC(entry, r) } @@ -177,83 +164,8 @@ func (s3a *S3ApiServer) executeReencryptCopy(entry *filer_pb.Entry, r *http.Requ return chunks, dstMetadata, err } - if state.SrcSSEC && state.DstSSEKMS { - // SSE-C → SSE-KMS: use unified multipart-aware cross-encryption copy - glog.V(2).Infof("SSE-C→SSE-KMS cross-encryption copy: using unified multipart copy") - return s3a.copyMultipartCrossEncryption(entry, r, state, dstBucket, dstPath) - } - - if state.SrcSSEKMS && state.DstSSEC { - // SSE-KMS → SSE-C: use unified multipart-aware cross-encryption copy - glog.V(2).Infof("SSE-KMS→SSE-C cross-encryption copy: using unified multipart copy") - return s3a.copyMultipartCrossEncryption(entry, r, state, dstBucket, dstPath) - } - - // Handle SSE-S3 cross-encryption scenarios - if state.SrcSSES3 || state.DstSSES3 { - // Any scenario involving SSE-S3 uses streaming copy - chunks, err := s3a.executeStreamingReencryptCopy(entry, r, state, dstPath) - return chunks, nil, err - } - - return nil, nil, fmt.Errorf("unsupported cross-encryption scenario") -} - -// shouldUseStreamingCopy determines if streaming copy should be used -func (s3a *S3ApiServer) shouldUseStreamingCopy(entry *filer_pb.Entry, state *EncryptionState) bool { - // Use streaming copy for large files or when beneficial - fileSize := entry.Attributes.FileSize - - // Use streaming for files larger than 10MB - if fileSize > 10*1024*1024 { - return true - } - - // Check if this is a multipart encrypted object - isMultipartEncrypted := false - if state.IsSourceEncrypted() { - encryptedChunks := 0 - for _, chunk := range entry.GetChunks() { - if chunk.GetSseType() != filer_pb.SSEType_NONE { - encryptedChunks++ - } - } - isMultipartEncrypted = encryptedChunks > 1 - } - - // For multipart encrypted objects, avoid streaming copy to use per-chunk metadata approach - if isMultipartEncrypted { - glog.V(3).Infof("Multipart encrypted object detected, using chunk-by-chunk approach") - return false - } - - // Use streaming for cross-encryption scenarios (for single-part objects only) - if state.IsSourceEncrypted() && state.IsTargetEncrypted() { - srcType := s3a.getEncryptionTypeString(state.SrcSSEC, state.SrcSSEKMS, state.SrcSSES3) - dstType := s3a.getEncryptionTypeString(state.DstSSEC, state.DstSSEKMS, state.DstSSES3) - if srcType != dstType { - return true - } - } - - // Use streaming for compressed files - if isCompressedEntry(entry) { - return true - } - - // Use streaming for SSE-S3 scenarios (always) - if state.SrcSSES3 || state.DstSSES3 { - return true - } - - return false -} - -// executeStreamingReencryptCopy performs streaming re-encryption copy -func (s3a *S3ApiServer) executeStreamingReencryptCopy(entry *filer_pb.Entry, r *http.Request, state *EncryptionState, dstPath string) ([]*filer_pb.FileChunk, error) { - // Create streaming copy manager - streamingManager := NewStreamingCopyManager(s3a) - - // Execute streaming copy - return streamingManager.ExecuteStreamingCopy(context.Background(), entry, r, dstPath, state) + // All other cross-encryption scenarios use unified multipart copy + // This includes: SSE-C↔SSE-KMS, SSE-C↔SSE-S3, SSE-KMS↔SSE-S3, SSE-S3↔SSE-S3 + glog.V(2).Infof("Cross-encryption copy: using unified multipart copy") + return s3a.copyMultipartCrossEncryption(entry, r, state, dstBucket, dstPath) } |
