aboutsummaryrefslogtreecommitdiff
path: root/weed/util
diff options
context:
space:
mode:
authorChris Lu <chrislusf@users.noreply.github.com>2025-12-09 20:15:35 -0800
committerGitHub <noreply@github.com>2025-12-09 20:15:35 -0800
commitd970c15d71be190e994f22bda5da461c5327131d (patch)
tree9b814ca20674a6ad39e91e21046f99332defce8e /weed/util
parent1b13324fb79a8dcbc00044b90c357c41c2cc70c7 (diff)
downloadseaweedfs-d970c15d71be190e994f22bda5da461c5327131d.tar.xz
seaweedfs-d970c15d71be190e994f22bda5da461c5327131d.zip
fix: prevent filer.backup stall in single-filer setups (#7695)
* fix: prevent filer.backup stall in single-filer setups (#4977) When MetaAggregator.MetaLogBuffer is empty (which happens in single-filer setups with no peers), ReadFromBuffer was returning nil error, causing LoopProcessLogData to enter an infinite wait loop on ListenersCond. This fix returns ResumeFromDiskError instead, allowing SubscribeMetadata to loop back and read from persisted logs on disk. This ensures filer.backup continues processing events even when the in-memory aggregator buffer is empty. Fixes #4977 * test: add integration tests for metadata subscription Add integration tests for metadata subscription functionality: - TestMetadataSubscribeBasic: Tests basic subscription and event receiving - TestMetadataSubscribeSingleFilerNoStall: Regression test for #4977, verifies subscription doesn't stall under high load in single-filer setups - TestMetadataSubscribeResumeFromDisk: Tests resuming subscription from disk Related to #4977 * ci: add GitHub Actions workflow for metadata subscribe tests Add CI workflow that runs on: - Push/PR to master affecting filer, log_buffer, or metadata subscribe code - Runs the integration tests for metadata subscription - Uploads logs on failure for debugging Related to #4977 * fix: use multipart form-data for file uploads in integration tests The filer expects multipart/form-data for file uploads, not raw POST body. This fixes the 'Content-Type isn't multipart/form-data' error. * test: use -peers=none for faster master startup * test: add -peers=none to remaining master startup in ec tests * fix: use filer HTTP port 8888, WithFilerClient adds 10000 for gRPC WithFilerClient calls ToGrpcAddress() which adds 10000 to the port. Passing 18888 resulted in connecting to 28888. Use 8888 instead. * test: add concurrent writes and million updates tests - TestMetadataSubscribeConcurrentWrites: 50 goroutines writing 20 files each - TestMetadataSubscribeMillionUpdates: 1 million metadata entries via gRPC (metadata only, no actual file content for speed) * fix: address PR review comments - Handle os.MkdirAll errors explicitly instead of ignoring - Handle log file creation errors with proper error messages - Replace silent event dropping with 100ms timeout and warning log * Update metadata_subscribe_integration_test.go
Diffstat (limited to 'weed/util')
-rw-r--r--weed/util/log_buffer/log_buffer.go5
1 files changed, 4 insertions, 1 deletions
diff --git a/weed/util/log_buffer/log_buffer.go b/weed/util/log_buffer/log_buffer.go
index 22e69cc60..853cbe475 100644
--- a/weed/util/log_buffer/log_buffer.go
+++ b/weed/util/log_buffer/log_buffer.go
@@ -723,7 +723,10 @@ func (logBuffer *LogBuffer) ReadFromBuffer(lastReadPosition MessagePosition) (bu
}
}
if tsMemory.IsZero() { // case 2.2
- return nil, -2, nil
+ // Buffer is empty - return ResumeFromDiskError so caller can read from disk
+ // This fixes issue #4977 where SubscribeMetadata stalls because
+ // MetaAggregator.MetaLogBuffer is empty in single-filer setups
+ return nil, -2, ResumeFromDiskError
} else if lastReadPosition.Time.Before(tsMemory) { // case 2.3
// For time-based reads, only check timestamp for disk reads
// Don't use offset comparisons as they're not meaningful for time-based subscriptions