diff options
| author | Chris Lu <chrislusf@users.noreply.github.com> | 2025-12-02 12:30:15 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-12-02 12:30:15 -0800 |
| commit | 4f038820dc480a7374b928a17cc9121474ba4172 (patch) | |
| tree | d15e1dade3537fbd462e819403b3347433ff37bd /.github | |
| parent | ebb06a3908990c31dcfb4995a69682d836228179 (diff) | |
| download | seaweedfs-4f038820dc480a7374b928a17cc9121474ba4172.tar.xz seaweedfs-4f038820dc480a7374b928a17cc9121474ba4172.zip | |
Add disk-aware EC rebalancing (#7597)
* Add placement package for EC shard placement logic
- Consolidate EC shard placement algorithm for reuse across shell and worker tasks
- Support multi-pass selection: racks, then servers, then disks
- Include proper spread verification and scoring functions
- Comprehensive test coverage for various cluster topologies
* Make ec.balance disk-aware for multi-disk servers
- Add EcDisk struct to track individual disks on volume servers
- Update EcNode to maintain per-disk shard distribution
- Parse disk_id from EC shard information during topology collection
- Implement pickBestDiskOnNode() for selecting best disk per shard
- Add diskDistributionScore() for tie-breaking node selection
- Update all move operations to specify target disk in RPC calls
- Improves shard balance within multi-disk servers, not just across servers
* Use placement package in EC detection for consistent disk-level placement
- Replace custom EC disk selection logic with shared placement package
- Convert topology DiskInfo to placement.DiskCandidate format
- Use SelectDestinations() for multi-rack/server/disk spreading
- Convert placement results back to topology DiskInfo for task creation
- Ensures EC detection uses same placement logic as shell commands
* Make volume server evacuation disk-aware
- Use pickBestDiskOnNode() when selecting evacuation target disk
- Specify target disk in evacuation RPC requests
- Maintains balanced disk distribution during server evacuations
* Rename PlacementConfig to PlacementRequest for clarity
PlacementRequest better reflects that this is a request for placement
rather than a configuration object. This improves API semantics.
* Rename DefaultConfig to DefaultPlacementRequest
Aligns with the PlacementRequest type naming for consistency
* Address review comments from Gemini and CodeRabbit
Fix HIGH issues:
- Fix empty disk discovery: Now discovers all disks from VolumeInfos,
not just from EC shards. This ensures disks without EC shards are
still considered for placement.
- Fix EC shard count calculation in detection.go: Now correctly filters
by DiskId and sums actual shard counts using ShardBits.ShardIdCount()
instead of just counting EcShardInfo entries.
Fix MEDIUM issues:
- Add disk ID to evacuation log messages for consistency with other logging
- Remove unused serverToDisks variable in placement.go
- Fix comment that incorrectly said 'ascending' when sorting is 'descending'
* add ec tests
* Update ec-integration-tests.yml
* Update ec_integration_test.go
* Fix EC integration tests CI: build weed binary and update actions
- Add 'Build weed binary' step before running tests
- Update actions/setup-go from v4 to v6 (Node20 compatibility)
- Update actions/checkout from v2 to v4 (Node20 compatibility)
- Move working-directory to test step only
* Add disk-aware EC rebalancing integration tests
- Add TestDiskAwareECRebalancing test with multi-disk cluster setup
- Test EC encode with disk awareness (shows disk ID in output)
- Test EC balance with disk-level shard distribution
- Add helper functions for disk-level verification:
- startMultiDiskCluster: 3 servers x 4 disks each
- countShardsPerDisk: track shards per disk per server
- calculateDiskShardVariance: measure distribution balance
- Verify no single disk is overloaded with shards
Diffstat (limited to '.github')
| -rw-r--r-- | .github/workflows/ec-integration-tests.yml | 41 |
1 files changed, 41 insertions, 0 deletions
diff --git a/.github/workflows/ec-integration-tests.yml b/.github/workflows/ec-integration-tests.yml new file mode 100644 index 000000000..ea476b77c --- /dev/null +++ b/.github/workflows/ec-integration-tests.yml @@ -0,0 +1,41 @@ +name: "EC Integration Tests" + +on: + push: + branches: [ master ] + pull_request: + branches: [ master ] + +permissions: + contents: read + +jobs: + ec-integration-tests: + name: EC Integration Tests + runs-on: ubuntu-22.04 + timeout-minutes: 30 + steps: + - name: Set up Go 1.x + uses: actions/setup-go@v6 + with: + go-version: ^1.24 + id: go + + - name: Check out code into the Go module directory + uses: actions/checkout@v4 + + - name: Build weed binary + run: | + cd weed && go build -o weed . + + - name: Run EC Integration Tests + working-directory: test/erasure_coding + run: | + go test -v + + - name: Archive logs + if: failure() + uses: actions/upload-artifact@v4 + with: + name: ec-integration-test-logs + path: test/erasure_coding
\ No newline at end of file |
