seaweedfs/.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
10 days	ec: add -diskType flag to EC commands for SSD support (#7607)	Chris Lu	1	-4/+28
	* ec: add diskType parameter to core EC functions Add diskType parameter to: - ecBalancer struct - collectEcVolumeServersByDc() - collectEcNodesForDC() - collectEcNodes() - EcBalance() This allows EC operations to target specific disk types (hdd, ssd, etc.) instead of being hardcoded to HardDriveType only. For backward compatibility, all callers currently pass types.HardDriveType as the default value. Subsequent commits will add -diskType flags to the individual EC commands. * ec: update helper functions to use configurable diskType Update the following functions to accept/use diskType parameter: - findEcVolumeShards() - addEcVolumeShards() - deleteEcVolumeShards() - moveMountedShardToEcNode() - countShardsByRack() - pickNEcShardsToMoveFrom() All ecBalancer methods now use ecb.diskType instead of hardcoded types.HardDriveType. Non-ecBalancer callers (like volumeServer.evacuate and ec.rebuild) use types.HardDriveType as the default. Update all test files to pass diskType where needed. * ec: add -diskType flag to ec.balance and ec.encode commands Add -diskType flag to specify the target disk type for EC operations: - ec.balance -diskType=ssd - ec.encode -diskType=ssd The disk type can be 'hdd', 'ssd', or empty for default (hdd). This allows placing EC shards on SSD or other disk types instead of only HDD. Example usage: ec.balance -collection=mybucket -diskType=ssd -apply ec.encode -collection=mybucket -diskType=ssd -force * test: add integration tests for EC disk type support Add integration tests to verify the -diskType flag works correctly: - TestECDiskTypeSupport: Tests EC encode and balance with SSD disk type - TestECDiskTypeMixedCluster: Tests EC operations on a mixed HDD/SSD cluster The tests verify: - Volume servers can be configured with specific disk types - ec.encode accepts -diskType flag and encodes to the correct disk type - ec.balance accepts -diskType flag and balances on the correct disk type - Mixed disk type clusters work correctly with separate collections * ec: add -sourceDiskType to ec.encode and -diskType to ec.decode ec.encode: - Add -sourceDiskType flag to filter source volumes by disk type - This enables tier migration scenarios (e.g., SSD volumes → HDD EC shards) - -diskType specifies target disk type for EC shards ec.decode: - Add -diskType flag to specify source disk type where EC shards are stored - Update collectEcShardIds() and collectEcNodeShardBits() to accept diskType Examples: # Encode SSD volumes to HDD EC shards (tier migration) ec.encode -collection=mybucket -sourceDiskType=ssd -diskType=hdd # Decode EC shards from SSD ec.decode -collection=mybucket -diskType=ssd Integration tests updated to cover new flags. * ec: fix variable shadowing and add -diskType to ec.rebuild and volumeServer.evacuate Address code review comments: 1. Fix variable shadowing in collectEcVolumeServersByDc(): - Rename loop variable 'diskType' to 'diskTypeKey' and 'diskTypeStr' to avoid shadowing the function parameter 2. Fix hardcoded HardDriveType in ecBalancer methods: - balanceEcRack(): use ecb.diskType instead of types.HardDriveType - collectVolumeIdToEcNodes(): use ecb.diskType 3. Add -diskType flag to ec.rebuild command: - Add diskType field to ecRebuilder struct - Pass diskType to collectEcNodes() and addEcVolumeShards() 4. Add -diskType flag to volumeServer.evacuate command: - Add diskType field to commandVolumeServerEvacuate struct - Pass diskType to collectEcVolumeServersByDc() and moveMountedShardToEcNode() * test: add diskType field to ecBalancer in TestPickEcNodeToBalanceShardsInto Address nitpick comment: ensure test ecBalancer struct has diskType field set for consistency with other tests. * ec: filter disk selection by disk type in pickBestDiskOnNode When evacuating or rebalancing EC shards, pickBestDiskOnNode now filters disks by the target disk type. This ensures: 1. EC shards from SSD disks are moved to SSD disks on destination nodes 2. EC shards from HDD disks are moved to HDD disks on destination nodes 3. No cross-disk-type shard movement occurs This maintains the storage tier isolation when moving EC shards between nodes during evacuation or rebalancing operations. * ec: allow disk type fallback during evacuation Update pickBestDiskOnNode to accept a strictDiskType parameter: - strictDiskType=true (balancing): Only use disks of matching type. This maintains storage tier isolation during normal rebalancing. - strictDiskType=false (evacuation): Prefer same disk type, but fall back to other disk types if no matching disk is available. This ensures evacuation can complete even when same-type capacity is insufficient. Priority order for evacuation: 1. Same disk type with lowest shard count (preferred) 2. Different disk type with lowest shard count (fallback) * test: use defer for lock/unlock to prevent lock leaks Use defer to ensure locks are always released, even on early returns or test failures. This prevents lock leaks that could cause subsequent tests to hang or fail. Changes: - Return early if lock acquisition fails - Immediately defer unlock after successful lock - Remove redundant explicit unlock calls at end of tests - Fix unused variable warning (err -> encodeErr/locErr) * ec: dynamically discover disk types from topology for evacuation Disk types are free-form tags (e.g., 'ssd', 'nvme', 'archive') that come from the topology, not a hardcoded set. Only 'hdd' (or empty) is the default disk type. Use collectVolumeDiskTypes() to discover all disk types present in the cluster topology instead of hardcoding [HardDriveType, SsdType]. * test: add evacuation fallback and cross-rack EC placement tests Add two new integration tests: 1. TestEvacuationFallbackBehavior: - Tests that when same disk type has no capacity, shards fall back to other disk types during evacuation - Creates cluster with 1 SSD + 2 HDD servers (limited SSD capacity) - Verifies pickBestDiskOnNode behavior with strictDiskType=false 2. TestCrossRackECPlacement: - Tests EC shard distribution across different racks - Creates cluster with 4 servers in 4 different racks - Verifies shards are spread across multiple racks - Tests that ec.balance respects rack placement Helper functions added: - startLimitedSsdCluster: 1 SSD + 2 HDD servers - startMultiRackCluster: 4 servers in 4 racks - countShardsPerRack: counts EC shards per rack from disk * test: fix collection mismatch in TestCrossRackECPlacement The EC commands were using collection 'rack_test' but uploaded test data uses collection 'test' (default). This caused ec.encode/ec.balance to not find the uploaded volume. Fix: Change EC commands to use '-collection test' to match the uploaded data. Addresses review comment from PR #7607. * test: close log files in MultiDiskCluster.Stop() to prevent FD leaks Track log files in MultiDiskCluster.logFiles and close them in Stop() to prevent file descriptor accumulation in long-running or many-test scenarios. Addresses review comment about logging resources cleanup. * test: improve EC integration tests with proper assertions - Add assertNoFlagError helper to detect flag parsing regressions - Update diskType subtests to fail on flag errors (ec.encode, ec.balance, ec.decode) - Update verify_disktype_flag_parsing to check help output contains diskType - Remove verify_fallback_disk_selection (was documentation-only, not executable) - Add assertion to verify_cross_rack_distribution for minimum 2 racks - Consolidate uploadTestDataWithDiskType to accept collection parameter - Remove duplicate uploadTestDataWithDiskTypeMixed function * test: extract captureCommandOutput helper and fix error handling - Add captureCommandOutput helper to reduce code duplication in diskType tests - Create commandRunner interface to match shell command Do method - Update ec_encode_with_ssd_disktype, ec_balance_with_ssd_disktype, ec_encode_with_source_disktype, ec_decode_with_disktype to use helper - Fix filepath.Glob error handling in countShardsPerRack instead of ignoring it * test: add flag validation to ec_balance_targets_correct_disk_type Add assertNoFlagError calls after ec.balance commands to ensure -diskType flag is properly recognized for both SSD and HDD disk types. * test: add proper assertions for EC command results - ec_encode_with_ssd_disktype: check for expected volume-related errors - ec_balance_with_ssd_disktype: require success with require.NoError - ec_encode_with_source_disktype: check for expected no-volume errors - ec_decode_with_disktype: check for expected no-ec-volume errors - upload_to_ssd_and_hdd: use require.NoError for setup validation Tests now properly fail on unexpected errors rather than just logging. * test: fix missing unlock in ec_encode_with_disk_awareness Add defer unlock pattern to ensure lock is always released, matching the pattern used in other subtests. * test: improve helper robustness - Make assertNoFlagError case-insensitive for pattern matching - Use defer in captureCommandOutput to restore stdout/stderr and close pipe ends to avoid FD leaks even if cmd.Do panics
12 days	Nit: have `ec.encode` exit immediately if no volumes are processed. (#7654)	Lisandro Pin	1	-0/+4
	* Nit: have `ec.encode` exit immediately if no volumes are processed. * Update weed/shell/command_ec_encode.go Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --------- Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-08-23	Shell: support regular expression for collection selection (#7158)	Chris Lu	1	-10/+32
	* support regular expression for collection selection * refactor * ordering * fix exact match * Update command_volume_balance_test.go * simplify * Update command_volume_balance.go * comment
2025-08-07	Shell: add verbose ec encoding mode (#7105)	Chris Lu	1	-20/+127
	* add verbose ec encoding mode * address comments
2025-08-03	mount ec shards correctly (#7079)	Chris Lu	1	-2/+2

2025-07-16	convert error fromating to %w everywhere (#6995)	Chris Lu	1	-5/+5

2025-07-14	Collecting volume locations for volumes before EC encoding	chrislu	1	-11/+26
	fix https://github.com/seaweedfs/seaweedfs/issues/6963
2025-06-15	Support filtering source disk type in volume.tier.upload (#6868)	NyaMisty	1	-3/+5

2025-05-13	Nit: unify the default `--maxParallelization` value for `weed shell` ↵	Lisandro Pin	1	-1/+1
	commands supporting this option (#6788)
2025-05-09	Improve safety for weed shell's `ec.encode`. (#6773)	Lisandro Pin	1	-35/+55
	Improve safety for weed shells `ec.encode`. The current process for `ec.encode` is: 1. EC shards for a volume are generated and added to a single server 2. The original volume is deleted 3. EC shards get re-balanced across the entire topology It is then possible to lose data between #2 and #3, if the underlying volume storage/server/rack/DC happens to fail, for whatever reason. As a fix, this MR reworks `ec.encode` so: * Newly created EC shards are spread across all locations for the source volume. * Source volumes are deleted only after EC shards are converted and balanced.
2025-05-08	Improve parallelization for `ec.encode` (#6769)	Lisandro Pin	1	-34/+61
	Improve parallelization for `ec.encode`. Instead of processing one volume at at time, perform all EC conversion steps (mark readonly -> generate EC shards -> delete volume -> remount) in parallel for all of them. This should substantially improve performance when EC encoding entire collections.
2025-02-28	`ec.encode`: Fix resolution of target collections. (#6585)	Lisandro Pin	1	-8/+4
	* Don't ignore empty (`""`) collection names when computing collections for a given volume ID. * `ec.encode`: Fix resolution of target collections. When no `volumeId` parameter is provided, compute volumes based on the provided collection name, even if it's empty (`""`). This restores behavior to before recent EC rebalancing rework. See also https://github.com/seaweedfs/seaweedfs/blob/ec30a504bae6cad75f859964e14c60d39cc43709/weed/shell/command_ec_encode.go#L99 .
2025-02-10	`ec.encode`: Explictly mount EC shards after volume conversion. (#6528)	Lisandro Pin	1	-2/+12
	This guarantees EC shards are immediately available after encoding, even if not affected by subsequent re-balancing.
2025-01-17	`ec.encode`: Fix bug causing source volumes not being deleted after EC ↵	Lisandro Pin	1	-1/+18
	conversion. (#6447) This logic was originally part of `spreadEcShards()`, which got removed during the unification effort with `ec.balance` (https://github.com/seaweedfs/seaweedfs/pull/6344), accidentally breaking functionality in the process. The commit restores the deletion code for EC'd volumes - with parallelization support.
2024-12-19	Fix volume replica parallelization within `ec.encode`. (#6377)	Lisandro Pin	1	-5/+3
	See 826edd5d.
2024-12-18	Allow configuring the maximum number of concurrent tasks for EC ↵	Lisandro Pin	1	-2/+2
	parallelization. (#6376) Follow-up to b0210df0.
2024-12-18	Parallelize volume replica operations within `ec.encode`. (#6374)	Lisandro Pin	1	-6/+15

2024-12-12	Begin implementing EC balancing parallelization support. (#6342)	Lisandro Pin	1	-3/+2
	* Begin implementing EC balancing parallelization support. Impacts both `ec.encode` and `ec.balance`, * Nit: improve type naming. * Make the goroutine workgroup handler for `EcBalance()` a bit smarter/error-proof. * Nit: unify naming for `ecBalancer` wait group methods with the rest of the module. * Fix concurrency bug. * Fix whitespace after Gitlab automerge. * Delete stray TODO.
2024-12-12	Limit EC re-balancing for `ec.encode` to relevant collections when a volume ↵	Lisandro Pin	1	-5/+1
	ID argument is provided. (#6347) Limit EC re-balancing for `ec.encode` to relevant collections when a volume ID is provided.
2024-12-12	Delete legacy balancing code for `ec.encode`. (#6344)	Lisandro Pin	1	-133/+0

2024-12-10	Unify the re-balancing logic for `ec.encode` with `ec.balance`. (#6339)	Lisandro Pin	1	-29/+44
	Among others, this enables recent changes related to topology aware re-balancing at EC encoding time.
2024-11-19	Unify usage of shell.EcNode.dc as DataCenterId. (#6258)	Lisandro Pin	1	-2/+2

2024-11-09	delete aborted ec shards from both source and target servers (#6221)	Chris Lu	1	-1/+4
	fix https://github.com/seaweedfs/seaweedfs/issues/6205#issuecomment-2465004586
2024-10-31	fix format (#6185)	wyang	1	-1/+1
	unitest weed/shell fail
2024-10-28	fix format	chrislu	1	-2/+2

2024-10-24	ensure 2 volume space since actual need 1.4x volume size empty space	chrislu	1	-2/+2

2024-10-24	correcting free volume count, factor it during ec encoding to ensure enough ↵	chrislu	1	-3/+19
	disk space available fix https://github.com/seaweedfs/seaweedfs/issues/6163
2024-09-29	refactor	chrislu	1	-1/+1

2024-09-28	add IsResourceHeavy() to command interface	chrislu	1	-0/+4

2024-09-24	fix(volume): don't persist RO state in specific cases (#6058)	Max Denushev	1	-3/+4
	* fix(volume): don't persist RO state in specific cases * fix(volume): writable always persist
2024-06-02	Ignore remote volume when selecting volumes in operation ↵	NyaMisty	1	-0/+4
	(ec.encode/volume.tier.upload) (#5635)
2023-10-22	log full percentage	chrislu	1	-1/+1

2023-10-05	default to skip if less than 4 nodes	chrislu	1	-0/+19

2023-07-06	clone volume locations in case they are changed	chrislu	1	-1/+1
	fix https://github.com/seaweedfs/seaweedfs/issues/4642
2023-06-12	Delete volume is empty (#4561)	Konstantin Lebedev	1	-1/+1
	* use onlyEmpty for deleteVolume https://github.com/seaweedfs/seaweedfs/issues/4559 * fix IsEmpty * fix test --------- Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
2023-02-09	fix bug when vid not found	chrislu	1	-1/+1
	fix https://github.com/seaweedfs/seaweedfs/issues/4193
2022-08-22	shell: stop long running jobs if lock is lost	chrislu	1	-0/+4

2022-07-29	move to https://github.com/seaweedfs/seaweedfs	chrislu	1	-7/+7

2022-04-05	erasure coding: tracking encoded/decoded volumes	chrislu	1	-1/+1
	If an EC shard is created but not spread to other servers, the masterclient would think this shard is not located here.
2022-02-08	volume.balance: add delay during tight loop	chrislu	1	-1/+1
	fix https://github.com/chrislusf/seaweedfs/issues/2637
2021-12-26	use streaming mode for long poll grpc calls	chrislu	1	-1/+1
	streaming mode would create separate grpc connections for each call. this is to ensure the long poll connections are properly closed.
2021-12-10	add lock messages	chrislu	1	-1/+1

2021-11-04	randomize a bit for ec shards distribution	Chris Lu	1	-1/+2

2021-11-01	adjust help message since both fullPercent and quietFor are needed.	Chris Lu	1	-1/+1

2021-09-13	shell: do not need to lock to see volume -h	Chris Lu	1	-4/+4

2021-09-13	erasure coding: add cleanup step if anything goes wrong	Chris Lu	1	-0/+14

2021-09-12	change server address from string to a type	Chris Lu	1	-6/+7

2021-08-13	shell: volume.tier.move makes up changes if volume move failed	Chris Lu	1	-23/+1

2021-05-06	optional parallel copy ec shards	Chris Lu	1	-20/+27
	fix https://github.com/chrislusf/seaweedfs/issues/2048
2021-02-28	adjust text	Chris Lu	1	-1/+1