aboutsummaryrefslogtreecommitdiff
path: root/note
diff options
context:
space:
mode:
authorChris Lu <chris.lu@gmail.com>2014-03-02 22:16:54 -0800
committerChris Lu <chris.lu@gmail.com>2014-03-02 22:16:54 -0800
commit27c74a7e66558a4f9ce0d10621606dfed98a3abb (patch)
treef16eef19480fd51ccbef54c05d39c2eacf309e56 /note
parentedae676913363bdd1e5a50bf0778fdcc3c6d6051 (diff)
downloadseaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.tar.xz
seaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.zip
Major:
change replication_type to ReplicaPlacement, hopefully cleaner code works for 9 possible ReplicaPlacement xyz x : number of copies on other data centers y : number of copies on other racks z : number of copies on current rack x y z each can be 0,1,2 Minor: weed server "-mdir" default to "-dir" if empty
Diffstat (limited to 'note')
-rw-r--r--note/replication.txt37
1 files changed, 30 insertions, 7 deletions
diff --git a/note/replication.txt b/note/replication.txt
index c4bf46044..a151e80c3 100644
--- a/note/replication.txt
+++ b/note/replication.txt
@@ -59,11 +59,6 @@ If any "assign" request comes in
3. return a writable volume to the user
-Plan:
- Step 1. implement one copy(no replication), automatically assign volume ids
- Step 2. add replication
-
-For the above operations, here are the todo list:
for data node:
0. detect existing volumes DONE
1. onStartUp, and periodically, send existing volumes and maxVolumeCount store.Join(), DONE
@@ -77,10 +72,38 @@ For the above operations, here are the todo list:
1. accept data node's report of existing volumes and maxVolumeCount ALREADY EXISTS /dir/join
2. periodically refresh for active data nodes, and adjust writable volumes
3. send command to grow a volume(id + replication level) DONE
- 4. NOT_IMPLEMENTING: if dead/stale data nodes are found, for the affected volumes, send stale info
- to other data nodes. BECAUSE the master will stop sending writes to these data nodes
5. accept lookup for volume locations ALREADY EXISTS /dir/lookup
6. read topology/datacenter/rack layout
+An algorithm to allocate volumes evenly, but may be inefficient if free volumes are plenty:
+input: replication=xyz
+algorithm:
+ret_dcs = []
+foreach dc that has y+z+1 volumes{
+ ret_racks = []
+ foreach rack with z+1 volumes{
+ ret = select z+1 servers with 1 volume
+ if ret.size()==z+1 {
+ ret_racks.append(ret)
+ }
+ }
+ randomly pick one rack from ret_racks
+ ret += select y racks with 1 volume each
+ if ret.size()==y+z+1{
+ ret_dcs.append(ret)
+ }
+}
+randomly pick one dc from ret_dcs
+ret += select x data centers with 1 volume each
+
+A simple replica placement algorithm, but may fail when free volume slots are not plenty:
+ret := []volumes
+dc = randomly pick 1 data center with y+z+1 volumes
+ rack = randomly pick 1 rack with z+1 volumes
+ ret = ret.append(randomly pick z+1 volumes)
+ ret = ret.append(randomly pick y racks with 1 volume)
+ret = ret.append(randomly pick x data centers with 1 volume)
+
+
TODO:
1. replicate content to the other server if the replication type needs replicas