Major:

change replication_type to ReplicaPlacement, hopefully cleaner code works for 9 possible ReplicaPlacement xyz x : number of copies on other data centers y : number of copies on other racks z : number of copies on current rack x y z each can be 0,1,2 Minor: weed server "-mdir" default to "-dir" if empty
author: Chris Lu <chris.lu@gmail.com> 2014-03-02 22:16:54 -0800
committer: Chris Lu <chris.lu@gmail.com> 2014-03-02 22:16:54 -0800
commit: 27c74a7e66558a4f9ce0d10621606dfed98a3abb (patch)
tree: f16eef19480fd51ccbef54c05d39c2eacf309e56 /note
parent: edae676913363bdd1e5a50bf0778fdcc3c6d6051 (diff)
download: seaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.tar.xz
seaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.zip
1 files changed, 30 insertions, 7 deletions
diff --git a/note/replication.txt b/note/replication.txt
index c4bf46044..a151e80c3 100644
--- a/note/replication.txt
+++ b/note/replication.txt
@@ -59,11 +59,6 @@ If any "assign" request comes in
   3. return a writable volume to the user
 
   
-Plan:
-  Step 1. implement one copy(no replication), automatically assign volume ids
-  Step 2. add replication
-
-For the above operations, here are the todo list:
   for data node:
     0. detect existing volumes DONE
     1. onStartUp, and periodically, send existing volumes and maxVolumeCount  store.Join(), DONE
@@ -77,10 +72,38 @@ For the above operations, here are the todo list:
     1. accept data node's report of existing volumes and maxVolumeCount ALREADY EXISTS /dir/join
     2. periodically refresh for active data nodes, and adjust writable volumes
     3. send command to grow a volume(id + replication level)  DONE
-    4. NOT_IMPLEMENTING: if dead/stale data nodes are found, for the affected volumes, send stale info
-       to other data nodes. BECAUSE the master will stop sending writes to these data nodes
     5. accept lookup for volume locations    ALREADY EXISTS /dir/lookup
     6. read topology/datacenter/rack layout
 
+An algorithm to allocate volumes evenly, but may be inefficient if free volumes are plenty:
+input: replication=xyz
+algorithm:
+ret_dcs = []
+foreach dc that has y+z+1 volumes{
+  ret_racks = []
+  foreach rack with z+1 volumes{
+    ret = select z+1 servers with 1 volume
+  if ret.size()==z+1 {
+    ret_racks.append(ret)
+  }
+  }
+  randomly pick one rack from ret_racks
+  ret += select y racks with 1 volume each
+  if ret.size()==y+z+1{
+    ret_dcs.append(ret)
+  }
+}
+randomly pick one dc from ret_dcs
+ret += select x data centers with 1 volume each
+
+A simple replica placement algorithm, but may fail when free volume slots are not plenty:
+ret := []volumes
+dc = randomly pick 1 data center with y+z+1 volumes
+  rack = randomly pick 1 rack with z+1 volumes
+    ret = ret.append(randomly pick z+1 volumes)
+  ret = ret.append(randomly pick y racks with 1 volume)
+ret = ret.append(randomly pick x data centers with 1 volume)
+
+
 TODO:
   1. replicate content to the other server if the replication type needs replicas
author	Chris Lu <chris.lu@gmail.com>	2014-03-02 22:16:54 -0800
committer	Chris Lu <chris.lu@gmail.com>	2014-03-02 22:16:54 -0800
commit	27c74a7e66558a4f9ce0d10621606dfed98a3abb (patch)
tree	f16eef19480fd51ccbef54c05d39c2eacf309e56 /note
parent	edae676913363bdd1e5a50bf0778fdcc3c6d6051 (diff)
download	seaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.tar.xz seaweedfs-27c74a7e66558a4f9ce0d10621606dfed98a3abb.zip