diff options
| author | Stuart P. Bentley <stuart@testtrack4.com> | 2015-03-04 22:05:25 +0000 |
|---|---|---|
| committer | Stuart P. Bentley <stuart@testtrack4.com> | 2015-03-04 22:05:25 +0000 |
| commit | 79127b267bee77a54b4be0384c2086000b12eade (patch) | |
| tree | e7fc485ac6491b7145a629a80820de587a7291af /docs/distributed_filer.rst | |
| parent | 5f9efceee305e68e53d3f0b844278f3b599d71e9 (diff) | |
| download | seaweedfs-79127b267bee77a54b4be0384c2086000b12eade.tar.xz seaweedfs-79127b267bee77a54b4be0384c2086000b12eade.zip | |
Use GitHub Wiki for docs
I've converted all the docs pages to pages on https://github.com/chrislusf/weed-fs/wiki/:
- docs/index.rst => https://github.com/chrislusf/weed-fs/wiki
- docs/gettingstarted.rst => https://github.com/chrislusf/weed-fs/wiki/Getting-Started
- docs/clients.rst => https://github.com/chrislusf/weed-fs/wiki/Client-Libraries
- docs/api.rst => https://github.com/chrislusf/weed-fs/wiki/API
- docs/replication.rst => https://github.com/chrislusf/weed-fs/wiki/Replication
- docs/ttl.rst => https://github.com/chrislusf/weed-fs/wiki/Store-file-with-a-Time-To-Live
- docs/failover.rst => https://github.com/chrislusf/weed-fs/wiki/Failover-Master-Server
- docs/directories.rst => https://github.com/chrislusf/weed-fs/wiki/Directories-and-Files
- docs/distributed_filer.rst => https://github.com/chrislusf/weed-fs/wiki/Distributed-Filer
- docs/usecases.rst => https://github.com/chrislusf/weed-fs/wiki/Use-Cases
- docs/optimization.rst => https://github.com/chrislusf/weed-fs/wiki/Optimization
- docs/benchmarks.rst => https://github.com/chrislusf/weed-fs/wiki/Benchmarks
- docs/changelist.rst => https://github.com/chrislusf/weed-fs/wiki/Change-List
Diffstat (limited to 'docs/distributed_filer.rst')
| -rw-r--r-- | docs/distributed_filer.rst | 118 |
1 files changed, 0 insertions, 118 deletions
diff --git a/docs/distributed_filer.rst b/docs/distributed_filer.rst deleted file mode 100644 index b4bd9a43a..000000000 --- a/docs/distributed_filer.rst +++ /dev/null @@ -1,118 +0,0 @@ -Distributed Filer -=========================== - -The default weed filer is in standalone mode, storing file metadata on disk. -It is quite efficient to go through deep directory path and can handle -millions of files. - -However, no SPOF is a must-have requirement for many projects. - -Luckily, SeaweedFS is so flexible that we can use a completely different way -to manage file metadata. - -This distributed filer uses Redis or Cassandra to store the metadata. - -Redis Setup -##################### -No setup required. - -Cassandra Setup -##################### -Here is the CQL to create the table.CassandraStore. -Optionally you can adjust the keyspace name and replication settings. -For production, you would want to set replication_factor to 3 -if there are at least 3 Cassandra servers. - -.. code-block:: bash - - create keyspace seaweed WITH replication = { - 'class':'SimpleStrategy', - 'replication_factor':1 - }; - - use seaweed; - - CREATE TABLE seaweed_files ( - path varchar, - fids list<varchar>, - PRIMARY KEY (path) - ); - - -Sample usage -##################### - -To start a weed filer in distributed mode with Redis: - -.. code-block:: bash - - # assuming you already started weed master and weed volume - weed filer -redis.server=localhost:6379 - -To start a weed filer in distributed mode with Cassandra: - -.. code-block:: bash - - # assuming you already started weed master and weed volume - weed filer -cassandra.server=localhost - -Now you can add/delete files - -.. code-block:: bash - - # POST a file and read it back - curl -F "filename=@README.md" "http://localhost:8888/path/to/sources/" - curl "http://localhost:8888/path/to/sources/README.md" - # POST a file with a new name and read it back - curl -F "filename=@Makefile" "http://localhost:8888/path/to/sources/new_name" - curl "http://localhost:8888/path/to/sources/new_name" - -Limitation -############ -List sub folders and files are not supported because Redis or Cassandra -does not support prefix search. - -Flat Namespace Design -############ -In stead of using both directory and file metadata, this implementation uses -a flat namespace. - -If storing each directory metadata separatedly, there would be multiple -network round trips to fetch directory information for deep directories, -impeding system performance. - -A flat namespace would take more space because the parent directories are -repeatedly stored. But disk space is a lesser concern especially for -distributed systems. - -So either Redis or Cassandra is a simple file_full_path ~ file_id mapping. -(Actually Cassandra is a file_full_path ~ list_of_file_ids mapping -with the hope to support easy file appending for streaming files.) - -Complexity -################### - -For one file retrieval, the full_filename=>file_id lookup will be O(logN) -using Redis or Cassandra. But very likely the one additional network hop would -take longer than the actual lookup. - -Use Cases -######################### - -Clients can assess one "weed filer" via HTTP, create files via HTTP POST, -read files via HTTP POST directly. - -Future -################### - -SeaweedFS can support other distributed databases. It will be better -if that database can support prefix search, in order to list files -under a directory. - -Helps Wanted -######################## - -Please implement your preferred metadata store! - -Just follow the cassandra_store/cassandra_store.go file and send me a pull -request. I will handle the rest. |
