Skip to content

Commit 50c9333

Browse files
committed
upgrade to 0.0.3
1 parent 49f555a commit 50c9333

40 files changed

+189
-5267
lines changed

README.md

+41-28
Original file line numberDiff line numberDiff line change
@@ -1,85 +1,97 @@
11
BlobFs
22
=====
33
![blobfs demo](doc/blobfs-demo.gif)
4-
BlobFs is a distributed [FUSE](http://fuse.sourceforge.net) based file system backed by [Microsoft azure blob storage service](https://azure.microsoft.com/en-us/services/storage/blobs/). It allows you to mount the containers/blobs in the storage account as a Linux local folder. It support the cluster mode.
4+
BlobFs is a distributed [FUSE](http://fuse.sourceforge.net) based file system backed by [Microsoft azure blob storage service](https://azure.microsoft.com/en-us/services/storage/blobs/). It allows you to mount the containers/blobs in the storage account as a the local folder/driver. , no matter it is a Linux system or a Windows system. It support the cluster mode. you can mount the blob container (or part of it) across multiple linux and windows nodes.
5+
6+
## Important Notes:
7+
* Here is the linux/mac version of the blobfs, please find the windows version of the blobfs from [blobfs-win](https://github.com/wesley1975/blobfs-win).
8+
* For the core libraries of the blobfs, you can find it from the bloblib(https://github.com/wesley1975/bloblib). which is responsible for handling all the underlying azure blob storage operations
9+
* If you are interested in contributing, please contact me via [email protected]
510

611
## Project Goals
7-
The main goal of the project is to make azure storage service easy to use for Linux box.
12+
Object storage is one of the most fundamental topic you'll encounter when you decide to start your cloud journey.The main goal of the project is to make azure storage service easy to use for Linux and windows box.
13+
14+
## Key Updates:
15+
base on the lots of feedbacks, in version 0.0.3, I made these major updates:
16+
* Ported the blobfs to windows platform. It became a universal solution.
17+
* Improve the performance of list/rename/delete operation by enabling the multi-threading way. now it can list/rename/delete the thousands of items within few seconds.
18+
* By requests of many users, I changed the queue from service bus to the Azure Queue storage,This will greatly simplify the configuration.
19+
* Various bugs fixed, It is now more stable. you can use in production environment, but it's at your own risk.
820

921
## Features:
1022
* Implemented these fuse functions: getattr, readdir, open, release, read, flush, create, mkdir, rename, rmdir, unlink, truncate, write, symlink, readlink.
1123
* Allow mount multiple containers (or part of them) as the local folder.
1224
* Cluster enabled: It supports mount the same containers/blobs across multiple nodes. Theses files can be shared via these nodes. The caches of these nodes are synchronized via service bus.
1325
* Use blob leases as the distributed locking mechanism across multiple nodes. The blob will be locked exclusively when it is written.
14-
* File’s attribute is cached for better performance, the cache are synchronized via service bus.
26+
* File’s attribute is cached for better performance, the cache are synchronized via azure queuq storage.
1527
* The contents are pre-cached by chunks when there is read operation. This will eliminate the times of http request and increase the performance greatly.
1628
* Multi-part uploads are used for the write operation. Data is buffered firstly and then be uploaded if the buffer size exceed the threshold. This also can eliminate the times of http request and increase the performance greatly.
1729
* You can edit the file content on the fly, especially recommend for the small file, It does not need download, edit and then upload.
1830
* Append mode is supported, you can append the new line to the existing blob directly. this is more friendly for logging operation. And it can change the block blob to append blob automatically.
1931
* Use server-side copy for move, rename operations, more efficient for big files and folders.
20-
* Support the link function
32+
* Support the symbol link function
2133

2234
## Architecture and introduction
2335

2436
This is the logical architecture of blobfs:
2537

2638
![blobfs Logical Architecture](doc/blobfs-arch.jpg)
2739
* Blobfs uses the blob leases to safe the write operation in the distributed environment, there is a dedicated thread to renew the lease automatically.
28-
* For each of the node, there is local cache in it’s memory, the cache will store the file attributes. Once the file is changed by the node, the node will send a message to the topic of the service bus. And then other nodes will receive the notification via the dedicated subscription of the topic.
40+
* For each of the node, there is local cache in it’s memory, the cache will store the file attributes. Once the file is changed by the node, the node will send a message to the Azure Queue storage. And then other nodes will receive the message and process it.
2941

3042
## installation
31-
I strongly recommend to test and verify it in you environment before you use it.
43+
Installation now is very easy. But I strongly recommend to test and verify it in you environment before you use it. it's at your own risk.
3244
### 1.Install fuse
3345
yum install fuse fuse-devel
34-
### 2.Install blobfs without cluster mode enabled
46+
### 2.Install blobfs
3547
#### 2.1 get the azure account connection string, refer this [link](https://docs.microsoft.com/en-us/azure/storage/storage-create-storage-account)
3648
#### 2.2 Edit configuration file:
3749
Open blobfs.conf
3850
change the setting of :
3951
Storage_Connection_String = your-storage-account -connection-string
4052
blob_prefix = / (e.g. /container1/folder1/)
4153
mount_point = /mnt/blobfs (make sure the path exists in you node)
42-
cluster_enabled = false
43-
You can also modify other settings if needed
44-
### 3.Install blobfs with cluster mode enabled
45-
Additionally, you should do these actions
46-
#### 3.1 Create a service bus topic. Refer this [link](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-create-topics-subscriptions)
47-
#### 3.2 create a subscription for each of your node, refer this [link](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-create-topics-subscriptions)
48-
#### 3.3 Edit configuration file:
49-
Open blobfs.conf,
50-
change the setting of :
51-
Storage_Connection_String = your-storage-account -connection-string
52-
blob_prefix = / (e.g. /container1/folder1/)
53-
mount_point = /mnt/blobfs (make sure the path exists in you node)
5454
cluster_enabled = true
55-
service_bus_connection_string = your-servicebus-connection-string
56-
service_bus_topic = your-blobfs-topic
57-
service_bus_subscription = subscription of the dedicated node
58-
You can modify other settings if needed
55+
the cluster mode is enabled by default, You can also modify other settings if needed
56+
5957
### final.Start the blobfs service
60-
nohup java -jar uber-blobfs-0.0.1-SNAPSHOT.jar
58+
nohup java -jar blobfs-0.0.3jar
59+
It is highly recommended that you should use [supervisord](http://supervisord.org/) to manage the blobfs services.
6160

6261
## Tips
6362
* the block blob is read only by default. marked with read only flag. e.g. r--r--r--
6463
* the append blob is marked with the read and write flag. e.g rw-rw-rw-
6564

65+
## How to create a append blob
66+
* CLI way
67+
touch append.log // this will create a empty block blob.
68+
echo 'new line here' >> append.log
69+
//this will change the underlying block blob to append blob automatically.
70+
// you also can issue this command against a existing file, this also works, but the time depends on the size of the file.
71+
72+
* Programming way
73+
FileWriter fw = new FileWriter("/mnt/blobfs/container1/append.log", true) //java 1.7+
74+
applendFile = open("/mnt/blobfs/container1/append.log",' a+') //python
75+
...
76+
6677
## Performance Test
67-
* simply do a dd testing (single thread mode).
78+
* simply do a dd testing (single thread mode). so the performance depends on the machine and network.
6879
![blobfs performance test](doc/blobfs-perf.gif)
6980

7081
## Dependency
7182
* FUSE (Filesystem in Userspace) is an OS mechanism for unix-like OS that lets non-privileged users create their own file systems without editing kernel code.
7283
* [Java Native Runtime (JNR)](https://github.com/jnr/jnr-ffi) is high-performance Java API for binding native libraries and native memory.
7384
* [jnr-fuse](https://github.com/SerCeMan/jnr-fuse) is FUSE implementation in Java using Java Native Runtime (JNR).
7485

75-
## Limitation
86+
## Limitation and known issues:
7687
* Due to the overhead of fuse system, the performance will be expected slower than native file system.
7788
* For the cp command, the blobfs will use read out - then write in to new blob mode. this will spent more time for large files/folders.
78-
* For the page blob, currently, blobfs doest not support the copy operation. it may casue file interruption.
89+
* For the page blob, currently, should be, but it is not well tested yet. it may casue file interruption.
7990

8091
## Supported platforms
8192
-Linux : x86, x64
8293
-MacOS (via osxfuse): x86, x64 (should be, but not tested yet)
94+
-windows: [blobfs-win](https://github.com/wesley1975/blobfs-win)
8395

8496
## Command Line Usage
8597
blobfs -h
@@ -90,10 +102,11 @@ Additionally, you should do these actions
90102
-m,--mount-point <arg> Desired local mount point for BlobFs.
91103
-o <arg> FUSE mount options
92104

105+
for more fuse mount options, you can find [here](http://manpages.ubuntu.com/manpages/xenial/man8/mount.fuse.8.html).
93106

94107
## License
95108
Copyright (C) 2017 Wesley Wu [email protected]
96-
This code is licensed under the The MIT License (MIT).
109+
This code is licensed under The General Public License version 3
97110

98111
## FeedBack
99112
Your feedbacks are highly appreciated! :)

README.md.old

+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
BlobFs
2+
=====
3+
![blobfs demo](doc/blobfs-demo.gif)
4+
BlobFs is a distributed [FUSE](http://fuse.sourceforge.net) based file system backed by [Microsoft azure blob storage service](https://azure.microsoft.com/en-us/services/storage/blobs/). It allows you to mount the containers/blobs in the storage account as a the local folder/driver. It support the cluster mode.
5+
6+
## Project Goals
7+
The main goal of the project is to make azure storage service easy to use for Linux box.
8+
9+
## Features:
10+
* Implemented these fuse functions: getattr, readdir, open, release, read, flush, create, mkdir, rename, rmdir, unlink, truncate, write, symlink, readlink.
11+
* Allow mount multiple containers (or part of them) as the local folder.
12+
* Cluster enabled: It supports mount the same containers/blobs across multiple nodes. Theses files can be shared via these nodes. The caches of these nodes are synchronized via service bus.
13+
* Use blob leases as the distributed locking mechanism across multiple nodes. The blob will be locked exclusively when it is written.
14+
* File’s attribute is cached for better performance, the cache are synchronized via service bus.
15+
* The contents are pre-cached by chunks when there is read operation. This will eliminate the times of http request and increase the performance greatly.
16+
* Multi-part uploads are used for the write operation. Data is buffered firstly and then be uploaded if the buffer size exceed the threshold. This also can eliminate the times of http request and increase the performance greatly.
17+
* You can edit the file content on the fly, especially recommend for the small file, It does not need download, edit and then upload.
18+
* Append mode is supported, you can append the new line to the existing blob directly. this is more friendly for logging operation. And it can change the block blob to append blob automatically.
19+
* Use server-side copy for move, rename operations, more efficient for big files and folders.
20+
* Support the link function
21+
22+
## Architecture and introduction
23+
24+
This is the logical architecture of blobfs:
25+
26+
![blobfs Logical Architecture](doc/blobfs-arch.jpg)
27+
* Blobfs uses the blob leases to safe the write operation in the distributed environment, there is a dedicated thread to renew the lease automatically.
28+
* For each of the node, there is local cache in it’s memory, the cache will store the file attributes. Once the file is changed by the node, the node will send a message to the topic of the service bus. And then other nodes will receive the notification via the dedicated subscription of the topic.
29+
30+
## installation
31+
I strongly recommend to test and verify it in you environment before you use it.
32+
### 1.Install fuse
33+
yum install fuse fuse-devel
34+
### 2.Install blobfs without cluster mode enabled
35+
#### 2.1 get the azure account connection string, refer this [link](https://docs.microsoft.com/en-us/azure/storage/storage-create-storage-account)
36+
#### 2.2 Edit configuration file:
37+
Open blobfs.conf
38+
change the setting of :
39+
Storage_Connection_String = your-storage-account -connection-string
40+
blob_prefix = / (e.g. /container1/folder1/)
41+
mount_point = /mnt/blobfs (make sure the path exists in you node)
42+
cluster_enabled = false
43+
You can also modify other settings if needed
44+
### 3.Install blobfs with cluster mode enabled
45+
Additionally, you should do these actions
46+
#### 3.1 Create a service bus topic. Refer this [link](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-create-topics-subscriptions)
47+
#### 3.2 create a subscription for each of your node, refer this [link](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-create-topics-subscriptions)
48+
#### 3.3 Edit configuration file:
49+
Open blobfs.conf,
50+
change the setting of :
51+
Storage_Connection_String = your-storage-account -connection-string
52+
blob_prefix = / (e.g. /container1/folder1/)
53+
mount_point = /mnt/blobfs (make sure the path exists in you node)
54+
cluster_enabled = true
55+
service_bus_connection_string = your-servicebus-connection-string
56+
service_bus_topic = your-blobfs-topic
57+
service_bus_subscription = subscription of the dedicated node
58+
You can modify other settings if needed
59+
### final.Start the blobfs service
60+
nohup java -jar uber-blobfs-0.0.1-SNAPSHOT.jar
61+
62+
## Tips
63+
* the block blob is read only by default. marked with read only flag. e.g. r--r--r--
64+
* the append blob is marked with the read and write flag. e.g rw-rw-rw-
65+
66+
## Performance Test
67+
* simply do a dd testing (single thread mode).
68+
![blobfs performance test](doc/blobfs-perf.gif)
69+
70+
## Dependency
71+
* FUSE (Filesystem in Userspace) is an OS mechanism for unix-like OS that lets non-privileged users create their own file systems without editing kernel code.
72+
* [Java Native Runtime (JNR)](https://github.com/jnr/jnr-ffi) is high-performance Java API for binding native libraries and native memory.
73+
* [jnr-fuse](https://github.com/SerCeMan/jnr-fuse) is FUSE implementation in Java using Java Native Runtime (JNR).
74+
75+
## Limitation
76+
* Due to the overhead of fuse system, the performance will be expected slower than native file system.
77+
* For the cp command, the blobfs will use read out - then write in to new blob mode. this will spent more time for large files/folders.
78+
* For the page blob, currently, blobfs doest not support the copy operation. it may casue file interruption.
79+
80+
## Supported platforms
81+
-Linux : x86, x64
82+
-MacOS (via osxfuse): x86, x64 (should be, but not tested yet)
83+
84+
## Command Line Usage
85+
blobfs -h
86+
-b,--blob-prefix <arg> The prefix of the blobs that will be used as the
87+
mounted BlobFS root (e.g., /container1/blob1/;
88+
defaults to /)
89+
-h,--help Print this help
90+
-m,--mount-point <arg> Desired local mount point for BlobFs.
91+
-o <arg> FUSE mount options
92+
93+
94+
## License
95+
Copyright (C) 2017 Wesley Wu [email protected]
96+
This code is licensed under the The MIT License (MIT).
97+
98+
## FeedBack
99+
Your feedbacks are highly appreciated! :)

bin/blobfs-0.0.3.jar

4.13 MB
Binary file not shown.

bin/blobfs.conf

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# ================================================================================
2+
# ++++++++++++++++++ blobfs configurations +++++++++++++++++++++++++++++++++++++++
3+
# ================================================================================
4+
# Azure storage account connection string
5+
Storage_Connection_String = DefaultEndpointsProtocol=https;AccountName=dev2107;AccountKey=wJNRBizfsyQ2JUBBBJjBFAsxjBmvVpkK+61ZqKV/n3HaI/jSEdS3oGoyBGdmvBhVVeUy5S2hlGSAuipeyXXSww==;EndpointSuffix=core.chinacloudapi.cn
6+
7+
# the name of the queue, which will be used for synchronizing the file cache across the nodes.
8+
queue_name = blobfs
9+
10+
# the prefix of the blobs that will be used as the mounted blobfs root,
11+
# e.g., /container1/blob1/; defaults to /
12+
blob_prefix = /
13+
14+
# Desired local mount point for BlobFs, linux and osx
15+
mount_point = /mnt/blobfs
16+
17+
# Desired local mount point for BlobFs, windows
18+
win_mount_point = Y:\
19+
20+
# the user id for the BlogFs, default value is the caller user id
21+
# default value is :-1.
22+
uid = -1
23+
24+
# the group id for the BlogFs, default value is the group id of the caller
25+
# default value is :-1.
26+
gid = -1
27+
28+
29+
# In the Debug Mode, the debug messages will be displayed in the console, currently this option is ignored.
30+
debug_enabled = true
31+
32+
# supports five logging levels: TRACE < DEBUG < INFO < WARNING < ERROR.
33+
log_level = ERROR
34+
35+
# cache
36+
cache_enabled = true
37+
38+
# in seconds
39+
cache_TTL = 180
40+
41+
# if one source of the blobs will be mounted by more than one host, you should enable the cluster mode
42+
# in the cluster mode, blobfs will manage the cache distributedly
43+
cluster_enabled = true
44+
45+
# change the block blob to append blob automatically,
46+
# this will be triggered when you open a read only file with the append mode.
47+
# caution for large blob, this will consume more time
48+
# e.g., echo "new line" >> readonlyfile
49+
auto_change_block_blob_to_append_blob = true

doc/blobfs-arch.jpg

17.6 KB
Loading

src/main/java/com/wesley/blobfs/BfsBlobModel.java

-45
This file was deleted.

src/main/java/com/wesley/blobfs/BfsBlobType.java

-5
This file was deleted.

0 commit comments

Comments
 (0)