Skip to content

Commit 96dc8bb

Browse files
committed
Changes to support RocksDB
This includes the following changes 1. Honour O_CLOEXEC in open call 2. FCNTL honour O_CLOEXEC 3. Redirect FREAD_UNLOCKED to FREAD 4. Handle 'pread' calls in SplitFS 5. Intercept fallocate 6. Intercept sync_file_range 7. Add unit tests 8. Add implementation.md
1 parent cf787a1 commit 96dc8bb

18 files changed

+733
-53
lines changed

implementation.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
## Implementation details
2+
Some of the implementation details of intercepted calls in SplitFS
3+
- `fallocate, posix_fallocate`
4+
- We pass this to the kernel.
5+
- But before we pass this on to the kernel we fsync (relink) the file so that the kernel and SplitFS both see the file contents and metadata consistently.
6+
- We also clear the mmap table in SplitFS because they might get stale after the system call.
7+
- We update the file size after the system call accordingly in SplitFS before returning to the application.
8+
- `sync_file_range`
9+
- sync_file_range guarantees data durability only for overwrites on certain filesystems. It does not guarantee metadata durability on any filesystem.
10+
- In case of POSIX mode of SplitFS too, we guarantee data durability and not metadata durability, i.e we want to provide the same guarantees as posix.
11+
- The data durability is guaranteed by virtue of doing non temporal writes to the memory mapped file, so we don't really need to do anything here. In case where the file is not memory mapped (for e.g file size < 16MB) we pass it on to the underlying filesystem.
12+
- In case of Sync and Strict mode in SplitFS, this is guaranteed by the filesystemitself and sync_file_range is not required for durability.
13+
- `O_CLOEXEC`
14+
- This is supported via `open` and `fcntl` in SplitFS. We store this flag value in SplitFS.
15+
- In the supported `exec` calls, we first close the files before passing the `exec` call to the kernel.
16+
- We do not currently handle the failure scenario for `exec`
17+
- `fcntl`
18+
- Currently in SplitFS we only handle value of the `close on exec` flag before it is passed through to the kernel.

splitfs/bg_clear_mmap.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,9 @@ static void clean_dr_mmap() {
4444
assert(0);
4545
}
4646
if (clean_overwrite)
47-
ret = posix_fallocate(dr_fd, 0, DR_OVER_SIZE);
47+
ret = _hub_find_fileop("posix")->POSIX_FALLOCATE(dr_fd, 0, DR_OVER_SIZE);
4848
else
49-
ret = posix_fallocate(dr_fd, 0, DR_SIZE);
49+
ret = _hub_find_fileop("posix")->POSIX_FALLOCATE(dr_fd, 0, DR_SIZE);
5050

5151
if (ret < 0) {
5252
MSG("%s: posix_fallocate failed. Err = %s\n",

splitfs/fileops_hub.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,9 @@ RETT_FOPEN64 _hub_FOPEN64(INTF_FOPEN64);
276276
RETT_IOCTL ALIAS_IOCTL(INTF_IOCTL) WEAK_ALIAS("_hub_IOCTL");
277277
RETT_IOCTL _hub_IOCTL(INTF_IOCTL);
278278

279+
RETT_FCNTL ALIAS_FCNTL(INTF_FCNTL) WEAK_ALIAS("_hub_FCNTL");
280+
RETT_FCNTL _hub_FCNTL(INTF_FCNTL);
281+
279282
RETT_OPEN64 ALIAS_OPEN64(INTF_OPEN64) WEAK_ALIAS("_hub_OPEN64");
280283
RETT_OPEN64 _hub_OPEN64(INTF_OPEN64);
281284

@@ -1399,6 +1402,21 @@ RETT_UNLINK _hub_UNLINK(INTF_UNLINK)
13991402
return result;
14001403
}
14011404

1405+
RETT_FCNTL _hub_FCNTL(INTF_FCNTL)
1406+
{
1407+
CHECK_RESOLVE_FILEOPS(_hub_);
1408+
1409+
DEBUG("CALL: _hub_FCNTL\n");
1410+
1411+
va_list ap;
1412+
void * arg;
1413+
va_start (ap, cmd);
1414+
arg = va_arg (ap, void*);
1415+
va_end (ap);
1416+
1417+
return _hub_managed_fileops->FCNTL(CALL_FCNTL, arg);
1418+
}
1419+
14021420
RETT_UNLINKAT _hub_UNLINKAT(INTF_UNLINKAT)
14031421
{
14041422
CHECK_RESOLVE_FILEOPS(_hub_);

0 commit comments

Comments
 (0)