Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider returning Future instead of Result from Filesystem methods #3

Open
LalitMaganti opened this issue Jan 16, 2017 · 6 comments
Open

Comments

@LalitMaganti
Copy link

LalitMaganti commented Jan 16, 2017

Currently all methods in the FilesystemMT return a Result indicating the response to the method call.

While it is true that all requests are currently processed concurrently using threading, it may be more ergonomic to make everything run on a reactor (from tokio-rs core crate) and change all methods to return Futures instead of Results.

In my case, I am trying to use async IO to perform network requests (I'm writing a network filesystem) and threading on the library level is not useful. Utilising futures-rs means that implementors can use whichever mechanism is more appropriate to them (whether that be disk IO on a thread pool or network requests on a reactor).

I'd be happy to prototype this in a fork and send a PR if you are open to it!

@wfraser
Copy link
Owner

wfraser commented Jan 16, 2017

This is an interesting idea, and I think it's worth looking into.

I'm currently just beginning to learn the Futures / Tokio framework. I took a very brief stab at changing just one of the calls to have the client FS return a Future instead of the result, and not dispatching it onto a threadpool, and I ran into some problems:

  1. At a minimum, the Tokio event loop needs to run on its own thread, because the incoming calls from rust-fuse come in on the main thread, and the event loop blocks its thread permanently. (Nearly all of the Tokio examples have them starting up the event loop at the end of the main function and blocking there, which just isn't how FUSE filesystems work.)
  2. The thread for the Tokio event loop needs to be created lazily. This is because a common thing to do in a filesystem's init call is to detach from the console, which involves fork, which cannot be done once threads have been spawned. In the current code I get around this by only creating the threadpool the first time one of the thread-dispatched calls is made. I haven't yet figured out how to do this with Tokio.

If you've got some time and interest, then by all means, please take a shot at implementing it. I'd like to see it.

I'm going to keep looking into it myself, because if nothing else it's an excuse to learn this new framework.

@wfraser
Copy link
Owner

wfraser commented Jan 16, 2017

Actually, I got motivated enough to take another stab at it myself and fixed the issues I was having. Experimental branch is here: https://github.com/wfraser/fuse-mt/tree/futures

Would this be an improvement for you? This experimental code only adds futures for the four syscalls that the current code does on background threads (read, write, flush, fsync); I know for a network filesystem you'd really want all of them to be futures, and this is the direction I would ultimately take this.

@LalitMaganti
Copy link
Author

Certainly this is an improvement.

I began mocking up a trait which used futures for everything and didn't need Boxed futures either - once I finish that, I'll let you know.

BTW the way I imagine the Filesystem trait working in the long run is a layered system. Working from inside out this looks as:

  1. fuse-rust
  2. Callbacks -> futures provided by this library
  3. Path translation provided by this library
  4. Application implementation (can itself be layered into cache layers etc.)

@wfraser
Copy link
Owner

wfraser commented Jan 19, 2017

Yep, fuse-mt began as just such a layer in a filesystem I made (BackFS), and I realized it was general purpose enough that it would probably be useful to others. :)

I'm still iterating on my experimental branch, and one thing I'm not entirely sure of is who should own the executor: FuseMT or the client filesystem? I currently have FuseMT owning it, but I wonder if that design potentially limits filesystems from doing something...

Another thing is FuseMT's original purpose of running I/O heavy tasks on a threadpool. This design gets rid of the threadpool and moves it out into the client filesystem (where the returned futures can be ones made using futures-cpupool). I wonder if this is the right decision, or if perhaps FuseMT should provide the ability to optionally do this itself.

Finally, there's the issue of synchronization, which I'm punting on for now by only making the big 4 I/O heavy operations asynchronous. Ideally, all operations would be asynchronous, but then there's potentially the problem of having the operations get run out of order.

For example: in a normal filesystem, if you issue an unlink and an open on the same path at approximately the same time, the order is non-deterministic; and if the open completes first, it will succeed, or if it completes second, it will fail. With FUSE in the mix (which has its own inode cache), if the kernel module does the open first and then the unlink, then passes them to FuseMT, but it completes the unlink first, when the open completes, it could fail to find the inode in the table. (Currently that means it would return EINVAL because FUSE is never supposed to issue a request for an inode that doesn't exist in the table.) I could potentially make FuseMT behave correctly in the face of this asynchrony (which gets tricky), but I wonder if the FUSE kernel module will behave strangely...

@LalitMaganti
Copy link
Author

I think leaving threading decisions of IO heavy tasks to the implementer is the current thing to do. I also think that the for now the reactor should be owned by FuseMT.

As for synchronization, I'm not sure but I think we will need more users of the library before we decide on what exactly to do there.

@amosonn
Copy link
Contributor

amosonn commented Feb 27, 2017

Just noting: you could actually run the reactor in the main thread as well. The IO operation in fuse-rust happens in channel.receive, and the operation there is reading from a raw file descriptor. The same can be achieved within the reactor, via mio::unix::EventedFd. I still need some more reading to see how exactly does one bind events to having an available read, but this should be possible.

Assuming this, we need to replace implementation of Session::run with another one. However, luckily the only other part of the code where Session is used, inside Request, only its fields are used, so we can have a AsyncSession with a Session field. This might be a good idea for fuse-rust as well, as currently Session has both public members and methods, which is usually bad.

EDIT: Ok, apparently this was already discussed here zargony/fuse-rs#74.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants