Skip to content

Add experimental replicate.use() function #438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open

Conversation

aron
Copy link
Contributor

@aron aron commented Jun 2, 2025

This PR introduces a new experimental use() function that is intended to make running a model closer to calling a function rather than an API request.

Some key differences to replicate.run():

  1. You "import" the model using the use() syntax, after that you call the model like a function.
  2. The output type matches the model definition. i.e. if the model uses an iterator output will be an iterator.
  3. Files will be downloaded output as Path objects*.

[!NOTE]

* We've replaced the FileOutput implementation with Path objects. However to avoid unnecessary downloading of files until they are needed we've implemented a PathProxy class that will defer the download until the first time the object is used. If you need the underlying URL of the Path object you can use the get_path_url(path: Path) -> str helper.

Examples

To use a model:

Important

For now use() MUST be called in the top level module scope. We may relax this in future.

from replicate import use

flux_dev = use("black-forest-labs/flux-dev")
outputs = flux_dev(prompt="a cat wearing an amusing hat")

for output in outputs:
    print(output) # Path(/tmp/output.webp)

Models that implement iterators will return an iterator that yields partial output as the model does:

claude = use("anthropic/claude-4-sonnet")

output = claude(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")

for token in output:
    print(token) # "Here's a recipe"

You can call str() on a language model to get the complete output as a string rather than iterating over tokens:

str(output) # "Here's a recipe to feed all of California (about 39 million people)! ..."

You can pass the results of one model directly into another, we'll do our best to make this work efficiently:

from replicate import use

flux_dev = use("black-forest-labs/flux-dev")
claude = use("anthropic/claude-4-sonnet")

images = flux_dev(prompt="a cat wearing an amusing hat")

result = claude(prompt="describe this image for me", image=images[0])

print(str(result)) # "This shows an image of a cat wearing a hat ..."

To create a prediction, rather than just getting output, use the create() method:

claude = use("anthropic/claude-4-sonnet")

prediction = claude.create(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")

prediction.logs() # get current logs (WIP)

prediction.output() # get the output

You can access the underlying URL for a Path object returned from a model call by using the get_path_url() helper.

from replicate import use
from replicate.use import get_url_path

flux_dev = use("black-forest-labs/flux-dev")
outputs = flux_dev(prompt="a cat wearing an amusing hat")

for output in outputs:
    print(get_url_path(output)) # "https://replicate.delivery/xyz"

TODO

There are several key things still outstanding:

  1. Support for asyncio.
  2. Support for typing the return value.
  3. Support for streaming text when available (rather than polling)
  4. Support for streaming files when available (rather than polling)
  5. Support for cleaning up downloaded files.
  6. Support for streaming logs using OutputIterator.
  7. Support for getting the prediction ID and status.

@aron aron force-pushed the experimental-use-fn branch from 756da8f to ae1589f Compare June 2, 2025 15:57
README.md Outdated
> For now `use()` MUST be called in the top level module scope. We may relax this in future.

```py
from replicate import use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtle ergonomic thing, but my instinct would be to call use directly from replicate, so the intention behind my code would be a bit more explicit. Would this also work?

import replicate

flux_dev = replicate.use("black-forest-labs/flux-dev")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention is that both would work. I went with use because that's how we styled include() early on, but I take your point that replicate.use() makes sense in this context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the README to use the import replicate style.

```
claude = use("anthropic/claude-4-sonnet")

prediction = claude.create(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prediction = claude.create(...)

☝🏼 I like it!


images = flux_dev(prompt="a cat wearing an amusing hat")

result = claude(prompt="describe this image for me", image=images[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I pass images[0] here, what's happening under the hood? Is it using the existing replicate.delivery HTTPS URL of the output file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, file outputs are now a thin wrapper around Path (see PathProxy) that will defer the download of the remote file to disk until the first time they are used.

To get the underlying URL, there is a get_path_url() helper that will return the remote URL of the file.

This is then used internally so that the file is never downloaded in cases like this where an output is just passed as an input to another model.

Feedback on this would be greatly appreciated. This and the OutputIterator are the two most complex pieces of this PR.

README.md Outdated

> [!NOTE]

\* We've replaced the `FileOutput` implementation with `Path` objects. However to avoid unnecessary downloading of files until they are needed we've implemented a `PathProxy` class that will defer the download until the first time the object is used. If you need the underlying URL of the `Path` object you can use the `get_path_url(path: Path) -> str` helper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid unnecessary downloading of files until they are needed

This feels similar to the FileOutput situation from before, mainly in that it requires explanation to understand, and it feels like a kind of non-standard behavior that docs-skimming users and LLMs won't be able to accurately guess at.

What would feel like the most intuitive thing to an end user?

Am I correct in thinking that this "defer until accessed" downloading behavior and the get_path_url feature are two entirely separate concerns? If so, maybe they should be documented separately to avoid confusion.

Here's a crack at that:

Downloading output files

Output files are return as Python Path objects. However to avoid unnecessary downloading of output files until they are needed, we wrap them in a PathProxy class that will defer the download until the first time the object is used.

Here's an example of how to download outputs to disk:

from replicate import use

flux_dev = use("black-forest-labs/flux-dev")

images = flux_dev(prompt="a cat wearing an amusing hat")

# Save each image to disk
for i, image in enumerate(images):
    image.write_bytes(f"cat_hat_{i}.png")

Accessing outputs as HTTPS URLs

All output files are automatically uploaded to Replicate's CDN, where they are available as HTTPS URLs for up to one hour before expiring.

To access the HTTPS URLs of your output files, use the get_path_url helper method:

import replicate

flux_dev = replicate.use("black-forest-labs/flux-dev")
outputs = flux_dev(prompt="a cat wearing an amusing hat")

for output in outputs:
    print(replicate.use.get_path_url(output)) # "https://replicate.delivery/xyz"

A few other thing come to mind:

  • In some places you're calling the helper get_url_path, and other places get_path_url -- If we're using that, I think get_path_url probably make more sense, as you're "getting the URL for the given Path™"
  • Seems a little weird to put the helper on use... I guess it helps to make it explicit as a helper method that you have to import. Otherwise it would be just a non-standard and undiscoverable mysterious method on the Path instance...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feedback thank you. Agree with both of the latter bullets, and the docs suggestions are much better.

The PathProxy is definitely similar to FileOutput, but without the completely new interface. I could be convinced to drop it from the version iteration, but I'd love to solve the issue.

Some other things I considered.

  • A subclass of Path with an explict .ensure() method, which is weird if the file doesn't exist yet.
  • A context manager to skip downloads. So if you had a large file you'd do something like:
    with skip_download():
       images = flux_dev(prompt="a cat wearing an amusing hat")
       result = claude(prompt="describe this image for me", image=images[0])
  • Not converting into Path at all and just using URLs and having that part of the API diverge from cog.

Copy link
Member

@zeke zeke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Thanks for spelling it out. I left some comments for consideration.

@andreasjansson
Copy link
Member

This is lovely!

PathProxy is a great solution, I think it will work for all the pipelines I've built.

It makes sense to start with only top-level include, but I imagine we'd want to relax that pretty soon. The main use case I have is pipelines where the user can select arbitrary trained LoRA models.

This is probably a later PR on top, but it'd be nice to denormalize the open api schema like I do here, which allows us to automatically construct tool use schemas for agents.

@aron
Copy link
Contributor Author

aron commented Jun 3, 2025

PathProxy is a great solution, I think it will work for all the pipelines I've built.

Great! An alternative I thought of would be to use PathLike I think we can then get away with something that looks like:

class URLPath:
  def __init__(self, url: str) -> None:
    self.url = url
  def __fspath__(self):
    # write path to disk
    return local_path

Then you can use it with Path or open as normal:

from pathlib import Path

output = model()
with open(output, "rb") as file:
  file.read()
  
p = Path(output)
p.read() ...

It's not quite the same, but there's a lot less magic involved, we could theoretically update cog to do the same thing, which would solve some of the issues we're seeing where the model also wants the input URL of the file rather than the data.

@andreasjansson
Copy link
Member

I think the most important thing with how we handle paths is that they should transparently behave like paths both in python, when using tools like PIL, and on the command line when using things like ffmpeg. So being fairly eager to download when someone requests str(path) is probably important. We have a decent test bed in pipeline-examples that we could run through this.

@aron aron force-pushed the experimental-use-fn branch from 09f1ce5 to e8acdb2 Compare June 3, 2025 13:57
@aron
Copy link
Contributor Author

aron commented Jun 3, 2025

@andreasjansson I've taken another pass at this and updated the README, we now have rudimentary typing support and I've replaced PathProxy with URLPath. The latter is an implementation of os.PathLike that defers the downloading of the underlying file until it's requested. This will be done when passing to a standard library method like Path() or open().

From some loose research the PathLike interface is supported by most libraries including PIL and python ffmpeg. I prefer it as an initial step because it's based on a common Python interface, isn't doing any wrapping and has no magic. Let's give it a test on the pipeline-examples repo.

@8W9aG
Copy link

8W9aG commented Jun 3, 2025

This PR looks good to me, PathProxy seems like a smart and reasonable use of the deferring of downloads which is definitely something that will help. My only comment here would be on the TODOs, which I see as the following in use.py:

# TODO
# - [ ] Support text streaming
# - [ ] Support file streaming
# - [ ] Support asyncio variant

I think asyncio is the only one that affects user facing code, and the streaming support can be done under the hood in this library after this change goes in, am I correct here?

@aron
Copy link
Contributor Author

aron commented Jun 3, 2025

@8W9aG I'm sorry, I've switched out PathProxy. would you mind taking a look again.

I think asyncio is the only one that affects user facing code, and the streaming support can be done under the hood in this library after this change goes in, am I correct here?

Yes, I think I have a solution here. So will get this implemented soon. The current include interface doesn't support async yet, so I'm not sure this is blocking.

aron added 4 commits June 4, 2025 11:33
This introduces a new `use_async` keyword to `use()` that will return an
`AsyncFunction` instead of a `Function` instance that provides an
asyncio compatible interface.

The `OutputIterator` has also been updated to implement the
`AsyncIterator` interface as well as be awaitable itself.
@aron aron force-pushed the experimental-use-fn branch from b5fa486 to a00b5d2 Compare June 4, 2025 10:33
return output


class OutputIterator[T]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this syntax is Python 3.12+ only https://docs.python.org/3/reference/compound_stmts.html#type-params

Unfortunately you'll need to use the more verbose TypeVar('T') syntax as this library supports Python 3.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants