Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FQE Loading .d3 errors #407

Closed
jdesman1 opened this issue Jul 30, 2024 · 1 comment
Closed

[BUG] FQE Loading .d3 errors #407

jdesman1 opened this issue Jul 30, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@jdesman1
Copy link

jdesman1 commented Jul 30, 2024

Describe the bug
Perhaps related to #381, I also encounter the bug of loading .d3 checkpoints from FQE. In particular, when using load_learnable, it raises NotImplementedError from

d3rlpy/d3rlpy/ope/fqe.py

Lines 74 to 77 in 3433de5

def create(self, device: DeviceArg = False) -> "_FQEBase":
raise NotImplementedError(
"Config object must be directly given to constructor"
)

I have also tried initializing fqe along the lines of:

fqe = FQE(algo=model, config=....)
fqe.build_with_dataset(dataset)
fqe.load_model("mymodel.d3")

which yields:

"..../serialization.py", line 1035, in _legacy_load
    raise RuntimeError("Invalid magic number; corrupt file?")
RuntimeError: Invalid magic number; corrupt file?

which is odd, since I save and load .d3 files for other d3rlpy models all the time without an issue. Im happy to run a modified load_learnable if such a workaround can be made

Is there an intended workaround for this?

@jdesman1 jdesman1 added the bug Something isn't working label Jul 30, 2024
@takuseno
Copy link
Owner

takuseno commented Aug 4, 2024

@jdesman1 Hi, thanks for the issue. Currently, FQE does not support .d3 format loading. The workaround would be:

# During training
fqe.save_model("fqe.pt")  # save as PyTorch model

# When loading
fqe.load_model("fqe.pt")

This .pt format model is not saved during training by default. If you need to save all checkpoints during training, you can use callback option.

def callback(algo, epoch, total_step):
    algo.save_model(f"{epoch}_model.pt")

fqe.fit(..., epoch_callback=callback)

@jdesman1 jdesman1 closed this as completed Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants