You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using a custom RLDS dataset that includes a language_instruction key (a string) alongside standard fields like action and observation.proprio. During finetuning, I run into this error:
tensorflow.python.framework.errors_impl.UnimplementedError: Cast string to float is not supported
This seems to happen during the dataset statistics computation step, where Octo tries to compute statistics for all fields—including strings.
Is there a recommended way to handle this kind of multimodal input (especially language) that doesn't require normalisation during this phase?