Skip to content

CroissantBuilder does not work on Windows machines #5546

Open
@zwouter

Description

@zwouter

Short description
When using a simple example code snippet of the CroissantBuilder to load datasets using the croissant format, it only seems to work on Linux.
The code snippet below correctly downloads and prepares a dataset on Collab, or WSL, but results in an error on Windows. All tested on a clean virtual environment.

Environment information

  • Operating System: Windows 11

  • Python version: 3.11.1

  • tensorflow-datasets/tfds-nightly version: tfds-nightly 4.9.6.dev202408050044

  • tensorflow/tf-nightly version: tensorflow 2.17.0

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ?
    Yes

Reproduction instructions

import mlcroissant as mlc
import tensorflow_datasets as tfds

url = "https://huggingface.co/api/datasets/fashion_mnist/croissant"
builder = tfds.core.dataset_builders.CroissantBuilder(jsonld=url, file_format='array_record')
builder.download_and_prepare()

Link to logs
https://pastebin.com/fRrfn8jj

Expected behavior
A dataset builder is prepared such that I can use .as_data_source() later.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions