Multi-label classification always returns empty results in run_classification.py example script

### System Info

Dell workstation with NVIDIA Titan Xp (12 GB RAM), driver version 535.261.03, CUDA 12.2.
Ubuntu Linux 24.04, Python 3.12.

### Who can help?

I'm using `run_classification.py` example script (in `pytorch/text-classification` folder), but when running with multi-labelled data it always returns empty values.

The file `predict_results.txt` contains:

```
index	prediction
0	[]
1	[]
2	[]
3	[]
4	[]
5	[]
...
```

### Information

- [x] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

This is the command:

```
WANDB_DISABLED=true python examples/pytorch/text-classification/run_classification.py \
	--text_column_name text \
	--train_file train.json \
	--validation_file dev.json \
	--model_name_or_path google-bert/bert-base-uncased \
	--shuffle_train_dataset \
	--do_train \
	--output_dir out \
	--num_train_epochs 3 \
	--per_device_train_batch_size 96 \
	--per_device_eval_batch_size 96 \
	--do_predict \
	--test_file test.json \
	--overwrite_output_dir \
	--do_eval \
	--label_column_name labels
```

This is the format of the data:

```
[
  {
    "text": "case c-116/15: action brought on 6 march 2015 \u2014 european parliament ...",
    "labels": [
      "4359",
      "5181"
    ]
  },
  {
    "text": "case c-20/15 p: appeal brought on 19 january 2015 ...",
    "labels": [
      "1484",
      "5541",
      "889"
    ]
  },
...
]
```

Full data can be found here: https://dh-server.fbk.eu/test-lex/

### Expected behavior

The file `predict_results.txt` should contain the predictions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-label classification always returns empty results in run_classification.py example script #43116

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-label classification always returns empty results in run_classification.py example script #43116

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions