logloss issue with multiclass task

Hi HunterMcGushion,

I am doing a multi-classification task and I wanna set `sklearn.metrics.log_loss` as the experiment metric, but I have a trouble:

```
env = Environment(
    results_path=HPHPATH, 
    train_dataset=df, 
    target_column='Quality_label',
    metrics=['log_loss'],
#     metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3])),
    cv_type=StratifiedKFold,
    cv_params=dict(n_splits=6, shuffle=True),
#     global_random_seed=seed
)

experiment = CVExperiment(
    model_initializer=RandomForestClassifier,
    model_init_params=dict(
        n_estimators=20
    )
)
```

See, the target has 4 labels, 0 to 3. When I run the code above, it triggers a value error:
```
ValueError: y_true and y_pred contain different number of classes 4, 2.
```

If I set `labels` for logloss metric, `metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3]))`, it throws out another error:

`ValueError: The number of classes in labels is different from that in y_pred.`

I checked the examples and previous issues like #90, and I wonder have you tested logloss for multiclass task?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logloss issue with multiclass task #197

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

logloss issue with multiclass task #197

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions