Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError when using TimeSeriesKMeans from kmeans_plusplus requiring sample_weight argument #459

Open
inega opened this issue Jul 3, 2023 · 1 comment
Labels

Comments

@inega
Copy link

inega commented Jul 3, 2023

I am currently having an issue with TimeSeriesKMeans (tslearn version 0.5.3.2, sklearn version 1.3.0, running on Windows, python 3.11). I initially tried to use the function on my own dataset, but managed to reproduce the error with the first example in the documentation page:

from tslearn.clustering import TimeSeriesKMeans
from tslearn.generators import random_walks

X = random_walks(n_ts=50, sz=32, d=1)
km = TimeSeriesKMeans(n_clusters=3, metric="euclidean", max_iter=5, random_state=0).fit(X)

Traceback (most recent call last):

  Cell In[1], line 6
    km = TimeSeriesKMeans(n_clusters=3, metric="euclidean", max_iter=5, random_state=0).fit(X)

  File ~\.conda\envs\subsenv\Lib\site-packages\tslearn\clustering\kmeans.py:780 in fit
    self._fit_one_init(X_, x_squared_norms, rs)

  File ~\.conda\envs\subsenv\Lib\site-packages\tslearn\clustering\kmeans.py:629 in _fit_one_init
    self.cluster_centers_ = _kmeans_plusplus(

TypeError: _kmeans_plusplus() missing 1 required positional argument: 'sample_weight'

I think this might be a version issue? The kmeans_pluplus docs indicate that the sample_weight argument was introduced in version 1.3 of scikit-learn, which was just released. I guess downgrading to a previous version will make it go away, but I thought of reporting it anyway, since other people will start getting the same error at some point.

@inega inega added the bug label Jul 3, 2023
@NimaSarajpoor
Copy link

NimaSarajpoor commented Jul 4, 2023

if SKLEARN_VERSION_GREATER_THAN_OR_EQUAL_TO_1_3_0:
sample_weight = _check_sample_weight(None, X, dtype=X.dtype)
self.cluster_centers_ = _kmeans_plusplus(
X.reshape((n_ts, -1)),
self.n_clusters,
x_squared_norms=x_squared_norms,
sample_weight=sample_weight,
random_state=rs,
)[0].reshape((-1, sz, d))
else:
self.cluster_centers_ = _kmeans_plusplus(
X.reshape((n_ts, -1)),
self.n_clusters,
x_squared_norms=x_squared_norms,
random_state=rs,
)[0].reshape((-1, sz, d))

Hope it helps with the debugging process...

Apparently, this part is added recently.
See: a091483

I think the param sample_weights should be added to the .fit method. (see KernelKmeans for reference)

@inega inega changed the title TypeError when using TimeSeriesKMeans from kmeans_pluplus requiring sample_weight argument TypeError when using TimeSeriesKMeans from kmeans_plusplus requiring sample_weight argument Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants