-
-
Notifications
You must be signed in to change notification settings - Fork 260
Open
Labels
Description
Reviewing this page, most of the feature selectors offered by SKLearn are covered, however it would be cool to see dask implement RFECV into that mix as well! I'd like to send a RFECV.fit()
to a dask cluster:
# Perform RFECV feature selection
selector = RFECV(model,
step=0.05, # Remove 5% of features at each iteration
min_features_to_select=5, # Keep at least 5 features
cv=TimeSeriesSplit(n_splits=3),
scoring=SCORING_METRIC,
verbose=0,
n_jobs=-1,
)
# Use Dask to parallelize the feature selection
with joblib.parallel_backend('dask'):
selector.fit(X_train, y_train)
'dict' object has no attribute 'estimator'
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\_utils.py", line 72, in __call__
return self.func(**kwargs)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\_dask.py", line 131, in __call__
results.append(func(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\utils\parallel.py", line 139, in __call__
return self.function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\feature_selection\_rfe.py", line 46, in _rfe_single_fit
X, params=routed_params.estimator.fit, indices=train
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'estimator'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\pygad\pygad.py", line 1688, in cal_pop_fitness
fitness = self.fitness_func(self, sol, sol_idx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "I:\nasty\Python_Projects\Stock_Options_Trading\DailyStockClassifierAndRegressors2025\1_1_class_train_v2_Matts.py", line 337, in fitness_func
selector.fit(X_train, y_train)
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\base.py", line 1389, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
scores_features = parallel(
^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\utils\parallel.py", line 77, in __call__
return super().__call__(iterable_with_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 2007, in __call__
return output if self.return_generator else list(output)
^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 1650, in _get_outputs
yield from self._retrieve()
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 1754, in _retrieve
self._raise_error_fast()
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 1789, in _raise_error_fast
error_job.get_result(self.timeout)
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 745, in get_result
return self._return_or_raise()
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chalu\AppData\Local\Programs\Python\Python311\Lib\site-packages\joblib\parallel.py", line 763, in _return_or_raise
raise self._result
AttributeError: 'dict' object has no attribute 'estimator'