In the line with for i in range(1,6): you have model = lm.fit(X_current, y) so you're training your model using the X_test data as well. Shouldn't the model be trained only with the training data? And then we would call cross_val_score() using the model and the X values?