Hacker News new | ask | show | jobs
by ruborcalor 2292 days ago
Thanks I really appreciate it.

I'll try and get back to you with the performance of k-folds validation and shuffling.

I don't think it can be learning the order or samples because the train and test data sets are separated very early on. If it were learning order or samples of the training set it would have to perform very poorly on the test set.

1 comments

This had been bugging me all day in the back of my head... turns out shuffle is enabled by default. Both in sklearn and in tf.keras (also original keras).

On a separate note, I think there may be a source file missing in your notebook. I kept getting an error when trying to load "GSE87571_series_matrix.csv". Might just be me.

[sklearn ref](https://scikit-learn.org/stable/modules/generated/sklearn.mo...)

[tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fi...)