Hacker News new | ask | show | jobs
by cdrake 2293 days ago
Interesting write up. I'd be interested to see how it performs with k-folds validation as well as shuffling. Kind of worried its learning order or samples.
1 comments

Thanks I really appreciate it.

I'll try and get back to you with the performance of k-folds validation and shuffling.

I don't think it can be learning the order or samples because the train and test data sets are separated very early on. If it were learning order or samples of the training set it would have to perform very poorly on the test set.

This had been bugging me all day in the back of my head... turns out shuffle is enabled by default. Both in sklearn and in tf.keras (also original keras).

On a separate note, I think there may be a source file missing in your notebook. I kept getting an error when trying to load "GSE87571_series_matrix.csv". Might just be me.

[sklearn ref](https://scikit-learn.org/stable/modules/generated/sklearn.mo...)

[tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fi...)