|
|
|
|
|
by lenzm
1731 days ago
|
|
I don't think this is very insightful. Using the first half of books for training data and the second half for testing data is still training the model specifically for these texts and authors. Not quite as bad as testing on the training data, but not great. |
|
For this task, I was primarily interested in whether the task would work at all. My assumption is that given we can optimise for these texts, we could optimise for more representative datasets, too. Perhaps you think this is a weak assumption?
Do you think testing on a sample of totally different texts from different authors would be more convincing?