| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by blackkettle 4184 days ago

the stuff i work on is in the area of machine learning, so most published work involves one or more well-known data sets.

i would argue that the two are the same in this case.

the lxcs provide all the source code i write [plus of course the compiled version], all third-party libraries, and all scripts used to run and evaluate the experiments, and the data as well, where that is permitted.

it's still not perfect, but for my area, i honestly think it is the best, and most accountable way to do things that i have seen.

1 comments

mcguire 4184 days ago

And hopefully one or more not-so-well-known, local data sets to check that the results are actually as claimed?

link

blackkettle 4184 days ago

well, the idea is that you should be able to run any data set you have, and get good results relative to other solutions. but that is an open question with any research.

the point of the docker/lxc aspect is to provide a simple working environment to facilitate replication and validation.

so in comparison to the status quo, which is basically 'write a paper, include some high level equations, and results', i think this is a step forward in a better direction.

link