Hacker News new | ask | show | jobs
by davidy123 848 days ago
You don't seem willing to share how you did anything, you only draw attention to your works. In the reddit thread, several people asked about your 'talk like a pirate' training, and you never responded. In this thread, you imply you'll talk about how you used this visualization in your training, yet you never do.
1 comments

I’ve gone into pretty great detail on the visualization in the README of my repo. The main utility is detecting individual layers being overfit.

There are some specifics about OpenPirate that I’m not at liberty to share at the moment, but those are unrelated to this visualization. I’ve published the model weights under a permissive license, and I hope to publish more of the training code in the future.

If you have any questions about how to use the code in my neural flow repo just ask.

OK, sorry if I missed that then, but perhaps a direct link here or there would help since a number of people asked the same thing. I followed a link to your huggingface page on reddit, and there the obvious README doesn't talk about specifics[1].

1. https://huggingface.co/valine/OpenPirate/blob/main/README.md

Yeah I apologize, a lot of the information is scattered across threads right now. I should have spent more time compiling everything in one place.

This comment chain in particular might have some of what you’re looking for:

https://www.reddit.com/r/LocalLLaMA/comments/1ap8mxh/comment...

Other relevant threads to put it all in one place:

https://www.reddit.com/r/LocalLLaMA/comments/198x01d/openpir...

https://www.reddit.com/r/LocalLLaMA/comments/19a5hdx/morehum...

https://www.reddit.com/r/LocalLLaMA/comments/1apz94o/neuralf...

https://github.com/valine/NeuralFlow/blob/master/README.md

The one thing I don’t talk about is the specifics of the instruction generalization which unfortunately I’m not able to share, even though I very much want to.

I don't think you should be apologising, there is always room for improvement. Nice work!