Hacker News new | ask | show | jobs
by Smerity 1540 days ago
I just want to note the replies to this thread are excessively dismissive and toxic. You may not agree with the wording of their advertising ("world's most powerful NLP toolkit" is marketing speak, sure) but going from that to implying the technical side is "only Min-GPT" is tremendously weird. As someone who works in machine learning and specifically language models this is a team I'm keeping an eye on.

For anyone who wanted more technical discussion re: ML / LM (though the author notes this work "[does] not reflect the architectures or latencies of my employer's models" i.e. it's an exploratory technical breakdown of general model characteristics) I've appreciated the technical write-ups from @kipperrii (ML ops @ Cohere) recently:

- Transformer Inference Arithmetic: https://carolchen.me/blog/transformer-inference-arithmetic/

- Breakdown of H100s for Transformer Inferencing: https://carolchen.me/blog/h100-inferencing/

1 comments

They should put that content on their website. I also thought that the comments were a bit harsh but then visited the site and was immediately put off myself. They have a really great team and could do a great job of conveying that through content.