Hacker News new | ask | show | jobs
by 3-cheese-sundae 915 days ago
Ah yes, Conda, definitely something else I've heard of.
3 comments

Its extremely common to manage python environments with conda (although it can do much more). If you are unaware of conda, it is unlikely you work with python, and therefore unlikely to be doing much with ML (and LLMs) anyway - its even part of the "getting started" documentation for pytorch.
Conda has been around for a decade and it used to be the primary package manager for everything related to numpy/scipy. Most ML and data science people have heard of it even if they haven't used it.
Conda is the latest LLM cli frontend that's a MOE of Mistral 7B, LLama 17B, Falcon 32C, and the Yamaha YZ50 quad bike.
Mamba is a PoC of the latest SSM architecture for LLMs named S6 and is a dense counterpart to Transformers trained for 300B tokens of the Pile in sizes up to 2.7B. Mamba proves that S6 LLMs train faster, run faster, use less VRAM, result in lower perplexity and better benchmark scores with the same exact training data.

That is actually accurate but probably sounds just as outlandish.

The approachable version is: Mamba is a proof of concept language model which showcases a new LLM architecture called S6 which is a competitor to the Transformer architecture (the 'T' in ChatGPT) and it is better in every measurable way.

> and the Yamaha YZ50 quad bike.

Well played.