| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Der_Einzige 846 days ago

Very annoying namespace conflict since a package called "mamba" (faster reimplementation of the python conda package manager) already existed for awhile before this architecture was even dreamed up.

https://github.com/mamba-org/mamba

Beyond that, I'll care about an alternative to transformers when it shows superior performance with an open source 7b-34b model compared to transformer model competitors. So far this has not happened yet

3 comments

jasonjmcghee 846 days ago

> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

link

lpasselin 846 days ago

The mamba paper shows significant improvements in all model sizes, up to 1b, the largest one tested.

Are there any reason why it wouldn't scale to 7b or more? Have they tried it?

link

samus 845 days ago

That's the issue - I keep hearing that it is beyond small research group's budget to meaningfully train such a large model. You don't just need GPU time, you also need data. And just using the dregs of the internet doesn't cut it.

link

woadwarrior01 846 days ago

I use the former and have been experimenting with the latter. Fortunately, the contexts are separate enough that they never come up in the same sentence.

link

amelius 846 days ago

I was using mamba to install mamba the other day, when suddenly I had to run for a live mamba.

link

croes 846 days ago

While chewing a Mamba?

https://www.mamba.us/

link

scarmig 845 days ago

I had the exact same experience, and I was also using it for a web application powered by the Mamba web framework.

link