| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by eurekin 929 days ago

I just played with 7b version. It really feels different than anything I tried before. It could explain a docker compose file. It generated a simple vue application component.

I asked around a bit about the example and it was strangely coherent and focused across the whole conversation. It was really well detecting, where I'm starting a new thread (without clearing a context) or referring to things before.

It caught me off guard as well with this:

> me: What does following mean [content of the docker compose]

> cybertron-7b: In the provided YAML configuration, "following" refers to specifying dependencies

I've never seen any model using my exact wording in quotes in conversation like that.

2 comments

mark_l_watson 929 days ago

How did you run it? Are there model files in Ollama format? Are you running on NVidia or Apple Silicon?

EDIT: just saw this “ Megatron (1, 2, and 3) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.”

link

brucethemoose2 929 days ago

My recommendation is:

- Exui with exl2 files on good GPUs.

- Koboldcpp with gguf files for small GPUs and Apple silicon.

There are many reasons, but in a nutshell they are the fastest and most VRAM efficient.

I can fit 34Bs with about 75K context on a single 24GB 3090 before the quality drop from quantization really starts to get dramatic.

link

mark_l_watson 929 days ago

Thanks! I will check out Koboldcpp.

link

eurekin 929 days ago

In the textgeneration web ui on NVidia gpu

link

whimsicalism 928 days ago

your edit is entirely unrelated to this topic

link

brucethemoose2 929 days ago

Yeah, the Yi version is quite something too.

link