Hacker News new | ask | show | jobs
by esha_manideep 845 days ago
These models will are compatible with llama.cpp out of the box, we (GigaML - https://gigaml.com) are planning to train a small model (3-4B, 1-bit, opensource) with the latest stack-v2 dataset released today. Let me know if anyone is interested in collaborating with us.
2 comments

I'm interested in collaborating. For example, from the comments it occurred to me that a 128-bit SIMD register can contain 64 2-bit values. It seems straightforward that SIMD bitwise logical operations could be used in training such models.
Highly interested in collaborating – got a bunch of proprietary legal data already pre-sorted and labeled for various scenarios. I've already benchmarked legal use-cases (i.e. legal speciality, a few logic-based questions, and specific document creation) with various LLMs – so would love to see what benchmarks this can produced compared to early Mistral or Llama.

Let me know what's the best way to reach out!