| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mchiang 508 days ago

I’m one of the maintainers of Ollama.

It’s amazing to see others build on top of open-source projects. Forks like RamaLama are exactly what open source is all about. Developers with different design philosophies can still collaborate in the open for everyone’s benefit.

Some folks on the Ollama team have contributed directly to the OCI spec, so naturally we started with tools we know best. But we made a conscious decision to deviate because AI models are massive in size - on the order of gigabytes - and we needed performance optimizations that the existing approaches didn’t offer.

We have not forked llama.cpp, We are a project written in Go, so naturally we’ve made our own server side serving in server.go. Now, we are beginning to hit performance, reliability and model support problems. This is why we have begun transition to Ollama’s new engine that will utilize multiple engine designs. Ollama is now naturally responsible for the portability between different engines.

I did see the complaint about Ollama not using Jinja templates. Ollama is written in Go. I’m listening but it seems to me that it makes perfect sense to support Go templates.

We are only a couple of people, and building in the open. If this sounds like vendor lock-in, I'm not sure what vendor lock-in is?

You can check the source code: https://github.com/ollama/ollama

3 comments

justinmayer 508 days ago

These comments would carry more merit if they weren’t coming from the very person who closed this pull request: https://github.com/jmorganca/ollama/pull/395

Those rejected README changes only served to provide greater transparency to would-be users, and here we are a year and a half later with woefully inadequate movement on that front.

I am very glad folks are working on alternatives.

link

threecheese 508 days ago

As an outsider (not an oss maintainer, but a contributor), the decline to merge imo was understandable - the maintainer had a strategy and it didn’t fit. They gave reasons why - really nicely - and even made a call to action for PRs placing architecture docs elsewhere. Your response tonally was disparaging, and the subsequent pile on was anti productive. All due respect to your experience as a maintainer; in that role, can you imagine seeing a contribution that you are not interested in, and declining/forgetting to or being too busy to engage, imagining that it might get dropped or made better while you are busy with your priorities?

Putting myself in your shoes, I can see why you might be annoying at being ignored. Suggests this change is really important to you, and so my question would be why didn’t you follow the maintainers advice and add architecture docs?

link

zozbot234 508 days ago

These comments seem reasonable to me. Could you clarify the Ollama maintainers' POV wrt. the recent discussion of Ollama Vulkan support at https://news.ycombinator.com/item?id=42886680 ? Many people seem to be upset that this PR seems to have gotten zero acknowledgment from the Ollama folks, even with so many users being quite interested in it for obvious reasons. (To be clear, I'm not sure that the PR is in a mergeable state as-is, so I would disagree with many of those comments. But this is just my personal POV - and with no statement on the matter from the Ollama maintainers, users will be confused.)

EDIT: I'm seeing a newly added comment in the Vulkan PR GitHub thread, at https://github.com/ollama/ollama/pull/5059#issuecomment-2628... . Quite overdue, but welcome nonetheless!

link

creesch 508 days ago

Since you are one of the maintainers of Ollama, maybe you can help me answer a related question. It is great that the software itself is open source, but hosting the models must cost a fortune. I know this is funded by VC money, yet nowhere on the Ollama website or repository there is any mention of this. Why is that?

There isn't an about section, a tiny snippet in a FAQ somewhere, nothing.

link

mchiang 507 days ago

We partner with Cloudflare R2 to minimize the cost of hosting. Check out their pricing.

The website is so minimal right now because we have been focused on the GitHub repo.

link