| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ilaksh 797 days ago
	I think the meat of this is the text-to-image model. I hope you will upgrade to use leading edge models like DALLE-3 or Imagen 2 or SD 3 (when available) if you are not already. That will dramatically increase the effectiveness of portraying the given vision for the virtual artist if they are using a prior model.

1 comments

jamez 797 days ago

The text-to-image model is an important component, but the current model in use is IMO good enough. My view for this project is that the internal monologue is more important than the output, so my wish is instead for a better open-weight LLM.

link

ilaksh 797 days ago

Which text to image and LLM models are you using?

link

jamez 797 days ago

LLM: Mixtral-8x7B text-to-image: one of the leading commercial models, whose TOS I may or may not be violating.

link

ilaksh 797 days ago

Mixtral is great. I assume you saw the DBRX and new larger Mixtral release that just came out over the last few days.

link

jamez 797 days ago

I did! I want to switch to Mixtral-8x22B, time permitting. During the development of Stream of Consciousness I already swapped LLMs twice. This space is moving incredibly fast.

link