|
|
|
|
|
by fourside
548 days ago
|
|
Sorry to sound like a party pooper but this project gives off strong “fake it till you make it” vibes. Most AI projects I’ve seen share some type of information on how they work, yet this is completely devoid of it. Is this a new approach to mesh generation or is it using existing tooling? Then you’ve got the “we think it’s really good” line when it’s really just you. Like, why the hand waviness, the use of “GPT” when it doesn’t apply. There’s just something a bit off about this. Maybe it’s all fine but the the lack of information doesn’t help. |
|
The new version of BlenderGPT (lets call this v2) doesn't use an any autoregressive token prediction for the actual mesh generation part, so I understand why it sounds dishonest. I really just chose to stick with the name because artists really didn't seem to care about how the meshes are generated, and the term GPT became closely associated with AI.
As for the technical stuff, I've been working on BlenderGPT v2 for the past several months, and until a week ago, i had been using a custom pipeline I built borrowing and re-implementing bits of Unique3D (https://wukailu.github.io/Unique3D/) and combining it with optimized models (flow matching diffusion models etc) for intermediate steps (text to image generation). My optimizations reduced inference time from >2 minutes to only about 20 seconds. This is the model used in this demo i shared: https://x.com/gd3kr/status/1853645054721606100
And then Microsoft released Trellis (https://github.com/microsoft/TRELLIS), and it seemed to leapfrog my model's capabilities on most things. Integrating it into the pipeline wasn't too hard and so I went forward with it.
All of this is just to say that there really was a lot of effort put into the core pipeline, and the landing page was mostly an afterthought. Actively working on a more comprehensive one that covers all the points I talked about.