Hacker News new | ask | show | jobs
by fourside 548 days ago
Sorry to sound like a party pooper but this project gives off strong “fake it till you make it” vibes. Most AI projects I’ve seen share some type of information on how they work, yet this is completely devoid of it. Is this a new approach to mesh generation or is it using existing tooling? Then you’ve got the “we think it’s really good” line when it’s really just you. Like, why the hand waviness, the use of “GPT” when it doesn’t apply. There’s just something a bit off about this. Maybe it’s all fine but the the lack of information doesn’t help.
3 comments

Understandable. For context, the GPT in the name comes from an earlier version of this project (https://github.com/gd3kr/blendergpt) which actually used GPT-4 to write python scripts that Blender would then execute. This would allow GPT-4 to program operations like instantiating primitives with the Blender Python API given only a text prompt (ex. "create 50 cubes")

The new version of BlenderGPT (lets call this v2) doesn't use an any autoregressive token prediction for the actual mesh generation part, so I understand why it sounds dishonest. I really just chose to stick with the name because artists really didn't seem to care about how the meshes are generated, and the term GPT became closely associated with AI.

As for the technical stuff, I've been working on BlenderGPT v2 for the past several months, and until a week ago, i had been using a custom pipeline I built borrowing and re-implementing bits of Unique3D (https://wukailu.github.io/Unique3D/) and combining it with optimized models (flow matching diffusion models etc) for intermediate steps (text to image generation). My optimizations reduced inference time from >2 minutes to only about 20 seconds. This is the model used in this demo i shared: https://x.com/gd3kr/status/1853645054721606100

And then Microsoft released Trellis (https://github.com/microsoft/TRELLIS), and it seemed to leapfrog my model's capabilities on most things. Integrating it into the pipeline wasn't too hard and so I went forward with it.

All of this is just to say that there really was a lot of effort put into the core pipeline, and the landing page was mostly an afterthought. Actively working on a more comprehensive one that covers all the points I talked about.

The problem with Trellis is that it insists on generating textures that are already illuminated. Is there a way to exclude lighting?
What did you use for the 2D loading images? This one is really nice: https://blendergptv2-jobs.s3.us-east-2.amazonaws.com/generat...
lol at the gearing on the front wheel and the whole frame being backwards. Also no pedals or crank arms, the artwork is quite nice though
The backwards drivetrain/steering is kind of fascinating to consider. I'd love to see someone like Colin Furze or Stuff Made Here actually make one to try it out. What would it be like to ride a bike that steered by pivoting the back wheel?
Would be interesting videos. Makes me think of how you have to maneuver a shopping cart if pushing backwards; as the rear wheels are fixed and front wheels rotate. At high speed it would be dicey, too easy to oversteer
Have you never seen a front-wheel drive rear-steer mono-pedal bike before? /s
why does it matter how it works? Either it works and people pay for it or it doesn't. Does every company owe you, the end user, an explanation on how their product works? While you're at it, maybe you can get all the secret recipes.
Well, because we're curious and this is a place where curious critical technology enthusiasts gravitate. If it doesn't do anything novel _at_all_ or if there's no story to elaborate on, go to Reddit.

Plus, many are probably tired of seeing the same thing being made repeatedly that just proxys requests to chatgpt and makes them look pretty.

I'm curious: don't you think the aggregate interest of the HN crowd is adequately measured via the voting mechanism? You seem not to find BlenderGPT as presented in its current form uninteresting, but if you accept that (voting up)=interest, many other people did. Why dismiss("go to Reddit" comment) someone else's work, that, evidently, many other HNers find interesting?
I didn't dismiss anyone's work, and I do find the upvote system to, at least in some cases, adequately represent the level interest on hn.

The question was: > why does it matter how it works?

and that's all my comment was intended to answer. Many people here are interested both in the idea of doing something enough to upvote AND are curious how something works. We're not necessarily just consoomers, we're often interested in details, but if I was buying something and wanted to know why I should, the maker should probably be able to answer why their thing is special; in this case, I'm just saying that people on HN are generally interested in how things work.

Sure, HNers are interested in explanations of how things work (I am too!).

But you specifically said that without such an explanation, products should "go to Reddit" (which presumably means, they don't belong on HN). I'll leave whether that's a "dismissal of someone's work" or not up to you, but all I'm saying is: it's evident via voting that many HNers find BlenderGPT, a tech product, interesting, even with the lack of that explanation. And so BlenderGPT does not need to "go to Reddit".

> But you specifically said that without such an explanation, products should "go to Reddit" (which presumably means, they don't belong on HN)

I didn't imply anything about BlenderGPT at all, I just responded to a comment. Reddit is both an advertising platform for products of all kinds, and a conversation platform for broader categories of audiences, whereas SHOW HN is like a "here's my project/product, I hope you find it interesting, and here's a chance to ask me about it". If someone posts a Show HN, it's fair assume that if people find it interesting, they'll ask how it works, because we're going to be curious, and if a person is hypothetically not prepared for that, Show HN might not be the best place to post it. I didn't say any of that was true or false regarding BlenderGPT, it was just a general remark.

> why does it matter how it works? Either it works and people pay for it or it doesn't.

It's hackernews, not aliexpress

I think that's a fair point that not every company owes the end user a recipe for how to reproduce their product.

However, it's also a fair question on Hacker News. Again, fair if they chose not to answer it.. but many people here are programmers.

Since they explained that they used an open source model and system https://github.com/Microsoft/TRELLIS, it will be possible for other developers who want to start similar businesses to launch basic competitors within a week or so, if they are ambitious about it.

I spent about 10 minutes with my agent running Claude 3.5 Sonnet New and generated most of the core code already: https://github.com/runvnc/img2blender

Although I haven't tested that and don't actually know if it will work.

> why does it matter how it works?

So we don't get another Theranos grift if this eventually raises money from private investors?

Big difference since this product appears to demonstrate that it does work.
> Like, why the hand waviness, the use of “GPT” when it doesn’t apply.

While recognizing your earlier complaint of not having details of how it works, is there some reason to think it doesn't work using a generative pre-trained transformer? If we had to make an assumption about how it works, that would be my assumption. It is the go-to tool for these types of problems.