Hacker News new | ask | show | jobs
by LarsDu88 930 days ago
As a machine learning engineer who dabbles with Blender and hobby gamedev, this is pretty impressive, but not quite to the point of being useful in any practical manner (as far as the limited furniture examples are concerned.

A competent modeler can make these types of meshes in under 5 minutes, and you still need to seed the generation with polys.

I imagine the next step will be to have the seed generation controlled by an LLM, and to start adding image models to the autoregressive parts of the architecture.

Then we might see truly mobile game-ready assets!

9 comments

> A competent modeler can make these types of meshes in under 5 minutes.

I don't think this general complaint about AI workflows is that useful. Most people are not a competent <insert job here>. Most people don't know a competent <insert job here> or can't afford to hire one. Even something that takes longer than a professional do at worse quality for many things is better than _nothing_ which is the realistic alternative for most people who would use something like this.

> I don't think this general complaint about AI workflows is that useful

Maybe not to you, but it's useful if you're in these fields professionally, though. The difference between a neat hobbyist toolkit and a professional toolkit has gigantic financial implications, even if the difference is minimal to "most people."

Linux vs Unix. Wikipedia vs Britannica. GCC vs Intel compiler. Good enough free hobby toy beats expansive professional tools given enough hobbysts.
First, we're talking about the state of the technology and what it can produce, not the fundamental worthiness of the approach. Right now, it's not up to the task. In the earliest phases of those technologies, they also weren't good enough for for professional use cases.

Secondly, the number of hobbyists only matters if you're talking about hobbyists that develop the technology-- not hobbyists that use the technology. Until those tools are good enough, you could have every hobbyist on the planet collectively attempting to make a Disney-quality character model with tools that aren't capable of doing so and it wouldn't get much closer to the requisite result than a single hobbyist doing the same.

1. they don't beat them outight. It's simply more accessible.

2. those "hobbyists" in all examples are in fact professionals now. That's why they could scale up.

Blender is an another good example
Blender was a professional tool from the start. The company behind it went insolvent ... and with crowdfunding the source could be freed.
Right-- being open source doesn't automatically mean it's an amateur tool or has its roots in a collective hobbyist effort.
Is the target market really "most people," though? I would say not. The general goal of all of this economic investment is to improve the productivity of labor--that means first and foremost that things need to be useful and practical for those trained to make determinations such as "useful" and "practical."
Millions of people generating millions of images (some of them even useful!) using Dall-E and Stable Diffusion would say otherwise. A skilled digital artist could create most of these images in an hour or two, I’d guess… but ‘most people’ certainly could not, and it turns out that these people really want to.
Are those millions of people actually creating something of lasting value, or just playing around with a new toy?
Is there a problem with the latter?
A lot, but how many people will start with the latter but find themselves (capable of) doing the former?
They're creating lasting value for themselves. Or don't you think they should be allowed to?
>Most people don't know a competent <insert job here> or can't afford to hire one

May be relevant in the long run, but it'll probably be 5+ years before this is commercially available. And it won't be cheap either, so out of the range of said people who can't hire a competent <insert job here>

That's why a lot of this stuff is pitched to companies with competent people instead of offered as a general product to download.

Is there a reason to expect it'd be significantly more expensive than current-gen LLM? Reading the "Implementation Details" section, this was done with GPT2-medium, and assuming running it is about as intensive as the original GPT2, it can be run (slowly) on a regular computer, without a graphics card. Seems reasonable to assume future versions will be around GPT-3/4's price.
Agreed! There's also no way this is 5 years away from being viable.

I just checked the timestamps on my Dall-E Mini generated images. They're dated June 2022

This is what people were doing on commodity hardware back then:

https://cdn-uploads.huggingface.co/production/uploads/165537...

This is what people are doing on commodity hardware now:

https://civitai.com/images/3853761

I'm not even going to try to predict what we'll be able to do in 2 years time; even when accounting for the current GenAI hype/bubble!

Perhaps not, but it begs the question of if GPT is affordable for a dev to begin with. I don't know how they would monetize this sort of work so it's hard to say. But making game models probably requires a lot more processing power than generating text or static images.
> but it'll probably be 5+ years before this is commercially available

I think you should look at the progress of image, text, and video generation over the past 12 months and re-asses your timeline prediction.

I have no doubt that 3d modeling will become commodified in the same way that art has with the dawn of AI art generation over the past year.

I honestly think we'll get there within 18 months.

My skepticism is whether the technique described here will be the basis of what people will be using in ~2 years to replace their low level static 3d asset generation.

There are several techniques out there, leveraging different sources of data right now. This looks like a step in the right direction, but who knows.

People still do wood block printing - even though printing is commodified to the nines.

At the moment, making 3d models is a lot of skilled, monotonous work, especially for stuff like scene furniture. I guess I'd be pretty happy if some of that work could be automated away, and I'm pretty confident that there's no point automating away the remainder, for the same reason you don't want ChatGPT writing your screenplay.

Availability =/= viability. I'm sure as we speak some large studios are already leveraging this work or are close to leveraging it.

But this stuff trickles down to the public very slowly. Because indies aren't a good audience to sell what is likely an expensive tech that is focused on mid-large scale production.

Yes but no, none of that really describes current development.
perhaps, but I was responding to

>Most people are not a competent <insert job here>. Most people don't know a competent <insert job here> or can't afford to hire one.

emphasis mine. Affordability doesn't have much to do with capabilities, but it is a strong factor to consider for an indie dev. Devs in fields (games, VFX) that don't traditionally pay well to begin with.

The open source art ai community is far more mature than people think.
2d, maybe. 3D, I haven't seen anything close to a game ready asset.
> A competent modeler can make these types of meshes in under 5 minutes

Sweet. Can you point me to these modelers who work on-demand and bill for their time in 5 minute increments? I’d love to be able to just pay $1-2 per model and get custom <whatever> dropped into my game when I need it.

they said competent though no cheap
There's no competent modeler that can produce 12 models per hour for 8 hours a day, let alone 24/7.

Sure, you can probably demo your skills on one such model, but to do it consistently non-stop is a fantasy.

but the AI will be cheap. $1 per model would be the OpenAI wrapper’s price. Let alone the wholesale price.
> A competent modeler can make these types of meshes in under 5 minutes

It's not about competent modellers, any more than SD is for expert artists.

It's about giving tools to the non-experts. And also about freeing up those competent modellers to work on more interesting things than the 10,000 chair variants needed for future AAA games. They can work on making unique and interesting characters instead, or novel futuristic models that aren't in the training set and require real imagination combined with their expertise.

Like most of the generative AI space, it'll eliminate something like the bottom half of modelers, and turn them into lower paid prompt wizards. The top half will become combo modelers / prompt wizards, using both skillsets as needed.

Prompt wizard hands work off to the finisher/detailer.

It'll boost productivity and lead to higher quality finished content. And you'll be able to spot when a production - whether video game or movie - lacks a finisher (relying just on generation by prompt). The objects won't have that higher tier level of realism or originality.

>freeing up those competent modellers to work on more interesting things than the 10,000 chair variants needed for future AAA games. They can work on making unique and interesting characters instead, or novel futuristic models that aren't in the training set and require real imagination combined with their expertise.

Or flipping burgers at McDonald's!

There are only so many games that the market can support, and in those, only so many unique characters[0] that are required. We're pretty much at saturation already.

[0]Not to mention that if AI can generate chairs, from what we have seen from Dall-E & SDXL, it can generatte characters too. Less great than human generated ones? Sure, but it's clear that big boys like Bethesda and Activision do not care.

The mesh topology here would see these rejected as assets for in basically any professional context. A competent modeler could make much higher quality models, more suited to texturing and deformation, in under five minutes. A speed modeler could make the same in under a minute. And a procedural system in something like Blender geonodes can already spit out an endless variety of such models. But the pace of progress is staggering.
I see it as a black triangle[0] more than anything else. Sounds like a really good first step that will scale to stuff that would take even a good modeler days to produce. That's where the real value will start to be seen.

[0]: https://rampantgames.com/blog/?p=7745

Just like a competent developer can use LLMs to bootstrap workflows, a competent model will soon have tools like this as part of their normal workflow. A casual user would be able to do things that they otherwise wouldnt have been able to. But an expert in the ML model's knowledge domain can really make it shine.

I really believe that the more experienced you are in a particular use case, the more use you can get out of an ML model.

Unfortunately, it's those very same people that seem to be the most resistant to adopting this without really giving it the practice required to get somewhere useful with it. I suppose part of the problem is we expect it to be a magic wand. But it's really just the new PhotoShop, or Blender, or Microsoft Word, or PowerPoint ...

Most people open those apps, click mindleslly for a bit, promptly leave never to return. And so it is with "AI".

I think eventually it may settle into what you describe. I don't think it's guaranteed, and I fear that there will be a pretty huge amount of damage done before that by the hype freaks whose real interest isn't in making artists more productive, but in rendering them (and other members of the actually-can-do-a-thing creative class) unemployed.

The pipeline problem also exists: if you need to still have the skillsets you build up through learning the craft, you still need to have avenues to learn the craft--and the people who already have will get old eventually.

There's a golden path towards a better future for everybody out of this, but a lot of swamps to drive into instead without careful forethought.

I can imagine one usecase, in a typical architecture design, where the architect creates a design and always faces this stumbling block, when wanting to make it look as lively as possible: sprinkling a lot of convincing assets everywhere.

As they are generated, variations are much easier to come by easier, than buying a couple asset packs.

A simple next step would be to simply scale the model, make it bigger, and train it on millions of images in the wild.
As I understand it their claim is more about efficiency and quality.

Being able to model something - is way different from being able to do it in the least amount of triangles and/or without losing details.

Until you create an AI to do those other parts too. (There is an AI being tested right now that tries to do that in the game dev community)
This is a very underrated comment... As with any tech demo, I'd they don't show it, it can't do it. It is very very easy to imagine a generalization of these things to other purposes, which, if it could do it, would be a different presentation.
It's research, not meant for commercialization. The main point is in the process, not necessarily the output.
What? If the research doesn't show it, it can't do it, is my point, or else they would've put it in their research.