Hacker News new | ask | show | jobs
by DavidFerris 127 days ago
We rendered the one million part ABC dataset from Deep Geometry, and open-sourced the data. We also built a fun demo with the following pipeline: CAD > render > caption > embed.

Open-sourced dataset: https://huggingface.co/datasets/daveferbear/3d-model-images-...

Blog writeup: https://www.finalrev.com/blog/embedding-one-million-3d-model...

2 comments

The search function doesn’t seem to work at all, it provides nonsensical results.

For example if I search “supercolumns” I get regular household furniture.

Yeah I think the embedding are describing what can be seen from a picture of the model not what it is or what it is used for. search some things work like "Fan" but others don't so you can search for "plate with 5 holes" but not for specific engine part cover.
When I search for "chair" I get 48 results, about 3 of which are actually chair-like.

- Some are clearly miscategorized - ABC-00131096 is a coffee table but has a very detailed description of its chair attributes.

- Many others are weird nonsense geometry, like ABC-00991744, ABC-00807798, ABC-00349255 or ABC-00822766.

- Some have a partially accurate description (if you pretend it's a chair), like ABC-00685912 has a blocky geometric structure with a horizontal piece off a vertical piece, but then it starts talking about an armrest on one side that doesn't exist at all.

- ABC-00388826 is a silhouette of a cat, which the description misses completely, and I don't see how you would sit on this "unique chair design characterized by its fluid and sculptural form."

Overall the descriptions are pretty useless and ascribe a lot of chairness to things that are not chairs.

Is a dataset with this much junk in it good for something?