Hacker News new | ask | show | jobs
by detaro 3421 days ago
I thought it was Lego at first, but 39000 shapes seemed a bit much. I thought there were only 8-10k shapes?
1 comments

  mysql> select count(*) from parts;
  +----------+
  | count(*) |
  +----------+
  |    38516 |
  +----------+
  1 row in set (0.00 sec)
Though I'm sure there is some overlap it's definitely not 70%.
That's really shapes and not counting colors/prints multiple times? (Not that it really matters to the scope of the problem ;))
No, if you multiply by colors it gets much worse. 100's of thousands of possibilities...

Of course not all parts exist in all colors so that helps (a bit), but it is quite an interesting problem to work on. Every assumption you make will be challenged.

Prints and stickers are counted separately but that's not a really huge number and they should be correctly identified (so the surface decoration matters as well in the classification).

Do you also plan to detect counterfeit pieces ?
That's a really hard problem. In many cases there are subtle hints (color, the writing on the studs).

Spectrography might give a hint here due to the different formulation of the plastics but some of the knock-offs are now so good it can be very hard to tell them from the real thing.

I'm not really sure if 'counterfeit' is the right term, the companies selling these are not making pieces labeled 'lego', and in fact the Lego brand started out by copying an English product.

https://en.wikipedia.org/wiki/Kiddicraft

Damaged pieces and discolored pieces are also of interest and a very hard category to detect.

I didn't know about Kiddicraft, thanks TIL.

It might not be true at the piece level, but some sets on some websites(ali, etc.) are exact replicas of lego sets, bar the brand. Down to the manual. Hence the use of "counterfeit".