Hacker News new | ask | show | jobs
by powrtoch 4583 days ago
I don't think this problem is solvable in any elegant form, but it is solvable. You'll just end up with massively conjunctive questions that you can't even hold in your head at once, like "27: Are you a non-practicing Catholic with exactly three children, or an asian owner of a minivan produced between 1998 and 2004 that isn't green, or a licensed boat mechanic with astigmatism, or..." and so on for the next 6 pages.

In short, you can draw categories to include or exclude as precise a number as you like, you just have to be willing to draw really, really complicated boundaries.

4 comments

Sounds like a premise for a dystopian sci-fi story: a future where every identity is exactly planned, where everyone's life is determine by ... 33 bits. Donald Sutherland can be the benevolent ruler that tells the protagonist how the unbridled greed of the 21st century brought us here (Hollywood adaptation can add an ironic anti-consumerist twist).

Perhaps this could be a retro sci-fi a la "Brazil", with each person carrying around a punch card with his 33-bits on them. A computer error means two people are issued the same bit pattern. In a defining shot, they hold up their punch cards up against the sun and see the holes line up. Maybe an Egyptian tomb opens too!

Your mistake was posting this here, and not on Reddit.
Here's my admittedly naive Sunday afternoon spitball on an elegant solution:

I like the idea of a human UUID/GUID type identifier.

I would also like to think that this is solvable using strictly biological and physical properties, sampled at birth.

Otherwise, time and culture factors would seem make it difficult to produce a static set of "apples to apples" questions and answers.

I wonder if the right maths applied to existing genetic and forensics big data sets could produce the 33 questions.

I wonder if the right maths applied to existing genetic and forensics big data sets could produce the 33 questions.

I'd imagine that genetic markers would be the best way to do to (Disclaimer: I'm no biologist and might have made completely wrong assumptions here). They're less likely to change than, say, someone's political or religious beliefs; one could get a nasty hit on their head and forget.

The thing with genetics is that they can change over time. Some gene's turn on and off. Attributes like your face and fingerprints change over time. They're not constants.

If you could identify a set of 33 lifetime constants, you'd end up with a life-long UUID. If you expanded beyond 33 bits and included genetic markers which change over time, such as gene's which flip on/off, you could end up with a point-in-time (PIT) UUID.

    UUID     = Constant throughout life.
    PIT+UUID = UUID plus markers identifying you at a particular point in time.
A constant would be something like, do you have a Y chromosome? (there is fault in this question: XYY syndrome)

Also, you'd probably need more than 33 bits. 33 will encompass all living humans today in 2013 ADE, but would have to be expanded as the living population grows, and to include the billions of deceased humans.

In the end, a "true" unique identifier, encompassing any human, would be their UUID plus a list of all PIT+UUID's they generated during their lifetime. Or in english, an entire record of their genetics from start to end:

    struct LIFETIME_UUID {
        void * uuid;
        void * pit_uuid_TIMESTAMP1; 
        void * pit_uuid_TIMESTAMP2; 
        void * pit_uuid_TIMESTAMP3;
        ...
    };
That should eliminate conflicts in edge cases like identical twins or cloning.
DNA is your UUID/GUID
Unless you're one of a set of twins/triplets/etc.
append datetime string of birth. Boom problem solved.
What if the twins are delivered via Caesarean, with the exact same datetime recorded? Including datetime (even as a string) also includes the aforementioned falsehoods about time.
unless you have a twin, or other tuplet siblings.
or a clone:)
Unless you're a twin
I don't think so. How do you come up with time invariant questions? If you used these big conjunctions, how do you uniquely identify every person in just 33 bits?
Those are bad examples because those questions don't split the population in two. Very few people are non-practicing Catholics with exactly three children. If you want to limit the number of questions to just 33 then you have to choose your questions very carefully.
I think you misunderstood. powrtoch proposed having a set of questions in which each individual question is itself very complicated. For example, being a non-practicing Catholic with exactly three children is only one small facet to a single question. By or'ing a bunch of really specific questions together you can come very close to getting exactly 50% of the population to answer yes to a single question.
That's kinda cheating though isn't it? Like chaining a dozen statements on one line with semicolons and going, "look I can write that program in one line!"
That depends what constraints you choose to define on the problem - if they need to be knowable, memorable,... then yes probably. Anyway, sz4kerto has a good comment about using a Karnaugh map which might help see how this works - it's a lot cleverer than just chaining things randomly - but does break a lot of hypothetical arbritary restrictions.
The example was one question with a bunch of OR operators that when combined, would equal exactly 50%.