I punched in my details and there are approximately 100 people like me. What has occurred to me is that people like me (20's, single) that live in rich mobile areas of Sydney spend less than the national average across most things, and overall. But also, compared to people like me, I get away with about 1/8th of what everybody else spends, and do not feel poor. To the contrary in fact.
What is going on here? I am inclined to believe the data.
What this has really reinforced for me is the extent to which other people must spend their income as it comes in. It has also reinforced the importance of my friends, connections, values & personal knowledge. My friends who enable me to live a life that feels rich whilst, apparently, spending about 1/8th of what everybody else does.
I really am surprised. I was not raised frugally and if I want something and have the money, I generally buy it & I don't need more than I have.
Given the history of people being identified from "anonymous" records (thanks a lot, information theory) ... well, I'm glad I'm not banking with the NAB.
You misunderstand. This is not an anonymous data-dump like the AOL fiasco, and there is no way to get any raw data.
Basically you just pick a few things like your age, salary and where you live, and then out of all of the people out there "like you", you're shown an average of where people like you tend to spend their money (travel, food, etc.).
Even if you spent a lot of time manipulating the data to try and get it to tell you one person's spending by category (assuming it were possible), the data would be relatively meaningless.
Agreed, I wonder what exactly they did to anonymise the data. Reminds me of this quote from Homer in The Simpsons:
Well, let's just call them, uh, Mr. X and Mrs. Y. So anyway, Mr.X would say, 'Marge, if this doesn't get your motor running, my name isn't "Homer J. Simpson."
For what its worth they have not "released" the data. I cannot copy any of the text from the FAQ section but if you browse the FAQs and "about this site" it looks like the whole site is basically an ad/demo of Market Blueprint[1]. But I could be wrong. estromiund, do you know where they released the data?
I wonder how accurate this actually is... I punched in my details and apparently for my details there are less then 10 people like me. Considering where I live, age etc... that seems unlikely.
I dunno the results for entertainment were scarily accurate. Everything else seemed out, especially cost of food in my area, considering it's a "cafe" district and you'd be hard finding a cheap meal there.
I am but one of the many. I did notice, however, that the average monthly spend for people 'like me' is more than the monthly income of people 'like me'. Curious.
The last thing they ask you to enter is your postcode, which if you did enter it, vastly reduces the number of people that you are being compared with. That, and most people here would be on a high income for their age compared to the rest of the population.
Robust De-anonymization of Large Sparse Datasets
http://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf
Regarding Netflix's competition: "We demonstrate
that an adversary who knows only a little bit about
an individual subscriber can easily identify this subscriber’s record in the dataset."
Interesting, as a customer of ubank I was concerned with the concept, especially after the data deanonymization techniques developed post AOL and Netflix.
However after using the site and entering my demographics (all public properties) I could see that I was like 56 people in area; their base spending patterns did not reflect me at all and I felt like a snowflake. Sometimes big data makes you feel special.
At no time was I shown transactions, merely aggrigate figures in categories. No privacy issue here, keep being decent and ethical national bank!
I agree. I am with this bank, and after seeing this was curious as to what information they actually have about me. Looking at the "My Details" section of the site shows only the following info:
- Name
- Phone
- Email
- Address
- Tax File Number (not unlike a US social security number)
I came to the same conclusion that its probably based on census data.
Is anyone aware of what method they used to make the data anonymous? I'm currently doing research in data masking tools and would be interested to know about the techniques used and the performance of the tool.
Kind of cool, but I bank with the bank in question (Ubank / National Australia Bank) and I'm not sure I'm completely comfortable that they'd use my transaction history like this.
Hi xodem, I'm Jennie, the Digital Director at UBank. I just wanted to reassure you that privacy was taken extremely seriously -- no personal customer information is available even within the database, and we've ensured strict compliance with privacy laws.
We also took an extra step around setting a minimum uniqueness level of 'Less than 10' - so there won't ever be a scenario shown for 1/2/5 PeopleLikeU as queried in the comments below.
Hope you enjoy the experience - we really wanted to give the value of the data safely back to the people who created it to help you make more informed decisions on what, where and how much you spend.
Feel free to tweet me @jbewes or the team @Ubank, or even Facebook msg them facebook.com/ubank if you've any more queries.
I work for an Australian Bank. We had an internal competition where they didn't sanitise the data for this competition well enough. After a week of analysing the data one of the people in competing in the competition knew who's data we were looking at. It was a friend of theirs.
All identifying information had been removed/sanitised, they could tell from looking at the spending habits.
... So they knew, because they already knew. So really they just confirmed.
Ask that person to go and find my details, and they will surely fail. Because they know nothing about me, and the transaction history doesn't tell them anything... thus it is not 'identifying'.
That still isn't your identity. You can identify yourself by it, but no one else can. Unless they already know that information about you... it which case, they already knew...
I put in my numbers to see what it would be like living on the other side of the world (I'm in Dublin, Ireland - it is currently 9C and raining) using the Sydney postcode 2000.
It says my expected house & home costs are $2000, which seems rather high for me. In London and here in Dublin, pretty much everyone my age (mid twenties) is in a houseshare, so we would pay that for the whole property, but split it three ways or so. Is housing just that much more expensive, or what?
You only picked one of the most expensive postcodes - that's the Sydney CBD! Yes, it is VERY expensive. In fact, it is very expensive to live in Sydney, full stop.
If we go by the history of such anonymization attempts (AOL, Netflix) it is only a matter of time that someone figures out how to get PII from this data.
I think the title is misleading for the purposes of catching people's attention. The data is not available to download, and is in aggregate form. The AOL data was not in aggregate form, so it was possible to get personalized details.
They probably constructed a simple decision tree aggregate which is probably less than 10kb of data. I doubt anyone could extrapolate 10kb data into personalized details of an individual.
Spooky. Tried it out and found it's prediction accurate for brands and companies I might use. Which was disappointing in a way, I was hoping I might find something new to try. Instead I'm well, just like me....
Nice idea but an absolute terrible website. E.g. If you go into the about section and expand one the questions you get this weirdo scroll bar. Seems like they're forcing the layout into a fixed dimension.
What is going on here? I am inclined to believe the data.
What this has really reinforced for me is the extent to which other people must spend their income as it comes in. It has also reinforced the importance of my friends, connections, values & personal knowledge. My friends who enable me to live a life that feels rich whilst, apparently, spending about 1/8th of what everybody else does.
I really am surprised. I was not raised frugally and if I want something and have the money, I generally buy it & I don't need more than I have.