Hacker News new | ask | show | jobs
by wpietri 1568 days ago
I tried something like this once and it worked surprisingly well, even for a UGC site.

Years back we were doing something that included users documenting TV shows. We had a big meeting where people put every feature they wanted on index cards. We laid the cards out a founder's dining room table. The host got their change jar and each person got a certain number of pennies to mark features they thought were vital for first launch.

After the first round of token-voting, the "user accounts" card had no votes. At first it seemed impossible. But after some discussion, we realized that viewing users didn't need accounts for launch. For people who wanted to edit, we let them type in a name to take credit for their contributions if they wanted, but with no verification. At worst, we figured we could add something more robust if the need were stronger.

It turned out fine. The launch got out earlier and we got to test a number of key product hypotheses without having to build any sort of user account system. Months later it did eventually become the highest priority. But not having accounts worked way longer than I expected.

2 comments

What's been professionally frustrating me for years as a developer is how much of the engineering and operational budget for a project is tied up into identifying and tracking users. The first time this happened to me we had some idiot who insisted that we needed to display exactly how many logged on users there were on every page load. There was no point in doing so, and we had proven that it was at least ten percent of the cost of each page load. In fact it was higher than that but 10% is what we could proved. My current project is about our customers, not the users, and probably 80% of the operating budget is about making the customer feel like they're running the show. Often with demonstrable and even clichéd consequences for the users.

Without customization or user tracking, many, many workflows shift to read-mostly. Many are idempotent. Some can be fully cached. Some can be edge-cached.

The dark secret of 'social' media that has been slowly coming out is that they aren't social. They aren't about 'Us', they're about me. Me, me, me. So of course the whole workflow is build around who I am and what I want. That's not just unhealthy, it's also really fucking expensive. And if it's really expensive we can't just eat the cost as a 'value add', we now have to monetize it. So things were already pretty dark and then compensation came into the picture and now it's positively dire.

It goes beyond social media.

Software always starts by appealing to discerning customers. The early adopters.

Once it is fairly widely adopted, often the early adopters have adopted a newer, better thing.

So now you are making features for a crowd of people who are there mostly because of platform intertia.

They don't even appreciate or use new features, because anyone who actually deeply cares about your product niche doesn't use your product.

> What's been professionally frustrating me for years as a developer is how much of the engineering and operational budget for a project is tied up into identifying and tracking users.

To add onto this, as a security-adjacent person, it's sad how much people think user behaviour data will be worth to their company. From the well-intentioned "we must pave the cowpaths" to the harmful "harvest the data and sell it", the attitude appears to have cropped up in the past 15 or so years as a mainstay of what apps should be doing and it's absolute insanity to me.

My only victories in convincing teams are where I could demonstrate their ROI was never actually going to materialize, especially when the investment part required enough development hours that other features that might sell more apps would have to be delayed. And even then, it's been about 40% of the time, with the other 60% being met with, essentially, "we have assurances it will be profitable" hand-waving.

The painful part of this is that unless certain privacy regulations start to get much more painful economically for companies, there's basically no incentive not to do it.

It's the entire "Data is the new Oil" run amok.

Absolutely. I think your last point is especially good. Facebook consumes a ton of cash for what many people feel are disappointing results. Are they vulnerable to a competitor who is less about what users want than what they need? A competitor who can do that for 1/10th or 1/100th as much money? That could be very hard for the me-me-me companies to keep up with.
The thing with fads, and adoption cycles in general, is that what people 'want' can be figured out pretty quickly, but as far as I'm concerned, The Trough of Disillusionment is what happens when people figure out that what they need is something else.

So what you're asking is can someone come into the ToD and introduce a new product that steals people away? It's plausible and if I were in a better headspace I could probably name you a bunch of examples. But does it always happen? I don't think so. There are plenty of incumbents who manage to coast through and come out the other side having demonstrated a dilute form of change of heart - just enough to convince the customers that 'something was done' even if they can't quite put a finger on what exactly is better and how much.

Sorry, I shouldn't have phrased that as a direct question. I meant it in a more rhetorical sense.

Oh, sure. It's a very tough field, and would be even if the incumbents didn't have billions to throw at the problem. I definitely don't believe that the better product wins; I only need Microsoft as a counter-example.

But it does strike me as a zone of opportunity. Maybe Substack is a good partial example here. Before the web, we had magazines. Then we basically had magazines on the web, preserving much of the old structure in the new medium. With lots of flailing as people tried to find sustainable business models.

And then Substack came along with an extremely bare-bones implementation mostly using 1980s technology and a lot of writers and readers are very happy with it.

So it's more that I'm asking myself. What are the products that cost 1/100th as much that might be as satisfying for my Facebook-ish needs?

Way back in the long dark ago I ran into some abandonware for incorporating third party data onto web pages via a shared server. Nobody I knew understood how it was meant to work, but I got the impression it was meant to be a tool where a group of people could host commentary about a website that was not their own.

I keep wondering why nobody has really tried that again. Slashdot sort of filled in that space, and then Digg and now Reddit. Or Facebook for the 'all-in' solution. I keep thinking there was something I was missing about why that would be difficult to pull off.

Today I have a different answer for that - that ship has sailed. We are multi-device and it would be much more difficult for me to have a consistent experience across phone and personal (and sometimes work) machines.

But at the time perhaps it as an adoption thing. Just visiting a website is a cheap interaction that can lead to a habit. Having to do something special doesn't work the same way.

What about abuse/vandalism? If the whole web has edit privileges, what's to stop someone from scripting changing all of the titles to random strings every hour? Do you do a captcha on every edit or something?

I think the main idea around user accounts is that they centralize a point of applying captchas as well as a tiny bit of data collection (some form of contact information) that can be used for antispam (e.g. banning certain email address domains from creating accounts, or banning certain email addresses, etc).

I'm familiar with the theory. But accounts just aren't a big barrier to determined bad actors.

Note that the world's biggest content site, Wikipedia, allows anonymous edits and always has. And note also that some of big tech companies, despite having all the money in the world, still have problems with fake accounts. So at best, requiring user accounts is one possible anti-abuse step, but it's neither necessary nor sufficient to prevent abuse.

> Note that the world's biggest content site, Wikipedia, allows anonymous edits and always has.

Not really. You can't edit Wikipedia from a VPN (even with a user account!), and I think they ban most datacenters. The edits aren't really anonymous if they publicly associate with a piece of PII that, for most people, directly maps to their name and home address.

> The edits aren't really anonymous if they publicly [show your IP]

Counter-example: stackoverflow is also reasonably big and allows anonymous questions, answers, and even edits, without publishing an IP address or anything. The edits end up in a review queue, the rest I think is actually published immediately.

But doesn't this content need to be reviewed (read permitted) by other non-anon user accounts?
> The edits end up in a review queue, the rest I think is actually published immediately.
Wikipedia also locks most interesting pages so only established accounts can edit them.
This is a good and sad point. I was on the wiki page for derivatives and found it was locked due to vandalism. On one hand, we don’t want pages locked because that defeats the point. On the other, how do we stop every troll high schooler who just learned derivatives and messes up the wiki page for lulz? We either need active watchers (surprisingly and fortunately pretty easy, wiki editors are a passionate and eagle-eyed group, but I wonder how long and how much of this is just the initial hard core fans from the early days) or to have some deterrent to vandalism in the first place. For some, maybe this is IP address logging (although as someone else noted in the thread, at what point does this sink anonymity?). For others, maybe creating an account. In practice, neither of these work 100% of the time. I have seen vandals from both IP accounts and registered accounts in about equal frequency.
I don’t think it really matters. Wikipedia has surprisingly strict standards and traditions that aren’t very intuitive. If you as a brand new user attempted to edit the page for Donald Trump or Apple, there is a close to 0% chance your edit would not be reverted anyway. These pages are highly curated and there is minimal value you can add to them as a new user. So the semi lock almost just stops people wasting their time.

Much better to start off editing your local country town which has no power users patrolling and tends to be significantly out of date.

Oh? My current IP is 2601:646:4300:758:f676:3f1b:8b5:42a. Please show me how to turn that into my name and home address. Thanks!
GP's "directly" is a pretty large overstatement, but at the same time I've noticed something of an uptick over the past couple of years of people saying that IP addresses aren't PII or that people shouldn't be concerned with them getting leaked, and I just don't think that stands up to much scrutiny.

If IP addresses didn't matter for privacy, Tor routing wouldn't exist. If IP addresses weren't useful for blocking specific users, IP bans wouldn't exist. If IP addresses weren't useful for tracking, operators wouldn't have gotten up in arms about Apple's private relay service. Obviously this stuff matters.

Remember that not everyone lives in or around San Francisco. For someone in a suburban/rural area, an IP address combined with things like timestamps, user ids, and the text of the edits can go a really long way towards unmasking them. Even for people who live in more urban areas, it is still obviously easier to find someone who lives in San Francisco than it is to find someone who could be living anywhere on the West Coast. If they could also have been using a VPN, or time-shifting their posts... that makes it even harder.

In contrast, how hard do you really think it would actually be to get some address data from a voter roll or via a warrant or even just through one of the scummy person lookup services online and to iterate through everyone who shares that IP address and check to see how many of them are named Pietri? Or who have shared the username wpietri across another account, or posted somewhere else at roughly the same time? Your IP address is drastically reducing the search-space for other attacks, many of which (timing, text-analysis, etc) are impossible to get rid of when making a Wikipedia edit.

I agree IPs are PII, and that they can lead to unmasking. I also agree the person I replied to was wildly overstating things.

But for the current context, where we are talking about whether or not user account registration is helpful in preventing abuse, I think the kinds of low-probability, long-timeline consequences you describe are not really going to deter most would-be vandals. Especially since Wikipedia is going to know the vandal's IP address whether or not it gets show publicly. So I think Wikipedia is still a good example of how "no user accounts" is workable at scale.

Comcast has a portal for law enforcement to request subscriber information at https://lea.comcast.com . That IPv6 address, plus the current date and time, uniquely identifies you by name and service address. Any edits you make to Wikipedia from that address are not anonymous.
This is a use of "anonymous" that is unfamiliar to me. Do you mean something like "untraceable"? For example, when non-profits credit an anonymous donor, they know who the person is. In that more common sense of the word, Wikipedia's anonymous edits are indeed anonymous: they are published without a name attached.

Anyhow, that seems besides the point. All HTTP requests come with IP addresses. That the police might be able to trace them back to a house eventually does not say much about either Wikipedia (who would give up an IP address with a warrant whether the edit was for a named account or an anonymous one) or no-user-account systems in general.

3270 23rd street, 94110?
You definitely didn't get that from my IP address.
The person y'all are downvoting is not technically incorrect if they're in the EEA, as this is exactly how GDPR treats it. Because there exists a party that can map it (your ISP), it's PII under that law. Of course this may be different in other jurisdictions.
Accounts alone won't do it. Accounts and invites might? But then someone who doesn't know anyone on the site needs to figure out how to contact someone who's a member.

It's not good for growth, but some websites are fine with that.

Over time the quality of the invites go down as well.

If I'm in the picky group, and we send out 5 invites total, but the unpicky group sends out 10, then 2/3 of the invites are unpicky - if the groups are the same size, which they probably won't be for a while (I'm probably inviting people who are almost as picky as I am)

There's also someone on the team who thinks we'd grow faster if we simplified the onboarding process, which is true but also means when we piss off some user they can create a bunch of accounts while they're still spun up and cause a bunch of overhead for the support team and the developers. That gets expensive too.