Hacker News new | ask | show | jobs
by smcmurtry 3429 days ago
In 2012 I wrote my first Wikipedia article on the 50-person startup I was working for at the time. I didn't include anything overly self-promoting, just the basic facts and referenced some news articles. My article was immediately nominated for deletion and a number of community members accused me of being a "single purpose account", i.e. not interested in contributing, just advertising. Needless to say I did not go on to create/edit more articles after a welcome like that.

A couple of editors did come to my defence. I got the impression there was a lot of internal conflict about this sort of thing.

Edit to add the following: The article: https://en.wikipedia.org/wiki/Ecobee Deletion discussion: https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletio...

5 comments

I think they have a point in that case. You didn't have any interest in contributing until you had a startup you needed to promote. Also, unless your startup is notable it's not supposed to have an article.

"If a topic has received significant coverage in reliable sources that are independent of the subject, it is presumed to be suitable for a stand-alone article or list... If a topic does not meet these criteria but still has some verifiable facts, it might be useful to discuss it within another article."

https://en.wikipedia.org/wiki/Wikipedia:Notability#Whether_t...

I agree with the above.

However, from my experience in CAD, Wikipedia's notability and importance ratings are strongly skewed towards open-source and against commercial systems.

High-importance:

https://en.wikipedia.org/wiki/Talk:FreeCAD

Low-importance:

https://en.wikipedia.org/wiki/Talk:SolidWorks

No disrespect meant to the FreeCAD folks, but that is definitely back-to-front! The article on Solidworks lists 165,000 companies using the product as of 2013. How is that low-importance?

The skew tends to be even worse against enterprise class systems.

Those importance ratings are utterly unimportant. In the vast majority of cases, they are just the opinion of a single editor who looked at the article for 10 seconds, and they only affect how the article is listed in some automated report that nobody ever looks at.
> You didn't have any interest in contributing until you had a startup you needed to promote.

That's one explanation of the facts but it's not the only one. Any new user is likely to start out by contributing on topics close to them - places they've worked, technologies they've worked with, etc.

If employees aren't allowed to make articles on their employers then that's that. But treating a new account differently than an old one is just assuming bad faith.

I actually had been contributing small edits for grammar anonymously for a while, and was interested in getting more involved in Wikipedia. My intention wasn't to promote my employer, I just thought it met the notability requirements.
Additionally, "referenced some news articles" is not sufficient for notability because most startup coverage is PR [1]. Unfortunately, when you get genuine independent journalism, it is also not positive coverage (e.g. Theranos, Magic Leap).

[1] http://paulgraham.com/submarine.html

The community of wikimedia and wikipedia is pretty bad with this. I've seen articles locked, then marked for deletion, so you can't add more information to make it 'article-worthy'. They then say that its not 'against the guidelines', as the ultimate excuse.

Mediawiki is supposed to also be a communicative platform on these problems, but it really fails its goal there with its talk pages, when these problems really stem from a lack of centralized community and being able to easily talk about and resolve these issues. Typically, you'l get referenced to IRC or another talk page, where your issue will not be resolved and will probably take forever to be responded to.

Overall, I've ditched working with Mediawiki or anything wikimedia. They don't show caring to actually invest in open platforms and software that others can use, they're just interested in making their own projects popular. Some of the core devs are actually really good at what they do, the problem is that the framework now needs a big revamp for it to be usable outside of the wikipedia environment properly, something wikimedia will not invest in.

If they want to show good faith in freedom of information, they would make their software into components and allow other projects to use them, especially the actual wiki markup processor. This would allow people to integrate wiki functionality into fundamentally better frameworks to maintain, like Drupal 8, or design their own frameworks that internally use the packages maintained in mediawiki.

You say they're "bad" at this, but another word you could use is "consistent", and you could follow that up by suggesting that there's a coherent criteria used for what's deleted, but that nerds have a really hard time understanding it.

To wit: Wikipedia not at all concerned about storage space, but they are concerned very much about the amount of time they'll need to spend policing articles to make sure that things that wind up preferentially at the top of Google search result pages aren't full of advertising spam, lies, and cruft. Every article they add increases that burden. A reasonable line to draw in the sand is "we're only going to allow articles that make a clear statement of why their topic is notable, and for which an ordinary, disinterested editor could verify all the facts by following cited sources."

> things that wind up preferentially at the top of Google search result pages

I feel that this really should not be the worry of Wikipedia.

Well, it very much is.
> They don't show caring to actually invest in open platforms and software that others can use

You mean apart from Wikipedia and Mediawiki?

Mediawiki is made for Wikipedia, it is not software that can be well used outside of the environment. I know, because I've invested a lot of time into trying to make it so educators and communities could start their own wiki's, and what I've found is that its way too complex to expect anyone other than a sysadmin to maintain, and way too hard for users to learn wiki markup and extensions to use it in any decent way in templates. Add ontop the amount of work you need to do to just get Lua (An actual .php extension), Visual editor (Giant node.js project), or the math extension (another big node.js project) in, and you're looking at a massive amount of work that could break at any time. This is because its direction is not for individual installs, it is meant to be in an environment of sysadmins, consistent maintenance, and those who develop the framework.

Wikipedia is also not that 'open'. What you edit has to fit specific guidelines, and of course get past the moderators to be approved. It also misses the point of freedom of information, because there's tons of info out there that people want to put out, but doesn't fit the scheme of Wikipedia.

The Wikimedia visions is: "Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment.", and thats what I'm criticizing Wikimedia and its projects for not upholding.

> What you edit has to fit specific guidelines

If you want truly open communication, go to 4chan. It's a cesspool, which is fine for people that like that, but it's much further away from being 'information for everyone'. Claiming that WP isn't open because they want a basic level of quality is just grinding an axe - WP has informed a far greater number of humans to a far higher level of quality than all the ^chans put together.

> I know, because I've invested a lot of time into trying to make it so educators and communities could start their own wiki's

And yet MediaWiki is sprayed in wikis all over the internet. I've set it up as well. Yes, learning the full markup isn't trivial, but the basic stuff is. And I'm not sure how it's "not open" simply because it has a learning curve. Does this mean Vim is not open? Emacs? Apache? OpenBSD? The Vim GUI sucks, because it's outside it's expected environment of a terminal - does that also mean it's "not open"?

If you want an example of open software that is designed specifically 'for the punters', look at Gnome 3... where you're pretty locked down and can't do much (cue complaints then about 'freedom'), but there's no learning curve. LibreOffice is made for general consumption and still gets complaints about being difficult to use from the punters - and even then, if you want to use the more advanced features, there's a learning curve.

Complex software has a learning curve, and hiding that learning curve is really difficult. Apple 'solved' this by simply removing functionality and configurability (again, cue complaints about losing freedom). If Apple made a wiki, you wouldn't even have the choice of adding that maths plugin. Hell, you probably wouldn't even be able to skin it.

> it is meant to be in an environment of sysadmins

It's a heavyweight engine that you're wanting to put lots of heavyweight stuff on. That's what they're designing for, and it has some warts, but it works. It's daft to complain that the engine primarily written by a non-profit for one of the top 5 websites isn't written as a one-click install feather-light application.

Basically you're holding WP to an impossible standard and complaining that they don't measure up.

>If you want truly open communication, go to 4chan.

Its not about open communications, its the fact that information has to fit the criteria Wikipedia deems necessary, and that does not fit a lot of information out there. For example, a group of proffessionals in building design want to create an educational site on how to get started digitally, what software, the theories and factors in play when creating a building, fueled by their real-life experience and education over time. That is not something you can put on Wikipedia. You can put some of the theory, but ultimately experience is lost in translation or deleted due to no sources.

>Claiming that WP isn't open because they want a basic level of quality is just grinding an axe

Its not open in anywhere near what their vision states. Its open for sourced information and whatever various mods will allow. Which, is fine if thats how they want it to be, but to claim its an open platform for information is false.

>WP has informed a far greater number of humans to a far higher level of quality than all the ^chans put together.

I am not advocating that Wikipedia just be an open book to write whatever you want, but that its platform does not support much outside of sourced info, which is a category of information, not the sum of all information, and leaves out a lot of other information that doesn't fit its guidelines.

>Yes, learning the full markup isn't trivial, but the basic stuff is. And I'm not sure how it's "not open" simply because it has a learning curve.

Making a wiki has very little to do with the basic markup, and a lot more to do with designing templates and organizing how your data is formatted and presented, and that is what mediawiki fails to do in a manner that is accessible. Difficulty does reduce accessibility, which infact does reduce its openness. If it were simpler and well documented, searchable, then there would be a lot more writers. A lot of the problems can be solved by having a markup language that also acts like a programming language, being able to work with variables and inputs and do transforms on them, much like an actual templating language.

The learning curve of Libreoffice or other programs of that nature is a false equivalence. Adding a graph in Libreoffice takes a few clicks of the dropdowns, maybe a few tries of adding in info. Adding in a graph into mediawiki requires you to find an extension, install that, learn its syntax, and god forbid you add it into a template dynamically, learn how to get data variables from wiki markup. It is significantly more work and understanding of tech.

>Complex software has a learning curve, and hiding that learning curve is really difficult.

Yes, it is, but it is possible, if the software were designed for being used outside of the wikipedia environment more, similiar to frameworks like Drupal 8 or Wordpress are, it would be much more maintainable and learnable. Understand that wikitext is just a small small part of a mediawiki environment, and even thats enough to bar entry for many people.

>It's a heavyweight engine that you're wanting to put lots of heavyweight stuff on. That's what they're designing for, and it has some warts, but it works.

Its an old engine that is very integrated into itself with a lot of tech debt that hasn't been paid back. They are designing for that, not for creating a framework that best suits accessible, editable, presentable information.

>It's daft to complain that the engine primarily written by a non-profit for one of the top 5 websites isn't written as a one-click install feather-light application. Wikimedia does not claim the Mediawiki is made for WP and shouldn't be used outside of it. I claim that, but thats not how it should be.

An application being 'heavy' has nothing to do with its maintainability or usability to the end user. Arguably, Wordpress is much heavier, yet has a built-in auto updater, plugins and theme installer, and is quite easy to setup.

>Basically you're holding WP to an impossible standard and complaining that they don't measure up.

I'm holding /Wikimedia/ to the standard they've set for themselves, with expectations much lower than that, and still it doesn't hold up, because they are not actually doing what their vision is, they're just making their own product where information has to fit their guidelines. You can argue that Wikimedia is a non-profit, or the software is complex, or whatever you'd like, but the reality is that there is a significant amount of information that will never be passed into the internet space because good platforms for it don't exist yet.

How many 50-person companies do you think have existed, globally, since 2001?

Lets just make some very rough estimates. Sweden with a population of 10 million creates an average of 36 500 new companies per year. Let say than 5% reach at some point 50 employees, which would result in 180 Wikipedia articles per million population per year. There is 7.4 billion people in the world, so that is 180 * 1000 * 7.4 * 16, which would be a bit over 21 million Wikipedia articles that only covers new 50-person and bigger companies (not including companies created before 2001).

The people who accused you of "single purpose account" were of course in the fault since they should have assumed good-faith, but I can't generally disagree with the notion that a 50-person startup might need more than employees to be notable enough for a encyclopedia.

The assumption that the world creates 50 person companies at the same rate as Sweden is not likely.

Company listings are actually often really useful on wikipedia, there can be some outside info that is far better than what the company itself has and if there are suspect things about the company wikipedia can link to them as well.

Wikipedia at one point deleted the article on Atlasssian because 'it wasn't notable'.

Unlike so many minor characters in Star Wars...

I said "very rough estimates" since indeed the world average is unlikely to be exactly the same as Sweden. But even if we reduce the global rate to 10% of Sweden (fair?), it is still 2 million articles and would cover half the current size of english Wikipedia.

Usefulness of non-notible articles is often discussed in Wikipedia. One side generally argue that any article that is useful should be included. The idea that notability is the criteria and not usefulness is an interesting discussion, and part of the deletionism versus inclusionism controversy.

Considering approximately 2 billion people in the world are self sustaining farmers, even 10% of Swedens rate is an over-estimate.
>enough for a encyclopedia.

I hate this argument. Wikipedia is not some storage bound book shipped out to people. There should be no limit on articles as long as the content is verifiable by contributors.

You might say, "but the disambiguation page will get big." well, that's a technical problem that can be fixed. I can search for something on Google and find it easily even though it contains far more than Wikipedia.

And I hate that argument too. There are long-term costs for maintaining any page on wikipedia, because there's always people out there looking to insert troll content or provide biased information for a specifically-targeted search term.

I'd estimate the typical page on Wikipedia has 0-1 people actively looking after it. And some of these articles are extremely popular (but noncontroversial) people/places/things. Wikipedia is full of articles which are "done" but still suck.

So you cannot look at a volunteer project and determine the storage costs are negligible, no problem, because that's very obviously not the main challenge.

> There are long-term costs for maintaining any page on wikipedia, because there's always people out there looking to insert troll content or provide biased information for a specifically-targeted search term.

But these costs do not scale on a per-page basis; rather, they scale based on the number of trolls. I don't think the number of pages meaningfully changes the amount of effort "trolls" put into "trolling"; meanwhile, automated tools like watchlists allow you to keep an eye on an unlimited number of pages.

It should be much easier to automate anti-"trolling" tools on fringe pages which get very few edits - e.g. automatically adding newly-created or rarely-edited pages to a watchlist.

Finally, it doesn't look like wikipedia has a great editor retention policy if the problem was really combating trolls; There seems to have been an assumption of bad faith on the count of GP - if he is really a "PR shill", then no skin off their back - if they're paid to do it, they'll keep trying, becoming a "troll". However, if he was to be a legitimate editor, blaming them from starting in their own topic of interest(even if it was self-promotion) doesn't seem like a good way to retain them as a long-term editor.

> they scale based on the number of trolls

Low-level trolling has almost zero cost on wikipedia, you don't even need an account. Especially for articles that are largely politically uncontroversial and "done". So they probably catch most of the vandals, and use their process to stop maybe the top 20% of political kooks. But when some random adds dubious information to a long-tail article, it can hang around for years.

Moderators are voting to remove pages that have validated data. That's idiotic because it's more work than leaving the article be in a locked state.

I would rather have a bunch of articles that are locked rather than deleted by some moderator that thinks they are defending the glory of worthy human knowledge.

There are no moderators, you could comment too.
In Sweden, the current proportion of companies greater than 50 people is closer to 0.5%. Since many companies close before reaching this size, the rate attaining that size is much lower by another factor of 10. So, your numbers exaggerate the amount by at least 100x on this basis, not to mention the huge portion of the world population which is not living in advanced economies with high rates of corporate formation.
It would be fun to estimate storage space for the articles.
Can you point us to the actual article you wrote? What was its title?
Tip: It would help to disclose in the user page too.