| The 1st source is a blog post on a consulting company website. The 2nd mentions Arrow only in passing, after several pages of coverage of Spark; Arrow is covered only in relation to Spark. It's a reliable source but doesn't clearly establish notability. The 3rd mentions Arrow hardly at all; it's an implementation detail, mentioned just once, in a paper about something else. I can't fetch the 4th. The 5th, a story in The Register, is reliable and probably does go towards notability, though it seems to sort of argue against it (the gist of the article is that it's surprising that Arrow has been made a top-level project at all). The 6th, in CIO, is a recap of a press release. Trade press PR recaps shouldn't be WP:RS, but WP will often accept them, or would when I was patrolling AfD; it's luck-of-the-draw. The admins who shot down Arrow's page were smart enough not to accept it. The 7th, in InfoWorld, is promotional as well, but it's at least written in some depth. It's a straightforward notability claim. The Arrow article should draw more clearly from it, in the opening paragraph. The 8th, in SDTimes, is written by someone affiliated with the project itself; it's citable, but WP probably won't accept it independently as grounds for notability. Same, in effect, for the 9th, which is just a recap of an interview with the project author. The 10th and 11th are just blog posts. They're citable if they're not contentious, but they usually won't be acceptable as WP:RS for notability. |
"Recognizing that Value Vectors meet the needs of other data processing engines, in February 2016, the Apache Software Foundation announced Apache Arrow as a top-level project, bypassing the standard Incubator process. Committers to the project include developers from other Apache projects such as Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark and Storm.
Apache Arrow enables execution engines like Spark to take advantage of the latest operations included in modern processors, for fast analytical data processing. Columnar layout of data allows for better use of CPU caches by placing all data relevant to a column operation in as compact of a format as possible. ...
Apache Arrow software is available under the Apache License v2.0.
Dremio, a startup led by Jacques Nadeau, chair of the Apache Drill and Apache Arrow Project Management Committees, leads the development."
In the past, this and the other sources would have been more than enough to establish notability. I know that because I have created Wikipedia articles on subjects much less notable than that. The problem for Apache Arrow isn't that it isn't notable enough, it is that people have already tried four times to get it included in Wikipedia so the Wikipedians voting on new page inclusions are getting suspicious about it.