Hacker News new | ask | show | jobs
by dredmorbius 1063 days ago
For what it's worth, I've been looking at HN front-page activity since 2007. I've recently classified the most frequently-appearing sites to give a breakdown of what type of content appears on the front page (as typified by site). Taking two arbitrary years, 2009 (after HN got reasonably established) and 2022 (most recent full year data), what's notable is that general news sites are less prevalent, and that programming content (mostly links to specific languages and/or source repos) more prevalent in the more recent period.

2009

   Posts:  10950  Sites:   1129   Submitters:   1440

  Class                 Stories    Votes   (mean) Comments  (mean)
          UNCLASSIFIED:     3741     3741    1.00     3741    1.00
                  blog:     1930     1930    1.00     1930    1.00
                   n/a:     1734    43045   24.82    45495   26.24
             tech news:     1118     1118    1.00     1118    1.00
          general news:      878      878    1.00      878    1.00
       corporate comm.:      450      450    1.00      450    1.00
         business news:      407      407    1.00      407    1.00
    academic / science:      337      337    1.00      337    1.00
           programming:      316      316    1.00      316    1.00
      general interest:      124      124    1.00      124    1.00
              software:       90       90    1.00       90    1.00

2022

   Posts:  10950  Sites:   1158   Submitters:   1398

  Class                  Stories    Votes   (mean) Comments (mean)
          UNCLASSIFIED:     4844     4844    1.00     4844    1.00
           programming:     1146     1146    1.00     1146    1.00
                  blog:     1123     1123    1.00     1123    1.00
                   n/a:      864   167707  194.11   125736  145.53
    academic / science:      567      567    1.00      567    1.00
          general news:      444      444    1.00      444    1.00
       corporate comm.:      406      406    1.00      406    1.00
             tech news:      400      400    1.00      400    1.00
          social media:      252      252    1.00      252    1.00
      general interest:      222      222    1.00      222    1.00
Notes:

- The "(mean)" columns have bad data, I need to fix my code. The others should be reasonable.

- "UNCLASSIFIED" are sites I've not manually classified. They tend to follow roughly the same overall distribution, though more blogs and fewer news sources.

- "n/a" are posts without a site, typically an "Ask", "Tell", "Who's Hiring", or similar post.

Keep in mind that even "general news" is often about science, technology, or tech-adjacent business, legislation, court decisions, etc.

1 comments

Update on the votes/comments/mean values: turns out that the votes and comments counters were also bad, given my accumulator code (I'd thought this was a reporting issue). The story counts are legit however.

(I was counting "stories" as both "votes" and "comments", which is obvious on eyeballing the values. The difference between "++" and "+= votes / += comments". Sigh.)

Actual / corrected data:

2009

   Posts:  10,950  Sites:   1,129   Submitters:   1,440

  Class                 Stories    Votes   (mean) Comments  (mean)
          UNCLASSIFIED:     3741   208521   55.74    90426   24.17
                  blog:     1930   121329   62.86    53561   27.75
                   n/a:     1734    84356   48.65    89256   51.47
             tech news:     1118    58870   52.66    29115   26.04
          general news:      878    41661   47.45    25054   28.54
       corporate comm.:      450    30740   68.31    13541   30.09
         business news:      407    20064   49.30    11748   28.86
    academic / science:      337    16806   49.87     6951   20.63
           programming:      316    19170   60.66     7820   24.75
      general interest:      124     7081   57.10     3370   27.18

2022

   Posts:  10,950  Sites:   1,158   Submitters:   1,398

  Class                 Stories    Votes   (mean) Comments  (mean)
          UNCLASSIFIED:     4844  1440894  297.46   767619  158.47
           programming:     1146   308222  268.95   117139  102.22
                  blog:     1123   322989  287.61   202251  180.10
                   n/a:      864   334550  387.21   250608  290.06
    academic / science:      567   130608  230.35    73908  130.35
          general news:      444   158965  358.03   154772  348.59
       corporate comm.:      406   144991  357.12    85870  211.50
             tech news:      400   122049  305.12    92895  232.24
          social media:      252   124105  492.48    87830  348.53
      general interest:      222    53305  240.11    47007  211.74