Hacker News new | ask | show | jobs
by codelani 1632 days ago
Wow, alerted by a few pull requests and then pleasantly surprised to see this here.

As tyingq pointed out, this is not a list of "open source" languages, though I don't think it was too off for the OP to add that since indeed most of them are. It's also a bit broader than "programming languages". The list is termed "computer languages". That is the main category and ~75% of the langs, but formats and other things are counted as well (see table below). Even musical notations make an appearance, as I find those relevant to people interested in designing computer languages for music, or visual languages in general. While the focus is on computer languages, I think it's helpful to have a light touch of some of the earlier developments in language in general. So I didn't draw explicit lines, rather the strategy is to keep focus on programming languages with a peripheral view of the bigger picture.

    type                        count
    pl                          3096 
    application                 111  
    queryLanguage               82   
    textMarkup                  67   
    grammarLanguage             65   
    xmlFormat                   61   
    editor                      57   
    packageManager              56   
    binaryDataFormat            51   
    metaLanguage                50   
    template                    49   
    library                     40   
    textData                    39   
    protocol                    37   
    esolang                     37   
    notation                    36   
    assembly                    34   
    ir                          20   
    compiler                    20   
    isa                         18   
    standard                    18   
    idl                         17   
    schema                      14   
    visual                      14   
    computingMachine            13   
    plzoo                       12   
    filesystem                  11   
    framework                   11   
    jsonFormat                  11   
    hashFunction                10   
    os                          10   
    ...
As to accuracy, in general, there are ~420,000 cells in my "spreadsheet". My initial target accuracy was ~98% or so. Gathering the cells was a mixture of manual curation, crawlers, simple NLP models, and contributions from the community.

This project sadly fell by the wayside. I need to decide whether to 1) abandon it and instead just contribute facts as I find them to the relevant pages on Wikipedia or 2) determine if there's a good reason to build a fact site like this outside Wikipedia and if so get it into gear.

Sorry about any inaccuracies and thank you for the feedback (and especially the pull requests!).