Hacker News new | ask | show | jobs
by m_mueller 1298 days ago
I can agree with this statement: "HTML provides a pretty face, but a lousy system-to-system transaction environment."

Imagine if we had something like JSON, but with a more standardized schema to e.g. describe something simple like a table of data. There is so much glue code being written that does nothing else than reformatting data to something an organization can actually use. Think what it would mean if every table on the web would be fully adressable with an URI, such that you can directly load it into whatever you want - pandas dataframes, tableau, a C++ vector, Excel, your next fancy executive board PowerPoint chart, whatever. Make it n-dimensional from the start and give the shape plus each column a type definition. I guess one can dream...

3 comments

There is a lot of glue code written because we are only transporting data through HTTP, that is the problem.

A concept that has been buzzing around my head a lot lately is, what if we could model objects/actors? So that I do not only go to google.com to display some HTML, but we have a standardised RPC language I can tell it to "search this query" and it returns a structured object? The same RPC I can use to talk to my Hue lamp and tell it "turn red."

In fact, our current HTML model can nicely map to a "render a thing to HTML" method call.

We spend too much time building complex systems by either scraping HTML, or gluing together incompatible APIs from vendors.

I want the Internet to be like the Erlang virtual machine. Each server is an independent actor that holds state, can send and receive messages, but they are not very trusted.

> I want the Internet to be like the Erlang virtual machine. Each server is an independent actor that holds state, can send and receive messages, but they are not very trusted.

you sound quite a bit like Alan Kay there (and that's not a bad thing IMO).

I know, my extended idea is borne out of Alan Kay and his vision. It is very nebulous at this stage to expand further, but I do strongly agree with him now that "the computer revolution hasn't happened yet."
Are you trusted? The reason g-search and others don’t have an API is not a technical one. They won’t let you neither JSON nor HTML without captcha module injected into your browser because this is the way they earn money.
So? That can still be modelled in a RPC manner.

Instead of sending a "query <string>" command, I have to do:

    -> get-captcha
    <- returns a captcha object
    -> solve-captcha <id> <solution>
    <- returns a token string
    -> query <token> <string>
    <- returns a list of results
Very simplified. If HTTP is the transport, the current Authentication header is very good at encapsulating these details without having to repeat them every command.

Point is, HTTP is still too low level and we're paying people $150k a year to write glue code and reimplement API clients until they quit and do the same thing at another company.

Could work for the traditional image check, but not at all for the user behavior analysis magic they do. Which seems to take over.
I agree with you.

The problems that computers are trying to solve are partly the expression problem.

Given this state, do this calculation.

Object orientation for me means creating relationships between arbitrary groups of objects and sending messages between objects to do things.

CSV, for all it's faults, is probably the closest thing we have to a "universal 2d table format you can import anywhere."
What would you want that's significantly better than an n-dimensional json array? A sparse table format?

You'll still need to handle the core cell-type parsing. You'll still need to deal with what level of normalization the table used ie are cells primitives or objects or different objects with conflicting structures.

an n-dimensional json array still doesn't come with strong type information (other than str/null/decimal/int) for the fields and it doesn't give you a guaranteed shape either. also, there is no standard how to declare the field header labels. also, it's not binary, thus very inefficient for numerical data. there are various data formats that help in those regards, but none of it is standardized, hence all our fingers being sticky from the glue.