Hacker News new | ask | show | jobs
by motrm 719 days ago
If the pieces of state are all well known at build time - and trusted in terms of their content - it may be feasible to print out JSON 'manually' as it were, instead of needing to use a JSON library,

  print "{"
  print "\"some_state\": \"";
  print GlobalState.Something.to_text();
  print "\", ";
  print "\"count_of_frobs\": ";
  print GlobalState.FrobsCounter;
  print "}";
Whether it's worth doing this just to rid yourself of a dependency... who knows.
4 comments

This looks like the exact kind of thing that results in unexpected exploits.
Hand rolled JSON input processing, yes. Hand rolled JSON output, no.

You're gonna have a hard time exploiting a text file output that happens to be JSON.

> You're gonna have a hard time exploiting a text file output that happens to be JSON.

If you’re not escaping double quotes in strings in your hand-rolled JSON output, and some string you’re outputting happens to be something an attacker can control, then the attacker can inject arbitrary JSON. Which probably won’t compromise the program doing the outputting, but it could cause whatever reads the JSON to do something unexpected, which might be a vulnerability, depending on the design of the system.

If you are escaping double quotes, then you avoid most problems, but you also need to escape control characters to ensure the JSON isn’t invalid. And also check for invalid UTF-8, if you’re using a language where strings aren’t guaranteed to be valid UTF-8. If an attacker can make the output invalid JSON, then they can cause a denial of service, which is typically not considered a severe vulnerability but is still a problem. Realistically, this is more likely to happen by accident than because of an attacker, but then it’s still an annoying bug.

Oh, and if you happen to be using C and writing the JSON to a fixed-size buffer with snprintf (I’ve seen this specific pattern more than once), then the output can be silently truncated, which could also potentially allow JSON injection.

Handling all that correctly doesn’t require that much code, but it’s not completely trivial either. In practice, when I see code hand-roll JSON output, it usually doesn’t even bother escaping anything. Which is usually fine, because the data being written is usually not attacker-controlled at all. For now. But code has a tendency to get adapted and reused in unexpected ways.

Even better to just use TSV. Hand-rolling XML or JSON is always a smell to me, even if it's visibly safe.
Hand-rolling TSV is no better. The average TSV generator does not pay any mind to data cleaning, and quoting / escaping is non-standard, so what the other wide will do with it is basically playing russian roulette.

Using C0 codes is likely safer at least in the sense that you will probably think to check for those and there is no reason whatsoever for them to be found in user data.

Do you mean TLV (tag-length-value)? I can't figure out what TSV is.
Tab Separated Values, like CSV but tabs instead of commas.
> If the pieces of state are all well known at build time - and trusted in terms of their content

.. than use library, because you should not rely on the assumption that next developer adding one more piece to this code will magically remember to validate it with json spec.

No magic necessary. Factor your hand-rolling into a function that returns a string (instead of printing as in the example), and write a test that parses it's return with a proper JSON library. Assert that the parsing was successful and that the extracted values are correct. Ideally you'd use a property test.
That’s somewhat better than assembling, say, HTML or SQL out of text fragments, but it’s still not fantastic. A JSON output DSL would be better still—it wouldn’t have to be particularly complicated. (Shame those usually only come paired with parsers, libxo excepted.)