Hacker News new | ask | show | jobs
by leef 3716 days ago
Finally! I've had to live the JSON nightmare since I left Amazon.

Some of the benefits over JSON:

* Real date type

* Real binary type - no need to base64 encode

* Real decimal type - invaluable when working with currency

* Annotations - You can tag an Ion field in a map with an annotation that says, e.g. its compression ("csv", "snappy") or its serialized type ('com.example.Foo').

* Text and binary format

* Symbol tables - this is like automated jsonpack.

* It's self-describing - meaning, unlike Avro, you don't need the schema ahead of time to read or write the data.

6 comments

You could have used CBOR for many of those things (http://cbor.io/).
Thanks for the link!
Sounds a lot like Apple's property list format, which shares almost everything you listed in common, except for annotations and symbol tables.

Its binary format was introduced in 2002!

Edit: Property lists only support integers up to 128 bits in size and double-precision floating point numbers. On top of those, Ion also supports infinite precision decimals.

Plists are nifty, but the text format's XML-based, which makes it too complex and too verbose to be a general-purpose alternative to something like JSON.

(plutil "supports" a json format, but it's not capable of expressing the complete feature set of the XML or binary formats.)

I don't get this gripe with XML, it is meant to be used by tools not to be written by hand.

Where is the XPath and XQuery for JSON?

Do people really think that manually iterating over the whole JSON document to find the data or writing yet another parser, is better?

Any solid Java, C#, C++ libraries?
The XML serialization Apple defined for plists is awful and not easy to query with XPath.
Like Property Lists the binary format is TLV encoded as well. Ion has a more compact binary representation for the same data and additional types and metadata. Also, IIRC, Plist types are limited to 32-bit lengths for all data types. The binary Ion representation has no such restriction (though in practice sizes are often limited by the language implementation).
Okay, but they did a really poor job marketing it in this release. Plus, if it's used within Amazon, why it's Java-only so far?
Amazon's mainly a Java shop, not sure if that helps you.
Not really the case - a lot of major projects (like boto and AWS CLI) are in Python.
These are only client-side interfaces. The server-side is usually much larger.
True, but my point was that there's enough talent at Amazon, working on SDKs, and others, and there are precedents where even more complex projects such as JMESPath have wide support [0].

[0]: http://jmespath.org/libraries.html

I'm sure there are ion bindings for every language in common use at Amazon. But a huge percentage of Amazon code is Java, so presumably this one was the best maintained and documented.
I doubt it, when I was there Ion was only used by only a handful of Java teams doing backend work. It was also horribly documented and supported at the time (3.5 years ago).
I am still in Amazon and Ion is definitely the most widely used library around. It has among the best documented code and some of the extensions that have been built on top of Ion are simply amazing.
I left < a year ago and never heard of ion.
What did you work on?
Real decimal type - invaluable when working with currency

What does JavaScript do with this though, just cast it to a float?

The real way is:

  "price": {
    "amount": "1500",
    "scale": 2,
    "symbol": "GBP",
  }
Currency has 3 properties, the amount, scale, and symbol.

Amount is a string, it holds a bigint. Yes, it's a string.

The value of Scale can be up to 5 but is usually 2 or 3.

Symbol is the ISO code.

Whenever I see a financial system that uses "amount": 15.00 I know that the system is ill-conceived.

What's the scale in this context? Your explanation doesn't really clarify.
From two different comments on here: http://stackoverflow.com/questions/5689369/what-is-the-diffe...

   Precision is the number of significant digits. Oracle guarantees the
   portability of numbers with precision ranging from 1 to 38.

   Scale is the number of digits to the right (positive) or left (negative)
   of the decimal point. The scale can range from -84 to 127.
Worth noting this isn't specifically an Oracle thing, most financial systems need to be sure that it can store currency numbers accurately and this convention is widely used to ensure this.

And:

   Precision 4, scale 2: 99.99

   Precision 10, scale 0: 9999999999

   Precision 8, scale 3: 99999.999

   Precision 5, scale -3: 99999000
Typically when dealing with currencies scale is only used to represent the units less than whole unit of the currency, i.e. cents and pence. But there isn't anything that restricts it from being used to accommodate larger numbers with the use of negative scales.

A current list of all ISO 4217 codes and the currency properties can be found here http://www.currency-iso.org/en/home/tables/table-a1.html

i.e.

   <CcyNtry>
      <CtryNm>UNITED STATES OF AMERICA (THE)</CtryNm>
      <CcyNm>US Dollar</CcyNm>
      <Ccy>USD</Ccy>
      <CcyNbr>840</CcyNbr>
      <CcyMnrUnts>2</CcyMnrUnts>
   </CcyNtry>
The CcyMnrUnts property denotes 2 decimal places for the sub-unit of the dollar.

So for the above example of 99999.999 you would store an amount of 99999999 and a scale of 3.

Cool, I didn't know about this technique. Thanks for the detailed explanation!
I guess "scale" here is log10 of the multiplier used in storing "amount".

So to convert the value back into the currency range, you're supposed to compute "amount / 10^scale".

what is the point of sending integer "amount" as a string?
So that it can be an integer with an arbitrary length, or a float/double without precision problems. You can also let our own integer classes do the parsing (which might even be able tonhandle complex types).

After all, everything in JSON is a string since it doesn't have a binary format and it shouldn't cause a huge overhead to do the parsing yourself (that might depend on the library, though).

I find that many financial technology companies opt to store currency as strings. The small overhead is typically well worth freedom from floating-point errors.
Don't they have to convert it back to a float to do math operations on it (if they're using JavaScript)?
Correct. We store as strings then derive a number from the string for sorting purposes.
Any text format is a technically a string. Just because some numeric token has no double quotes around it doesn't mean it isn't a string (in that representation).

It's just that we can have some lexical rules that if that piece of text has the right kind of squiggly tail or whatever, it is always treated as a decimal float instead of every programmer working with it individually having to deal with an ad hoc string to decimal type conversion.

move the decimal and use an int (cents in US). It still blows me away that javascript has become so popular on the server without 64bit int types.
Do you really need those extra 11 bits? Javascript numbers accurately represent integers up to 2^53 - 1. See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
Until you need to deal with decimals instead of floats, then you are going to hate yourself because you have to pull in some third party library because the language treats every single number as a float (and floating point errors are a lot more common than most people think even when they are adding together simple numbers).
Integers will do no better at pretending to be a decimal type without a library.
Sure, but most other languages have built-in support for decimal types. Java has BigDecimal, as does Ruby, Python has the decimal module, C# has System.Decimal, the list goes on.

Javascript doesn't even have proper integers to guarantee the functioning of this correctly, it's a really sad state.

I just did some ownership percentage stuff where it's not uncommon to go 16 decimal places out...working with JavaScript on this was a pain. Never thought I'd care about that .00000000000001 difference hah...
floating point works best near 0, most of the numbers it can represent lie near 0 (negative exponent is a number less than 1).
JavaScript doesn't have any int types at all. Your solution works until you need to do division...
Ion has functions to turn Ion into JSON which will, of course, lose information. Annotations are dropped, decimals turn into a JSON number type which may lose precision, etc.
Does it support comments?
> * Real decimal type - invaluable when working with currency

I believe that the proper way to handle money is to use Integer values plus a pre-defined precision.