Ask HN: How do you view large JSON files?

Y	Hacker News new \| ask \| show \| jobs

	Ask HN: How do you view large JSON files?
	55 points by chuchuva 3630 days ago
	I use JSON Viewer [1] to view data in JSON format but it freezes on files larger than 1 MB. I want to be able to easily collapse and expand elements to understand the structure of the data. [1] https://jsonviewer.codeplex.com/

32 comments

niccaluim 3630 days ago

I view them unfavorably.

link

rdtsc 3630 days ago

Ha! It is funny because it is true. I remember listning to one of Joe Armstrong's talks (React 2014 conf and he talks at some point ( https://youtu.be/rQIE22e0cW8?t=2009 ) how parsing is rather expensive CPU-wise and also bandwidth expensive. Especially in mobile networks. The company he works for control data paths for smart phones to the internet, and they sweat wasting every little bit because it eats into the precious bandwitdth available to consumers -- and what do developers do? -- they shove JSON through that channel in the application level!

It was a silly observation but it is also true at some level. JSON might be easy to read, but reading a 100M json still needs a special editor.

Another funny observation Joe made at some point when a response to him calling JSON out, because "after all, you can see JSON" was that, your eyes cannot see JSON, they see photons bouncing from the screen. You still use an editor or some other translation program to display it and read it. So at some point might as well use binary (thrift, protobufs, sqlite, ...).

link

gravypod 3630 days ago

I personally think that's one of the more sane replies.

It is really important to chop up your JSON files into smaller sub-files. This will not only make it easier to backup and read manually but will usually give you a speed boost (can read to and write to more then 1 part of the "db" at a time).

link

TazeTSchnitzel 3630 days ago

This is probably sensible. JSON is a text-based format that requires you to parse the entire thing just to get an outline, unlike well-designed binary formats.

link

misframer 3630 days ago

Well-designed text formats too?

link

mockery 3630 days ago

Not necessarily. (I'd actually argue "not at all.") Presumably you have a text format in the first place because you want your representation to be human-readable [with common tools like a text editor], and very likely human-writable as well. Those are the real constraints of a (useful) text format, and they tend to be in direct conflict with high-performance parsing, or partial parsing.

For an arbitrary example, a binary format could have an index table of objects at the start of the file, and then you could perform partial reads to access only the subset of objects you care about. That's something you could do in a text format too, but if the file is edited in a text editor you can't guarantee that the user remembered or bothered to update the index when they added a new object. The parser would effectively not be able to trust the index, and have to parse the entire file. (I suppose you could use CRCs or something to enforce this, but then you'd end up with a very brittle format that people get frustrated when trying to edit.)

Really, the true advantage of a binary format is you generally assume that nobody messed with the data behind your back, so you can have duplicate data (like an index) if you want without worrying that it's out of sync. This pretty much goes hand in hand with the fact that you can't just open it in a text editor and fiddle with stuff.

TLDR: Human-writability and high-performance are arguably mutually exclusive features.

link

besselheim 3629 days ago

> Really, the true advantage of a binary format is you generally assume that nobody messed with the data behind your back

I would rephrase that a bit and say the true advantage is flexibility, as you're not subject to the constraints of textual data.

The integrity of the data is a separate matter, and should be carefully verified rather than trusted implicitly. A huge amount of security vulnerabilities, and program crashes in general, come from errantly assuming that user-supplied data is correct.

link

TazeTSchnitzel 3630 days ago

Indeed. Another example is that you could have a length field in a text format that precedes a string, so you can skip over it without parsing it. But humans will forget to update it, or update it incorrectly.

link

mtmail 3630 days ago

There's addons/extension for Chrome and Firefox. Both also called JSON Viewer (different author). For really large files I use a command-line tool plus grep and awk though https://stedolan.github.io/jq/

link

chuchuva 3630 days ago

jq looks promising but I still need to view the JSON file first. I want to be able to somehow get the general understanding of the structure very quickly and interactively. The tree control works very well for that. JSON Viewer is almost perfect but it doesn't work with local files.

link

slapresta 3630 days ago

I usually use jq to understand the structure of the file. `keys` will get you all the key names in an object; `with_entries(.value |= type)` will replace each element on an object with its type. You can quickly delve into a JSON and gain some understanding of it with jq.

link

dwaltrip 3630 days ago

JSON Formatter for Chrome works with local files (you have to check the box in the settings to enable it).

It's super handy. Drag and drop json files into a new tab.

Link: https://chrome.google.com/webstore/detail/json-formatter/bcj...

link

bsg75 3630 days ago

jq . file.json | less

link

bsg75 3630 days ago

jq . file.json | less

curl ... | jq . | less

link

rickyc091 3630 days ago

Yep, all that above + `sed`.

link

JesseAldridge 3630 days ago

I think Sublime Text should be able to handle fairly large json files without a problem.

If not, I think I would just split data into multiple smaller files. Something like the python code below should work, assuming the file can fit in memory. If not, I assume you can find some json lib that can work in streaming mode and then do the same thing.

    import json, os

    json_input = '''{
    "foo": 1,
    "bar": 2,
    "baz": 3,
    "bug": 4,
    "thing": 5
    }'''

    entries_per_group = 2

    if not os.path.exists('sub_files'):
      os.mkdir('sub_files')

    main_d = json.loads(json_input)
    iter = main_d.iteritems()
    for group_count in range(10 ** 6):
      sub_d = {}
      try:
        for _ in range(entries_per_group):
          k, v = iter.next()
          sub_d[k] = v
      except StopIteration:
        break
      finally:
        json_output = json.dumps(sub_d, indent=2)
        with open(os.path.join('sub_files', '{}.json'.format(group_count)), 'w') as f:
          f.write(json_output)

link

mani04 3630 days ago

I used to be a fan of JSONView or one of those chrome plugins / extensions. Then I realized that I can simply use the network tab to view the JSON data as it gets loaded.

Here is how it works - keep the network tab ready. When you see the JSON data request, click on the request and hit the "Preview" tab. It gives you data in a collapse / expand format.

Advantages: 1. There is one less plugin that scans all your browsing activity, 2. Slightly extra battery life when just browsing and not developing stuff.

Disadvantage: You need to keep the network tab ready, otherwise you will have to reload the entire page with the network tab open.

EDIT:

My apologies if it wasn't clear. I was talking about the "Developer Tools" option in Chrome, in which there is a "Network" tab. It is available in "Chrome Menu" > "More Tools" > "Developer Tools". Alternatively you can hit Command + Option + I in mac, or some equivalent in Linux / Windows to get there.

link

cyphax 3630 days ago

Usually F12 opens these tools in most browsers (IE, Chrome, Firefox at least) on Linux and Windows, but I do not have a Mac to test Os X. Your tip also applies to Firefox, but I don't know about IE or Edge. :)

I've never really had to handle really large JSON files, as I'm not a big fan of those but for smaller files I tend to be lazy and paste it in jsonlint.com. It's usually just for reference or debugging purposes (like finding the name of a property or some strange value).

link

i0exception 3630 days ago

If you're familiar with vim, you can do this

  cat unformatted.json | python -mjson.tool | vim -

And then use vim's folding methods to navigate the file (http://vim.wikia.com/wiki/Folding)

link

burkemw3 3630 days ago

I use json.tool a lot too, but usually pipe it to `less`, using the paging and searching.

link

ferrari8608 3630 days ago

It's not nice to abuse innocent cats.

    python -mjson.tool < unformatted.json | vim -

If you're not using cat to concatenate things, the shell can probably handle the job.

link

HappyTypist 3630 days ago

People like cat because it allows notating the command closer to how they think. How can I pipe while putting the file first?

link

yoha 3630 days ago

Well, just do it:

    < unformatted.json python -mjson.tool | vim -

link

thiht 3630 days ago

What's the immediate advantage for me in using a less intuitive syntax? I understand cat is meant to concatenate files, so what?

link

beeper87 3630 days ago

Not Atom.

;)

[1] https://github.com/atom/atom/issues/8864 [2] https://github.com/atom/atom/issues/979

link

satori99 3630 days ago

On Windows I use Notepad++, which handles large local JSON files just fine, and has a collapsible tree structure.

link

alexatkeplar 3630 days ago

A human can't easily understand the structure of a large JSON file - JSON is internally self-describing, so there is no guarantee that the sample of records currently visible in your editor's viewport are representative of the whole.

Instead, use a tool like Schema Guru (https://github.com/snowplow/schema-guru ; disclaimer: we wrote this at Snowplow) to programmatically extract the JSON Schema (http://json-schema.org/) which represents all JSON instances in the file.

link

ilyash 3630 days ago

I wrote a tool some time ago. It's not a viewer. It just shows the structure. https://github.com/ilyash/show-struct

link

jsonninja 3630 days ago

For very large files, like 1M rows and 100's of MB's, of JSON arrays try http://www.jsondata.ninja

link

uglycoyote 3630 days ago

I have a question related to this which nobody has really touched on yet.

Imagine you have a json structure like this:

{"records" : [ {"name":"joe", ...lots of other fields in each record}, {"name":"fred", ...lots of other fields}, ... thousands of records ]}

Now a lot of people in this thread have mentioned tools for collapsing the json, but the trouble I have with this is that you can't browse the record names without opening each one and looking at all the fields. It would be nice if there was a tool that took a few key identifiers (name, id) and "bubbled them up" so that in a collapsed view you would see something like:

{"records":[ {...click to expand...}, // name=joe {...click to expand...}, // name=fred ]}

I have the same problem when working with XML as well (though in XML sometimes the ID's are attributes of the parent which mitigates the problem, but other times the ID would be nested in a child element of the structure more like you would have in json). I even found it to be enough of a problem that I wrote my own XML editor to solve this issue.

Of course one issue with this is that there's so clear standard as to which fields of an object represent "ID" information which would be important/useful to "bubble up" to the next level when collapsing. It would have to be something user-configurable (though having some sensible defaults like looking for "name" and "id" keys would work in a lot of cases). In the XML world, there's probably something to do with Schemas that would help with this problem, and fancy editors which understand your data using a schema, though some of the editors I looked at which went into that level of detail seemed like way overkill for what i wanted to do.

So essentially my question here is whether this concept of "collapse child, but keep important identifying information of the collapsed child visible" exists in any json tools? is this a thing that has a name/buzzword associated with it that I don't know about? or is it purely an issue that's my own personal quirk which nobody else cares about?

link

tonyle 3630 days ago

That data structure is basically a CSV file. If I were you, I would convert the JSON to a CSV file, load it into excel and use a pivot table dive into your data and convert it back if needed.

A pivot table will do exactly what you want and more. If you hate excel, You could load it into a database and do a groupby query. I suspect most JSON in that data structure were auto generated from CSV files or database anyways.

Another option is to just convert that JSON to another JSON keyed by name with Lodash then inspect it.

var anotherJSON=_.keyBy(oldJSON, 'name');

link

tonyle 3630 days ago

Chrome console allows you to inspect an object and expand on the properties visually.

For the most common cases, I just copy and paste the string into the chrome console if I need to inspect some random JSON file really quick.

If it is too big to copy and paste, pipe it into a html file as a global variable and view the object in the console.

You can also just open a node repl and load the JSON into memory. node lets you auto complete property names by pressing tab, this is great for inspecting some unknown object.

link

77897555 3629 days ago

Whether you are looking at JSON with a bit of expected structure or otherwise, parsing something that big is not humanely possible in a timely fashion. Throw it into a db to analyse it.

link

TMSZ 3630 days ago

There are few viewers/editors listed at http://softwarerecs.stackexchange.com/questions/18839/json-v...

JSONedit should work well with files up to few MBs, with larger files loading time may be not acceptable. Side note: for editors that generate structured view file size may be not precise metric - number of elements might work better.

link

YuriNiyazov 3630 days ago

link

EN1 3630 days ago

This is something I use quite frequently when dealing with JSON. https://www.getpostman.com

link

poooogles 3630 days ago

The Python REPL and the JSON lib. Probably not the most efficient if you're starting out but if you're familiar with Python it can be quite effective.

link

ezstar 3630 days ago

This. I use the Ruby REPL, but same principle.

You have to be a little careful about not accidentally printing out the whole thing to stdout, but after its loaded into memory you can check keys at a given level, drill into the substructure, etc.

link

r24y 3628 days ago

Generally I'll copy it to the clipboard (usually `pbcopy < file.json`), fire up Chrome, Cmd+Alt+J, then paste into the console. Instant, attractive, collapsible rendering of the content. Usually when I do this I'm interested in performing some sort of transformation on the data, and this way I can prototype it without any further steps.

link

twblalock 3630 days ago

I use IntelliJ for that. It parses the JSON and adds code folding automatically. I've used it with fairly large files.

link

rhubarbquid 3630 days ago

I'd suggest either use your browser with a good JSON viewing extension or a code editor.

> I want to be able to easily collapse and expand elements

That feature is often called "code folding", many programming oriented text editors do it.

link

jijji 3630 days ago

print_r(json_decode(file_get_contents("http://url.com/"), TRUE));

link

chuchuva 3630 days ago

What about collapsing/expanding elements interactively?

link

baseh 3630 days ago

For extremely large CSVs I use Emeditor (windows / paid). It also works with JSON, but never had to handle large JSONs

link

mtsmithhn 3630 days ago

You could try the JSONView chrome extension.

link

patrickmn 3630 days ago

Link: https://chrome.google.com/webstore/detail/jsonview/chklaanhf...

link

chuchuva 3630 days ago

I'm already using JSONView Chrome extension and it's great but it doesn't work with local files. Nor can I copy and paste JSON.

link

ZoF 3630 days ago

Why not just throw the file(s) up on a box somewhere so you can use it, or host it locally, can you not use locally hosted links with it or something?

link

jamesmalvi 3628 days ago

I uses http://jsonformatter.org.

link

nielsbot 3630 days ago

On the macOS I've had some success with "Cocoa JSON Editor.app". It takes a while to load files, but they are presented in an outline view you can search and browse.

link

radoslawc 3630 days ago

http://kmkeen.com/jshon/ or python -mjson piped to $editor

link

yagop 3630 days ago

vim

link

rileytg 3630 days ago

IntelliJ scratch files

link

hendzen 3630 days ago

jq with less/more as pagers...

link

canada_dry 3630 days ago

notebook++ has a great json viewer plugin and handles huge files

link

ferrari8608 3630 days ago

>but it freezes on files larger than 1 MB

That's a pretty large bug. Considering what JSON is supposed to be used for, passing data between two programs/processes, files larger than 1 MB should have been accounted for.

Does it freeze then crash, freeze until the OS takes care of it, or freeze temporarily then resume?

link

chuchuva 3630 days ago

It freezes temporarily then resumes after few minutes.

link

meeper16 3630 days ago

vi (5 gigs)

link