Y
Hacker News
new
|
ask
|
show
|
jobs
by
nyamhap
3332 days ago
My bottleneck is still speed of reading data from json. I wonder whether I should wait for features to be built out here or go down the path of writing a custom data reader in C++
3 comments
jacksnipe
3332 days ago
If your data has a little extra structure that isn't shared by JSON in general, you could probably get serious performance gains by rolling your own.
link
sirfz
3331 days ago
Utilizing multiprocessing for reading and processing jsons (or any type of data) then feeding the output into a shuffle_batch* op works great for me.
link
jamesblonde
3332 days ago
You could use Tensorflow-on-Spark to read your JSON into a RDD in Spark. Then the Tf-RDD-Reader will be in-memory and can feed your training.
link