Hacker News new | ask | show | jobs
by rev_bird 3929 days ago
I can't remember seeing them looking for anything other than Postgres, which seems less like "big data" than... regular data. Maybe they don't put their heavy duty stuff in job listings?
3 comments

Farming is often in the raster world as opposed to the line/row/text/vector world. Some of things in Postgres might be huge. Farm data sets could easily be only adding a few thousand rows a day, but the objects associated with each row could be several gigabytes. Meaning that the size of the data is bigger than so-called "big data" but the row analysis tool set looks more like your "regular data." However, there's a lot that goes into the raster analysis that's a whole different beast.
Huh. Thanks for taking the time to outline this, I don't know why it never occurred to me. I have to admit, I'm unfamiliar with "rasters" in the way you seem to be referencing them. It sounds, though, like the relational bits of the DB are really being used more as a file system than a database, if there are even really distinctions in the first place. If "a couple thousand rows" are basically being used as a metadata store for the rasters, is that an unusual use of the database, or is everybody doing this and I just never had enough data to care?
You'd be surprised at what you can do with PostgreSQL. Also, there are also some really exciting new ways to handle data pipelines using Docker and tiny applications (see Pachyderm) as opposed to classic approaches like HDFS and Hadoop.

Not everything needs to be in an enterprise grade multi-node C* cluster to be big data!

Thanks.

I did see a job listing that noted "familiarity with stuff like Hadoop" (paraphrasing). So it seemed to cast doubt on them using Hadoop, but it neither served to confirm nor deny.