Hacker News new | ask | show | jobs
by triangleman 2067 days ago
Years ago I put together a Rails app to store iPhone/iPad apps for curation. There was a backend that connected to the iTunes API and then I parsed the data elements I needed and entered them into the DB.

The most infuriating thing was realizing that I needed to update the model with a new data element, or that some API responses didn't include the needed data. There was all this controller and model code and then the migrations every time I needed to make a change.

Is this normal when coding any web application at all? I think if I had to do it again I'd just store the entire API response in a single "document" column and query it dynamically as needed. Did I stumble upon why many developers have moved to "noSQL" models of document storage?

9 comments

You're just describing any software problem where things aren't static or you haven't yet mapped the domain. Storing the whole response in the database so that you aren't losing data is a good solution while you make your logic robust against the domain of values.

You run into this talking to an undocumented API for example. But notice that to "query it later", you still have to write/polish the rest of your logic to handle the actual data which is how you end up with a system that doesn't need to store every API response to disk. This isn't a web thing, it's any time you don't have a 100% handle on the interface between two systems.

Nothing to do with Rails, though. You were trying to create harder to change solutions based on a premature understanding of the values you were working with which caused churn in solutions (like DB columns) that are annoying to change frequently.

But in that situation, you as an engineer have to be able to step back and go "okay, what can I do to handle this at a more dynamic layer until I understand what I'm working with?"

> I think if I had to do it again I'd just store the entire API response in a single "document" column and query it dynamically as needed.

You'd quickly find that it is a lot of code to parse that column and read and write data to it, compared to a normalized database wrapped in ActiveRecord. I've done just that, and it's just a lot more work in the long run. A relational database really is a good thing when you want to slice and dice that data and keep it consistent.

You could have cached the entire API response in a field for a given record (or their own table) from the beginning. Then you could reprocess it if you had to change the schema, assuming Apple didn't change the response on their end in the interim.

Schema changes are normal in web applications, yes.

Kinda, but there's a subsequent phase of maturity where you realise that you are knee-deep in technical debt as a result of spurning the framework and infrastructure and find yourself reinventing relational behaviours such as joins and foreign keys, but badly and expensively and in application memory, because your attributes are a freeform uncontrolled grab-bag of poorly typed whatever, and you'll wish you'd just had the discipline to trust that the framework folks really did know their stuff after all.
What other languages/platforms do you have experience with, if any? Yeah, it sounds pretty normal to me.

Yes, storing the entire API response in a single column is an option, and yes I think you stumbled on some of what drew some people to "noSQL", which has it's own downsides, and the pendulum swings to people talking about how terrible "noSQL" is, but it can work.

Not sure exactly what you’re describing. If the iTunes api was constantly changing that’s sort of an issue in and of itself, beyond whatever database you’re using.
NoSQL's initial purpose was scale/speed. But yes, the flexibility it provides for things like trivially adding fields is another factor that leads to NoSQL adoption.
Try jsonb fields (postgres only maybe?) and the jsonb accessor gem: https://github.com/madeintandem/jsonb_accessor

I use this for prototyping new apps a lot, where I'm not sure on the best design. Then I migrate to proper database columns when I'm sure of something. Feels like the best of both worlds to me.

Yeah I think I learned this the hard way: even making DB columns can be a premature optimization.
No, the problem was more you were dancing to the tune of someone else's fiddle. It's always boring having to slice, dice and store someone else's JSON data from an API.

As others have noted, maybe you should have just stored that data in a JSON column, but it depends on the app. Making it relational can be useful if you're going to be running reports, etc. on it.