Smarter Rails Seeding with Sprig

Y	Hacker News new \| ask \| show \| jobs

	Smarter Rails Seeding with Sprig (viget.com)
	30 points by dce 4531 days ago

5 comments

rpwilcox 4531 days ago

It looks like using Rails test fixtures (reborn) to create seed data. Which is an OK win I guess, and having it separated out by environment (implicitly mentioned in the article) is nice.. but I don't see how much of a win this is over instantiating fixtures or FactoryGirl.create in your seeds.rb file.

Yes, there are tons of problems with using db/seeds.rb for anything serious. I'm not sure that Sprig will handle my issues much better than I currently do myself.

For the record, my desired behaviors for a seed data solution are:

1. environmental separation (which Sprig has). Developers have different needs for seed data as QA does, as does the staging server, as does production. Developers want an easy default dev@myco.com user with a simple password, but you don't want that in production.

2. If I rerun my seed solution (perhaps because I added some more seed data) it shouldn't duplicate records (or throw errors because it's trying to create the second user with the same email address)

3. Handle bootstrap data I need in my app (example: I want a list of US states, and every environment should get this. To reiterate my second point, I should be able to add to this bootstrap data without getting two copies of "California" in my US state list).

It's sad that no real solution exists to handle all three of these needs. Some projects I've been on have gotten this close, but that was years ago and things have changed.

(If Sprig does have these things, then that's the selling point, not seed data as fixtures which the article emphasized)

I'm also not sure about using fixture like things in Sprig. I give it 6 months before most users remember why many people in the Rails community moved to a Factory pattern for (test) data construction long ago.

However, I am happy that a relatively well known Rails consultancy is released 1.0 of a seed gem. Hopefully the name recognition / noise will lead developers to the gem and I'll be a better solution with many more eyes.

link

bigtunacan 4530 days ago

Not sure about Sprig, but Seedbank hits all 3 of those and has for the past two years.

https://github.com/james2m/seedbank

link

danso 4531 days ago

I completely expected the OP to have had a typo in the headline and for it to actually be about "Spring", which is an amazing an essential gem (and part of 4.1beta)...but this is pretty cool too :).

I'm biased because the OP shows off a custom-parser for Google Spreadsheets, which is neat to me because Google Spreadsheets is my goto-interface for new prototyped apps...a much better, live-collaborative admin than anything I could easily build myself or with Rails tools.

But I wonder if this gem is more work than its worth? I mean, seeds don't seem like the best place to persist intricate production-ready data in the repo. And if you continue to use Google Spreadsheets, or whatever, as your main admin input interface, then it seems worth it to build a more elaborate abstraction to handle that usecase.

Also, I wonder if some of the self-referencing could be done via YAML's standard syntax? That would mean no JSON as a format, but YAML seems like it was built for this kind of lightweight relational data storage?

link

lkurtz 4531 days ago

The Google Spreadsheet example is admittedly a reach. It's mostly just a demonstration of how flexible Sprig can be with data formats.

That's a very interesting idea to use YAML's self-reference syntax. Although a lot of the value of this gem comes from the seed organization across different record-type-specific files, and I think YAML's self-referential syntax might make cross-file referencing a bit more difficult for the user. Good thing to keep in mind for the future though, especially once everyone is on board the YAML train.

Do you have another place/system for persisting that you've success with in the past? I've always felt that seeds are designed to be the place to persist record data in the repo.

link

danso 4530 days ago

For data that I intend to share/reuse, I tend to have it in a SQL dump, but here are my (perhaps non-informed) assumptions and needs:

1. The values in this data never change. That is, if I'm maintaining a store of the U.S. Congress vote database, all votes (in the past) are inextricably and forever tied to a legislator's ID.

2. The seed data is so large that loading by ActiveRecord is too slow and needs to be done either by plain SQL import or activerecord-import.

3. The dataset should, when possible, imply its own opinion on conventions...so the Congressmebmer table will always be "congressmembers" and Vote will always be in "votes" and the relations and their keys will have the same convention in any app that uses this data. Of course, a particular app may choose to rename things, but they can do that after the seeding process.

4. For a situation like the above, it's likely that a Rails Engine has been made, i.e. with all the domain-specific logic.

For smaller datasets, say, a list of the U.S. states and their abbreviations...storing them as plain seed files should suffice.

So anyway, those are my past practices. I'm not saying they're best, though, and am always looking for better ways to organize data between apps.

link

bigtunacan 4530 days ago

This feels like just re-inventing the wheel again. The seedbank gem https://github.com/james2m/seedbank has been around for a couple of years and does a great job for managing seeds on a natural, more granular level, if that is what you need.

link

tbruffy 4530 days ago

Seems like an oversight not to include XML support

link

hmans 4531 days ago

No. Just no.

link