Hacker News new | ask | show | jobs
by e12e 881 days ago
You might want to try and maintain a synthetic dataset for testing and staging that has the same "shape" as your production data - to avoid exposing sensitive data.

We're currently trying to have each rails model implement a #new_example method that builds a valid subgraph filled in by Faker, ready to save. Ie a

    user = User.new_example
will come with a Company.new_example if every user needs a company relationship.

Still early, we'll see how it goes.

2 comments

We're doing the same, but for the TypeScript world with "Snaplet Seed." We use generative AI to generate deterministic values + the required relational data: https://www.snaplet.dev/seed

We generate data based off of your database schema and your production data (if you give us access.)

Since you've kinda already built something like this I would be curious to hear what you think!

Wouldn't it be easier to do the same with FactoryBot? It'll similarly cascade creation of associated records.
I don't enjoy documenting the graph in two places, first in models, then in factories.

But yes, the pattern is essentially the same, just our example methods and Faker - without Factory Bot.

I can understand that. I prefer keeping my models clean without environment-specific implementation details, which is why I've settled on the FactoryBot approach for testing, seeding, etc.