|
|
|
|
|
by n4atki
849 days ago
|
|
Curious what a good end-to-end solution looks like for you? Is it more about ease-of-use (import/export with minimal effort) or is there a privacy layer that's missing? I see it in 4 steps:
1. Connect to a source db to import your data
2. Train a Gen AI using the source data
3. Use it create synthetic data
3. Export synthetic data into a new db The SDV team is working on business solutions to cover the full use case. You can use the public SDV to validate steps 2 and 3. |
|
1. spin up a production-equivalent database (eg: mysql container instead of prod RDS)
2. point a process/binary (maybe a simple container) to:
-- source db (RDS)
-- sink db (mysql container)
-- transformation function (that may use gen AI, etc) to seed sink db with synthetic/anonymized data [there may be some parallel process to enable testing of this transformation function]
3. profit (use this for dev etc)
Key over here would be speed in step (2) if the entire pipeline were to run end-to-end on-demand. do you have some examples of using SDV to achieve this? highly possible that there's already something in the docs that I have missed