Hey! Worked on this here at Databricks: the blog post goes into the dataset collection design a bit (https://www.databricks.com/blog/2023/04/12/dolly-first-open-...). In summary, you're right - brainstorming and GeneralQA will have overlap because the taxonomy naturally has some overlap