|
|
|
|
|
by dpifke
698 days ago
|
|
From https://huggingface.co/datasets/mlfoundations/MINT-1T-HTML#l...: We release MINT-1T under a CC-BY-4.0 license, designating it primarily as a research artifact. While the dataset is freely available, users are responsible for ensuring its legal use in commercial settings. Users must independently verify compliance with applicable laws before employing MINT-1T for commercial purposes. Same page includes this caveat: Potential Legal and Ethical Concerns: While efforts were made to respect robots.txt files and remove sensitive information, there may still be content that individuals did not explicitly consent to include. |
|