|
|
|
|
|
by ben_w
546 days ago
|
|
> To your point, I have wondered whatever became of that massive initiative from Google to scan books, and whether that might be looked at as a potential training source, giving that Google has run into legal limitations on other forms of usage. Still around, doing fine: https://en.wikipedia.org/wiki/Google_Books and https://books.google.com/intl/en/googlebooks/about/index.htm... Given the timing, I suspect it was started as simple indexing, in keeping with the mission statement "Organize the world's information and make it universally accessible and useful". There was also reCAPTCHA v1 (books) and v2 (street view), which each improved OCR AI until the state of the art AI were able to defeat them in the role of CAPTCHA systems. |
|
Maybe I wasn't clear, but I was interested in the consequences of the legal stuff. It's not clear from the wiki article what any of this means with respect to the suitability of scans for AI training.