Hacker News new | ask | show | jobs
by giovannibajo1 715 days ago
Google Docs does a lot of algorithms over the data you put in. For instance, it paginate them and show a page count. This is an algorithm processing your data exactly like Gemini does. There is no option in Google Docs to avoid the pagination algorithm from reading my data and processing it.

Another example: Google Docs indexes the contents of your document. That is, it stores all the words in a big database that you don't see and don't have access to, so that you can search for "tax" in the Google Docs search bar and bring up all documents that contain the word "tax". There is no option in Google Docs to avoid indexing the contents of a document for the purpose of searching for it.

When you decide to put your data into Google Docs, you are OK with Google processing your data in several ways (that should hopefully be documented). The fact that you seem so upset that a specific algorithm is processing your data just because it has the "AI" buzzword attached to it, seems like an overreaction prompted by the general panic we're living in.

I agree Google should be clear (and it is clear) whether Gemini is being trained on your data or not, because that is something that can have side effects that you have the right to be informed about. But Gemini just processing your data to provide feature N+1 among the other 2 billions available, it's really not something noteworthy.

2 comments

> For instance, it paginate them and show a page count.

Do you think this information google is gathering can then be used in the future to paginate some other document? Do you think paginating my doc will help their algorithm to better paginate documents in the future? I see what you're trying to say but putting everything in the "algorithm" bucket doesn't help moving the whole conversation around AI forward.

> The fact that you seem so upset

Your upset detector is clearly wrong. I don't use google docs. I don't care about google docs. I'm just adding my 2c to a conversation around this type of practices google and co are using.

Isn't this why we're here on HN? To exchange ideas?

Google is pretty good at separating inference from training. If they wish to train on your data they do that by just training on your data, them running the model on that data to give you info is totally separate.
https://support.google.com/gemini/answer/13594961

“Google collects your Gemini Apps conversations, related product usage information, info about your location, and your feedback. Google uses this data, consistent with our Privacy Policy, to provide, improve, and develop Google products and services and machine-learning technologies, including Google’s enterprise products such as Google Cloud.”

“To help with quality and improve our products (such as generative machine-learning models that power Gemini Apps), human reviewers read, annotate, and process your Gemini Apps conversations. We take steps to protect your privacy as part of this process. This includes disconnecting your conversations with Gemini Apps from your Google Account before reviewers see or annotate them. Please don’t enter confidential information in your conversations or any data you wouldn’t want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.” [italics was bold in the original]

Seems pretty clear to me.

You can opt out of that. Its explained right after what you have quoted.

> To stop future conversations from being reviewed or used to improve Google machine-learning technologies, turn off Gemini Apps activity. You can review your prompts or delete your conversations from your Gemini Apps activity at myactivity.google.com/product/gemini.