|
You're basically dumping down a database to the web browser, including all of the internal metadata that's likely irrelevant to rendering the HTML. For example, user role memberships: {
"id": "c80b68c5-09ae-4a50-a447-df7c5a4a6d01",
"type": "user",
"attributes": {
"username": "kinshiki",
"roles": [
"ROLE_MEMBER",
"ROLE_GROUP_MEMBER",
"ROLE_POWER_UPLOADER"
],
"version": 1
}
}
Also record timestamp dates like created/changed, along with contact details that may be revealing sensitive info: "attributes": {
"name": "SENPAI TEAM",
"locked": true,
"website": "https:\/\/discord.gg\/84e3j9b",
"ircServer": null,
"ircChannel": null,
"discord": "84e3j9b",
"contactEmail": "senpai.info@gmail.com",
"description": null,
"official": false,
"verified": false,
"createdAt": "2021-04-19T21:45:59+00:00",
"updatedAt": "2021-04-19T21:45:59+00:00",
"version": 1
}
But let's just go back to your response:> Most of it is page filenames which indeed could be made optional Do that! If you strip them out, the 529 kB document shrinks to 280 kB, which hardly seems worth the hassle, but when gzipped, this is a miniscule 13 kB! This is because those strings are hashes, which significantly reduces their compressibility compared to general JSON, which usually compresses very well. It's basic stuff like this that can make a website absolutely fly. Avoid giving computers unnecessary, mandatory work: https://blog.jooq.org/many-sql-performance-problems-stem-fro... |
Because of this model, we also make sure that Elasticsearch merely works a search cache, not as an authoritative content database (hence everything we add in there is considered public, on purpose, and what isn't meant to be public is just not indexed in ES)
However the gzip efficiency improvements would be really neat for sure
Fwiw I also don't work on the backend and there might be good reasons to not expressly filter out data (yet anyway, perhaps it will end up as a separate entity and be a include parameter)