|
|
|
|
|
by trainyperson
664 days ago
|
|
Are there any tools that employ LLMs to fill out the Semantic Web data? I can see that being a high-impact use case: people don’t generally like manually filling out all the fields in a schema (it is indeed “a bother”), but an LLM could fill it out for you – and then you could tweak for correctness / editorializing. Voila, bother reduced! This would also address the two reasons why the author thinks AI is not suited to this task: 1. human stays in the loop by (ideally) checking the JSON-LD before publishing; so fewer hallucination errors 2. LLM compute is limited to one time per published content and it’s done by the publisher. The bots can continue to be low-GPU crawlers just as they are now, since they can traverse the neat and tidy JSON-LD. —————— The author makes a good case for The Semantic Web and I’ll be keeping it in mind for the next time I publish something, and in general this will add some nice color to how I think about the web. |
|
The author (and much of HN?) seems to be unaware that it's not just thousands of websites using JSON-LD, it's millions.
For example: install WordPress, install an SEO plugin like Yoast, and boom you're done. Basic JSON-LD will be generated expressing semantic information about all your blog posts, videos etc. It only takes a few lines of code to extend what shows up by default, and other CMSes support this took.
SEOs know all about this topic because Google looks for JSON-LD in your document and it makes a significant difference to how your site is presented in search results as well as all those other fancy UI modules that show up on Google.
Anyone who wants to understand how this is working massively, at scale, across millions of websites today, implemented consciously by thousands of businesses, should start here:
https://developers.google.com/search/docs/appearance/structu...
https://search.google.com/test/rich-results
Is this the "Semantic Web" that was dreamed of in yesteryear? Well it hasn't gone as far and as fast as the academics hoped, but does anything?
The rudimentary semantic expression is already out there on the Web, deployed at scale today. Someone creative with market pull could easily expand on this e.g. maybe someday a competitor to Google or another Big Tech expands the set of semantic information a bit if it's relevant to their business scenarios.
It's all happening, it's just happening in the way that commercial markets make things happen.