| > But (as you rightly pointed out) this is a lot harder than it should be (and harder than it used to be). It's hard because doing this is moving in the wrong direction concerning the intended abstractions. As I said earlier, a document combines unstructured information with presentation, and you have to re-do the document every time the presentational logic changes. This is necessary because the information in the document is unstructured, it's not like an order form. Because you cannot predict what form unstructured information will take, the presentational logic is necessarily strongly coupled with the information. That's why it's hard to do what you want. You can pop open Dev Tools and manually do it, but you can't write a program that will take _all blog posts_ and restructure them the way you want to, because _all blog posts_ is impossible to reason about. No amount of evolution to the HTML or CSS standards will work out this particular bit of complexity. If you could standardize a blog post, then you could write a program to do it. But it would only work on posts that meet the standard. Say you made it so every blog CMS out there stored the text in the DB in Markdown format and provided an API so you could get at the Markdown. Then you could do what you say, provide your own styling. What this would be doing is introducing a separation of concerns. You push most of structure out of the data, and divide up styling duties between a base level (Markdown) and an upper level. (whatever you're using to display it) You may not even need the API if the CMS doesn't screw around too much with the presentation by putting, say, ads, in the middle of the content. Then you could screen scrape and convert the generated HTML back into Markdown, but again, this is the wrong way to go, (depending on concretion rather than abstraction) and prone to breakage. You really need the API layer to do this properly. But you can't hope that one day HTML and CSS will make sense again like the old days and that user-styling will work again. It only worked before in the very early days of the web because everything was super simple and people could live with the edge cases that cropped up, not because the underlying domain changed. That solution was always brittle, and it broke the second people wanted greater flexibility in presentation. |
This is exactly what I'm saying.
> Then you could screen scrape and convert the generated HTML back into Markdown,
Why do you want the Markdown at all?