Hacker News new | ask | show | jobs
by bct 4587 days ago
> You may not even need the API if the CMS doesn't screw around too much with the presentation by putting, say, ads, in the middle of the content.

This is exactly what I'm saying.

> Then you could screen scrape and convert the generated HTML back into Markdown,

Why do you want the Markdown at all?

1 comments

> Why do you want the Markdown at all?

Because you always want to depend on abstraction, not concretion. If you have a whole bunch of data, and you know that all the data has 7 fields and none of the entries in field 3 are null, then that's much easier to work with if you don't know how many fields are in your data at all, or if some of the data has 9 fields and some of it has 3. If you have this situation, then you have to take an extra step to clean your data before you can reason about it properly.

The HTML you get when you go to a web page is anything but an abstraction of a data type. If you want it to get that way, now we're back to telling the whole web how to make web pages.

> The HTML you get when you go to a web page is anything but an abstraction of a data type.

You don't think that "this is an article", "this is a paragraph", "this is a link", "this text should be emphasized" are abstractions? And how is Markdown different, when it describes exactly the same elements of a document?

> now we're back to telling the whole web how to make web pages.

Telling people how to make web pages isn't a problem - that's what the HTML standard is. Depending on the whole web to make their pages suit your individual needs is a problem.

> You don't think that "this is an article", "this is a paragraph", "this is a link", "this text should be emphasized" is an abstraction?

It's not. There's many different ways to do this with HTML/CSS. You can use bold tags or spans. If you use spans, then you have to understand the class and the CSS before you can tell that this text is supposed to be bold-faced rather than colored differently.

Links can be specified in the HTML or added in with jQuery. An article can be described with a semantic HTML tag or a div with a class. If it's the latter, you've got to parse the class and figure out what it means. If you're lucky it will be 'article'. But you probably won't be.

HTML cannot be looked at as a data type. Markdown specifies one and only one way to do all of the above. That's a proper abstraction.

> Telling people how to make web pages isn't a problem - that's what the HTML standard is.

You can announce a set of 'best practices', but that's not a standard. A standard is an abstraction that you can rely on other people using because otherwise the vast majority of software won't work with it. Best practices cannot be relied on, you follow them for your own benefit, not others.

The HTML standard is insufficient for this kind of use. And it will remain this way because HTML isn't intended the way you seem to think it is.

> If you're lucky it will be 'article'. But you probably won't be.

This situation can be improved (and is being improved). If we treat HTML as nothing but a display language, then it will become one - and if that's what you want, then you should just be using PDF, PNG or SWF.

> And it will remain this way because HTML isn't intended the way you seem to think it is.

I guess we've reached the root of our disagreement. It's exactly how it was intended to be used historically, it's how it is still used for the most part (with webapps being a notable exception), and I think it's the best way to use it going forward.