You can read my blog posts using curl | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	You can read my blog posts using curl (mahdi.blog)
	101 points by mdibaiee 1419 days ago

17 comments

dxuh 1419 days ago

In these comments people have suggested to check for the User-Agent, but wouldn't it make more sense to check if the Accept header mentions text/html? I realize that curl sends "Accept: /", but if you wanted to see a page in plain text, you would have to pass -H "Accept: text/plain". I think that uses HTTP much more like it was intended.

mort96 1419 days ago

Browsers send "Accept: text/html" explicitly. For my pastebin[1], I send an HTML document when "text/html" is in the list of accepted formats, and I just send the plain data otherwise. This seems to work in practice, doesn't hard-code knowledge about curl, and doesn't require curl users to add custom headers.

[1]: https://sr.ht/~mort/coffeepaste/

IanCal 1419 days ago

I feel that neatly meets the kind of thing these params are for - if you ask a server for content and say "Accept: */*" the server should be free to return that content in whatever format it likes.

remram 1419 days ago

That format should probably be the original file then, the one that was uploaded/pasted.

dfbsdfbwe2ef2e 1419 days ago

Why do you feel that way?

bawolff 1419 days ago

That's literally what the header means according to the rfc.

masukomi 1419 days ago

plus regardless of the official meaning, how _else_ would you interpret a header that's saying "I'll accept anything". Like the only reasonable response is "cool, here's what i felt like giving you since you gave me no guidance whatsoever and said you could handle anything"

knoebber 1419 days ago

Yup, I do the same with dotfilehub.com

E.G,

curl https://dotfilehub.com/knoebber/emacs

results in plain text

eddieroger 1419 days ago

That definitely seems closer to the intent of the header - User Agent is (kind of) who you are, Content Type is what you're sending, and Accept is what you'd like back. If you want plain text, you should be able to ask for it. It's super cool to see a plain text blog option show up, but I would have hoped it would be at the same route, not namespaced `/raw`, and using HTTP conventions. This sounds much more like a dig than it is - it's awesome to have a plain-text version of the author's content.

chrismorgan 1419 days ago

(Editorial correction: “Accept: /” → “Accept: */*” by backslash-escaping the asterisks.)

bronikowski 1419 days ago

When I was hacking that for my blog I went with user-agent detection because of the UX. I don't consider this feature to be anything more that "look at his cool stuff I did" and telling random friend to copy and paste an url prefixed by curl is much easier proposition.

bronikowski 1419 days ago

Two months ago I hacked something like that for my blog:

    curl https://fuse.pl/beton/10print.html # with code highlighting

    curl https://fuse.pl/beton/cegla.html # just prose

ggorlen 1419 days ago

Looks fantastic! I can't read the post, but I think you can simplify the 10 print Python code to

  while True: print(choice("\\/"), end="")

(the `format` call doesn't seem needed and `choice` works on any iterable, including strings of characters).

Al Sweigart has a repository of "scroll art" similar to 10 print that might interest you: https://github.com/asweigart/scrollart

bronikowski 1419 days ago

I wanted to be explicit with my iterables for people who might not know that string is a list of chars. The format() call, I can't tell what I was thinking, probably nothing good.

Thanks!

mdibaiee 1419 days ago

the code highlighting is really cool, well done! pretty sure it's possible to do something like that with jekyll by updating the `highlight` liquid tag to be able to render ascii highlighting in case of raw pages.

tomrod 1419 days ago

Neat!

dfbsdfbwe2ef2e 1419 days ago

Damn that looks really polished.

How does the server know what characters to send so that my specific command line interprets it in a nice way? Sorry if I'm not being very articulate.

EDIT: doesn't work when piping to less.

bronikowski 1419 days ago

I'd be honest, my approach is "everything is utf-8, right? And every terminal has at least 16 colors at the ready?". It does not degrade gracefully, I checked. It's more of a party trick than something reasonable, but nothing stops you from doing it correctly, it would just be a hassle to serve in a static way (every variation would require a different txt file to be generated and sent).

> EDIT: doesn't work when piping to less.

try -R

dfbsdfbwe2ef2e 1419 days ago

> try -R

works :-)

bronikowski 1419 days ago

I'm guessing you're on BSD (or maybe Apple)? Their version requires you to tell less there's a raw data at the input.

dfbsdfbwe2ef2e 1419 days ago

No, Debian 11. Fresh install too.

paulgb 1419 days ago

> EDIT: doesn't work when piping to less.

What if you pass '-r' to less? (I can't verify myself because it worked without -r for me)

gotlou 1418 days ago

Wow, this is a nice feature! I'd much rather implement it by having a .txt or .raw file just in the same folder as the HTML page though, rather than having to go in the middle of the URL. I feel like it is more convenient to do so.

Example, rather than https://mahdi.blog/raw/raw-permalinks-for-accessibility/, it would be https://mahdi.blog/raw-permalinks-for-accessibility.raw

It's a minor nitpick really, but I quite like this idea! I think I'll try to implement this for my website too.

As for the other people here wondering why User Agents weren't used for this:

- Using static website hosting goes out the window, which is quite a shame because it makes everything so much easier

- User agents are pretty terrible for determining capabilities and intent (what if someone was using curl to get an actual webpage?)

- It will never cover all types of HTTP clients (a whitelist is pretty terrible as we have seen from various online services restricting Firefox users or Linux users from certain features for no other reason than their user agents weren't in the list the developers used to check for the presence of certain features).

kevincox 1419 days ago

> To make this easily readable on small screens and terminals, I used vim’s text-width setting to make sure my lines do not exceed 80 characters:

I never understand comments like these. Now if my terminal is 78 characters it is a mess or if it is 100 characters it is wasting space. If you just don't wrap the lines my terminal does it at the right width.

Hard wrapping doesn't work well. You need to know the target width to wrap and you don't know that until someone actually opens the file. Every viewer I have ever tried is excellent at soft-wrapping. Let it do its thing.

mdibaiee 1419 days ago

For me the reasoning is being able to distinguish between what needs to be wrapped, and what I would prefer not to be wrapped. In this case, code samples would preferrably not be wrapped. If I can wrap normal text manually, and keep code samples unwrapped, the client can disable wrapping entirely and see the code lines untouched.

This is possible with HTML and CSS, but not in plaintext. I think wasted space is something I can handle, but a badly wrapped code is something I dislike.

dundarious 1419 days ago

This is the exact reason the Linux kernel developers require plain text emails to be pre-wrapped. You cannot leave it to the reader, because readers will always/never wrap both text+code — they cannot distinguish, and text and code have very different wrapping requirements.

reaperducer 1419 days ago

Now if my terminal is 78 characters it is a mess or if it is 100 characters it is wasting space. If you just don't wrap the lines my terminal does it at the right width.

Just tried it on a vintage early-80's 40-column terminal, and works better than I expected. I thought that a lot of words would be cut off on the right side, but the wrapping was about 90% correct. Perhaps it's just a coincidence, but this is what happened just now.

phailhaus 1419 days ago

I will never understand this obsession with 80 characters as if we still use MS-DOS.

zelphirkalt 1419 days ago

It is still useful though, when you open like 3 files next to each other on a screen. And if screens become bigger then 4 or 5 or ... Screens are wider than high, so putting things next to each other is often better than on top of another

david2ndaccount 1419 days ago

My editor (Vim) can soft-wrap lines at word boundaries. I routinely have 3 or 4 files open next to each other and on my monitor that does not give 80 columns per file. Inserting hard-breaks at 80 columns would make it look horrible, while long lines get wrapped nicely.

zelphirkalt 1418 days ago

I get that you can softwrap. Usually that means though, that when some code is indented, the content of a line that gets wrapped starts at the beginning of the next line (not indented). That just looks bad. I guess one could get used to it. If you can change that and make it continue at same indentation, then you need some additional marker, to distinguish softwrapping from an actual line break. Probably all possible in editors like VIM and Emacs. Just a question of configuration. I would claim though, that softwrapped lines are a bit more difficult to read, when it comes to code, than having short lines, all properly indented.

bern4444 1419 days ago

I also use vim. I prefer to keep wrapping off and line length go to 80 or so chars.

I find it easier to read this way rather than have wrapping turned on. Wrapping isn't nearly as good for my comprehension as a line break after 80 chars.

Different people have different preferences

Schroedingersat 1419 days ago

Makes coding on a phone way easier.

eterps 1419 days ago

Or use:

lynx -dump <regular URL>

elinks -dump <regular URL>

(not the same thing of course, but it doesn't require anything from the server other than reasonable HTML)

nottorp 1419 days ago

Yeah, why not just have a version that renders right in lynx/links?

adolph 1419 days ago

Or even better integrate a site-search engine with Gopher

https://en.wikipedia.org/wiki/Gopher_(protocol)

nottorp 1419 days ago

Axtually Im forgettong things. This is like John Carmack’s .plan :)

jpoesen 1419 days ago

Fun idea.

If the site is not a static one, you could check the request's User Agent server-side, and return the raw version directly (or redirect to /foo/raw) if the UA contains 'curl' or 'wget'.

If the site is static and you are able & willing to change your vhost config, you could detect the UA too, and redirect to /foo/raw.

Just a few ideas. This is a fun little project you've got here. Well done.

anderspitman 1419 days ago

You can read mine using netcat :P

https://apitman.com/19/#netcatable

pbronez 1419 days ago

Pretty cool. Short step from here to publish the blog via Gemini. That protocol uses Gemtext as the core hypertext, and it’s basically markdown.

https://gemini.circumlunar.space/software/

woodrowbarlow 1419 days ago

then use kineto[1] or similar to cross-host your gemtext content as html over https, with your own css.

1: https://sr.ht/~sircmpwn/kineto/

(i do this for my blog, anachronauts.club/~voidstar. i kind of hate gemini-the-protocol, but love gemtext-as-default and love having a space where text-forward content reigns.)

Terretta 1419 days ago

For similar but not gemini-the-protocol you can do similar with markdown and caddy2.

Here’s how to make it pretty-ish: https://github.com/dbohdan/caddy-markdown-site

Serving just the markdown as plaintext to e.g. Lynx is straightforward.

Discussion here:

https://caddy.community/t/markdown-support-in-v2/6984/12

zzo38computer 1419 days ago

But is there "Gemini over HTTP"? You can serve text/gemini files also HTTP(S), and also local files too. (On my computer I have it set up that it is capable of loading files of this format (locally and remotely), although most other programs are not capable of understanding this format)

woodrowbarlow 1419 days ago

but www browsers don't render gemtext and gemini browsers don't fetch http(s) resources.

ideally we'll tweak our gemini-to-http proxy to look at the client's "Accept:" header (as suggested in other comments on the op) to decide whether to reply with raw gemtext or converted html-with-css. then a www browser receives light formatting, curl receives raw gemtext, and gemini browsers format however they like.

zzo38computer 1419 days ago

Some browsers might accept both. My computer accepts both (although not the Gemini protocol, but it does have Gemini file format, and can read this over HTTP(S) and local files (including inside of ZIP archives, so gempub files can be displayed)). (Also, curl does not have Gemini (although I think that it should).)

There are some problems with using Accept header (although you can use it as a start; if you can implement it, then you might do so). Even if it is mentioned in the Accept header, servers will not necessarily understand it, and if a lot of file formats are implemented then it can make the Accept header very long, knowing properly the preference, trying to explicitly download the raw file or a specific other format (regardless of the Accept header), and other problems, etc.

(My idea is a Interpreter response header, although this requires that web browsers will implement it.)

tgv 1419 days ago

Cute idea, but the first one I picked didn't work very well: https://mahdi.blog/raw/mathematical-induction-proving-tiling...

mdibaiee 1419 days ago

ah! that one includes a JavaScript I used to draw a canvas in the post body, hence the script showing up in the curl result. Might try to think of a solution for that.

Edit: fixed it by moving the script to its own file.

nstart 1419 days ago

If you know basic vim movements (j/k for down/up kind of basic stuff), you can pipe the output into less to read it in a more convenient way. Nothing major, just found it nicer to not have to scroll back up when the article loads

Example:

curl https://mahdi.blog/raw/self-hosted/ | less

teddyh 1419 days ago

The “less” program understands the standard cursor keys just fine; there is no need to use vi-style keys.

mgdlbp 1419 days ago

You can even use the scroll wheel (so can vim, accursedly)

teddyh 1419 days ago

No, I'm pretty sure that’s your terminal emulator capturing the scroll wheel events and translating them to cursor key events.

mgdlbp 1419 days ago

  LESS(1)

[...]

       --mouse
              Enables mouse input: scrolling the mouse wheel down moves
              forward in the file, scrolling the mouse wheel up moves
              backwards in the file, and clicking the mouse sets the "#"
              mark to the line where the mouse is clicked.  The number
              of lines to scroll when the wheel is moved can be set by
              the --wheel-lines option.  Mouse input works only on
              terminals which support X11 mouse reporting, and on the
              Windows version of less.

----

  5. Using the mouse                                      *mouse-using*

[...]

  Don't forget to enable the mouse with this command: >
          :set mouse=a
  Otherwise Vim won't recognize the mouse in all modes (See 'mouse').

  Currently the mouse is supported for Unix in an xterm window, in a *BSD
  console with |sysmouse|, in a Linux console (with GPM |gpm-mouse|), for
  MS-DOS and in a Windows console.

[...]

                                                          *xterm-mouse-wheel*

  To use the mouse wheel in a new xterm you only have to make the scroll wheel
  work in your Xserver, as mentioned above.

teddyh 1418 days ago

OK, but those are not enabled by default. BTW, Emacs also has it as an available feature: M-x xterm-mouse-mode

Sheeny96 1419 days ago

A cool add-on to this would be a plugin that automatically redirected /latest to the latest blog post. That way, if the blogger were to publish say once a week every Tuesday, the user could alias the curl for maximum ease of use.

mxuribe 1419 days ago

I once suggested this kind of strategy at a previous job...and internally, everyone liked it. But what ended up happening was: our fans of the content kept complaining that they were sharing the /latest url but "the website kept breaking" (in their mind) because last week it pointed to the blog post they wanted to share, and this week, its pointing to a different blog post, and please fix the website, etc. Clearly, our fans misunderstood about redirects...and our fans were vastly, overwhelmingly non-techies...so try as we did to educate them, it became a lost cause...and we stopped employing that tactic. Our fans as i call them were real estate agents who affiliated with the real estate company that i worked for...and these guys i called fans because they were the most fervent, avid followers of the company's brand, and they sold million-dollar plus homes...so they had plenty of pull with the senior leasders at the company, but to be fair they did follow the brand religiously...so like i've heard poeple say: support your most ardent fans. In any case, since that time, i'm careful to set expectatins about expected behaviors as it pertains to links, web resources, redirects, and such. (Of course, such a tactic might work for other audiences too.)

EDIT: typo fixings.

pwdisswordfish0 1419 days ago

This is not a problem if you use proper redirects.

Sheeny96 1419 days ago

Interesting. It's easy to fall trap to the tech mindset - my big ideas to change the lives of your average person are often humbled whenever I have to explain the difference between a web browser and the internet, to my averagely tech literate mid 20s friends.

mxuribe 1419 days ago

You and me both! :-)

preisschild 1418 days ago

If you use glow[1] you can even format/color the markdown nicely `https://mahdi.blog/raw/raw-permalinks-for-accessibility/ | glow -`

[1]: https://github.com/charmbracelet/glow

webscout 1419 days ago

Also works with https://terminal.news

mxuribe 1419 days ago

I love these kinds of websites! I'm not into following crypto news, but the concept (and i'd say design) of this website is great! I look foward to the days when gemini is more popular!

Eriks 1419 days ago

Pipe output to mdless (or similar) and it will be readable even more

netsharc 1419 days ago

Why not use user agent detection (and maybe have a header/footer that says "This page has been formatted for readability based on your User Agent: curl").

I'm guessing the blog is made by a static site generator, so the above is harder than it seems. I suppose one could add a reverse proxy that redirects to /raw/$PAGE when it sees "curl".

jenny91 1419 days ago

User-agent specific stuff is pretty hacky in general. I at least would avoid it unless I've tried everything else first.

sylware 1419 days ago

this is brain unwashing: noscript/basic (x)html browser compatibility.

theandrewbailey 1419 days ago

This site already works well without JS and CSS. Tested with Firefox + uMatrix.

cjvirtucio 1419 days ago

would've liked something like this on hugo

Zhyl 1419 days ago

You can!

https://gohugo.io/templates/output-formats/

cjvirtucio 1419 days ago

huh. TIL, that's pretty cool. guess I know what I'm doing next weekend..