In these comments people have suggested to check for the User-Agent, but wouldn't it make more sense to check if the Accept header mentions text/html? I realize that curl sends "Accept: /", but if you wanted to see a page in plain text, you would have to pass -H "Accept: text/plain". I think that uses HTTP much more like it was intended.
Browsers send "Accept: text/html" explicitly. For my pastebin[1], I send an HTML document when "text/html" is in the list of accepted formats, and I just send the plain data otherwise. This seems to work in practice, doesn't hard-code knowledge about curl, and doesn't require curl users to add custom headers.
I feel that neatly meets the kind of thing these params are for - if you ask a server for content and say "Accept: */*" the server should be free to return that content in whatever format it likes.
plus regardless of the official meaning, how _else_ would you interpret a header that's saying "I'll accept anything". Like the only reasonable response is "cool, here's what i felt like giving you since you gave me no guidance whatsoever and said you could handle anything"
That definitely seems closer to the intent of the header - User Agent is (kind of) who you are, Content Type is what you're sending, and Accept is what you'd like back. If you want plain text, you should be able to ask for it. It's super cool to see a plain text blog option show up, but I would have hoped it would be at the same route, not namespaced `/raw`, and using HTTP conventions. This sounds much more like a dig than it is - it's awesome to have a plain-text version of the author's content.
When I was hacking that for my blog I went with user-agent detection because of the UX. I don't consider this feature to be anything more that "look at his cool stuff I did" and telling random friend to copy and paste an url prefixed by curl is much easier proposition.
I wanted to be explicit with my iterables for people who might not know that string is a list of chars. The format() call, I can't tell what I was thinking, probably nothing good.
the code highlighting is really cool, well done!
pretty sure it's possible to do something like that with jekyll by updating the `highlight` liquid tag to be able to render ascii highlighting in case of raw pages.
I'd be honest, my approach is "everything is utf-8, right? And every terminal has at least 16 colors at the ready?". It does not degrade gracefully, I checked. It's more of a party trick than something reasonable, but nothing stops you from doing it correctly, it would just be a hassle to serve in a static way (every variation would require a different txt file to be generated and sent).
Wow, this is a nice feature! I'd much rather implement it by having a .txt or .raw file just in the same folder as the HTML page though, rather than having to go in the middle of the URL. I feel like it is more convenient to do so.
It's a minor nitpick really, but I quite like this idea! I think I'll try to implement this for my website too.
As for the other people here wondering why User Agents weren't used for this:
- Using static website hosting goes out the window, which is quite a shame because it makes everything so much easier
- User agents are pretty terrible for determining capabilities and intent (what if someone was using curl to get an actual webpage?)
- It will never cover all types of HTTP clients (a whitelist is pretty terrible as we have seen from various online services restricting Firefox users or Linux users from certain features for no other reason than their user agents weren't in the list the developers used to check for the presence of certain features).
> To make this easily readable on small screens and terminals, I used vim’s text-width setting to make sure my lines do not exceed 80 characters:
I never understand comments like these. Now if my terminal is 78 characters it is a mess or if it is 100 characters it is wasting space. If you just don't wrap the lines my terminal does it at the right width.
Hard wrapping doesn't work well. You need to know the target width to wrap and you don't know that until someone actually opens the file. Every viewer I have ever tried is excellent at soft-wrapping. Let it do its thing.
For me the reasoning is being able to distinguish between what needs to be wrapped, and what I would prefer not to be wrapped. In this case, code samples would preferrably not be wrapped. If I can wrap normal text manually, and keep code samples unwrapped, the client can disable wrapping entirely and see the code lines untouched.
This is possible with HTML and CSS, but not in plaintext. I think wasted space is something I can handle, but a badly wrapped code is something I dislike.
This is the exact reason the Linux kernel developers require plain text emails to be pre-wrapped. You cannot leave it to the reader, because readers will always/never wrap both text+code — they cannot distinguish, and text and code have very different wrapping requirements.
Now if my terminal is 78 characters it is a mess or if it is 100 characters it is wasting space. If you just don't wrap the lines my terminal does it at the right width.
Just tried it on a vintage early-80's 40-column terminal, and works better than I expected. I thought that a lot of words would be cut off on the right side, but the wrapping was about 90% correct. Perhaps it's just a coincidence, but this is what happened just now.
It is still useful though, when you open like 3 files next to each other on a screen. And if screens become bigger then 4 or 5 or ... Screens are wider than high, so putting things next to each other is often better than on top of another
My editor (Vim) can soft-wrap lines at word boundaries. I routinely have 3 or 4 files open next to each other and on my monitor that does not give 80 columns per file. Inserting hard-breaks at 80 columns would make it look horrible, while long lines get wrapped nicely.
I get that you can softwrap. Usually that means though, that when some code is indented, the content of a line that gets wrapped starts at the beginning of the next line (not indented). That just looks bad. I guess one could get used to it. If you can change that and make it continue at same indentation, then you need some additional marker, to distinguish softwrapping from an actual line break. Probably all possible in editors like VIM and Emacs. Just a question of configuration. I would claim though, that softwrapped lines are a bit more difficult to read, when it comes to code, than having short lines, all properly indented.
I also use vim. I prefer to keep wrapping off and line length go to 80 or so chars.
I find it easier to read this way rather than have wrapping turned on. Wrapping isn't nearly as good for my comprehension as a line break after 80 chars.
If the site is not a static one, you could check the request's User Agent server-side, and return the raw version directly (or redirect to /foo/raw) if the UA contains 'curl' or 'wget'.
If the site is static and you are able & willing to change your vhost config, you could detect the UA too, and redirect to /foo/raw.
Just a few ideas. This is a fun little project you've got here. Well done.
(i do this for my blog, anachronauts.club/~voidstar. i kind of hate gemini-the-protocol, but love gemtext-as-default and love having a space where text-forward content reigns.)
But is there "Gemini over HTTP"? You can serve text/gemini files also HTTP(S), and also local files too. (On my computer I have it set up that it is capable of loading files of this format (locally and remotely), although most other programs are not capable of understanding this format)
but www browsers don't render gemtext and gemini browsers don't fetch http(s) resources.
ideally we'll tweak our gemini-to-http proxy to look at the client's "Accept:" header (as suggested in other comments on the op) to decide whether to reply with raw gemtext or converted html-with-css. then a www browser receives light formatting, curl receives raw gemtext, and gemini browsers format however they like.
Some browsers might accept both. My computer accepts both (although not the Gemini protocol, but it does have Gemini file format, and can read this over HTTP(S) and local files (including inside of ZIP archives, so gempub files can be displayed)). (Also, curl does not have Gemini (although I think that it should).)
There are some problems with using Accept header (although you can use it as a start; if you can implement it, then you might do so). Even if it is mentioned in the Accept header, servers will not necessarily understand it, and if a lot of file formats are implemented then it can make the Accept header very long, knowing properly the preference, trying to explicitly download the raw file or a specific other format (regardless of the Accept header), and other problems, etc.
(My idea is a Interpreter response header, although this requires that web browsers will implement it.)
ah! that one includes a JavaScript I used to draw a canvas in the post body, hence the script showing up in the curl result. Might try to think of a solution for that.
Edit: fixed it by moving the script to its own file.
If you know basic vim movements (j/k for down/up kind of basic stuff), you can pipe the output into less to read it in a more convenient way. Nothing major, just found it nicer to not have to scroll back up when the article loads
--mouse
Enables mouse input: scrolling the mouse wheel down moves
forward in the file, scrolling the mouse wheel up moves
backwards in the file, and clicking the mouse sets the "#"
mark to the line where the mouse is clicked. The number
of lines to scroll when the wheel is moved can be set by
the --wheel-lines option. Mouse input works only on
terminals which support X11 mouse reporting, and on the
Windows version of less.
----
5. Using the mouse *mouse-using*
[...]
Don't forget to enable the mouse with this command: >
:set mouse=a
Otherwise Vim won't recognize the mouse in all modes (See 'mouse').
Currently the mouse is supported for Unix in an xterm window, in a *BSD
console with |sysmouse|, in a Linux console (with GPM |gpm-mouse|), for
MS-DOS and in a Windows console.
[...]
*xterm-mouse-wheel*
To use the mouse wheel in a new xterm you only have to make the scroll wheel
work in your Xserver, as mentioned above.
A cool add-on to this would be a plugin that automatically redirected /latest to the latest blog post. That way, if the blogger were to publish say once a week every Tuesday, the user could alias the curl for maximum ease of use.
I once suggested this kind of strategy at a previous job...and internally, everyone liked it. But what ended up happening was: our fans of the content kept complaining that they were sharing the /latest url but "the website kept breaking" (in their mind) because last week it pointed to the blog post they wanted to share, and this week, its pointing to a different blog post, and please fix the website, etc. Clearly, our fans misunderstood about redirects...and our fans were vastly, overwhelmingly non-techies...so try as we did to educate them, it became a lost cause...and we stopped employing that tactic. Our fans as i call them were real estate agents who affiliated with the real estate company that i worked for...and these guys i called fans because they were the most fervent, avid followers of the company's brand, and they sold million-dollar plus homes...so they had plenty of pull with the senior leasders at the company, but to be fair they did follow the brand religiously...so like i've heard poeple say: support your most ardent fans. In any case, since that time, i'm careful to set expectatins about expected behaviors as it pertains to links, web resources, redirects, and such. (Of course, such a tactic might work for other audiences too.)
Interesting. It's easy to fall trap to the tech mindset - my big ideas to change the lives of your average person are often humbled whenever I have to explain the difference between a web browser and the internet, to my averagely tech literate mid 20s friends.
I love these kinds of websites! I'm not into following crypto news, but the concept (and i'd say design) of this website is great! I look foward to the days when gemini is more popular!
Why not use user agent detection (and maybe have a header/footer that says "This page has been formatted for readability based on your User Agent: curl").
I'm guessing the blog is made by a static site generator, so the above is harder than it seems. I suppose one could add a reverse proxy that redirects to /raw/$PAGE when it sees "curl".