Hacker News new | ask | show | jobs
by stefco_ 2307 days ago
It's not just a browser thing. Apple Books does this with their e-books, which is infuriating if you're working with a coding book and just want to copy-paste stuff into your editor/terminal. You get something like:

    “ghci> putStrLn (pretty 10 value)”

    Excerpt From: Bryan O’Sullivan, John Goerzen, and Donald Bruce Stewart. “Real World Haskell.” Apple Books. 
When you only copied:

    ghci> putStrLn (pretty 10 value)
Note that the quotes around your actual selection aren't even the ASCII quote character; you get horrid unicode quotes that are easy to miss if you're just trying to run a bit of code in your REPL. This isn't even a DRM ebook, so it's not like Apple is being compelled by contract to insert a citation. It's awful, user-hostile behavior that removes one of the main advantages of digital-vs-hardcopy coding books (copy/paste), and AFAIK there's no config that lets you disable it.
13 comments

Wonder if an author will rename themself sudo rm -rf / with the proper escape codes.
You my friend, are an evil, sadistic meglomaniac. I love it. Remind me not to make you mad.
Bobby wants his tables back!
Legend has it that all data related to little Bobby Tables has been lost.
:D
`rm -rf ~` is disastrous enough on macOS with a bonus of not having to authentiacate sudo.
How do people know the result of running this command without running a VM? Is it possible to run OS X in a VM these days?

The command's intention is so obvious I wonder why there is no warning from the os, the shell program or the terminal for such, easily blacklist-able commands. Or is there?

I've learned that personally one night fighting messed up rebase conflicts and for still unknown reason I copied and pasted a path for `rm` and ended up issuing `rm -rf ~ /repo-path /unneeeded-dir` - it took me a couple of seconds to realize that this command takes unexpected long time to execute and then my eye caught a view of Finder window with shrinking list of folders in home directory :) Time Machine helped me quite a lot but unfortunately with a nasty surprise that it doesn't backup dot files :-/ Lessons learned.
PS: it turns out you can restore dot files from the Time Machine backups but it's not enabled by default:

https://apple.stackexchange.com/questions/141321/how-to-rest...

> Time Machine helped me quite a lot but unfortunately with a nasty surprise that it doesn't backup dot files.

Wow, that's good to know!

When I learned about this command 10 years ago I tried it on a Debian system that I had no use of anymore.

There's a big warning before it lets you execute the command.

That's true only of distributions that alias rm with rm -i.

Now, almost 20 years ago only RedHat did it. And it felt wrong to me :-/

Modern versions of rm require you to pass --no-preserve-root. According to Wikipedia [1] this has been the default (in upstream) since 2006. Of course it took distros some time to actually update to the GNU utils 6.4 (especially long-term support systems like CentOS) but it's been a decade since the change should've been implemented everywhere.

[1]: https://en.wikipedia.org/wiki/Rm_%28Unix%29#Protection_of_th...

Isn't that only a GNU and FreeBSD thing? I think other Unixes will still let you rm -rf /.

On FreeBSD you can do

    sudo dd if=/dev/random of=/dev/mem
and it will do exactly that, write random crap into your memory without any sort of safeguard, causing a spectacular crash and a console that looks like it's having a seizure. Linux won't let you do that unless it's been compiled with a flag to enable full access to /dev/mem and /dev/kmem.
I want to try that dd command. For fun. Just to see what happens on OpenBSD.

No lasting effects are there? Besides a crash, afterwards I shouldn't have any memory issues, right?

worth noting that it can be trivially workarounded by adding an asterisk at the end: `rm -rf /*` still works.
Fair enough, but on the other hand there's nothing preventing you from just putting --no-preserve-root in the command either. I see the feature as something to prevent accidents, not as a way to secure the rm command.
That version will helpfully leave behind any .files lingering in /
Maybe run `sudo chmod -R 000 /` instead. Can't get charged with destroying any data, but it's a huge pain to get a system working again from that. Only done it twice; hope never to do so again.
Curious to know how you recover from that? How do you what permissions to assign back to files and directories?
Depending on how you define "fix," the common way (if the machine has already been rebooted and you're locked out of the session) is to boot via a live distro, mount the filesystem, and change permissions to get a usable system back. There are various other methods depending on the machine state and requirements, too, so it's definitely recoverable from a working machine standpoint. Changing permissions back to what they used to be is a rabbit hole that varies depending on your distro, machine-specific requirements, and any special permissions setup by you and your admins.
It's a pain-and-a-half. I'm not sure any system ever recovers fully, but you basically run chmod -R's on most of the important directories and then fix the many services that will surely fail once you can get a shell.

More specifically, I used a bunch of these links until I could get a shell, then manually-fixed the rest.

https://askubuntu.com/questions/308939/how-to-reset-default-...

https://askubuntu.com/questions/508359/restore-default-syste...

Can confirm. I kept getting permission errors on something I was testing, and in desperation ran `chmod -R 777` thinking I was in my project dir. I was actually at root. After a bunch attempts to recover, I ended up saving the data I cared about and doing a fresh install. 0/10 would not recommend.
Curious to know what circumstances led up to someone doing that and if they were charged for other violations.
A modern shell should prevent you from directly executing pasted code.
Apple Books has so many of these ridiculous little incursions against good taste. Just to vent, I made a list of all the ones I found: http://macos-design-review.com/books.html

It really does seem like software at Apple is being designed by people who just aren't familiar with the platform.

It also messes with dictionary lookup! It adds this nonsense if you select more than two Chinese characters, and so it makes it really hard to look up four-character Chinese idioms in my dictionary (which are, inexplicably, frequently missing from the built-in “lookup” dictionary).

Apple Books is so close to being a nice reading interface, but there are so many stupid little bugs. Highlighting is another horrid little bug that can easily wipe out a full chapter’s worth of highlights with one tap...

The entire Apple dictionary system is a PoS AFAICT. It only works with perfect spelling and it only works with single words. How about finding closest match and how about letting me select lots of text and go through all the words.

I can only guess no one on the team actual uses the feature.

As a language learner I use Rikaikun for Japanese on my desktop browser. While I'd really like it if I could also run a similar extension in my mobile browser I'd be okay of Apple's dictionary actually worked with a fuzzy search and multiple terms... but no....

I solved the Apple Books problem with an automator script, using the "Copy to Clipboard" action. Then it can be assigned a shortcut in Keyboard preferences. https://imgur.com/a/sG2isap
Thanks for the method. I’ve written up detailed instructions for it at https://apple.stackexchange.com/a/382603/21473.

As you can see in that answer, I found a better method to assign the keyboard shortcut. Assigning the shortcut within App Shortcuts instead of Services lets you use the normal ⌘C shortcut in Books while not affecting copying in other apps.

Limiting the action to "Books" seems to not affect other apps as well. The main reason I did not use ⌘C is because it doesn't work when the context menu is displayed (and it gets displayed immediately after selecting text).
This was really helpful, thanks for sharing!
On Windows, clipboard items have hidden tags added when they are copied, so the program you are pasting into can make these decisions, instead of the program you are copying from needing to manipulate the copied text. For example, if I copied that code from a Web page and pasted it into the console it would paste just the code, but if I pasted it into Onenote it would check the tags and add a line underneath with the url the code was copied from.

https://devblogs.microsoft.com/oldnewthing/20140721-00/?p=45...

I hate this behavior! Switched to a different eBook reader on my iPad because of it. I like copying interesting snippets to OneNote.
Out of interest what did you choose? I'd love something different that syncs between MacOS and iPad for ePub files.
> aren't even the ASCII quote character; you get horrid unicode quotes

Why should they be ASCII, and what’s wrong with Unicode?

Because the next step when you face this problem is doing a search-replace of the quotes to remove them. ASCII quotes are on your keyboard, so you can actually type the command to remove them.

Unicode quotes probably aren't, so it's extra annoying to remove them, and they're not even the same character for start and end so you have to do it twice.

The time spent for removal is then vastly higher than with ASCII quotes, which in the context of unwanted characters may well qualify as "horrid" imo.

Yes, this is pretty much what I meant. The "horrid" part is that I would like to do a quick

    /"<CR>xnx
(or similar) in vim to delete the quotes on a multi-line block. Even that is annoying. But with the unicode quotes I have to do a bit more conscious searching, which is a distraction.

Of course, the fact that the text is modified in any way is user-hostile. I don't mind re-typing a code example (since I'm trying to learn it), but I like to copy-paste sometimes (e.g. data literals).

I paid for a DRM-free ebook that's meant to be copy-pasted. We have a universal paradigm for how plaintext copy works. Apple charges a premium for good design. Ebook readers are generally supposed to get out of your way and let you focus on content. There is no reason for Apple Books to have this behavior.

PowerShell acknowledged smart quotes and dashes in its language design and allows proper quotes interchangeably with ASCII straight quotes. I always found that an interesting design choice, although I'm not sure how useful it actually is to allow people to copy/paste code from dubious web pages. On the other hand, they'll do that anyway, so why bother making it harder?
This is going down a rabbit hole. “This” is an English quote, whereas „this“ is a German quote; note that the open quote character in the English quote is equal to the close quote character in the German quote.

Then there is a less-often used style of quoting, similar to the French style, but: «this» is the French quote, whereas »this« is the German version. Yes, the open and close quotes are swapped.

I applaud the idea, but just want to point out that there are dragons lurking in the shadows.

All of “”" parse the same as ", so bracketing doesn't really enter into it. This is just to prevent smart quotes from destroying code, not to have more levels of nesting.
Additionally note that «this» is mostly used in French online and in Switzerland, while « this » is more common in professionally typeset works in France. Those are U+202F NARROW NO-BREAK SPACEs between the guillemets.
Don't forget that the French quotes should have (non-breaking) spaces inside « like this » - only when used with the French language though, other languages usually don't use such spaces.
I've never liked this sort of thing in code, because as things stand they're a bit of a faff to type, and some tools don't support them very well. But it's a shame, because there's loads of types of quote mark and bracket in Unicode, not just the ``...''-type pair commonly used in English. So you could use one type for string delimiters and then use the other types, unescaped, in the string itself.

(The Unicode quotes also mostly come in matching pairs, so, for good or for ill, they'd in principle be nestable.)

isn't that an input problem not an output problem? Is it giving you the quotes directly from the ebook or is it converting them on copy-paste? I'd assume there was a bug on the input side, that some tool wrongly converted `printf("Hello World")` to `printf(“Hello World”)` so what's in the ebook is wrong already. If so the issue is further up the chain.
In this context, the selection within the ebook contains no quotes, whether ASCII or Unicode. The problem is that Apple’s Books app adds quotes and attribution text around the selected text. Books knows that the text it is putting in the clipboard isn’t what you selected, and it doesn’t care.
It's not changing quotes in the code. It is adding quotes. You have to delete them, not search-and-replace them, and it doesn't matter what kind of quotes they are, you have to do the exact same thing.
Search and replace is a pretty reasonable way to delete a quote; lot's of editor UI's make that much more convenient than finding the end of some blob of text you just pasted into some pre-existing context.
Depending on the regex engine, it may not be that much higher.

sed -i '/<open quote>|<close quote>/d'

I think they are talking a about styled quotes, the kinds that point inward, which a REPL won't interpret as an ASCII quote and will throw some type of Syntax error.
Ok but those are standard, valid characters like any others. Why are they ‘horrid’?
Some people hate unicode quotes because lots of software will automatically convert ASCII quotes to them when you copy/paste your code to Slack or email to share with others.
Because they result in a syntax error.
Even though the OP seems offended by the presence of the Unicode quote marks for some reason, they have nothing to do with the issue of Apple Books surrounding the copied text with those quote marks and adding an attribution line. If they'd surrounded the text with ASCII quote marks and added an attribution line, it would have still been a syntax error, wouldn't it.

Personally, I don't see any reason to hate typographically correct quote marks when used correctly, which (obviously) doesn't include code samples.

But they wouldn’t do what you wanted anyway if they were ASCII. What’s the problem specifically with them being Unicode?
The problem specifically is that they are not the ASCII quote character. There is only one ASCII quote character, and that's the one used by programming languages. Any other quote or quote-like character is outside the ASCII range, and must therefore be Unicode (or another non-ASCII code page).
They're horrid because you can't easily type them on a keyboard.
Because for some reason terminals are stuck in the 70ies and don't accept those characters as quotes. Anything but ASCII trips them up.

Seems such an obvious interface to innovate, but it seems to run into terminal wizards sense of purity.

The problem, such as it is, is with languages. Terminals (mine at least) handle most of Unicode just fine; admittedly I've seen it choke on emoji, but punctuation, nah.

The vast majority of programming languages are defined in terms of ASCII and only ASCII. I don't care for this, personally.

I've given some thought to how to do quoting right in a programming language, and implemented «guillemets» as an experiment. But it's challenging, you need to decide what to do with all of “”‟„"″ and there aren't obvious pairings, like „this is a sentence” and “this is a sentence” and »this is a sentence» and «this is a sentence» and »this is a sentence«, it ends up feeling like rather a lot of effort for what you get in return.

Oh, one of those characters I typed isn't a quotation mark, did you catch which one? Hacker news won't even let me type two of them!

Having multiple almost identical ways of achieving things is a bug magnet in programming languages. Differing programmers will presumably use different styles (why else even support differences?), and if you mix code like that, bugs ensue. This a hassle not just because autoformatters are liable to make churny changes (which distract from real changes, which makes history harder to understand: bug magnet), but also because people will make mistakes e.g. when find-replacing (another minor bug magnet!). Then there's the fact that some of those quotes aren't symmetric - so you need to think of something to have happen when they're unpaired or used incorrectly, and it wouldn't surprise me if no matter what you did, you surprise somebody (bug magnet!).

Sure: these are all quibbles, and a language wouldn't die from all these minor cuts. But they're definitely downsides, not upsides. So: where is that upside? Why would you ever support something like this? "It looks a little nicer" sounds like a pretty weak argument compared to "it's inconsistent, hard to machine process, and may cause a few bugs"...

> I've given some thought to how to do quoting right in a programming language

Already fully designed and implemented in Raku: https://docs.raku.org/language/unicode_entry#Smart_quotes

Test online: https://tio.run/##K0gtyjH7//9Rw7ySjMxiBSBKVChOzStJzUtOfdQw9/...

A language or interface is defined by the set of symbols mutually agreed upon. If you allow Unicode that number simply explodes, thus bloating and complicating the language/interface and its implementation as well. In effect it is no longer the same language but becomes a different, more complex one.

The tradeoff is not worth it for programming languages the same way as learning all the scripts of the world is not worth it for one human being, just to accomplish tasks that don't need all these symbols in the first place.

It is a solved problem however. We chose not tot use it (in terminals).
The terminal can usually handle it, provided it's UTF-8. Most of the editors can handle it. It's very specifically the programming language specifications which don't include them. Even the ones that allow Unicode identifiers.

Also, they're difficult to type - my keyboard has a " key but not a “ or ” character. I had to copy and paste those from the unicode website since I don't have a suitable input method set up.

The idea that "terminal wizards" reject this is rather undermined by the fact that "terminal wizards" made open and close double quotation marks accessible as simply [Group2]+[B04] and [Group2]+[B05] on an ISO 9995-3 conformant keyboard.

Or [Shift]+[Option] [B04] and [Shift]+[Option] [B05] if there's no explicit [Group2] shift.

Open and close single quotation marks use the same keys.

"terminal wizards" have done quite the opposite of rejecting this.

That seems like an obviously bad solution. That is like writing [a(b{c]d}e) except that the quote characters look even more similar.
Its easy (solved) to map Unicode to control characters such as quotes (and when you think about it, quotes are often there to deal with ambiguities stemming from the limited number of allowed characters in ASCII). So to could have a terminal accepting such input, and a few helper function that normalize it into ASCII and so on.

After all, users of non-ascii languages (which is nearly everyone) already know how to deal with it without ambiguity. Its only ambiguous if you don't use encodings, and that should never happen anyway.

Often in code environments, it's hard to tell the difference between " and “.
Don’t despair, there are workarounds for this. See the many answers to https://apple.stackexchange.com/q/137047/21473 ‘Don't want iBooks to always paste the “Excerpt From” of what I have copied’.

I recommend the solution in https://apple.stackexchange.com/a/382603/21473 (which is an improved variant of stassats’s solution in https://news.ycombinator.com/item?id=22355322). The method is to use Automator to set up a Quick Action that copies the text while bypassing the clipboard-modifying behavior of Books. Then you configure a keyboard shortcut of ⌘C for the Quick Action, only in the Books app, so you don’t have to change the way you copy.

If your shell has a function for copy and paste integration (like fish), just override it with a sed script to remove that crap. Same for vim’s clipboard integration.
Is this not a setting you can put off? I would say it is a reasonable function if it can be toggled on/off (I would prefer default off, YMMV).
I remember that kind of thing as far back as Encarta(?)

Back then, I was writing papers and it was kind of handy.

OneNote does this on the paste end, which is really nice for collecting references and quotes, as you get the source automatically.
“ghci> putStrLn (pretty 10 value)”

    Excerpt From: Bryan O’Sullivan,
My instinct is that this is for legal attribution purposes, i.e. a legal moat.

It sounds ridiculous, but in my experience, there are 1000 things that lawyers will identify as possible sources for legal trolling and this looks like that. Apple has a zillion dollars in the bank and they are sued daily. All it takes is for a judge who may nor not be knowledgable or a sufficiently grey area ... and you have a major problem.

So it could be risk mitigation: they are including the attribution in the copy so as to not be perceived as partisan to some kind of IP dissolution paradigm.

On the other hand, this could just be one of those wayward culty Apple kinds of things they think is 'good UI' when it's not.

Unlikely - see my comment elsewhere in this post.
So yes, I see in your comment that given the context of 'Apple Books' it's probably a contractual issue, I agree there.

Though I can see that a reasonable legal opinion might not support my more cynical view, when the risk dynamics are high, a different kind of logic creeps in.

I worked for a software platform that refused to provide usable snippets of code anywhere in the documentation for fear of liability.

We also 'perpetually sued' an organization that was infringing on our brand, even though they were really helpful to us (a user-managed fan-site which used our name in theirs) and otherwise had a positive relationship with them. Our 'perpetual legal action' was merely cover give the appearance that our brand was being defended, without which action, we could feasibly lose rights to it. So, literally suing people, while dragging out and 'nudge-nudge-winking them' to not worry about it, literally inviting the people we were suing to events, dinners etc..

I don't think most people understand the risk dynamic in many large organizations with respect to these issues, the calculation seems bizarre even to most regular product types, it really takes a legal view to understand this. And also the personal fears and biases of the executives.

> Our 'perpetual legal action' was merely cover give the appearance that our brand was being defended, without which action, we could feasibly lose rights to it.

Was that really easier than just licensing your trademark with a strong contract that preserved your rights while letting the website use your name for one specific purpose? Make the licensing costs $1 per decade or something if it is a question of money.

Misusing the legal system in this way seems like it could backfire if the third party did something you genuinely wanted to stop and they could prove your ongoing action was a sham.

I'm not a lawyer, and I was not involved, other than I knew there was a many-years-long legal action regarding branding against another company with whom we had otherwise a really good relationship.

My point is not about branding or lawyers, it's about risk.

Said company gave up a huge amount of money to patent trolls, and their lawyers were empowered to mitigate risk, with the backing of the CEO, their rationality being: "We make a huge amount of over here, why on earth would we allow that to be risked by speculative activity over there?" which is not entirely irrational, it just depends on implementation.

Everything is so gray, it's so hard to tell. Consider that we have no idea how open-source software licensing will work out because it hasn't been really pushed through the court system, and how limiting that ambiguity is for the entire industry.