Hacker News new | ask | show | jobs
by ggambetta 2277 days ago
Does it support a 400k* word text file without slowing down to a crawl (i.e. a 5-10 second lag between a keypress and a change)? I've tried many markdown editors for my novel, and sadly most editors fail this very basic test :(

* Edit: I misread the output of wc. The novel is ~70k words, ~400k characters. Leaving the 400k figure above because of the discussion that follows, because it should still be a reasonable use case, and because it makes even more surprising that most editors fail this even more basic test.

7 comments

As a little experiment of your question (using the original figure actually) I repeated a dozen or so ~lines~ paragraphs of Lipsum... a lot, until I ended up with a text file with a little over 17K lines... 532K words... and either 3.1M or 3.6M characters, depending whether you count spaces.

MacDown (a native MarkDown editor for macOS) opens it in a snap, has no lag editing, and even the rendered preview (which scrolls with the editor side) has no lag. Memory usage was about 150MB.

For shits and giggles I then pasted that content into the OP's linked tool.. The Safari process for that tab is still sitting at 100% CPU a minute or so later, and is unresponsive.

Is this example a bit ridiculous? Sure. But it demonstrates the difference between a native app - oh, and the process for that tab just disappeared because Safari killed it - and trying to make an app in a web browser.

I really like MacDown but it has this weird bug of sometimes hanging for up to 30 seconds with a spinning beach ball and then continuing as if nothing happened.
I was curious about how my text editor (http://github.com/alefore/edge) would fare here. I decided to try it with a file with 400k lines, each with ~5 or so words. It worked reasonably well: https://asciinema.org/a/cTMmQMhqLcIQrDbkqx3JFpFUn
Notepad++ can edit such files. At least that is what I routinely use for much larger files in Windows. Even 500k of text is small fry.
At 20 words per line, that is 20k lines... why are you editing everything in a single file?
Edited with the corrected math. It's ~3.5k lines, ~70k words, ~400k characters.

Still, why wouldn't I want to edit it as a single file?

> why wouldn't I want to edit it as a single file?

> ...without slowing down to a crawl (i.e. a 5-10 second lag between a keypress and a change)?

I think you answered your own question. :P

Also, you can edit chapter 53 without having to scroll down to line 130,465. Basically, the same reasons bills are split up

> I think you answered your own question. :P

I disagree! We're having the following conversation:

Me: "I want to edit this as a single file, but I can't, because it's slow due to inefficient data structures"

You: "Why would you want to edit it as a single file?"

Me: "Why wouldn't I?"

You: "Because it's slow due to inefficient data structures!"

My point is, I want to edit it as a single text file for creative / process reasons, and there are algorithms and data structures that make this possible, as evidenced by the editors that do let me do it in real time, so there's no real reason not to.

When doing largescale editing changes, I much prefer the single file approach.

Things like:

+ Changing the age of a character

+ Changing the argument order for a particular function

Those aren't simple find/replace solutions that you can run across files.

You can do the same thing with multiple files, but it's slower and can be harder to hold a context in your head.

Why can't you edit it with some fraction at a time?
I could, and I did for some time, but it's a PITA. The question should be "why should I?". I've since found a Markdown editor that works well for a file this size (Typora - I also had good experiences with Texts).
I've done this in various formats, including single editor, scrivener, and so forth.

I've found that the process of breaking a book up into "chunks" to be a useful one to me. If I can't chunk it, there's probably something wrong with the structure.

But hell, if what you have works for you, go for it. I'm currently using the LeanPub build system which seems to be just enough structure without getting in the way.

Sure, I could edit chapters, or even "acts", separately, and I did for some time; but the novel is a single work, and searching text, replacing text, etc, is much simpler with a single file.
I can see both sides of this. In some ways, this is the old "seat of your pants" vs. "plot the book out" discussion. The important thing is whatever works for you.
Oh, I'm extremely on the plotting end of the spectrum :) I could have written the chapters individually and in a random order. More details: https://gabrielgambetta.com/tgl_swiss_trains.html
Why should you? Perhaps if there are other features of a particular editor that make it worth it. I'm otherwise not one to defend anything that's potentially unnecessarily slow :)
You could help future readers by mentioning the name of your discovered-editor.

(Me? I'd use Emacs..)

Good point! Edited.
Is a 400 thousand word text file "very basic"?
No, the "very basic test" is not having a 10 second latency in a text editor. Come on, this was a solved problem in the 90s.
If it was a solved problem in the 90s, can't you just use a text editor from the 90s? No need to bash on this project just because it doesn't fulfill your use case.
Because there's contemporary Markdown editors that do work as expected. I didn't bash this one, I asked.
What editor did you end up with?

Modern editors are fairly complex. Splitting a bitbuffer and showing ascii line by line is simple. But modern editors have to deal with unicode, where some characters (surrogates) are two chars long, and can be followed by modifiers such as skin color of emojis. Then some characters are wider then others. So a modern editor must first parse the string encoding, then parse the language (markdown) for further formatting and coloring or building a WYSIWYG, usually on every key stroke. The expensive part is showing the text on the screen, generating the fonts, where old editors didnt have fonts.

I'm using Typora, which works pretty well for this use case. Open source, cross-platform, very nice. I'm in no way affiliated with it, I just like it :)

I get the parsing difficulties. Generally, at least for markdown, pressing a character should affect only the line (paragraph) that contains it, so worst case scenario, you need to parse the entire paragraph. Worst case scenario, you need to re-render whatever fits on the screen. I refuse to believe it's not possible to do this efficiently with 2020 hardware, and I have proof, in the form of editors that work.

If it's 2020, I try to append a character to a 400k file I have open, and this takes multiple seconds, it's because someone implemented Schlemiel the Painter's Algorithm, not because the hardware is slow or the file is too big.

Yes, it is. This is a quite typical size of a book. Typically novels have about 100 thousand words, but four times that size is not at all uncommon.
In the context of a book, yes. But I mean in the context of an online Markdown editor. (Or even a Markdown editor at all. Can you actually write books in Markdown?)
Rust book is written in Md https://github.com/rust-lang/book as well as plenty other books that I can't remember from the back of my head.
> Can you actually write books in Markdown?

Yes, you can. The "Programming Perl" book by Larry Wall was famously written entirely in "POD" format, which is just like markdown. I don't know if it was split into several files.

I wrote the novel I'm referring to, and a Computer Graphics textbook (with images and equations and stuff).
That's untrue. Novels are usually in the 90-110k range, depending on genre. If you're an established author, especially in fantasy or sci fi, insane doorstoppers like that do occur, but 400k is an outlier even for sff.
Here you have a list of very famous books with their word counts. Quite a few of them have more than half a million words. Thus, it is reasonable to expect that a text editor is able to handle that size. After all this is a tiny amount of data compared to what has to deal a program that does image or video processing, so there's 0 reason for a text editor to be "laggy" when dealing with just a few megabytes.

https://blog.fostergrant.co.uk/2017/08/03/word-counts-popula...

Seriously, that’s stupid. You’re collecting outliers. Unpublished writers trying to hawk 400k word novels are for the most part delusional. Every single piece in your list that goes into that territory is something written by an established author.

Though, to be fair, an author organizing a work of that size in a single markdown document has bigger problems than key lag.

That doesn't exactly make your point.
Books should not be written in a single text file, though. That is asking for trouble.
A good editor can handle them without any trouble. I just downloaded Moby Dick from Project Gutenberg and opened it in Emacs. It opened instantly, I could immediately go the end (line 22333), scroll around, count the words (there are around 222617), count the ocurrences of the word 'whether' (there are 91, the last of which is on line 21769), make a change somewhere in the middle. All of this is instant.

You can do all that instantly in Vim too (... and I just checked to make sure). A megabyte of text is just not a lot of text for a good editor.

Why not? Beacuse of bad text editors? Or is there some fundamental reason about that?
Because it is extremely hard to work with it?

Same reason why we do not put programs in a single source file.

Why is it extremely hard to work with the whole novel in a single file? Because most software for writing prose is not up to the task?
break the novel into chapters? Each chapter in a separate md file?
Use and learn vim. It's been around for ages.
I did for a long time. It doesn't do WYSIWYG markdown, so I've switched to something that does (Typora).
Did you spend ten seconds googling whether there was a vim plug in that could do that for you? I did and I found this: https://github.com/suan/vim-instant-markdown
> Did you spend ten seconds googling

Please don't be a jerk in comments here.

https://news.ycombinator.com/newsguidelines.html

Thanks. That's nice, but not exactly what I meant. By WYSIWYG editing I mean visual editing like Word or Google Docs, but saving Markdown; not typing Markdown and having a real-time preview separately.

Typora does this nicely; I type an asterisk, it looks like an asterisk, but when I close the asterisk, the text in between goes italic. The asterisks are hidden unless the cursor is within the italic part. I can double-click to select a word, press cmd-I, and the word goes italic (by adding the asterisks). Etc.

I realize this is a matter of personal preference but for me, using editors that are changing the text I write as I write it between code and markup is really annoying. I can buy using pure WYSIWYG (i.e. Word) or typing plaintext, but something is just off for me when the editor tries to do both. I prefer knowing whether what I'm looking at is the actual contents of the file or just a representation of them, but whatever floats your boat obviously. Since a terminal based editor will never be able to to what you require it to, unfortunately this means your selection for "optimized" editors is limited.