| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by junkerm 1581 days ago
	I was reading on ropes a couple of months ago when researching emacs internals. Apparently emacs and vs code are using gap buffers. Would be interesting to know the reasons for the decisions.

3 comments

gumby 1581 days ago

EMACS was just a package on top of TECO (the way LaTeX sits on top of TeX). Later re implementations (e.g. GNU emacs) just continued to use the same design.

So why did TECO use a gap buffer? The gap buffer was an easy way to manage an edit buffer back when machine clock speeds were measured in kilohertz and RAM in kilobytes. There were no fonts (no rendering at all), six or seven bit characters, oh, and the machines were often timeshared.

Likewise, vi is just a visual addition to ed, itself a clone of the Multics qed, itself a clone of qed on earlier machines. Those machines were, by modern standards, equally resource starved.

link

rstat1 1581 days ago

I believe VSCode's editor is actually built on a variation of a piece table (they call it a piece tree)

https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r...

Not sure if this is still the case, but it was as of 2018.

Its really kind of fascinating to me, all the different ways we've come up with over the years just to manipulate text.

link

vidarh 1581 days ago

> Its really kind of fascinating to me, all the different ways we've come up with over the years just to manipulate text.

One of the things that strikes me is how much effort goes into making these editors work well with absurdly large files, rather than more editors punting on that and having people fall back on specific tools for huge files.

For my own editor I basically decided to ignore large files entirely and fall back on using emacs for the very rare case where I need to open an absurdly large file.

I know that's a luxury Jetbrains doesn't have, because we've come to expect all editors to handle ridiculous sized files well.

But the point being that for reasonably sized files - up to tens of thousands of lines - just an array of strings is more than fast enough.

Even with an editor like my personal one (not really usable for anyone else, though I've started packaging up parts of the code) written in Ruby (which introduces a substantial overhead per string).

I think if I ever decide to make my editor handle really huge files, I'll "just" split them in suitably large chunks and lazily do the necessary processing as needed

link

Ygg2 1581 days ago

> I think if I ever decide to make my editor handle really huge files, I'll "just" split them in suitably large chunks and lazily do the necessary processing as needed

That falls apart the moment you need some non-basic feature like matching open/closed brackets/elements, etc.

Or references. Or refactoring ...

link

vidarh 1581 days ago

I think that missed my point. If a file is large enough that this becomes an issue, it is tens of thousands of lines or more, which means it's rarely human written code. I'm perfectly happy to turn off anything fancy in that scenario, as on the rare (every few months at most) occasions where I open such monstrously large files it's usually a log file or similar, not code. Your mileage may well vary, but I'm not interested in writing a general purpose editor (some components of my editor are general purpose, and I'm packaging up some of them, but the editor itself is written entirely with my own usage patterns in mind - my editor is smaller than my .emacs file used to be). I think more people ought to focus on writing more opinionated editors rather than try to make everyone happy.

That said, it's not true that it needs to affect features like the ones you mentioned - all you need is to add a facade that gives your tools whatever interface to the buffer they need. As it is, my editor stores its buffer in a separate server process, because it was trivial to do so and gave me a bunch of benefits like multiple clients connecting to the same buffer, which also means I can trivially have out-of-process services augmenting the buffers with additional state lazily without needing to know anything about how the buffers are represented. The server process + a facade for the current buffer implementation + most of the basic editing operations the rest is built on is ~500 lines of code.

link

ben-schaaf 1581 days ago

fwiw Sublime Text uses a rope (using a rbtree).

link

ilrwbwrkhv 1581 days ago

That already makes me look forward to fleet. If it can be as fast as sublime that will be life changing.

Although webstorm can't handle large typescript projects with a lot of computed type as well as vs code.

link

ben-schaaf 1581 days ago

The data structures used for text editing are important but only a very small part of what makes ST fast. It's the native, gc-less code, the custom UI toolkit and constant attention to performance that pull that weight.

link

searealist 1581 days ago

You can't beat the efficiency of a gap buffer for just typing. The downside of gap buffers is its a linear operation to move the cursor.

link

moonchild 1581 days ago

I see no reason why a rope would be slower than a gap buffer for 'just typing'. And the gap buffer will choke when you fill it up and want to continue typing.

link

SamReidHughes 1581 days ago

A rope could frequently call malloc, or cause more time spent garbage collecting, and it takes up more memory, and the code is slower, especially in the 80's.

Before modern JavaScript VMs, I made an in-browser text editor, and I tried using a finger-tree data structure with string segments. It was extremely slow. I replaced it with two strings. Then it ran at human speeds. Memcpying the whole string upon a keypress was faster than some fancy data structure.

link

vidarh 1581 days ago

As I noted in another comment, I think a lot of complexity in many editors is because they worry about handling absurdly large files without thinking about whether they need to.

The moment you're willing to set an arbitrary limit above which you're ok with dropping performance, you can go very simple the way you did.

And that limit where things start to slow down can be far over the size of files most people need most of the time.

For my own part I'm fine with falling back to e.g. emacs the one time every few months I need do do something with an unusually large file.

link

searealist 1581 days ago

I'm just saying that "gap buffer" is not a reason any editor is perceived to be slow unless its slow when moving the cursor in a large buffer (or as you point out, the amortized growth of a large buffer, which will happen max log n times in a buffer of size n)

link

moonchild 1581 days ago

> amortized growth of a large buffer, which will happen max log n times in a buffer of size n

You can do better than log if you care to; just grow the buffer more quickly. It's just that exponential growth tends to work pretty well (and, notably, amortizes the overhead of copying such that it is O(1)). Also: people tend to take breaks when typing, so you can asynchronously try to grow the buffer when it is close to full.

link