The data structures used for text editing are important but only a very small part of what makes ST fast. It's the native, gc-less code, the custom UI toolkit and constant attention to performance that pull that weight.
I see no reason why a rope would be slower than a gap buffer for 'just typing'. And the gap buffer will choke when you fill it up and want to continue typing.
A rope could frequently call malloc, or cause more time spent garbage collecting, and it takes up more memory, and the code is slower, especially in the 80's.
Before modern JavaScript VMs, I made an in-browser text editor, and I tried using a finger-tree data structure with string segments. It was extremely slow. I replaced it with two strings. Then it ran at human speeds. Memcpying the whole string upon a keypress was faster than some fancy data structure.
As I noted in another comment, I think a lot of complexity in many editors is because they worry about handling absurdly large files without thinking about whether they need to.
The moment you're willing to set an arbitrary limit above which you're ok with dropping performance, you can go very simple the way you did.
And that limit where things start to slow down can be far over the size of files most people need most of the time.
For my own part I'm fine with falling back to e.g. emacs the one time every few months I need do do something with an unusually large file.
Realistically, unless you're browsing log files or some globbed together generated code, I would lay money that 95% of usage of programmers editors are under 2000 lines.
If you're optimizing for a 2 GB apache log, you're probably focusing on the wrong thing.
Exactly. I'd rather keep my editor simple than worry about use cases like that. Of course I'm happy some editors do the work to handle large files too, in part because that makes it ok for me to ignore it since I have fallbacks.
You still can't avoid the performance penalty of a tree allocation for something like bracket pairing. Or AST analysis or any one of billion things people want from a code editor.
Depends on the analysis. For bracket pairing and quite a bit of analysis you can avoid it quite easily by storing the state of a parser at intervals. E.g.for syntax highlighting I use Rouge augmented with serialization of the lexer state, which also provides enough state for bracket matching and the internal state I need to store is typically one symbol every few lines.
For complex analysis of the code-base, sure, you may want to build an AST. For my part I have no interest in having that functionality in-process in the editor - I'd rather have that provided by an external service.
I'm just saying that "gap buffer" is not a reason any editor is perceived to be slow unless its slow when moving the cursor in a large buffer (or as you point out, the amortized growth of a large buffer, which will happen max log n times in a buffer of size n)
> amortized growth of a large buffer, which will happen max log n times in a buffer of size n
You can do better than log if you care to; just grow the buffer more quickly. It's just that exponential growth tends to work pretty well (and, notably, amortizes the overhead of copying such that it is O(1)). Also: people tend to take breaks when typing, so you can asynchronously try to grow the buffer when it is close to full.