Hacker News new | ask | show | jobs
by O__________O 1437 days ago
Here’s what appears to be an independent review of Spritz:

https://www.tsw.it/wp-content/uploads/Rapid-serial-visual-pr...

One of the issues with tests like these is that the companies sponsor the research and they are one-off vs long exposure studies; that is as with most interfaces with high-throughput, takes time for the mind to adjust.

My guess that given human speech appears to have a universal transmission rate of around 39-bits per second, that on average that’s going to be the actual performance target:

https://www.science.org/doi/epdf/10.1126/sciadv.aaw2594

2 comments

Blind people who use screen readers usually turn up the speed to something unintelligible to the rest of us.

Maybe reading a computer screen is a simpler task than talking person to person, but it's an interesting datapoint!

I am definitely not blind. But i listen at high speed to audio books and podcast all day at work, I average at around 2.5x/2.8x speed (lower for certain narrators and can get up to 3x + speeds if i am not doing something that requires much of the language processing part of my brain or anything mentally taxing) i could drive or play a non-text/plot heavy video game but not talk or problem solve.
While I agree both the blind and others, including myself, listen to screen readers and audio at a higher rate, that’s not actually research that’s reviewable and shareable. Speaking for myself, 100% sure the noise-to-signal ratio increases when I do, but if needed, I just go back X-seconds in time to relisten to prior audio. I for sure never literally test my comprehension systematically when doing so relative to when I am not. The speakers, vocabulary, topic familiarity, etc — also make a huge difference; aka prior familiarity in general with the input.

On the flip side, I provided research on transmission rates, which to me seems reasonable, but another user shared research on reception rates, which to me is unreasonable:

https://news.ycombinator.com/item?id=32160095

To me, I am interested in notable, reviewable progress in understanding the topic — not chatting about it around an internet campfire.

Not blind, but I often watch youtube videos from 2x - 4x speed, only going below 2x to review complicated information (pausing a tutorial to read the text/code/image on the screen) or if the timing of information is part of the information (music, comedic timing, etc...)
> human speech appears to have a universal transmission rate of around 39-bits per second

This is unacceptable, we need a modern language that packs information tighter!

I think this is in jest, but to answer seriously: No, I think that's actually backwards, trying to make a language more "packed" will harm information flow.

Evidence suggests the real limit is how quickly human brains can take ideas/qualia, convert them into abstractions, and encode the abstractions into language. This is because (A) very different languages still exhibit similar limits and (B) those limits appear to be governed by the sending-side. People can comprehend spoken words at a higher rate than they can spontaneously emit them.

So trying to make the language more "compact" would likely just waste precious brain-cycles on the compression step, which isn't actually necessary when your mouth already supports talking faster.

Technology analogy: Two computers are collaboratively solving a problem with back-and-forth messages. The network connection is actually very good, however the bottleneck is the CPU in each computer. Will the problem be solved faster if you change the transmission style from plaintext to gzipped?

The other problem with this idea is that natural languages contain quite a bit of redundancy, which is why speech and writing can be decoded even over very noisy channels. It’s hard to make language more compact without removing some redundancy.

That being said, this has been attempted before: see https://en.wikipedia.org/wiki/Speedtalk and https://www.zompist.com/kitlong.html#howmany, and especially https://web.archive.org/web/20000503004430/http://fatmac.ee.... for a detailed attempt.

A related rant I've had over the years: the redundancy present in human language is not a flaw to be optimized away. Evolution went there because it acts as a form of forward error correction.
Except when the complex houses married and single soldiers and their families. :P
Wow, I've never seen such a good one. I spent a solid minute frozen in this garden!
That's not redundancy, that's multiplexing!
> People can comprehend spoken words at a higher rate than they can spontaneously emit them.

Like so many things with people, it depends on what they practice. Most people practice listening far more than rapid speech. Some people can speak faster than most people can comprehend. Part of learning to do that may involve not listening (to themselves) in the same way.

> So trying to make the language more "compact" would likely just waste precious brain-cycles on the compression step

If you learned this new more compact language as your native language or to native fluency, I don't imagine you would need to go through a compression step, since all of your thoughts would already have happened in the compacted form.

That raises the question of why it seemingly hasn't happened already.

One option is that what you're describing just isn't possible, that humans are already butting up against some kind of limit which is not avoidable simply by being raised with a different language.

Another option is that one can be raised to think in "pre-compressed" structures, but nobody does because it's a bad tradeoff, dropping "general thinking" performance with a worse impact than any "faster speaking" benefits. (Such as being simply slower, or more error-prone, or more demanding on attentional resources, etc.)

>> “People can comprehend spoken words at a higher rate than they can spontaneously emit them.”

If you have any links to research supporting this, I would be interested.

A Large Inclusive Study of Human Listening Rates [1] focused on finding optimal generated-speech rates for screen-reader software:

> The mean Listening Rate was 56.8, which corresponds to 309 WPM. Given that people typically speak at a rate of 120-180 WPM, these results suggest that many people, if not most, can understand speech signifcantly [sic] faster than today’s conversational agents with typical human speaking rates.

While there was some difference between the visually-impaired and normal-vision respondents, I don't think it's enough to matter to the thesis of my post:

> [...] The mean Listening Rate for visually impaired participants was 60.6 (334 WPM) while for sighted participants it was 55.1 (297 WPM).

1: https://dl.acm.org/doi/abs/10.1145/3173574.3174018

Thanks, reviewed research, unclear though how a synthetic voice reading off a single word per unit of measure is an accurate gauge of average listening rates.

From page 4 of 12 of the PDF you linked to, “Rhyme test: measures word recognition by playing a single recorded word, and asking the participant to identify it from a list of six rhyming options (e.g., went, sent, bent, dent, tent, rent). We used 50 sets of rhyming words (300 words total), taken from the Modifed Rhyme Test [27], a standard test used to evaluate auditory comprehension.” The word list research used is on page 30-31 of 55 of this PDF:

https://www.researchgate.net/profile/Michael-Hecker-3/public...

If true, test appears to not even measuring word recognition, it’s more accurately measuring a single phoneme recognition. If the listener correct picks the correct phoneme from a multi-choice list, researchers assume person would hear and understand 100% of any expressions received at that rate of speed; which in my opinion is clearly flawed.

I would be the first to agree that testing listening comprehension rates is hard to do, hence why I asked to review research, but to me, unless I am misunderstanding something, unclear how this research actually provides any meaningful observations.

I know you're probably joking, but for those out of the loop, the result your parent was referring to is that more verbose languages have faster speakers and the two effects compensate one another pretty well.

https://www.science.org/doi/10.1126/sciadv.aaw2594

Not a language, but a font: https://dotsies.org/

Discussed in HN multiple times https://news.ycombinator.com/item?id=18703805

No, latin is the only language we'll ever need! It is much closer to the fundamental truth, unlike these modern languages that make unnecessary abstractions of what is really going on at the low level.
We need to start in schools, teaching our children to speak in zstd. It could more than triple their throughput!
I wonder if the bottleneck is in speaking or listening.