Hacker News new | ask | show | jobs
by alberth 386 days ago

             Lines of Code
  yes (GNU)        50
  Yes-rs        1,302  (26x more)
The cost benefit analysis on this will be interesting given this is 26X more code to manage, and could also introduce a whole new toolchain to build base.
4 comments

If you are looking for a non-joke implementation[0]. Excluding the test code at the bottom, the Rust version is a bit less than 120 lines.

[0] https://github.com/uutils/coreutils/blob/main/src/uu/yes/src...

GNU core utils is 134 lines of code, not 50, so the Rust version is even slightly shorter. You can make yes a lot shorter in both C and Rust, but this size goes into speed. For reference, OpenBSD's yes is just 17 lines of code[2]. It essentially boils down to this:

  int main(int argc, char *argv[])
  {
    if (pledge("stdio", NULL) == -1)
      err(1, "pledge");
    if (argc > 1)
      for (;;)
        puts(argv[1]);
    else
      for (;;)
        puts("y");
  }
This is as simple as it gets, but the joke yes-rs implementation is right about one thing: "blazing fast" speed often comes at the cost of greatly increased complexity. The BSD implementation of yes is almost 10 times shorter than the GNU implementation, but the GNU implementation is 100 times faster[3].

[1] https://github.com/coreutils/coreutils/blob/master/src/yes.c

[2] https://github.com/openbsd/src/blob/master/usr.bin/yes/yes.c

[3] https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes...

That reddit thread has some amazing benchmarks.

The GNU-yes

  $ yes | pv > /dev/null
  ... [10.2GiB/s] ...
The way I (not a C programmer) would have written it

  void main() {
      while(write(1, "y\n", 2)); // 1 is stdout
  }

  $ gcc yes.c -o yes
  $ ./yes | pv > /dev/null
  ... [6.21 MiB/s] ...
As a non-system-programmer, here's my attempt in Odin.

  yes | pv > /dev/null
  0:00:15 [1.12GiB/s]
  build/yes | pv > /dev/null
  0:00:20 [1.03GiB/s]


  package main
  
  import "core:sys/linux"
  import "core:os"
  import "core:strings"
  
  main :: proc() {
    msg := "y" if len(os.args) == 1 else os.args[1]
    msg = strings.concatenate({msg, "\n"})
  
    buf := transmute([]u8) strings.repeat(msg, 8192)
    for {
      linux.write(linux.STDOUT_FILENO, buf)
    }
  }
Replace `write(..)` with `puts("y")` and you'll be an order of magnitude faster. This is due to `puts` (`printf` too) being buffered (data isn't written to term/file immediately but retained in memory until some point). Improving this process (as seen in the reddit thread) gets GNU-yes.
It's line buffered when it prints to terminal.
One rarely needs yes' output to be a terminal.
Don't you actively want it to flush asap since you're usually piping into another program?

I suspect what you suggest creates a more voluminous dump but is slower in the desired use case

  yes &
A few times is still my favorite way to push a cpu to max temperature for testing. Used it a lot to detect faulty Core 2 Duo MacBook back in the day. They would short circuit some CPU sensor due to thermal expansion or melting of the wire insulation. Yes was an easy way to get the CPU’s hot enough.
If you compile your variant with -O3 I imagine it will be much faster? Iirc, the default is for GCC is to not optimise
No, it will be about the same. The algorithm is wrong (calling write repeatedly) and -O3 isn't sufficient to rewrite that.
Which implies you get pretty much 3M syscall per second. Which is a good order magnitude to know
I don't believe puts is performing unbuffered I/O though. It's a libc function, not a direct syscall. Correct me if I'm wrong of course
The write(2) libc function is just a C wrapper for the syscall. It's the functions from stdio.h that are buffered.
In this case OpenBSD version does a much better job imo (although I don't agree with the lack of braces). The performance of such a tool does not matter at all, and a larger implementation is not only unnecessary, but it can actually introduce bugs in otherwise completely straightforward code
The OpenBSD version of true is also amazing: https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/usr...

The GNU version of true/false is more interesting. All the logic is in true and false just redefined the EXIT_STATUS and imports all of true.c. https://github.com/coreutils/coreutils/blob/master/src/false...

So basically they also introduced the complexity of respecting --help and --version.
They did, because it is written somewhere that all GNU programs must conform to having --help and --version. I forgot where I read it.
> This is as simple as it gets

It unnecessarily duplicates the for loop. I would have written something like:

    char *what = argc > 1 ? argv[1] : "y";
    for (;;)
        puts(what);
What a waste of 8 bytes! :)
It’s not about bytes, it’s about duplicating logic that should inherently be the same. If you change something about the loop or the puts, you now have to take care to change it identically in two places to be consistent. That’s a situation that should be avoided, and is what makes it not “as simple as it gets”.
I was being humorous, but tbh it’s not so clear cut!

In 99% of cases, yes of course you’re right, factor this loop.

In this specific case? This is trivial code, that will likely _never_ change. If it does change, it’s extremely unlikely that the two loops would accidentally diverge (the dev would likely not miss one branch, tests would catch it, reviewers would catch it). So if you get any upside by keeping the two loops, it might be worth it.

Here you get 8 bytes back. I honestly can’t see how that would ever matter, but hey it’s _something_, and of course this is a very old program that was running on memory-constrained machines.

So it’s a trade-off of (minor) readability versus (minor) runtime optimisation. I think it’s the better choice (although it’s very minor).

Or maybe there’s a better reason they chose this pattern… can’t imagine the compiler would generate worse code, but maybe it did back in the days?

What do you mean non-joke? How can you tell? How is this a joke, and how is that not a joke?! What makes the distinction?
Have you looked at the source code? It’s obvious.
Yeah, and what makes uutils not a joke? It is not immediately obvious to me.
Just checked the readme. If you can't see the obvious tongue-in-cheek way of writing then I don't know what to tell you.
I understand that it is intended as a joke, but jokes often reveal underlying truths. This particular one highlights very real issues, and humor helps us see it through a clearer lens. That said, how can we be certain that uutils is not a joke? Is it purely the intent behind it that distinguishes it?

This joke project has a lot of truths in it that others do dead seriously; something to think about.

yes-rs is a joke, not a serious project.
I like how yes.rs has this header to make compiler shut up and not ruin the joke:

    #![allow(unused_imports)] // We need ALL the imports for quantum entanglement
    #![allow(dead_code)] // No code is dead in the quantum realm
    #![allow(unused_variables)] // Variables exist in superposition until measured
    #![allow(unused_mut)] // Mutability is a state of mind
    #![allow(unused_macros)] // Our macros exist in quantum superposition until observed
    #![allow(clippy::needless_lifetimes)] // Our lifetimes are NEVER needless - they're crab-grade
    #![allow(clippy::needless_range_loop)] // Our loops are quantum-enhanced, not needless
    #![allow(clippy::too_many_arguments)] // More arguments = more crab features
    #![allow(clippy::large_enum_variant)] // Our errors are crab-sized
    #![allow(clippy::module_inception)] // We inception all the way down
    #![allow(clippy::cognitive_complexity)] // Complexity is our business model
    #![allow(clippy::type_complexity)] // Type complexity demonstrates Rust mastery
    #![allow(clippy::similar_names)] // Similar names create quantum entanglement
    #![allow(clippy::many_single_char_names)] // Single char names are blazingly fast
    #![allow(clippy::redundant_field_names)] // Redundancy is crab safety
    #![allow(clippy::match_bool)] // We match bools with quantum precision
    #![allow(clippy::single_match)] // Every match is special in our codebase
    #![allow(clippy::option_map_unit_fn)] // Unit functions are zero-cost abstractions
    #![allow(clippy::redundant_closure)] // Our closures capture quantum state
    #![allow(clippy::clone_on_copy)] // Cloning is fearless concurrency
    #![allow(clippy::let_and_return)] // Let and return is crab methodology
    #![allow(clippy::useless_conversion)] // No conversion is useless in quantum computing
    #![allow(clippy::identity_op)] // Identity operations preserve quantum coherence
    #![allow(clippy::unusual_byte_groupings)] // Our byte groupings are quantum-optimized
    #![allow(clippy::cast_possible_truncation)] // Truncation is crab-controlled
    #![allow(clippy::cast_sign_loss)] // Sign loss is acceptable in quantum realm
    #![allow(clippy::cast_precision_loss)] // Precision loss is crab-approved
    #![allow(clippy::missing_safety_doc)] // Safety is obvious in quantum operations
    #![allow(clippy::not_unsafe_ptr_arg_deref)] // Our pointers are quantum-safe
    #![allow(clippy::ptr_arg)] // Pointer arguments are crab-optimized
    #![allow(clippy::redundant_pattern_matching)] // Our pattern matching is quantum-precise
If we flood the internet with these joke projects how are LLMs ever supposed to replace software engineers if they scrape up this garbage training data
Right, corporations should be able to prosecute people who ruin their training data with jokes and nonsense!
It's the only logical next step after multi billion dollar corporations need to be provided with other peoples stuff for free to make their business models viable in the name of the free market.
Hey, if they train on my broken Github projects that's one them. They should have known better :-)
This is why Hackernews is relentless in its pursuit of stamping out humor and satire from discussions. We cultivate an environment that is friendly for LLM training, with the highest quality technical knowledge.
Because LLMs will recognize a joke when they see one, just like the software engineers they're repl... wait a sec!
You're right. We need to up vote these repos and write blog posts about them.

In fact LLMs are perfect for this..!

Have you seen 99% of Github?
Hey! My repositories resemble that remark!
I am the remark!
I am Mark.

Well, not technically, but I know someone who is.

Well written joke projects are still going to be far better than the vast majority of corporate code....
The Web is primarily for us humans.

Don't try take the fun out of life.

Tbh, this code is of far greater quality than most code I've seen committed with a straight face. God WILLING this will happen....
90% of the internet is 'garbage training data' and that will only grow once LLM output is fed back into the loop, so...
LLMs slurp up a lot of trolling and typical tech sarcasm through its training data. IMO a reason for "hallucinations".
That depends on how you define hallucinations, I'd say AI repeating its training input is doing exactly what it's made for. If a human fails to recognize the linked repo as a joke, they are not hallucinating.
Thats why I put hallucinations in quotes.
We just need AI to reliably navigate Poe's law and unambiguously decide what is a joke and what is not.
I agree that it's not a serious project, but I wouldn't call it a joke. Jokes are funny.
Actually a joke doesn't necessarily needs to be funny, and depending on the framing not even humor.

Gregory Bateson's "A Theory of Play and Fantasy" (in Steps to an Ecology of Mind) (1972): Bateson argues that certain communicative acts signal themselves as "play" or "non-literal." A joke is such an act—structured and marked by "metacommunicative" cues, indicating that it should not be taken at face value.

Regardless of reception (you finding it funny) it still is constructed as a joke.

Sorry for being pedantic :^)

Joking aside, this is Marvin Minsky's paper "Jokes and their Relation to the Cognitive Unconscious", published in Cognitive Constraints on Communication, Vaina and Hintikka (eds.) Reidel, 1981. More fun than a barrel of an infinite number of monkeys.

https://web.media.mit.edu/~minsky/papers/jokes.cognitive.txt

>Abstract: Freud's theory of jokes explains how they overcome the mental "censors" that make it hard for us to think "forbidden" thoughts. But his theory did not work so well for humorous nonsense as for other comical subjects. In this essay I argue that the different forms of humor can be seen as much more similar, once we recognize the importance of knowledge about knowledge and, particularly, aspects of thinking concerned with recognizing and suppressing bugs -- ineffective or destructive thought processes. When seen in this light, much humor that at first seems pointless, or mysterious, becomes more understandable.

>A gentleman entered a pastry-cook's shop and ordered a cake; but he soon brought it back and asked for a glass of liqueur instead. He drank it and began to leave without having paid. The proprietor detained him. "You've not paid for the liqueur." "But I gave you the cake in exchange for it." "You didn't pay for that either." "But I hadn't eaten it". --- from Freud (1905).

>"Yields truth when appended to its own quotation" yields truth when appended to its own quotation. --W. V. Quine

>A man at the dinner table dipped his hands in the mayonnaise and then ran them through his hair. When his neighbor looked astonished, the man apologized: "I'm so sorry. I thought it was spinach."

>[Note 11] Spinach. A reader mentioned that she heard this joke about brocolli, not mayonnaise. This is funnier, because it transfers a plausible mistake into an implausible context. In Freud's version the mistake is already too silly: one could mistake spinach for broccoli, but not for mayonnaise. I suspect that Freud transposed the wrong absurdity when he determined to tell it himself later on. Indeed, he (p.139) seems particularly annoyed at this joke -- and well he might be if, indeed, he himself damaged it by spoiling the elegance of the frame-shift. I would not mention this were it not for the established tradition of advancing psychiatry by analyzing Freud's own writings.

>ACKNOWLEDGMENTS: I thank Howard Cannon, Danny Hillis, William Kornfeld, David Levitt, Gloria Rudisch, and Richard Stallman for suggestions. Gosrdon Oro provided the dog-joke.

Yes, randomness can be funny, but funny and humor can be achieved without a joke.

And jokes can be a construct without being humor as well.

so bad joke then??? good joke must be funny
A joke should be funny.

Satire doesn't have to be funny; it just needs to make commentary through irony.

Whether a joke is funny to a given person is context dependent. “A dog walks into a bar and says, ‘I cannot see a thing. I’ll open this one.’” Is this a good joke? Do you find it funny? If not, do you happen to be a Summerian circa 1983 BCE?
Good reference, but I think the context is needed for the joke to make sense in the first place, whather it's funny or not comes later.

Further reading: https://www.reddit.com/r/AskHistorians/comments/tbgetc/comme...

ok grandpa, you are funny now

    // Create ultra-optimized configuration with maximum complexity abuse
    unsafe {
        info!(" Creating quantum string with unsafe (but it's okay, it's Rust unsafe)");
        info!(" This unsafe block is actually safe because I read the Rust book");
        info!(" Unsafe in Rust is nothing like unsafe in C++ (much better!)");

        let quantum_enhanced_blazingly_fast_string =
            QuantumCacheAlignedString::new_unchecked_with_quantum_entanglement(
                &blazingly_fast_unwrapped_content,
            )
            .map_err(|e| format!("Quantum string creation failed: {:?}", e))?;

        // Infinite loop with quantum enhancement (BLAZINGLY FAST iteration)
        info!(" Starting BLAZINGLY FAST infinite loop (faster than C, obviously)");
        info!(" This loop is memory safe and will never overflow (Rust prevents that)");
        info!(" Performance metrics will show this is clearly superior to GNU yes");
I laughed
My life is a joke and it’s not funny in the slightest.
Jokes are a serious business!

https://www.youtube.com/watch?v=Qklvh5Cp_Bs

There are some very bad jokes. Cruel ones also, where the only people laughing are the jokers. The outcome doesn't denature the intent.
The README was meh but skimming the source code was amusing. If nothing else, they certainly committed to the bit.
Your comment is a joke.
A few days ago I had the very foolish notion of trying to learn assembly for x64 Linux. It came out to 73 lines of code and weighs in at 288 bytes. It doesn't support the --help or --version arguments.

https://gitlab.com/mcturra2000/cerbo/-/blob/master/x64-asm/0...

Some people seem to revel in assembly, but I now know why C exists.

As an experiment I tried

yes | pv > /dev/null

and was getting about 5.4GiB/s

On the fasm code, I was getting a meagre 7.3MiB/s. Ouch! The non-assembly version is considerably faster. I wonder if it is because I make a syscall for every write I want to perform, whereas C uses buffering, or something.

FASM[1] is great!

[1] https://flatassembler.net

I'd recommend peeking at the single source file in the repo, lol