Hacker News new | ask | show | jobs
by coder543 2590 days ago
Segfaults are the ideal case for memory errors, and those are the most easily caught and fixed, so you're least likely to see them. But, often those memory errors result in silent corruption which can be exploited, and that's harder to detect, especially if it relies on very specific corner cases. `curl` has had a number of these vulnerabilities over the last several years, for example.

Something as simple as `ls` is probably so small and battle tested that it's not an issue, but if you're writing an all-new, not-battle-tested tool, why wouldn't you want stronger guarantees? The new tool is being written for the features, but it's not fun to write vulnerabilities into something that should be simple and "just work."

Languages like Go and Ruby are also mostly memory safe, so those are generally fine picks here too, but every language has trade offs. In this case, the author clearly cares about performance, which Ruby does not care about.

Rust also has a built-in testing harness, which is a lot more convenient IMHO than using whatever C testing framework you might have a predisposition towards.

1 comments

Regarding segfaults, when I was interviewing devs for C++ roles, I'd ask questions about a simple function like this:

  std::string foo(bool flag)
  {
    if (flag)
      return "true";
  }
Questions I'd ask: * Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11) * what happens if 'foo' is called with true? (Returns "true" as one would expect) * what happens if 'foo' is called with false? (Undefined or implementation defined behavior, but generally nothing nice - segfault, acces violation, etc)

* if it crashes, where, when and why does it crash? (Technically since its undefined, nearly any answer here suffices, if it can be backed up. Since practically, most optimizing compilers assume UB can never happen, when you return nothing from a non-void function, the compiler will attempt to invoke the destructor of a non existent object instance (assuming non POD) and boom.)

I asked this because it was a distilled example of a real world rare crash that was extremely difficult bug to track down because the crash location is often know where near the offending function.

I remember getting into a heated argument with a coworker when I claimed it should have been a compilation error. IIRC, he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value. I called BS, citing at the time (circa 2004) that the new compiler on the block for C# could reliably emit errors when not all return paths returned a value.

I also like it that in C++, it's a very rare example of a very terse example dealing with a number of topics such as undefined/implementation defined behavior, debugging, compilation settings (warning levels, etc) all in a mere 4 lines of codes. With 4 LOC, which is straight forward and simple for the candidate to mentally parse, I can gleam a lot about their understanding of the language (and it's potential pitfalls).

Sorry if this got a little long winded and ranting.

> he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value.

Theoretically we can't determine whether a function will return a value or not. In practice, heuristics get 99% of the way and the last 1% you can make the programmer put in a possibly redundant return statement.

There are two different questions: "do all paths lead to returning a value?" and "will the function return a value?"

Answering the second in the general case is equivalent to solving the halting problem. But answering the first question is much simpler. Static analyzers aren't using a heuristic approach to the second question, they're solving a completely different problem.

Don't get me wrong, but if whole classes of errors, can be avoided by not choosing C++, why should I choose to use it?
Because you already know C++ and don't want or can't justify learning another language. That's a very valable reason.
Of course you can decide the type a function returns, to overkill this problem, just run any type inference algorithm powered by unification (e.g. hindley milner). Give undeclared variables the type NULL and you're all set since you can decide whether the function returns type NULL.

Your friend was confused because you cannot decide whether a certain line in your program will be executed. Since this is equivalent to the halting problem. Let's prove this! Assume we have a blackbox B(P,x,N) decides whether line N of program P will be executed given input x. Here, we can solve the halting problem:

```pseudocode

def Helper(sourcecode P, input x):

    [0] P(x)

    [1] print "What Do We Say to the God of Death?"
def Halts(sourcecode P, input x):

    if B(Helper,(P,x),1):

        return true

    else:

        return false
```

This means, given programs like your interview question, compiler cannot decide -- in general -- whether the program will crash or work. Of course, it's not too hard to find "good enough" heuristics that'll catch some cases and/or restricting your language to make it "harder" to encode such programs.

> Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11)

In C falling off the end of a function without returning is legal, as long as you don't use the return value. In C++ this is illegal.