I was referring to IO (), which is a type that can have only one value, () — hence my thinking that a compiler should simply optimize the call to (), since there is no other logical value that it can return.
I'm not sure I find that explanation satisfactory. This only implies that there is information not used by the compiler: The only conceivable value that the type IO () can have is still the value IO (). If the compiler knew this, then it could, and would, optimize it away. Is there a formal aspect of parameterized types that prevents this from — formally — happening?
There's no such thing as "the value IO ()". The type IO () has tons of possible values, which all have different internal representations:
1) The IO action that does nothing.
2) The IO action that prints the string "Hello World" and then returns nothing.
3) The IO action that reads a string from the console, reverses it, prints it back, and then returns nothing.
4) ...
It's a common misconception to think that IO X is a wrapper around X with some "type system magic" to prevent it from being used outside IO. It's nothing like that. For example, IO String is not a wrapper around a String, doesn't contain any String inside, and cannot be converted to String. Meditate on the fact that System.IO.getLine is not a function, but a value of type IO String. Then you will understand why IO () has tons of possible values.
Thanks. I get that part. After digging a bit, it seems the implementation of IO is in fact a bit magical [1] (and not possible to implement in pure Haskell). GHC's IO is implemented using a GHC-internal pseudo-state monad that uses a magical "world" value to enforce linear ordering — all of which, I'm guessing, would prevent the compiler from accidentally optimizing away or memoizing side-effectful functions?
Edit: I find it interesting that Haskell tutorials and books generally don't try to peel away some of Haskell's abstractions. Many texts will tell you that "IO a" represents an "action" and expect you to take this at face value, intead of explaining how this is distinguishable from a mere function, and how it's implemented internally. For example, I would say one of the big eureka moments for any Haskell learner is to realize how lazy-evaluated graphs of functions can turn into linearly-ordered programs thanks to the transparent "baton passing" of monad chaining (IO in particular). But I've yet to find a text which articulates this idea very well.
GHC's implementation of IO isn't the only one possible. For example, if Haskell had only two available IO operations (read a character or print a character), the following would be a workable definition of the IO type:
data IO a = PutChar (Char, IO a)
| GetChar (Char -> IO a)
| Return a
Programs can work by constructing a value of type IO () and defining it as main, just like in regular Haskell. Note that IO is no longer a wrapped impure function, and there are no magical tokens in sight. In fact, the above implementation doesn't even have to be opaque, it can be completely exposed to programmers. Nevertheless, it's just as easy for the runtime to interpret in a sequential way, and there won't be any problems with accidentally optimizing away or memoizing anything.
You are asking the right questions, it just means you're ready to go one step down the rabbit hole. You need to carefully, and slowly, read SPJ's Tackling the Awkward Squad. You need to allow yourself to skim over the parts that are perplexing on first-reading and take them on faith. With more understanding you'll find even the answer given there not particularly satisfactory (in fact the paper points it out explicitly, trust me you'll know when you're in the precise section).
At this point you'll either accept the state of affairs, or want to go deeper down the rabbit hole, at which point you become more an academic than a programmer per se.
In a very precise sense what groovy2shoes is saying is exactly right, and you need to understand that truly strange weirdos inhabit IO () whereas, as you point out, only one thing, modulo non-termination, inhabits ().
"The only conceivable value that the type IO () can have is still the value IO ()."
There is some magic around the IO type, but it's not needed to answer your question.
If you look in GHC.Types in the package ghc-prim, you can find the definition of IO:
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
An IO action is a function from the state of the world, to the state of the world plus a value. So the compiler may know it's giving you (), but it doesn't know what the new state-of-the-world will be.