Hacker News new | ask | show | jobs
by Zancarius 2038 days ago
> Terrible, terrible, behaviour but there are lots of similar examples in PHP

So true.

One I ran into recently was comparing hashes that start with '0e' and contain only a numeric component after that prefix. Failure to use the identity operator means that you can have two different hashes returning... equality. PHP apparently thought it a good idea to coerce that into 0 raised to an exponent, which of course always yields zero. e.g.:

    php > var_dump('0e41235843934' == '0e61193475532');
    bool(true)
Yeah, I know: The correct (and only) way to write this is to use the identity operator (===) but it's not at all intuitive. For those of us who are used to this it's fine since it's a force of habit, but for those who aren't it'll lead to unexpected behaviors. The only saving grace is that the quality of PHP code out there has been slowly (slowly!) improving over time, thanks in part to composer and ecosystem/cultural changes.

But it's also not outside the realm of possibility that someone won't eventually forget to add an extra '=' and induce a very difficult bug to isolate...

1 comments

Is this fixed in php 8? I did not think this was possible as there is no type coercion and not well documented. [1][2]

Noticed "0.2" == "0.20" as another example

[1] https://www.php.net/manual/en/language.operators.comparison.... [2] https://www.php.net/manual/en/language.types.type-juggling.p...

> Is this fixed in php 8?

I don't know. It's one of those cases where fixing it to do the "right" thing would potentially break a lot of software depending on this sort of erroneous behavior. Part of me wants to say "I hope so" and part of me is kinda terrified at what might happen if they did. Shouldn't be difficult to fix, but it would definitely take some time.

Someone should know if there's a decent linter that would pick this up to make it easier to fix, I would imagine!

> I did not think this was possible as there is no type coercion and not well documented.

I think when applying the equality operator (==) in lieu of the identity operator (===) between strings PHP uses heuristics to decide what to do. If it looks like a number, it coerces it into an int or a float. As an example:

    php > var_dump('1e12' == 1e12);
    bool(true)
As ridiculous as I think it is, I also take the approach that it's just what PHP does. Weird, maybe a bit eccentric, probably a contributor to difficult-to-find bugs, but it's just something we have to keep in mind. Perhaps I'm feeling charitable because it's Thanksgiving!
It is well documented that strings are converted to integers when comparing with integers.

It is well documented that when converted to integers, strings containing a number at the start will be converted to that number, and drop any string after, e.g. '1eabc' becomes 1.

It's also been a best practice for a looong time (since something like PHP 5.4) to use === to test for equality.

> It is well documented [...]

Yes, it's documented, but it doesn't make it any less obnoxious. IMO the thought of implicit conversion is absolutely asinine.

But, in fairness, PHP isn't the only language that does this. Python, at least, is more sane.

> and drop any string after, e.g. '1eabc' becomes 1.

In defense of PHP, it does yield a NOTICE if you attempt to apply certain operations to implicitly cast strings (e.g. addition), so there's that.

Suffices to say that while I've written PHP for years, I'm quite comfortable with its idiosyncrasies, it doesn't mean I don't find them appallingly brain damaged. :)

FWIW the identity operator (===) is also a best practice in JS for nearly identical reasons which also exhibits similar (and in some cases nearly identical) implicit casting.

Yes, implicit conversion in both PHP and JS are brain damaged by today standards, but that's life with legacy cruft.
> but that's life with legacy cruft.

So true.

I can't really complain. Well, I can, and that's what I've been doing in much of this thread.

It's mostly just yelling at the sky for no good reason other than to make myself feel better at this point, if I were to be completely honest with you.

Still in js "0.2" == "0.20" would always be false. My instinct would be that comparing two different strings would be false in php too.

The problem for me is working with a big legacy codebase. I'd rather have php break backwards compatibility and change === to ==.

> Still in js "0.2" == "0.20" would always be false. My instinct would be that comparing two different strings would be false in php too.

Nope.

I wish I were kidding. This is where PHP diverges from JS in a way that really will surprise people coming from JS (PHP7.something):

    php > var_dump("0.2" == "0.20");
    bool(true)
> I'd rather have php break backwards compatibility and change === to ==.

I have mixed feelings on this.

On the one hand, I desperately want to agree with you, and probably for all the same reasons. On the other hand, languages like PHP with their dynamic typing system make implicit conversion almost a necessity to retain the identity operator (===) because they strive not to break "too much" (for some value of "too much").

Although, I'd imagine your argument is akin to "well, just cast it to what you expect" (e.g. `(int)$_GET['value']`), and you're right. That's how it should be done, of course.

But, as you mentioned about legacy code bases... sometimes it's not that easy.