Hacker News new | ask | show | jobs
by danaris 2197 days ago
I have heard—and I do not recall the source for this, so it may be incorrect—that when PHP was originally being written, the hashing algorithm for function calls used the length of the function name as its primary key. Thus, making each function name a slightly different length made the code run faster.

Of course it would've been nice if, at any time in the past 20 years, they'd gone back and standardized those function names (possibly also adding some kind of backwards-compatibility shim for people who absolutely cannot update their ancient codebase with global search-and-replace), regardless of whether the story is true.

1 comments

That reasoning is true to a third.

Length as hash was indeed a thing in PHP/FI 2.

Second third to the reason is that PHP often takes names from underlying C libraries. strlen is strlen since that's the C name etc.

Third third reasoning is that early PHPade it easy to contribute. You had a need and a patch - a few minute Slater it is in. Nowadays there is more of a debate and vote before things are added (in my personal opinion too much emphasize on voting, but, well that's how things go, some day the process will be relaxed again ... and strengthend ... and relaxed)

Cleaning this up isn't trivial. There eis sooooo much code written already. There are sooo many tutorials, books, articles, magazines, videos and muscle memory making a change hard, even ignoring that for many things there is no single truth about what is best.

From time to time some modules might see replacements with more streamlined APIs (i.e. Maybe someday one figures out what a good and practical Unicode aware string library might be, which might replace the classic string mess) but those are multi year things.

A good place to start would seem to be creating naive aliases to these functions to at least fix the consistency problem. There's a possibility of breaking some code out there if someone made their own, say, "str_cmp", but we're used to breaking changes between major revs, and these could be fixed with a grep -l | xargs sed.

After that's done, we can talk about deprecation warnings, and then finally outright removal.

Doing that means you have to first agree on a name, strcmp has vertical consistency with C. str_compare might be nice, string_compare even nicer.

Once that debate is over you have to migrate all old code and all developers and teach them to use the new form. Throwing away all preexisting documentation and telling them why typing more is better.

Only then the old form can be removed.

Is it really worth it? - And yes, there is an argument "in the next 25 years there will be noore code written than in last 25" but still, is it worth it?

To give a feeling on the time period: PHP had this register_globals feature where URL (GET) and POST parameters became global variables. Getting rid of that, from introducing a replacement, over changing default and final removal of the option, too 10 years. When taking that time frame it's 10 years of confusion.

Problem 1 is basic standardization and naming things. Hard, but possible.

Problem 2 is is solved by adding the deprecation warnings. The code works, but it generates warning messages.

Yes, it'll take some time, but I'd argue it's necessary. Unless you've memorized the entire standard library, remembering which functions use underscores and which don't (and which ones use "to" vs "2") is an unnecessary mental burden.

Deprecation warnings which many people will ignore, while new learners still learn with old material.

It is really tough and costs lots of energy.

If you have the energy: Create a complete proposal and push for it. If people feel like you it will go through.