Hacker News new | ask | show | jobs
by mfonda 32 days ago
After over two decades of working in PHP, I'm now working in Java. PHP is basically Java-lite. I am absolutely loving the compile-time safety of Java, but I dearly miss PHP's maps and arrays. In Java, the amount of verbosity for defining a map/list and operating on it is overwhelming.

Modern PHP is great. Many powerful language features, excellent performance, great community and package ecosystem, and decent enough safety with modern static analysis tools.

I'm not too sure I agree with the author's complaints here. When using something like array_filter, you're typically mapping from collection to collection (i.e. you don't care about the first element--you care about the whole thing) and so this problem is really a non-issue. The next follow up step would usually be foreach, or another operation like array_map, in which case it's a non-issue.

If you really do need the first element, you can use array_first. And if you really do need a fixed-sized collection, you can use SplFixedArray.

The point on properties is valid to an extent, but IMO not really an issue you commonly run into in the real world (regardless of language, your constructors should generally return an object in a usable state).

12 comments

Maps should be a first class type in every language.

I have this thing I want to do in C. Now my C is very weak so my plan was to do a prototype in python, keeping in mind the C ecosystem then sit down with my old copy of "The C Programming Language" and struggle through it. Doing it with no dicts was rough.

I am normally the sort to avoid adding any libraries I don't have to but if anyone has any hints to simple hash maps in C I am all ears.

My first recommendation is don't use C if you can avoid it. Rust, go, zig, nim, odin, c++ etc. have hashmaps built in to the standar library or the language itself, and will have other advantages over c as well.

If for some reason you absolutely need to use c, consider if you really need a hashmap. If your collection is relatively small, just use an array or linked list with linear search. That's a pattern I've seen used in several c codebases, I think because of the difficulty of using map types.

If linear search would be too slow, then a binary search tree is relatively easy to implement, and gives you log(n) lookup time (as long as it's balanced). Or if you build it up once and don't modify it very much afterwards, you can use a sorted array, with a binary search.

If you really need a hashmap, there are some implementations, but I've also seen a few c projects that just implement their own hashmaps.

My suggestions, none of which are particularly simple but it's C, you get what you get with a language that doesn't even know what strings and arrays are:

0) use Lua with LuaJIT. It has very good C interop and native hashmaps. Downside - Lua has its own rough edges and LuaJIT is frozen at Lua 5.1 with some extensions for 5,2

1) SDL3 has a hashmap implementation with its properties API, and a general purpose hashmap that they're probably going to make public at some point maybe? Downside - overkill if you don't actually want to use the rest of the library.

2) https://github.com/tidwall/hashmap.c this seems to work fine and is the simplest hashmap implementation for C that I've seen. Downside - you still have to write a lot of boilerplate.

you may try https://github.com/nothings/stb/blob/master/stb_ds.h. a single header implementation for both dynamic array and string based hash table
If I remember it correctly, the "Java-lite" part comes rather late. PHP was more close to Perl and/or other old-days scripting language, it allows you to quickly launch a web page. Just

    <p><?php echo(htmlspecialchars($_GET['user'])); ?></p>
and you get a hello page with a parameter specifiable via `?user=` query.

But then people started to actually use it to build big sites, `echo($_GET['user'])` alone is not enough, it has to be:

    <?php
    $user = "guest";
    if (!empty($_GET['user'])) { // Have to remember to do this check everytime when handling $_GET/$_POST etc
        ... safety check etc etc
        $user = htmlspecialchars($_GET['user'], ...more parameters...);
    }
    ...
So people started to add module/components as well as ways to load and use those components, to enable them to write code like:

   <?php

   use My\Beautiful\InputFilter;
   use My\Beautiful\InputFilters\Integer;

   function get_page_num_from_query(InputFilter $f, array $source, string $name): bool {
     return $f->is_valid(new Integer(0, 100), $source, $name);
   } // Or something like that, hasn't write a single line since PHP5.
That's when it got it's Java look.
>That's when it got it's Java look.

Nice Java burn! But now days all you have to say to burn Java is "Lawnmower".

Namespaces were added long after that step.
If I have an “array” and can do array[0] to get first item, but when I filter this array and array[0] throws an error, that’s super weird. What is the meaning of [] or what is an array even? The language forces me to understand how it is implemented under the hood. That’s exactly what the author says: leaky abstraction.
That also often shoots you as when json_encoding it only becomes an array when ordered "correctly" (numeric 0-based keys without gaps), otherwise an object. So to be safe you generally need to array_values after filtering. If in your testdata you only remove elements from the end you don't catch that before production data hits.

To get the first element there also is reset().

I love PHP though.

It's especially problematic when encoding an empty object to json. By default an empty array is serialized as [], to get {} you either need to pass a flag to force object serialization (which can mess up serializing actual arrays), or cast the array as an object. Neither of which are great when the object is deeply nested in the serialized object.
An “array” in PHP is an ordered map.
Isn't exactly their complaint? It's called an array, referred to consistently everywhere as an array, but it just ... isn't.
Apparently array is short for associated array:-)
In PHP the array index supports different key types and there's various optimizations that need to happen when the indexes are all numeric, mixed, or all strings or anything else. Technically it's called associative as soon as even one of the indexes isn't numeric. Internally though they are always numeric and anything non integer is hashed internally with DJB2 (Daniel J. Bernstein) hash algorithm and then stored. Using a non numeric index is slightly slower for that reason.
But even if you only have numeric keys, those keys don't need to be consecutive, or start at 0.
Better than calling it a hash.
I don't think it is, tbh.

Perl's hashes are a complete mystery to me still, but at least it lets me know that it's not just a linear, uh, well, array.

> Perl's hashes are a complete mystery to me still

They're unordered mappings from strings to arbitrary values ("scalars" in Perl jargon). In this sense they're just like an object in JavaScript.

Where this gets a little weird is that Perl arrays and hashes are fundamental types distinct from scalars - you can't put a hash into a $variable without taking a reference to it first, for instance. But that's more a matter of Perl being picker about the value/reference distinction than a hash-specific thing.

PHP arrays are vectors, hash maps and doubly linked lists in one
My issue is a ton of people call all maps a hash regardless how it's implemented.
> If you really do need the first element, you can use array_first

That probably needs an array_second too, doesn't it? Maybe array_second_from_last as well?

That's foolish. It would clearly be named `second_array_element`.
Just use array_values() and suddenly you can use int indicies again.
Php has compile time safety too but compile time occurs at roughly the same time as runtime lol
I'm not familiar with PHP, can you elaborate on what you mean here? What is it compiling? Or are you referring to type safety?
When a php file is loaded at runtime, it runs through a very basic JIT compiler that does statically check a few things before continuing with execution. Syntax, for example, is checked for the entire file during this step.

Most type checking happens at runtime (this might not be true for interfaces at some level, but I can’t say for 100% certain - I just know I tend to see interface related errors earlier during code execution…). It’s perfectly valid syntax to declare a private method as returning an integer and then for the body of the method to return a string (explicitly cast as a string even). As long as you never call that method at runtime, no exceptions will be thrown.

With a half decent IDE or LSP, these sorts of runtime exceptions can be easily avoided but technically they still exist and if you don’t know about that, it can be argued to be confusing. PHP has made a lot of trade-offs to largely maintain backwards compatibility and many of them live in decisions that happen at runtime.

Modern PHP tooling can provide type safety in a very similar way to Typescript if you’re willing to put in the effort while also still technically offering you an escape hatch to do whatever the heck you want and duck type to your hearts content.

NIT linter warning: That’s not really a JIT compiler.

PHP parses the whole file and compiles it to Zend opcodes before executing it, so syntax errors are caught up front. But "JIT" means compiling an intermediate representation/opcodes into native machine code at runtime, when the functions are called, not at load time when the source file is parsed. If you just load a file and never call any of its code, the JIT compiler should never compile it.

PHP 8's OPcache JIT can do that optionally, but the normal load/parse/compile-to-opcodes step isn't JIT.

One form of type checking does happen at compile time (which is really load time in PHP, but close enough), namely when a class extends another class or implements an interface: the types of every method and property are checked to ensure that they are substitutable per the standard variance rules (return types are covariant, parameters are contravariant, props are invariant). Everything else is checked at runtime though, and statically analyzing any of those is left to external tools like phpstan, psalm, or mago.
Is there a way to run type checking ahead of time similat to typescript or python's mypy or pyright?
PHPStan is one of the more popular tools to evaluate that stuff. It works by examining each file and resolving all of the imports to verify that all of the types are compatible.

Jetbrains PHPStorm has this sort of type resolution built in (one of their value adds) but you can also run PHPStan instead of their proprietary version.

Have you tried Kotlin? It's a less clunky Java. The syntax is IMHO Ruby-level charming (for an OO-first lang), but with types that are quite a bit stronger than Java. Java interop is quite smooth.
The comparison with Ruby is spot on.

I always thought that DSLs were the one thing Ruby did better than the competition, but Kotlin's combination of receiver lambdas plus syntactic sugar for calling higher-order functions make it an even better language to write DSLs in.

That's exactly how i feel about it.

And the code I'm looking at now with Kotlin is so similar to code i liked reading when I was in a committed relationship with Ruby.

All PHP needs is python's flexible and convenient manipulation of lists/dicts/objects. Plus dropping the end semi colon and it would riiiiiiiiiiiiiiiip the fabric of spacetime.
The semi colon allows you to have lots of flexibility in how you present your code, that is if you still read code, I seem to be an oddity now.
I'd really like it if PHP switched to Perl's grammar rule where semicolon is a statement separator rather than a terminator. It already does this in a fashion within <?php ?> tags in that statements in those don't have to end with a semicolon. I think it just blindly inserts a semicolon, but that does work -- if PHP did that whenever it saw a `}` token, it should have the same effect.
Just had a bug the other day where we were array_filter and then feeding the results into json_encode. If you feed a sparse array into that, you get an object, not an array, which can then cause JavaScript Problems.

I wasn't aware of either of those behaviors going into that debugging session.

Have you tried Lua or one of its variants?
> Modern PHP is great. Many powerful language features, excellent performance, great community and package ecosystem

I heard this a long time ago about perl. CPAN is great.

Well ... perl entered the fossilized era. I think people do not really observe things correctly. I am noticing the same with ruby right now - everyone sees that ruby is in decline, very strongly so, in the last 3 years. Yet you have blog posts such as "ruby is not dying - it is aging like fine wine". And these are all NOT BASED ON FACTUAL ANALYSIS. I still think ruby is a great language, but if people are not realistic in their assessment of a situation, what does this tell us about people's evaluation in general? People seem to shy away from criticism. You can see this on reddit too, where moderators ban and censor willy-nilly, or even on github, where you can also quickly get eliminated for not conforming to xyz. It's as if some people are very afraid of strong opinions. I don't understand why - an opinion that is objectively false, can be shown to be false.

People absolutely have VERY strong opinions and voice them constantly. True of every language but especially php. Almost feel like it’s more acceptable to rant about php than to praise it
I attended a talk by Rasmus Lerdorf at a FOSS conference in 2006. It has been a long time, so I remember only a few things from the talk, but one thing I remember him talking about is how people love to complain about PHP, often on forums that are themselves written in PHP.

    #include <stdstroustrupquote>
Rasmus also admits that he didn't really know what he was doing when he created PHP, and that it's a bit surprising that PHP has stayed as compatible as it has. I kind of respect him more for that than I once used to.
No, in 2006 it was still considered poor form to reply with a low-effort meme phrase instead of meaningfully criticizing his position with your own.
Depends entirely on the forum.

I remember it being somewhat common for people to make forum posts consisting entirely of a joke image. However, they weren’t called memes at the time as the word had yet to be popularized.

I know I used to have the impression of PHP as a messy language because I last worked with PHP4. It's come a long way since then, though I don't use it.
> objectively

> Ruby is dying

How exactly do you define these “objective” criteria for such sensationalism?

Mate, not to be rude but your entire comment isn't based on factual analysis; it's a rant about unrelated languages.
You can say the same thing about lisp (and C in some regards). Sometimes a language is done and anything you add to it is breaking things for no sizable improvement. And if your primary target is Unix, it’s often so easy to write a shim for C/C++ libraries that you don’t bother implementing your own version of stuff.
Which lisp? :P
> People seem to shy away from criticism.

What actual criticism of PHP is anyone shying away from?

PHP got bashed for such a long time, while simply nothing steps up to do what it does better. Something that, for example, is available on every webhost you can just throw files at, where all (meaningful) config and state can be in those files.

I used to really love the dead-simple ease PHP brought to server-side dynamic web stuff too. But when shared cpanel type hosting was orders of magnitude cheaper than anything else, that was a way bigger deal. Today you can deploy a node.js app (all the same “just a script” advantages of PHP) to a half dozen places for free, and for the next step up, a smallish instance at Hetzner, DigitalOcean or whatever, where you can just run any arbitrary container, costs less than those shared hosting once did.

Why do I bring up containers? Because part of why PHP was so dope in this way was the way you can just define 1 file per endpoint and drop it in public_html, and have no server setup to do. Running say, Rails or ASP.NET or a Java site back then meant doing… a lot more, to your server.

But with Docker, you can just steal a good Dockerfile template from someone else, and it’s just like 3-4 simple files for you to manage for a simple Sinatra (Ruby) or node.js version of the “one-off PHP file” things.

But I don't want to manage 3-4 files, I want to manage zero files. I don't want half a dozen hosts, I want hundreds of thousands. It's not about costs, I really mean the simplicity and pervasiveness. PHP apps that are simple (in that they don't require any "rare" modules to be enabled) can easily be written to not run in relative folder structures, you can move them around like .exe files if you will. Not "like moving an exe file and then just updating a few lines in this file over there", that is a completely different thing for me.

edit: Granted, I agree that if you want to do all sorts of things on the internet, maybe PHP is not the right choice. But for simple, dynamic web things that I want to just make and then run like this forever, that I can work on but don't have to? PHP and vanilla HTML and Javascript are where it's at for me, hands down. Everything else I know is either too new or seems to have constant churn or issues. That you hear nothing about PHP other than complaining it's "outdated" or whatever from the outside -- always "why are you using this?" never "why oh why am I using this?" -- is because it just hums along, IMO. I like it better than Python, and I kinda view it as in that class.

> Today you can deploy a node.js app (all the same “just a script” advantages of PHP)

You still need to build a router and a web server in said nodejs script. And manage everything, including taking care that requests don't accidentally mutate global state.

PHP in contrast is stateless. Way less bs to take care about.

I've always thought that the core idea of PHP, the intermixing of code and HTML is an incredibly elegant solution to a very difficult problem. But at the same time, the language itself does suck (although I won't discount the improvements it has made). I would really love for there to be an entirely reimagined PHP from the ground up, and to hell with backwards compatibility or availability.
> I would really love for there to be an entirely reimagined PHP from the ground up, and to hell with backwards compatibility or availability.

I'm halfway onboard :) That is, personally I do like that PHP is kinda "boring", it's just a programming language that mostly looks like other curly brace languages that can basically do anything I ever need to do with strings out of the box. And I like that PHP, while also adopting more advanced programming concepts, stays nice with all the simple stuff. I like the stewardship, you might say, although I don't pay any attention to it. New PHP versions always either have things I don't understand, don't care about, or really like. The performance goes steadily into the correct direction, too.

I'm sure people who know more about programming and the version history will disagree about something, but for my "low-tech" usage, it's actually one of the few happy places of computing. Not happy as in exciting, more like working in a modeset garden, without stuff blowing up constantly, and salesmen posing as flowers.

But I still would love to see and try out "takes" on that, or on what other people like about it. At least to start with, I guess it would just need to be be somewhat painless to use with with Apache, NGINX, or come with a webserver built-in. Then people can use that locally and on servers they fully control, and if it's really good, and really good with resources, basically a painless additional thing to add, web hosts might adopt it etc.

There was. It was called Hack[0]. Among other things it had XHP built in so you could write HTML natively (as opposed to concatenating strings) and define your own templates easily[0]. It even handled escaping. It really improved on a ton of PHP's flaws.

Unfortunately newer versions of PHP killed it and it's dead now, and even more unfortunately while PHP absorbed a lot of features from Hack, native XML was not one of them. There was even going to be a Hack version of the Composer package manager but that never got finished AFAIK. Distros stopped supporting it. I think I still have my half-finished attempt at a Hacker News fork in Hack sitting around on a hard drive somewhere. I can't even find an environment to run it in anymore.

[0]https://docs.hhvm.com/hack-overview/

[1]https://docs.hhvm.com/hack/XHP/introduction/

XHP was a modified PHP before Hack came about, but only Hack supports it now. XHP did have a longer reach though: JSX is its direct descendant.
Making it an extension rather than default was a mistake on PHP's part. 99% of PHP is being run on shared servers and most people cant recompile it, and most of the rest just won't.

Having a language whose entire purpose is to be a templating language for HTML have no concept of what HTML is, is just ridiculous. You have to use a templating framework that rolls its own ad-hoc DSL and parser to manage context just to make PHP do what it should be able to do safely and sanely out of the box.

It doesn't matter now, since "web dev" is whatever JS vibe-coded nonsense Claude shits out and no one cares anymore, but ye gods it could have been so much better.

yes I say this now very often, that PHP has morphed into runtime Java. Quite nice in some ways.

In PHP though the STDlib is not very well thought out, it may be fun(needle,haystack) or fun(haystack,needle) and you just have to remember

empty() will do weird stuff like a string with the value "0" is also empty, so not great for parsing things.

A lot of footguns and the best way to avoid that is with a decent linter that lets you be picky

another thing I mention is avoiding the array_ stuff because of aformentioned reasons, it's easier to remember/reason about boring loops unfortunately.

Why not just use a proper code editor with inlay hints, inline documentation, and autocompletion instead? These things are a non-issue, unless you’re working with Notepad++ or something.
Yea VSCode or Eclipse does this.
I have worked with countless programming languages and they all have little oddities.
Isn't kotlin a more modern choice coming from java, that's designed propperly instead a series of bolted on decisions?