> Nobody with the actual skillset to design and implement a programming language would consider, for example, merging vectors and dictionaries into some kind of mutated frankenarray.
Whats the order of complexity of adding an element in the middle of "an array" in PHP?
So, it's bad, because it makes it very confusing, what the exact implementation is. If you can't tell how much slower your algorithm that calculates something from an 'frankenarray' gets, when the array gets bigger, you have a problem.
That's one problem. The other problem is with language itself. When we call something an array, which isn't actually an array (an array being a very specific implementation of a vector) people get confused.
Imagine how you would feel if PHP used the word 'variable' to mean function and the word 'function' to mean variable.
So, a hash and an array are not just semantically different (that they refer to a completely different ADT's), but they are also completely distinct implementations of those ADT's.
Now, this is not a problem, when you are using PHP as it was originally intended. To add a small bit of logic to a template that mostly uses fast, well written c functions in it's built-in library.
But people are abusing PHP to write large scale libraries and frameworks. Many riddled with security issues, and unpredictable performance bottlenecks, because the language wasn't designed to actually allow you to write well defined algorithms.
Now, we've seen workarounds for these problems. Facebook wrote a PHP to C compiler, and moved all crucial algorithms to C. You can use unit-testing to supplement the lack of any usable type information, and to combat the automatic casting that PHP does (which creates all kinds of bugs in edge cases, that is hard to track or test against). You can use profiling to track performance issues, and fix code that interacts with frankenarrays, in a sort of trail-and-error kind of way.
People once wrote all their code in ASM. Discipline can make all the difference. But highly skilled disciplined, well educated professionals aren't the target audience of PHP.
All this required discipline can be justified on practical grounds. PHP programmers are generally paid the worst (or as an employer would call, are the cheapest). Hosting is cheap. And for many use-cases you can just drop in ready to go code (Wordpress, Drupal, Joombla, etc.) and just skin it.
So, i'm not saying picking PHP is a bad financial choice for a company. If you don't take on difficult projects, and don't intent to hire highly skilled coders, it can be cheap and productive.
It's interesting that I asked a simple technical question, and rather than just answering it, you threw in a slew of insults and opinions about PHP and PHP programmers. Why do you have such strong feelings about PHP and PHP programmers so much? How long did you actually use PHP, and which companies did you work for that used it?
It's not your assertion that PHP is a crappy language that I disagree with. I fully agree there. Where I disagree with you is in your assessment that a) there aren't many skilled programmers who use PHP, and c) language features of PHP make it very difficult or impossible to write well-designed software.
As to your assertion that PHP is only good for simple projects employing low-skilled developers, tell that to the various PHP startups paying top dollar for skilled PHP developers.
PHP arrays are actually ordered hash tables. This information is readily available. If you really need a real array, there is SplArray.
> Now, we've seen workarounds for these problems. Facebook wrote a PHP to C compiler, and moved all crucial algorithms to C.
At the scale of Facebook, nobody uses scripting languages, so I doubt this is purely because of PHP. This would be like me telling you to not to use RoR because Twitter switched to Scala (people said this, but those people are stupid and I'm not among them.) Twitter and Facebook have problems you and I are unlikely to have.
> You can use unit-testing to supplement the lack of any usable type information, and to combat the automatic casting that PHP does (which creates all kinds of bugs in edge cases, that is hard to track or test against). You can use profiling to track performance issues, and fix code that interacts with frankenarrays, in a sort of trail-and-error kind of way.
PHP's type system is crazy, that is absolutely true. But it's not as hard to work around or as frequently a problem as you're suggesting, at least in my experience. JavaScript is another language that has a crazy-ish type system, but it doesn't get the hate that PHP does.
I have never had a case where PHP's arrays caused performance problems, but that's probably because I've also never written an application that didn't interact with some kind of back-end store like a database that was inevitable slower. I've done a fair amount of work to make apps faster, and the big wins were always in fixing bad data access or caching patterns and algorithms. I imagine that there would be a point when I need to worry about not using PHP arrays, but I guess I haven't hit it. And again, if/when I do, there's always SplArray.
> But people are abusing PHP to write large scale libraries and frameworks. Many riddled with security issues, and unpredictable performance bottlenecks, because the language wasn't designed to actually allow you to write well defined algorithms.
There are some really shitty PHP projects. I suspect a big part of that is that they took off before there was the widespread knowledge of best-practices we have now. (I'm assuming you're thinking of wordpress, drupal and joomla.) I've worked with Wordpress a bit, unfortunately, and the main reason it's so damned slow is because the queries are badly written, and the data access and caching patterns are awful. PHP's arrays never crossed my mind as a bottleneck, because they don't even show up in my profiling tools.
For what it's worth, I used to see a lot of people's code because the company I worked for required code samples when we were interviewing people. I saw a lot of really shitty RoR code, and I do not blame that on the language or the framework.
>Where I disagree with you is in your assessment that a) there aren't many skilled programmers who use PHP, and c) language features of PHP make it very difficult or impossible to write well-designed software.
a) didn't make that assement
b) ??
c) difficult, yes
>As to your assertion that PHP is only good for simple projects employing low-skilled developers, tell that to the various PHP startups paying top dollar for skilled PHP developers.
No, my assertion was that PHP is only a good economic choice, when you choose to hire low-skilled, cheap employees. That's its niche.
>Twitter and Facebook have problems you and I are unlikely to have.
Agree. But that does not make what Facebook did with PHP, anything less than working around the issues caused by the initial commitment to PHP.
>JavaScript is another language that has a crazy-ish type system, but it doesn't get the hate that PHP does.
But it did. Even now, claiming that Javascript's language design isn't optimal isn't somethign people will argue against. Say the same thing about PHP, and you get threads like these...
>I have never had a case where PHP's arrays caused performance problems, but that's probably because I've also never written an application that didn't interact with some kind of back-end store like a database that was inevitable slower.
Now, the type of problems I'm pointing at, are the ones that read their head when the datasize gets larger. For example, an algorithm interacting with an array in PHP. Every loop it inserts or removes an element. Now depending on where that data can be found in the array, and if it's key is a number or a string, if these number-keys are close together in range or not, the algorithmic complexity is anything from O(n) to (2^n).
In other words, the normal fauna of sorting algorithms are very hard, if not impossible to correctly implement in PHP.
This is not very relevant, if you just need a good sorting function for your array, since that is build-in. This is not relevant if you datasizes are always small.
So, this is not relevant in the small-scale world. But if there is a chance that will need to be implementing a complicated algorithm in PHP, that perhaps needs to interact with large datastructures, than PHP is a dangerous choice.
Beyond the limited scope of a database powered template with a bit of logic, it is simply not suited. It is not designed well enough to be 'general purpose' programming language, and its civic duty of all to warn those who think it is. (like the authors of this Node.php project)
While this is merely anecdotal, I feel compelled to note that I'm a PHP programmer -- as well as a Python and a Ruby programmer (in the sense of "somebody has paid me going market rates for working in these languages) -- and even for PHP work, I am not cheap. And, while I am of course a biased observer, I'd say I'm not low-skilled, either.
I generally agree with your technical criticisms of the language (I generated a bit of heat and light a few months ago with a blog post called "PHP is not an acceptable COBOL"), but I don't think your assertions about the economics are entirely correct. I don't think most companies choose PHP because they expect it will let them get cheap help -- I don't think most companies actually choose PHP. They end up with PHP for any number of reasons that don't have anything to do with long-term planning. (Also, if you were putting together a database-backed web site in the early 2000s, there's a very good chance you'd go with either ASP or PHP unless you had a lot of experience in web programming in another language. After you built up sufficient inertia, switching would be difficult.)
On the flip side, there are a lot of companies out here in Silicon Valley using Ruby on Rails because it fits very, very well into the niche you describe: take a bunch of Rails gems and learn just enough Ruby to hold it all together with spit and duct tape. If I were starting a company and wanting to hit the "low-skilled, cheap niche," the only downside to choosing Rails is that low-skilled Rails programmers frequently don't recognize that they're low-skilled, because the mere fact that they've chosen Rails makes them think they're definitionally awesome.
(N.B.: I like Rails a lot, but gosh, it's attitudinal.)
>It's interesting that I asked a simple technical question, and rather than just answering it,
Your question was:
Why is this bad? (referring to frankenarrays of PHP)
The summary of my answer was:
1. you can't easily tell the algorithmic complexity of most algorithms in PHP involving these "arrays"
2. naming something an array, that doesn't have the associated algorithmic complexity of an array, is misleading and hurts education, research and knowledge sharing.
I explained both these answers, with two examples:
1. Whats the order of complexity of adding an element in the middle of "an array" in PHP? (i'll answer: its undefined)
2. What would you think if PHP named a function a a variable, and a variable a function?
So, I did in detail answer you question.
> you threw in a slew of insults and opinions about PHP and PHP programmers.
It wasn't meant to be insulting. And the majority of my reply was a direct answer to your question. I think it was just a bit too broadly stated, for you to see the forest through the trees of my answer. And then i went off topic for your particular reply.
> But people are abusing PHP to write large scale libraries and frameworks. Many riddled with security issues, and unpredictable performance bottlenecks, because the language wasn't designed to actually allow you to write well defined algorithms.
I mostly agree with your post but you've gone way over the top on this statement. PHP is pretty analogous to the majority of languages in terms of syntax, structure, and capability. Frameworks written in PHP are no more likely to have security issues or unpredictable performance bottlenecks than any other language. Certainly nothing prevents you from writing well defined algorithms.
PHP's "arrays" are both vectors and hash tables at the same time. I agree that there is no reason to combine those concepts (although I love ordered hash tables). But ultimately there isn't much harm in it; it's just an extreme example of typelessness.
>> But people are abusing PHP to write large scale libraries and frameworks. Many riddled with security issues, and unpredictable performance bottlenecks, because the language wasn't designed to actually allow you to write well defined algorithms.
>I mostly agree with your post but you've gone way over the top on this statement.
You are right, i went over board.
>Frameworks written in PHP are no more likely to have security issues or unpredictable performance bottlenecks than any other language
I suspect there is a measurable relationship between average quality of ecosystem and how easily or difficult a language is to use. But that would not be a fair argument: a language being easy should be a good thing.
>Certainly nothing prevents you from writing well defined algorithms.
Well, not being able to guess the order of complexity of many of the core operations on datastructures, does make it a lot harder. It's generally undefined, at least by the language spec. And when the internal implementations of core datastructures change between versions, you can't even reliably implement the typical collection of sorting algorithms.
The reasons this isn't a problem at the smale scale, is that the most common operations are actually all build-in with good defaults.
>But ultimately there isn't much harm in it; it's just an extreme example of typelessness.
PHP is not typeless, just messy with types. Depending on the type, you can use different syntactical operators and you are unable to implement those same contructs for other types. Also dependent on the type, is the difference between value- and object- semantics. However, this logic is not consist with the actual algorithmic complexity. Strings are copy-on-write, for example. So you are technically passing a reference (from a performance POV), but acting like it's not.
PHP is a dynamically typed language, with lots of run-time type errors, when things don't match. We even have all kinds of functions to determine types. It's definately not typeless.
> Strings are copy-on-write, for example. So you are technically passing a reference (from a performance POV), but acting like it's not.
That's just an optimization; it's not a very significant detail. All languages have these little details. Java, for example, caches and reuses Number instances if the value is below 127.
> PHP is a dynamically typed language
It's also weakly typed. You can, for the most part, assume that the type isn't very important. If you've got the value 12, it doesn't matter much if it's stored as an int, float, or string. But obviously if you attempt to do an operation not supported by the value then you'll get an error. There's nothing surprising about that.
PHP arrays just take the idea of an index and allow it to be both strings or integers. In both cases, you're providing a value for the key and getting out another value. It's pretty easy to explain. The issue, I'd say, is that programmers rarely ever index by both integers and strings in the same structure. There are also performance differences but hash tables are so fast that you need to be doing something unusual for it to matter.
Whats the order of complexity of adding an element in the middle of "an array" in PHP?
So, it's bad, because it makes it very confusing, what the exact implementation is. If you can't tell how much slower your algorithm that calculates something from an 'frankenarray' gets, when the array gets bigger, you have a problem.
That's one problem. The other problem is with language itself. When we call something an array, which isn't actually an array (an array being a very specific implementation of a vector) people get confused.
Imagine how you would feel if PHP used the word 'variable' to mean function and the word 'function' to mean variable.
So, a hash and an array are not just semantically different (that they refer to a completely different ADT's), but they are also completely distinct implementations of those ADT's.
Now, this is not a problem, when you are using PHP as it was originally intended. To add a small bit of logic to a template that mostly uses fast, well written c functions in it's built-in library.
But people are abusing PHP to write large scale libraries and frameworks. Many riddled with security issues, and unpredictable performance bottlenecks, because the language wasn't designed to actually allow you to write well defined algorithms.
Now, we've seen workarounds for these problems. Facebook wrote a PHP to C compiler, and moved all crucial algorithms to C. You can use unit-testing to supplement the lack of any usable type information, and to combat the automatic casting that PHP does (which creates all kinds of bugs in edge cases, that is hard to track or test against). You can use profiling to track performance issues, and fix code that interacts with frankenarrays, in a sort of trail-and-error kind of way.
People once wrote all their code in ASM. Discipline can make all the difference. But highly skilled disciplined, well educated professionals aren't the target audience of PHP.
All this required discipline can be justified on practical grounds. PHP programmers are generally paid the worst (or as an employer would call, are the cheapest). Hosting is cheap. And for many use-cases you can just drop in ready to go code (Wordpress, Drupal, Joombla, etc.) and just skin it.
So, i'm not saying picking PHP is a bad financial choice for a company. If you don't take on difficult projects, and don't intent to hire highly skilled coders, it can be cheap and productive.
But that doesn't make the language any better..