On deployments is not just a matter of updating the code, many times you have to update the data too and "appcfg.py update" really does not take care of that.
On one of my apps the only thing I have to do is literally "git push production release" and that takes care of everything.
I don't understand the scaling comment either.
How is it not scaling for you?
On scaling it is tedious because things we take for granted are suddenly hard. Some example of that, but not limited to:
---
The transactional support in GAE is really awful. Something which I take for granted when working in Django or Rails with a RDBMS: I want the writes I did to be reverted in case my request failed for some stupid reason and triggers an exception. It's nothing much, but I want this basic functionality that should be a no-brainer.
Transactions in GAE work on trees. Basically GAE puts a lock on a parent entity (every entity can have a parent entity) and every operation done on an entity with that parent will be inside the transaction (and reverted in case of rollback). But THE PROBLEM is that you cannot abuse this functionality, because once a lock is made on parent, the whole tree is completely blocked. Which means that concurrent requests to the same entities from another request will wait until the transaction holding the lock finishes, with a freaking timeout and an exception if it fails.
Even SQLite is smarter than that. Programming like it's freaking 1970.
---
Even on GAE you do have to keep a normalized copy of your data, unless your data is really dumb. That's because you do not know the kind of queries you will do in advance and it is better to have a single-source of truth that is normalized and then build views asynchronously around that for whatever queries you want ... but in doing so, adding a single dumb view to your app is a huge PITA.
And even denormalization is a problem, because surely you can store the comments on an article for a stupid blog in a single entity as a list, or tags, or whatever, but then GAE places hard limits on the amount of data stored in a single entity (1 MB or something). Every step of the way you'll have to think about the query you want and about how big your dataset is. You just can't throw the data in there and optimize it later.
In case of MySQL missing an index your app will just be slower. But in case of GAE, it won't run.
---
Many simple queries become really complex, like in case you want to do a radius search, for which with MYSQL you can just do a stupid haversine-based search like lat BETWEEN (LatX, LatY) AND lon BETWEEN (LonX, LonY), but this is not possible to optimize on GAE and instead have to go and experiment with bounding box searches.
And even if you have a perfectly optimizeable query, GAE places hard limits on what you can retrieve. For example you can retrieve only the first 5000 items of a dataset (that number is up from 1000 last time I tried it). This is again problematic -- with MySQL sometimes I want the first 20000 items and it isn't a problem. What if you want to update all items in an entity somehow? What if you want not the first 5000 items in an index, but the last 5000, something which isn't a problem with a classical RDBMS? Yeah, you have to do special gymnastics.
Go ahead and try building a classic admin, like the one Django provides, in which you have a paginated view of all the data you have in a single table, with filters applicable and ordering on any column you want. Something that is trivial in a classical environment will suck your soul out until your eyes will bleed.
---
GAE puts hard limits on the amount of time allocated for each request or background task. And surely they raised those limits but it is not enough.
The fact of the matter is that you have absolutely no guarantee that processing from 1 to X items from the datastore will succeed in the time allocated. When I had to solve this, a background task sometimes processed 500 entities in a row, sometimes it managed to process only 10, sometimes it managed to process only 5 -- I shit you not. That's because the time GAE spends on other tasks like waking up an instance, initializing your framework and other things is factored in. Not only that but the datastore can have a really high latency, which is a problem GAE users have been bitching about since GAE was launched.
And so you have to do your own book-keeping of where you left off and whether the processing of an item succeeded and it case it didn't you have to revert and start again (developing your own transactions mechanism that can rollback changes after the fact, like with the Memento pattern or shit like that -- again, like it is freaking 1970).
---
GAE puts hard limits on what you can actually do with it. The apis are severily restricted.
If your app does anything interesting, other than (1) serving content from the datastore, (2) fetches content from an external URL (but good luck doing crawling, hope you have enough money), (3) makes thumbnails of images, (4) sends emails but only using the special API as you can't do SMTP or (5) sends XMPP messages using their special API, then you're shit out of luck.
And it's also a problem offloading whatever you can't do on your own Linux instance. That's because the datastore has severe limits on the number of requests you make or on the amount of data transfered and if you're going through hundreds of thousands of entities, which is really not that many, updating them with fresh content from an external instance, then you need to prepare yourself for a really fat monthly bill.
I think bad_user covers most of it, (although to be fair misses the mark on some things too; there is no 5000-entity limit per query if you use the query as an iterator rather than calling fetch(), which is a much better idea generally) but it's not that deployment itself is difficult--that's the only easy part, frankly.
What's tedious is development, due to all the restrictions and extra concerns necessary. They may make total sense for Google but make zero sense for the rest of the world's applications.
As one simple example, everything you need to be consistent must be handled in a manual transaction but transactions are slow and prone to collisions with insane time-outs (e.g. a request takes 45 seconds because a transaction couldn't commit the first time). Don't even get me started on cross-entity transactions or parent/child relationships, both of which you can use to completely destroy all semblance of performance in an application. Something you take for granted every day on any other platform (transparent, fast consistency guarantees) is of constant consideration and concern on AppEngine--and the APIs presented for it are pedestrian.
These are not intractable problems; none of AppEngine's shortcomings are, really. But the huge number of man-hours spent dealing with its terrible APIs, restrictions, and astounding pre-optimization requirements are hours that could have been spent self-managing a much more user-friendly platform.
On one of my apps the only thing I have to do is literally "git push production release" and that takes care of everything.
On scaling it is tedious because things we take for granted are suddenly hard. Some example of that, but not limited to:---
The transactional support in GAE is really awful. Something which I take for granted when working in Django or Rails with a RDBMS: I want the writes I did to be reverted in case my request failed for some stupid reason and triggers an exception. It's nothing much, but I want this basic functionality that should be a no-brainer.
Transactions in GAE work on trees. Basically GAE puts a lock on a parent entity (every entity can have a parent entity) and every operation done on an entity with that parent will be inside the transaction (and reverted in case of rollback). But THE PROBLEM is that you cannot abuse this functionality, because once a lock is made on parent, the whole tree is completely blocked. Which means that concurrent requests to the same entities from another request will wait until the transaction holding the lock finishes, with a freaking timeout and an exception if it fails.
Even SQLite is smarter than that. Programming like it's freaking 1970.
---
Even on GAE you do have to keep a normalized copy of your data, unless your data is really dumb. That's because you do not know the kind of queries you will do in advance and it is better to have a single-source of truth that is normalized and then build views asynchronously around that for whatever queries you want ... but in doing so, adding a single dumb view to your app is a huge PITA.
And even denormalization is a problem, because surely you can store the comments on an article for a stupid blog in a single entity as a list, or tags, or whatever, but then GAE places hard limits on the amount of data stored in a single entity (1 MB or something). Every step of the way you'll have to think about the query you want and about how big your dataset is. You just can't throw the data in there and optimize it later.
In case of MySQL missing an index your app will just be slower. But in case of GAE, it won't run.
---
Many simple queries become really complex, like in case you want to do a radius search, for which with MYSQL you can just do a stupid haversine-based search like lat BETWEEN (LatX, LatY) AND lon BETWEEN (LonX, LonY), but this is not possible to optimize on GAE and instead have to go and experiment with bounding box searches.
And even if you have a perfectly optimizeable query, GAE places hard limits on what you can retrieve. For example you can retrieve only the first 5000 items of a dataset (that number is up from 1000 last time I tried it). This is again problematic -- with MySQL sometimes I want the first 20000 items and it isn't a problem. What if you want to update all items in an entity somehow? What if you want not the first 5000 items in an index, but the last 5000, something which isn't a problem with a classical RDBMS? Yeah, you have to do special gymnastics.
Go ahead and try building a classic admin, like the one Django provides, in which you have a paginated view of all the data you have in a single table, with filters applicable and ordering on any column you want. Something that is trivial in a classical environment will suck your soul out until your eyes will bleed.
---
GAE puts hard limits on the amount of time allocated for each request or background task. And surely they raised those limits but it is not enough.
The fact of the matter is that you have absolutely no guarantee that processing from 1 to X items from the datastore will succeed in the time allocated. When I had to solve this, a background task sometimes processed 500 entities in a row, sometimes it managed to process only 10, sometimes it managed to process only 5 -- I shit you not. That's because the time GAE spends on other tasks like waking up an instance, initializing your framework and other things is factored in. Not only that but the datastore can have a really high latency, which is a problem GAE users have been bitching about since GAE was launched.
And so you have to do your own book-keeping of where you left off and whether the processing of an item succeeded and it case it didn't you have to revert and start again (developing your own transactions mechanism that can rollback changes after the fact, like with the Memento pattern or shit like that -- again, like it is freaking 1970).
---
GAE puts hard limits on what you can actually do with it. The apis are severily restricted.
If your app does anything interesting, other than (1) serving content from the datastore, (2) fetches content from an external URL (but good luck doing crawling, hope you have enough money), (3) makes thumbnails of images, (4) sends emails but only using the special API as you can't do SMTP or (5) sends XMPP messages using their special API, then you're shit out of luck.
And it's also a problem offloading whatever you can't do on your own Linux instance. That's because the datastore has severe limits on the number of requests you make or on the amount of data transfered and if you're going through hundreds of thousands of entities, which is really not that many, updating them with fresh content from an external instance, then you need to prepare yourself for a really fat monthly bill.