Hacker News new | ask | show | jobs
by tclancy 3323 days ago
I've got 10 years' experience on projects small and large and I have to agree. The title talks about building at scale but the article doesn't stress that which makes some of the advice downright weird.

>If you don't really understand the point of apps, ignore them and stick with a single app for your backend. You can still organize a growing codebase without using separate apps.

This is where the article lost me. If this is for building at scale, maybe, I don't know. I never hit a point where designing the project in apps became a problem. Regardless, if you don't know why Django wants to use apps, that suggests you are new to Django and probably not building at scale, so this feels like poor advice. Much of the article is telling readers to do things exactly contrary to Django's philosophy; the problem with that is there are lots of articles and StackOverflow answers out there based on Django's philosophy. There isn't a similar body of reference based on the authors' approach.

I don't know why explicitly naming your database tables is imperative for running at scale. Now we're breaking from Django's convention because some day we might want to stop using Django and we will be annoyed by its table-naming convention? Avoiding "fat models" is another place where it feels more like opinion than anything to do with performance or good design.

It would be good to know what database engine the authors are running into such serious migration issues with-- MySQL?

4 comments

> Avoiding "fat models" is another place where it feels more like opinion than anything to do with performance or good design

So in the Java world, the general pattern is that:

Views:

  - Accept and sanitize query parameters

  - Call call one or more service methods.

  - Catch errors and return an appropriate error response

  - Render a JSON response based on the results of the service methods if nothing goes wrong.
Service methods:

  - Perform business logic

  - Manage persistence

  - Bubble up errors
The nice thing about this architecture is that each piece of the codebase tells a complete story about what it's doing. That is from looking at the view you can see what parameters it accepts, how they are sanitized, what service method it calls, each of the errors that can be returned, and what the 200 response looks like.

And looking at the service method we can see what business logic it performs, and what the database queries look like.

In each case there isn't any reason to look at other methods to understand the 'story' of what's happening in your app. This makes it very easy to read the codebase and audit it for correctness.

The problem with fat models is that they're not telling a story about what's actually happening in the app, e.g. looking at them doesn't tell you anything about the business logic the endpoints are performing. And what's worse, you also can't look at the views or services and know what they're doing either.

As someone who strongly prefers Python and Django over the Java ecosystem, I'll say hands down that in terms of how web app are architected they got it right and the Django people got it wrong. As far as I can tell the whole Domain Model Architecture thing seems like a bunch of bullshit that was invented to sell consulting. If the advocates of this approach can't even write a coherent Wikipedia article, it should give you a clue as to what the code ends up looking like. [1]

[1] https://en.wikipedia.org/wiki/Domain-driven_design

Yeah, I don't disagree with that at all. I came to Django from C# after playing with Ruby on Rails a little bit and the lack of an explicit Controller in Django confused me and I think it is part of the driver behind the "fat model" approach. I like the idea of the logic for the business object being inside it and all testable on its own but I think it has its limits-- thinking about my own Django codebases, the number of class/ static methods I have on models is a code smell from me learning OOP on C# where I had to stick those methods some place.
Is there a good place or pattern for service methods in Django? I've got some very fat models right now, and it's a DRY improvement over having the fat in the views, but like you say it takes a lot of effort to trace what's going on.
Let's say you're following the approach of breaking down your project into separate apps, so you have an app called user_accounts. This app would be a folder containing files like:

  views.py, services.py, models.py, test_views.py
So in views.py you'd have a User class, with:

  A POST method that calls services.create_user(username, email_address, password)

  A GET method that calls services.get_user_profile(request.user)

  A PUT method that calls services.update_user_profile(x, y, z)

  A DELETE method that calls services.inactivate_user(request.user)
The return value of each of these views can just be whatever services.get_user_profile(request.user) returns, rendered into JSON.

Then each of the services performs whatever business logic it needs to, preferably directly in the method. But if it would be more readable split into multiple methods, then you can create some private helper methods in the services file prefixed with an underscore. You can also have a separate folder somewhere for utility functions meant to be reused across the app, e.g. get_user_emails(request.user, is_active=True, is_verified=True)

Basically though each view sanitizes the data, e.g. strips XSS out of strings, makes sure booleans are actually booleans, etc.

Then each service first does field-level validation with serializers, e.g. ensuring that usernames meet the appropriate requirements for usernames. Next if there is other business logic validation that needs to happen, it happens, e.g. making sure that only users with verified email addresses can perform certain actions.

After that you perform the actual business logic, e.g. transforming any data. Then you perform your CRUD operation, e.g. creating a user model. And lastly you return something, e.g. returning the user model.

Each endpoint and service method can be written pretty much following this pattern, which makes the codebase super readable because once you understand one endpoint you understand all of them. And the service methods are the reusable component of the architecture, so e.g. if you want the ability for admins to create users, then they are created with the exact same service method. (But called from your admin endpoints/services.)

That's almost exactly what I do, except for using a big monolithic app for the entire backend (called "core"), and making "services" a package with several modules inside.

It also resembles very much what I see in Java projects which use DDD (Domain Driven Design).

What's your take on the article's point that you should have fewer rather than more Django apps (citing the problem of inter-app-FKs)?

So my startup is built the way you describe, in terms of just having one main app, and I personally prefer this style.

The basic argument in favor of breaking down the Django project into multiple apps is that it makes the components more decoupled and reusable. But personally I think this is bullshit. If you want your apps to be reusable and decoupled then you need to put a ton of time into architecting them this way, the idea that you're going to get these benefits just from putting stuff into different folders is magical thinking. It seems like pretty much the textbook example of cargo cult programming.

That said for the client I'm currently working for, the decision was made to do it the 'standard Django way' in terms of breaking it into multiple apps. So far I haven't run into any issues here. I like it slightly less because I think having all the views in the views folder, and all the services in the services folder makes folks more likely to reuse code just by making it easier to find. But yeah, so far no real problems, but I'm also not expecting to see any magical benefits either.

I'm not aware of architectural patterns for service methods in Django (would like to find some as well), but what I did was to somewhat mimic a Java structure.

All the project is in one single app, which I unimaginatively called "core", and inside this app there's a "services" Python package (i.e. a folder with a __init__.py file inside). These have roughly one Python module (.py file) for each "category" of services. For example, there's thin layers like "user_service.py" (basically passes through to the relevant models), to more complex services like "dependency_x_integration_service.py", which connects to external service "X" and pulls some relevant data (say, user interaction datapoints), and bridges them to the models in the system.

We do roughly this where I work, so I broadly agree.

That said, it's fairly common for the unit of reuse to be below service methods. Also, depending upon how exactly you manage transactions, another thing to look out for is making multiple non-idempotent service calls from the view - this will be an area ripe for race conditions you likely aren't testing.

> it's fairly common for the unit of reuse to be below service methods.

In terms of utility functions or serializers? What does that look like exactly?

E.g. in our codebase service methods can call helper methods (non-reusable), utility methods (reusable), and serializers (non-reusable).

I agree the doordash article gets some stuff right and most stuff wrong, almost to the point where it's difficult to read. But (somewhat tangentially) I admit I have struggled in the past with separating out Django apps for reasons not mentioned in the article.

Specifically, say I have two apps, with a second more specific app heavily dependent on a first more general app. What I find in this scenario is that I sometimes need hooks into the general app from the specific app, which means that I wind up importing modules from the specific app into the general app. This hasn't generally been a showstopper in my experience, but it creates some friction because:

a) I would prefer for the general app to have no dependencies on the specific app

b) This results in circular imports (which can themselves be addressed, but this is an implementation detail I would prefer not to have to worry about)

I realize these issues can be mitigated with signals, but I try to use signals sparingly for various reasons (https://code.djangoproject.com/ticket/16547#comment:2). It also helps that foreign keys can be expressed using a string literal rather than the actual model, but in the end, I still occasionally run into situations I don't feel great about.

Please note that I'm not advising against separating out functionality into apps. Instead, I'm merely citing an issue about having multiple apps that bothers me.

Agreed. I've built and maintained a moderate size Django app for 5 years now, and had similar issues.

GOOD app division: my "members" app which has classes for UserProfile, MemberType, and communicates with an upstream 3rd party membership API. It doesn't have any dependencies, but a bunch of other apps that depend on it. Another one I've just started work on is a generic Questionnaire/Survey app. This one is a definite candidate for spinning out as an open-source third-party app later, it lets you attach Questions and Answers to any of your own model objects via generic relations.

BAD app division: I have separate "Entry" and "EntryHandling" models across two apps, the latter is a OneToOne with an Entry. Originally this was a separation of concerns, but it's become a mess. Like the parent, the generic app ends up depending on the specific app, and migrations have to be handled gently and sometimes manually edited.

If you treat your Django apps as points that would be logical for splitting as micro-services, you'd probably be just fine.

> I never hit a point where designing the project in apps became a problem.

The article literally describes why this is a problem in the first place. Cross-app model relations are a PITA, and splitting "sections" of your site into separate apps often has you end up with cross-app relations.

The more general point here is that: The functional separation between Django apps and the logical separation between "parts" of your site often do not match up, and thus you should be careful about splitting up your site into multiple apps.

In my experience, this is absolutely true and happens often. The article recommends separating the parts of your site into modules and packages within a single app, which is a great idea and something the Django docs don't make obvious as a choice.

>Cross-app model relations are a PITA

A PITA how? They say they ran into migration issues due to the apps approach but it sounds like they ran into issues due to the sheer number of migrations happening across a bunch of developers. That sounds like a likely problem on big teams, but I don't think it's one best solved by not using the app approach and I don't think it's one whose underlying problem is ForeignKeys to models in other apps. Again, it would be nice to know if this was on MySQL or a different database as what finally caused me to move to Postgres 8 years or so ago was the heartburn of migration on MySQL. I think I've run into one or two migration knots since then and they were both due to me moving a little too fast.

And at scale, I would assume you aren't actually running the migrations but generating SQL from them and running that. Still could run into the same problems, but you could sort that out by hand when you do. Not the best answer, but from the article it feels like a more formalized/ strict approach to who gets to modify the database and when would be good.

>splitting "sections" of your site into separate apps often has you end up with cross-app relations.

It also encourages you to do some up-front data modeling which is a skill that gets rarer as ORMs get more common.

/old man yells at cloud

Don't see how up-front modelling would have predicted the high planes of Unicode & MySQL's utf8 encoding being supplanted by utf8mb4 so users of your product could message taco emoji to each other.

I have an agenda against Getting It Right First Time, Every Time, since it encourages brittle code that's a struggle to adapt to new, seemingly-similar use cases.

I don't understand how that first sentence is relevant. How would using apps organization vs the approach described in the article protect you from making the decision to change your database's underlying encoding?
One of the projects I worked on in the past had a legacy app which contained most of the core logic... and as the site grew, new apps were created because it got impossible to manage that one massive app. Splitting it into multiple pieces was incredibly painful because the codebase wasn't designed for that... so while I think this may help you get up and running faster, it will probably cause problems down the line.