Hacker News new | ask | show | jobs
by pistolpeteDK 2811 days ago
Not trying to split hairs, but we have no problem maintaining our 50K LOC python/django codebase.

What I like about Python is the ease of which you can layout a project to fit your needs. Maintaining a large codebase is more about layout/architecture than the language itself.

Maybe other languages are simply better suited for programmers who lack the need of keeping their codebase clean and tidy :)

4 comments

This is totally true. We are trying to maintain a codebase that is hundreds of thousands of lines of code, based on Drupal 5. It's a fucking nightmare.
I have worked on my fair share of Python monstrosities. To be fair I think I have seen worse in PHP, but I don't think the language itself is going to save you from shit architecture.
Aye but it can be a factor in how easy it is to revise a shit architecture ...
> Aye but it can be a *refactor in how easy it is to revise a shit architecture ...

:)

[flagged]
I used to use drupal and it was a nightmare. I switched to Django just because of python and it was the best decision I’ve ever made in regards to web development.
Similarly, we have a ~125k loc java service and a 47k loc one and they're both real, honest burdens to maintain.
Yes, but with Java, a good IDE can answer questions like:

- Where is this method being used? - If I delete or change this method will anything break? ....

Sure, and that's true and if I were to set out to do a large service now I'd use something with typing. My last few net-new 10k+ loc projects were all typescript and I'll probably stick with that approach for a while.

The main issues with large codebases are often not feature-based but rather maintenance and operational overhead related: undiscoverability, config hell, dependency hell, and scaling issues - and typing helps a little bit with one of those.

With python, a good IDE should be able to find usages.

If you delete a method that has usages without searching for them, you're asking for trouble. Sure it's not a compilation error in python, but compilation can only show that the method is missing, not that changes are guaranteed to work.

With R#, any method that is not referenced is gray and you can choose to display a warning.

If I delete a method, I automatically get a big red dot in the corner showing compilation errors (without compiling).

If I change a type from string to int in both my method and method signature, it automatically tells me every place that would break.

How do you find usages in a provable correct way in a dynamic language?

When it comes to refactoring, there are so many automated, guaranteed safe refactors you can do with statically typed language that you can’t do with dynamic language.

I can’t understand why developers would want to take a tool out of their tool belt like static checking.

I’m a fan of Python as a low overhead scripting language but for large systems with multiple developers, static type checking is a godsend.

> 50K LOC python/django codebase.

How much of it is generated?

I would guess none. Code generation is very rare in the Python community. Mainly because it's generally unnecessary given the nature of the language.
Sorry for the bluntness of my question: I only have little experience in using Python, which I acquired while trying out Django, and I remember using scripts to generate certain parts of the site (things might have changed since then?).

To be fair, Django is pretty good: iirc, it induces good discipline for development, which is perhaps why it is so easy to handle a large codebase.

On the other hand, the business logic is outside the scope of Django (or any other framework), and there, if the devs do not follow good practices and develop the right tools, it might be more difficult to be productive.

The way I see it, people using dynamic languages have much more freedom in development, but also less safeguards, which means that they have to build them to make their code more robust. Having code generators is a way of getting back some safety, because you only have to make sure the generators are producing correct code.

Custom code, however, need to be thoroughly tested to be considered safe, and if the compiler doesn't do any type level consistency check, it's that many more tests that needs to be written, which adds up to the LOC count.

I frequently use code generation in python. why write thousands of lines of boilerplate code in any language when you can just generate it? Maybe I'm just lazy but the results are better and more consistent to get something done.
I would regard anything other than tiny amounts of boilerplate to be a code smell in most languages.

Code generation has so many downsides and Python is so dynamic I can't think what you're doing that couldn't be done better without code generation.

Weird, typically you can use metaclasses or mixins to remove such boilerplate.
I utilize that too but why not generate new models with forms, cbvs, drf endpoints, serializers, etc? You can pick and choose what you need, add the mixins and move on. Plus initial project using cookicutter to generate the original project in a much more fitting and featureful structure.
I've been a Django developer for years and—barring Django migrations—I've never 'generated' a single line of code.
Honestly you are missing out, I've been developing in Django for years and can't imagine not generating code anymore. Generate and a bit of tweaking, the results are the same with a lot less time.