Ask HN: What are some architectural decisions that improved your codebase?

Y	Hacker News new \| ask \| show \| jobs

	Ask HN: What are some architectural decisions that improved your codebase?
	98 points by happycoder97 2508 days ago
	Dear senior developers on HN, What are some examples of design choices that helped you reduce the effort needed to change your code according to change in requirements? What are some of the architectural choices you made that made your codebase easier to work with?

24 comments

ncmncm 2507 days ago

Eliminate threads, queues, locks, buffer allocate & free, copying, system calls, synchronous logging, file ops, dynamic memory allocation.

Replace with huge-page mapped ring buffers, independent processes, kernel-bypass set-and-forget, buffer lap checks, file-mapped self-describing binary-formatted stats, direct-mode disk block writes, caller-provided memory.

link

rramadass 2507 days ago

You can't just tease us like that! I demand some details on everyone of the techniques in ;-)

>Replace with huge-page mapped ring buffers, independent processes, kernel-bypass set-and-forget, buffer lap checks, file-mapped self-describing binary-formatted stats, direct-mode disk block writes, caller-provided memory

Please elaborate.

link

ncmncm 2507 days ago

Well, OK. All this is about high-throughput, low-latency systems.

The principle is decouple, decouple, decouple. Memory isn't just memory, it's paged and mapped, and the mappings are in a small cache called the TLB, one for each core. Each "hugetlb" page, 2MB or 1GB on x86, takes just one such cache entry, so anything big, like buffers, should live in hugepages.

A ring buffer is a kind of queue with just a head, and one writer. Each new item goes at the next place in the buffer, round-robin. A head pointer -- if it's in shared memory, an index -- gets updated "atomically" to point to the newest item. Downstream readers poll for updates to the head. New stuff overwrites the oldest stuff, so downstream readers can look until it gets overwritten, and can often avoid copying. They don't need to lock anything, but need to check that the head hasn't swept in and and overwritten what they were looking at; that is called being lapped. It is their responsibility to keep up, and prevent this.

Because there is never any question where the next entry goes, hardware devices understand ring buffers, and can be set to write to them whenever there is data. Typically a proprietary library talks to a proprietary driver to set this up, and then the hardware device runs free with no more interaction. (io_uring, AF_XDP, libexanic, ef_vi, DPDK, PF_RING, netmap, etc.)

Usually the hardware ring buffer is pretty small, a few MB, so for high-rate flows there might be cores dedicated to copying from it to one or more much, much bigger ring buffers in shared, mapped memory. Typically, multiple downstream readers watch for interesting traffic to show up on such a ring, splitting the work out to multiple cores.

Threads famously interfere with one another, mainly when competing for locks; but also, whenever they fool with the memory map, other threads may experience TLB cache stalls. Separate programs are better isolated, and can be further isolated by running on a dedicated core ("isolcpu", "NOHZ", and "taskset") that is protected against the OS sticking other threads on it, or vectoring interrupts to it. In extreme cases a core may offload its own RCU retirements, or even not run any kernel code.

A unikernel may run on such a core, running a single program, so what it thinks are system calls just call a static library. There is a lot of work going on on variations on this theme -- exokernels, parakernels, etc.

Instead of getting the file system and buffer cache all mixed up in your program, you can append to files with O_DIRECT writes, or store to mapped memory and let the kernel expose it to other processes, and spool to disk, asynchronously. A monitoring process can look at event counters in such memory as they are updated in real time. It is generally better if the program updating the counters also stores a generic description of them -- type, name, a hierarchical structure that can be read out to a JSON record, periodically, by a separate program. That might be written to a log and/or feed a status dashboard. Thus, the code doing the work just updates memory words pointed to from its working configuration, but doesn't ever need to format or write out updates. If there is any actual text logging, it goes through another ring buffer to a background logging process that, ideally, is also responsible for formatting.

Memory management -- new and delete -- is a source of unpredictable delays. Such allocations are always OK during startup, but often not after. A function that needs memory, then, should use memory provided by its caller. The top level can handle memory deterministically, pre-allocated or on the stack, with a global view of program behavior.

Using separate processes enables starting and stopping downstream processing independently, and isolates crashes. Ring buffers being read are always mapped read-only, so a crashed reader cannot corrupt any shared state.

link

rramadass 2506 days ago

This is great! These are some concrete and non-trivial architectural techniques for "Systems Programming" :-)

I have had opportunity to work on/with some of these techniques on Fast Network Protocol/Security Appliances and so have some familiarity with them. However some of your hints(breadcrumbs?) are not known to me and hence i have something to research and study. Thank you.

PS: Can you add some more details on the above techniques? Like System/Library/API calls to look into, books/papers/articles to read etc?

link

ncmncm 2506 days ago

Another hint.

Speaking of breadcrumbs, if the ring buffer has fixed-size entries, a reader can come in later and start reading old entries first, say halfway back. This is helpful if you want to start a new reader and then kill an old one, and not skip any entries.

It helps if the ring has power-of-two size, and the head pointer/index is 64 bits and increases monotonically. Then the high bits are easily masked off on each use, so that arithmetic on pairs of positions is simpler.

For variable-sized entries, an array of N "breadcrumbs", past positions near 1/Nth indices, allows jumping in at earlier positions. If traffic is low enough, you might be able to buffer a whole day's traffic, and get random access starting from breadcrumbs; otherwise, you can log old entries to a sequential file, and also log the breadcrumbs, translated to file offsets, as a global index.

Downstream processes can each sequentially log an individual field of each record, with a breadcrumb index to enable full records to be reconstructed. Often these column logs can be compressed, with enormous efficiency, between breadcrumbs: 98% compression may be easy to achieve for slowly-changing or limited-alphabet values.

Lz4 and Zstd are excellent compression engines. Lz4 really shines for fast decompression. There is no excuse for zlib/gz compression anymore.

link

rramadass 2503 days ago

Thanks for the hints. For one product that i had worked on, we had something like the above for runtime logging/debugging. A shared memory area (i.e. address range in Linux process space where shared libraries are loaded) at a fixed address was reserved via linker scripts with each process having its ring buffer at its own fixed offset. A separate reader process interacted with the CLI to provide comprehensive access to this data. It was all robust and worked quite well.

Unfortunately, these sorts of practical techniques are not known to many programmers and it would be nice if somebody (eg. you :-) were to list it on a website/book with some sample code for everybody's benefit.

link

non-entity 2506 days ago

This is all fascinating. Do you have any recommended reading on designing said high-throughput, low-latency systems?

link

rramadass 2506 days ago

You might find the following useful.

* Network Algorithmics,: An Interdisciplinary Approach to Designing Fast Networked Devices - https://www.amazon.com/Network-Algorithmics-Interdisciplinar...

* See MIPS Run - https://www.amazon.com/Morgan-Kaufmann-Computer-Architecture...

* UNIX Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers - https://www.amazon.com/UNIX-Systems-Modern-Architectures-Mul...

* Advanced UNIX Programming - https://www.amazon.com/Advanced-UNIX-Programming-Marc-Rochki...

link

ncmncm 2506 days ago

I don't know of any. Maybe the people who know this are too busy building them. I might be in trouble for writing as much as I did. :-)

link

ArtWomb 2507 days ago

Was about to comment that you should write a Book, ncmncm. But I think it'd be better if you just wrote a new Operating System ;)

link

benologist 2507 days ago

I have been making my test suite emit structured data for the API tests which is used to document the API. This eliminated the margin for error in manually keeping the API documentation up to date. This improved the test coverage a lot as complete coverage is required for the documentation to be complete. It looks great too -

https://github.com/userdashboard/organizations/blob/master/a... derived from https://github.com/userdashboard/organizations/blob/master/t...

Another thing that helped was moving all my UI page tests to Puppeteer which is a NodeJS API for browsing with Chrome and tentatively Firefox web browsers. This let me automatically generate screenshots for my entire UI to publish as documentation, while simultaneously testing the responsive design under different devices which surfaced many issues.

https://userdashboard.github.io/administrators/stripe-subscr... generated by https://github.com/userdashboard/userdashboard.github.io/blo...

link

alzoid 2507 days ago

I did the same thing with puppeteer when I had to do a bootstrap upgrade. It was easier to generate screen shots at each breakpoint to make sure there pages looked ok.

link

domlebo70 2507 days ago

Interesting idea. I wonder if this could be incorporated into some sort of test runner for testing API components. Emit a structured summary of responses that the tests generate

link

2rsf 2507 days ago

> moving all my UI page tests to Puppeteer

Why did you choose Puppeteer over other options like Protractor/Selenium or Test Cafe ?

link

benologist 2507 days ago

I already had some familiarity with Puppeteer but mostly it's just because Puppeteer's NodeJS and my project's NodeJS, they work together without extra setup steps, configuration etc.

link

gitgud 2507 days ago

Stateless components, or as I like to call them dumb components.

We found it much easier to reason about logic in the code base with having many small dumb components, which didn't have any state or complex functionality. These would be controlled by a few smart parent components to coordinate them.

The result was a lot cleaner. We implemented this on a Web client, but I think the concept would work well in any codebase.... dumb classes are easier to understand

link

rumanator 2507 days ago

> Stateless components, or as I like to call them dumb components.

You mean like pure functions?

https://en.m.wikipedia.org/wiki/Pure_function

link

gitgud 2507 days ago

Yes similar, I'm talking about reactive UI components though (used in React, Vue, Angular etc.). They're a class that might have many functions. In this case all the component's functions would be pure functions though.

Perhaps a better term could be pure components maybe?

link

BubRoss 2506 days ago

What does that mean? Something that is read only?

link

myguysi 2507 days ago

Amen to that! I’m working my way through a React codebase that has state logic everywhere with even the most basic UI components connected to Redux. It’s so much easier to reason when they’re decoupled and you have those higher level coordinator components.

Funny enough, the coordinator pattern really clicked with me when I wrote Swift apps. Similar concept.

https://will.townsend.io/2016/an-ios-coordinator-pattern

link

Someone1234 2507 days ago

Immutable JavaScript/CSS/Blobs/etc.

We have a very typical [web] codebase, server-side code (e.g. business rules, database access, etc), server-side Html generation, and JavaScript/CSS/Images/Fonts/etc stored elsewhere. Two repositories (content and code).

So the obvious question is: How do you manage deployment? Two repositories means two deployments, which means potential timing problems/issues/rollback difficulties.

The solution we use is painfully simple: We define the JavaScript/CSS/etc as immutable (cannot edit, cannot delete) and version it. If you want to bug fix example.js then it becomes example.js 1.0.1, 1.0.2, etc. You then need to re-point to the new version. The old versions will still exist and old/outdated references will continue to function.

This also allows our cache policy to be aggressive. We don't have to worry about browsers or intermediate proxies caching our resources for "too" long. We've never found editing files in-place, regardless of cache policy, to be reliable anyway. Some browsers seemingly ignore it (Chrome!).

We always deploy the "content" repository ahead of the "code" repository. But if we needed to rollback "code," it wouldn't matter because the old versions of "content" was never deleted or altered.

There's never a situation where we'd rollback "content" because you add, you don't edit or delete. If you added a bad version/bug, just up the version number and add the fix (or reference the older version until a fix is in "content," the old version will still be there).

link

mattmanser 2507 days ago

A much easier way than this is to append a hash of the file instead of 'versioning' it. Some people add it as a query string, some add it into the filename.

Been doing this for years with infinite (well, practically) cache settings.

These days it's built into most js compression tools afaik.

link

benologist 2507 days ago

This is the official way to do that I think -

https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

link

Someone1234 2507 days ago

That doesn't work, because no one file exists in isolation. If you're using version 32.14 of this, you want version 32.14 of that, and this other thing. Versioned directories make this kind of grouping natural and easy, co-mingled hashes do not (and you could do both but you have the downsides of both and no real upsides).

Plus semantic versioning can help cross-team communication, there's no human understanding of raw hashes.

link

diggan 2507 days ago

You don't necessary need to use a hash based on randomness. Either using the git commit as the version/hash or a hash based on the content of the file itself works.

So as long as your entrypoint and it's references are versioning, everything follows from that. So if I load version A of index.html, it also points to version A of the scripts/styles. If you load version B, you get version B of the scripts/styles, since everything is versioned the same way.

link

mattmanser 2507 days ago

Git commit is a bad solution, you want to use the file hash so you can have multiple bundles that version automatically. Also, if you pushed a change to even some comments or something not related to code, your bundle will change.

Often you have a bundle of library code that you rarely ever push changes to, and you don't want your clients to download each time you make minor changes.

link

mattmanser 2507 days ago

The file hash of the output file, that's what I meant by hash.

Your way has big downsides and is pretty old-fashioned. It's still used by libraries that only have one javascript file, but not by websites that have to have multiple bundles and multiple CSS files.

Here's webpack's advice about doing exactly what I'm advocating, it definitely works and is the industry standard:

https://webpack.js.org/guides/caching/

Firstly, the file hash mechanism is built in to most bundling tools, and the file hash means you never ever, ever, ever get any collisions or make any mistakes or forget to increase the version number. It's all handled automatically in the build process.

But on top of that, you can also then also have multiple bundles and they will automatically version themselves on the fly. It's common for most sites to have multiple bundles, meaning when you commit a change and rebuild the site, some of those bundles will not have changed. With the automatic hashing, the browser will only download the bundles that changed and you aren't serving tons of unnecessary javascript. You might have one of rarely changing shared libraries, another of the sales part of the website, another for the client part of the website, another for the admin section, you might have a video player that only parts of the site use, etc.

Each release you do would only force the browser to download bundles that actually changed.

For example reddit has:

    https://www.redditstatic.com/_chat.Q8BtxnzGjSI.js
    https://www.redditstatic.com/crossposting.4zJErPF9qdo.js
    https://www.redditstatic.com/reddit-init.en.zJ5ikJ21-Gw.js
    https://www.redditstatic.com/reddit.en.BQfJLVYdPSA.js
    https://www.redditstatic.com/spoiler-text.vsLMfxcst1g.js

Or stackoverflow:

    https://cdn.sstatic.net/Js/full.en.js?v=b45d5b4c957c
    https://cdn.sstatic.net/Js/stub.en.js?v=963cc3083a37
    https://clc.stackoverflow.com/markup.js?omni=Ak4r5CHnPNcIR2AAAAAAAAACAAAAAQAAAAMAAAAAAKpYlh7uXVCYJKM&zc=24%3B4&pf=0&lw=165

Or stackoverflow's CSS:

    https://cdn.sstatic.net/Shared/stacks.css?v=897466c4b64a
    https://cdn.sstatic.net/Sites/stackoverflow/primary.css?v=2d33230dde3d

See how they have a "shared" css they use on all the stackexchange sites, and a "stackoverflow" one, and that they can release each without destroying the cached version of the other?

link

Someone1234 2507 days ago

You're talking about something entirely differently than what I am talking about. We aren't bundling at all. We're minifying and relying on H2 for high performance concurrent delivery. Bundling is the only old-fashioned thing here. Semantic versioning is timeless.

You're talking about a mechanism that is purely designed to cache-bust. I am talking about a mechanism for humans to deploy, understand, and utilize libraries across teams (and to group different files into distinct versions). Apples and oranges. The thread was about architecture, after all...

I won't get drawn too far into your post since it has too many strongly held claims without explanation/justification and I don't feel like trying to unravel that. But, yes, if you're automatically generating bundles for HTTP 1.1, append a hash. We aren't, so we don't.

link

mattmanser 2505 days ago

And you should still be minifying your CSS and JS, even if it's just to get out the comments, and it's still better to use file hashes than piss around with versioning.

Doesn't matter how much you dance around it, this wasn't a good architectural decision, nor is it standard industry practice.

link

zawerf 2507 days ago

I have recently been struggling with versioning(and learning devops in general) myself so I would love to hear more on this topic. For example if you rollback a deployment (or if you just have browsers who haven't refreshed yet), how do you make sure browser clients are talking to the right api backend version? How do you force them to upgrade or rollback? Will they even be routed to the same api server on multiple calls?

This is especially bad with long-lived single page apps.

(I already use immutable static files auto generated/hashed by create react app. I rely on cloudflare to cache them forever rather than never deleting from the build though)

link

Someone1234 2507 days ago

> how do you make sure browser clients are talking to the right api backend version?

We version the URL itself.

> How do you force them to upgrade or rollback?

We don't use it often but we can embed an "obsolete" tag into the HTTP/AJAX response header which a global AJAX hook (jQuery) will read and bring up a prompt/force a page reload. We use it infrequently but it was added for just such an occasion.

It is a bad user experience but it is a useful tool.

link

gitgud 2507 days ago

That's a great solution, and I think that's what a lot of webpack build systems do.

In Angular, if a src file changes, then the corresponding build file hash changes. They call it cache-busting as it breaks the cache.

What kind of web stack are you running?

link

Someone1234 2507 days ago

We have Java and .Net Core (trying as a replacement) internet facing and Node.js for internal APIs. All on Linux. Some of this is due to organizational reasons, not technical.

As for the "content" side, it is pretty stereotypical: Sass, TypeScript, AngularJS 1.xx (not a typo!), and too many npm dependencies. But there's too much NIH[0] between teams, which is why our structure is so important in other ways.

[0] https://en.wikipedia.org/wiki/Not_invented_here

link

cbanek 2507 days ago

Simplify, simplify, simplify. Don't make tomorrow's problem today's complexity.

Get rid of any configuration options that no one uses. These things get passed around in flags sometimes to deep levels and can make logic complicated. Don't add a configuration option until you are sitting at someone's desk and see they need it and why. Only add the bare minimum. Same for APIs, buttons, and features.

link

tnolet 2507 days ago

Don’t use Kubernetes or Microservices. Solves most problems.

Not even being sarcastic.

link

ellius 2507 days ago

In general I think matching your tools to your needs, and coming up with solutions that are as simple as possible (but not simpler) is a super power and hard to get right. Your goal should always be to maximize your leverage by hiding and offloading as much complexity as you can while still meeting your requirements.

link

rumanator 2507 days ago

In your opinion what's wrong with Kubernetes or microservices?

link

quickthrower2 2507 days ago

Not the OP, but I'd say use only Kubernetes if you have the time to dedicate for the team to learn that technology and it's mental model.

link

pbar 2507 days ago

From a developer point of view, one should not have a mental model in play for Kubernetes, the standard 12 factor pattern should be it. If not, the infrastructure and the app are strongly coupled

link

rumanator 2507 days ago

The overall mental model is not rocket science, and managed Kubernetes services remove most of the barriers to entry.

In fact, most of kubernetes' mental model is in fact a direct reference of basic requirements to run containers on any platform.

link

gitgud 2507 days ago

Microservices... Instead of 1 server you have N servers to maintain and scale....

"premature optimization is the root of all evil" - Donald Knuth

link

oftenwrong 2506 days ago

A monolith is simpler than a bunch of services. If you can run your system as a monolith, you should.

link

weitzj 2507 days ago

The biggest principles for emerging good code for me are:

Inversion of control (pass in your dependencies), keep your architecture orthogonal (make it composeable and really think if you need to inherit things rather than delegate them), code-generation of a transport api via gRPC and only focus on the business logic implementation.

link

happycoder97 2507 days ago

What is orthogonal architecture?

link

shoo 2507 days ago

i would assume: the different parts of the architecture are independent, and each addresses a completely distinct non-overlapping responsibility. i.e. you can add or remove or adjust each part without interactions between parts

link

happycoder97 2507 days ago

Could you give some guidelines to keep an emerging architecture orthogonal?

link

valand 2507 days ago

Keep states where they are needed.

Make most things immutable.

Prefer composition to extension.

Treat Types as contracts.

Sandbox "unsafe" codes (codes that interacts with network, file storage, etc).

Eliminate side effects.

Eliminate premature abstractions.

Prefer explicit over implicit.

Keep components functional.

Prioritize semantic correctness and readability.

Use events to for inter-component communication when those components don't need to care about each other's functionality.

Think protocol over data.

link

croo 2507 days ago

I nodded along expect the last one. What do you mean by thinking protocol over data?

link

valand 2506 days ago

I meant: When creating an endpoint, a component, a feature, or a data structure, I treat them like protocol. Protocols enable other components to do more things while being robust and efficient. It must be, to certain degree, extensible and forward compatible. With that mindset, you're likely going to avoid more trouble in the future, while indirectly enforcing open-closed principle in every level.

link

scarface74 2507 days ago

Ripping out as much home grown code for cross cutting concerns (logging, database access, retry logic, etc) that previous developers used and using third party packages.

link

hellwd 2507 days ago

There are many decisions that you can make to improve the quality of your codebase. There is no a recipe that you can follow because each application is different but there are some general things that can make your life easier.

Here are some tips that helped me a lot:

- Keep your solution and tech-stack as simple as possible

- Mark those parts that can change often and try to make them configurable (when you have it configurable you don't need to change code and re-deploy every single adjustment)

- Make sure you have a good and readable logging

- Use DI

- Separate your application core application logic from the infrastructure part (DAL, Network Communication, Log Provider, File readers/parsers and similar)

- Keep your functions/methods clean and without side effects

- Method has to return something (try to minimize the usage of "void" methods)

- Split each feature or functionality you are working on into small pieces and compose the final thing with them

- Be disciplined about your naming conventions and code style

link

tnolet 2507 days ago

One of the things in the Clean Code book really helps.

Methods and functions should be around 5 lines.

Doesn’t always work but is great to aim for.

link

caseymarquis 2507 days ago

Is there an article on this? I feel like I must be missing some context, as 5 lines seems short enough to be counter productive.

link

croo 2507 days ago

Uncle Bob (writer of the Clean Code book) argues that functions should be small (3-10 lines long, and not longer). He brings up 2 points as far as I remember.

1 - functions are(should be) well named so anyone later on will have better understanding of the intent of the writer of the code.

2 - bugs have a harder time to hide in 5 lines of code than 30 or 300 lines of function code.

If you did not read it I recommend it or the video series based on the book.

I worked on only one code base where we more or less held ourselfs to this and the class length limit (classes really should not be more than 2-300 lines long) and it turned out pretty well.

link

tnolet 2507 days ago

I probably exaggerated a bit. This paraphrase says the limit is “hardly ever 20 lines”.

https://dzone.com/articles/rule-30-–-when-method-class-or

Sorry, don’t have the actual book at hand now. Still a great read though.

link

cbanek 2507 days ago

I think 5 lines is pretty short but good. At the very longest, I like a function to fit on one screen of text so I don't have to scroll to see the entire function.

link

juangacovas 2507 days ago

I like to use curly brace jump shortcut and interactively debug my code and other's code to avoid being too picky about this stuff, unless you have to stick to 80x24 kernel surface ;P

link

pryelluw 2507 days ago

Small and simple over big and complex. Plus some functional patterns and a lot of YAGNI based thinking.

link

pearjuice 2506 days ago

Honestly, tests and by extension testable code. The amount of enterprises processing tens to hundreds of millions of dollars (either business value or actual revenue) without tests of vital parts of their software is something which is mind blowing. You can sometimes not fathom how they are comfortable with changing a line without having tests to back them up. They f5 a page or recompile the server software, redeploy click through it and "yup it works let's ship" and then a few days later find out it broke a csv import of the external warehouse inventory system which runs once a week because they removed a dash between sku and title for better SEO in the online catalog. Oops, good luck finding out where the problem is because you have zero integration tests. A few million down the drain because import division couldn't possibly know what to forecast on due to no stock data. And this is not an exception to dumb bugs and malfunctions occurring because developers don't write tests.

You can start an entire business in consulting on test automation and you would never run out of work.

link

oftenwrong 2506 days ago

YAGNI, KISS

Choose Boring Technology http://boringtechnology.club/

Build your system to be level-triggered as much as possible. Its default mode should be reconciliation: examining its current state and transforming that into the desired state, especially if the current state is "something went wrong". Build in dumb reconciliation before worrying about making it more real-time.

The fewer moving parts, the better. Don't go multi-service architecture until you absolutely have to (see YAGNI, KISS).

Keep your business logic contained, separated from everything else, in ONE place. If I open up your business logic code, I shouldn't see anything about persistence, the network, etc. Similarly, I shouldn't find any business logic in your other concerns. The business logic interacts with other concerns via abstractions.

Be unforgiving when it comes to correctness guarantees. Use the type system as much as possible to make errors impossible.

link

diehunde 2507 days ago

Very basic ones:

- Strong test suite

- Delete duplication as much as possible by using any techniques such as method extraction and keeping classes and methods small.

link

he0001 2507 days ago

Reduce the number of tools, use them to the max, and know those tools intimately. When it falls short consider a new tool.

link

happycoder97 2507 days ago

I had been organizing all of my projects so far using layered architecture. Recently I read this article about layered architecture: https://dzone.com/articles/reevaluating-the-layered-architec... Now I feel that layered architecture was a poor choice for many of my previous projects.

So, I think, instead of layering, for example I should put everything that needs an access to a User entity's internal fields in User class itself.

For example: User.getProfileAsJson() // for sending out to frontend

Now I am confused regarding where to put methods that involves two entities. Suppose there is an Event entity which represents some online event that can be registered by the User.

Where is the best place to put getEventsRegisteredByUser()?

link

neRok 2507 days ago

I'm not a pro and don't do this for a living, but here are my 2cents...

I recently started a large project, so did some reading on architectures/patterns like DDD and Clean-Arch. One of the most important points I took from both was to clearly define your domain. But based upon past experiences, I have developed a dislike for "heavy" objects like those used by DDD and ORM's in general. I like to keep things simple, sort of "functional" in nature - what your link refers to as anemic objects. So I have stuck to the SOLID principles, and in particular the D = dependency injection. I've also taken a fancy to RPC style code, so that influences my code. BTW, clean arch isn't too different from the image of Layered-Arch in your link, more of an evolution really.

So here is how I apply my concepts to your problems...

Users want to know the Events they are registered for, and Events want to know the registered Users. You have a circular dependency! But really, the problem to me is that you haven't expanded your domain enough. I think you should have a third entity, something like UserEventRegistrations. Now User's and Event's don't depend on each other, and UserEventRegistrations will depend upon them. No circle!

As per my like for anemic objects, I would have a User model object to hold properties like name, and a UserRepository for doing CRUD style operations with methods like GetByID() that returns a User instance. The same would apply for Event, and something similar for UserEventRegistrations, except it's repository would have a dependency on the User and Event repository so that it can do methods like GetEventsByUserID().

Then to apply this in Clean-Arch style, I leverage whatever statically typed language I am using (Go, TypeScript, etc) to implement interfaces. So I define the domain layer as the model objects, and interfaces for the repositories. For the persistence layer, I would create a concrete implementation of the repository interfaces, and they would return instances of the domain model objects. Then for presentation, I would create a layer that expects to be dependency-injected with a concrete implementation of the repository interface. So my layers are separate, based upon the "contract" that is my domain layer.

Now your example for User.getProfileAsJson() is vague in meaning, but if you wanted to return the data in a different format than the domain model, you could have another layer on the presentation side of the equation that handles this. It would utilise the repositories to build what you need. So your "Profile" might be a single JSON payload containing a User with their Events. Your function would do UserRepo.GetByID(), check you have a User, then do UserEventRegistrationsRepo.GetEventsByUserID(User.ID). Then it would stick it in your payload, and viola.

I've not completed my project yet, but I've implemented some functionality in all layers (Go server pulling data from RDBMS and sending to TypeScript UI), and it seems to be working well. I've also noticed after the fact that my domain layer ends up looking exactly like a protocol buffers definition, so maybe just use those.

link

mattbrewsbytes 2507 days ago

Specifically to allow easier changes: abstraction, encapsulation and separation of concerns.

An example would be if you have a module that calls a REST API to get/put something (say time sheets for your invoicing app), then have that be its own module that is testable.

Create internal TimeSheet data structures that you pass to/from that module. The core functionality of your app should be implemented using the TimeSheet data structures and you can have tests that use those and then separate tests around calling an API.

New customer comes along and says they want to send you CSV files via SFTP (yuck, but they got money). You just have to write a new interface that works with exchanging those files and gets them into your TimeSheet data structures, the core of your app should remain unchanged.

link

srijanshetty 2507 days ago

One controversial opinion: monorep which is ideal for small teams iterating really fast. The other one was figuring out 12 factor app by serendipity as we were focussing on keeping our operations simple.

link

pbar 2507 days ago

Make sure to take great care of the monorepo, and break it up _before_ it becomes impossible but necessary

link

srijanshetty 2503 days ago

I completely agree, but we're constrained to two developers and won't hire anytime soon. So, monorep seems to be working great for us.

link

throwaway1954 2506 days ago

Write proper git commit messages.[1]

A few examples here.[2]

1. https://drewdevault.com/2019/02/25/Using-git-with-discipline...

2. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

link

afpx 2507 days ago

Eliminate “broken windows”

https://en.m.wikipedia.org/wiki/Broken_windows_theory

link

Noumenon72 2507 days ago

What, were people swearing in the comments and putting Easter Eggs in the releases?

link

reilly3000 2507 days ago

I'm assuming they mean a clean and orderly codebase invites developers to commit clean code. Its hard become motivated to make well-formed units when there are dumpster fires everywhere you look.

link

ndreipoppa 2505 days ago

For an event driven app(poker game) built with React, redux and redux-saga, we deleted almost the entire project(100k lines of code) because our logic was tightly coupled with the sagas and reducers. Now we moved our logic inside the state selectors(we use reselect), the reducers are dumb, while sagas are only used to listen/dispatch async actions.

link

dustingetz 2507 days ago

Clojure

link

xcubic 2495 days ago

Where to start?

link

CameronBarre 2504 days ago

Structuring software as a series of processes separated by queues in the small and large.

link