Hacker News new | ask | show | jobs
by FpUser 2063 days ago
The concepts of threads and concurrent data access is simple enough for any decent programmer to comprehend. There is no hell here. Sure there are some complex cases but complex cases will arise in many situations when programming things.

And achieving concurrency without shared memory is impossible in general case. Sure it is possible to isolate such access to a separate layer and make it transparent for the rest of the program but someone still has to program such layer.

2 comments

The problem for novices is that a program that behaves correctly looks a lot like a correct program. Until one day it doesn’t.

And because you’re in production and getting random spurious failures, the panicked (but common) reaction is to wrap every shared resource in a synchronized block. Which makes an incorrect implementation worse but possibly correct.

If the resource is shared and being accessed from many threads and is both written to and read from then it is the correct behavior to to lock it with the proper type of lock at access time. Depending on resource it might be possible to split it into few with more granular access.

As for novices: they are called that for reason and supposed to be under supervision rather than allowed running wild.

Why is this being downvoted? It's the truth.

HN needs to only allow downvotes that have an accompanying explanation comment.

HN uses downvotes mostly to boo the people with opinions deviating from common party line. As for reasonable explanation - you're asking too much. Programming as many other things often are treated as the religion. No arguments, it just is.
As with most other compromised social sites, no badthink allowed here and how dare you.
Novices don't build working concurrent systems of any kind with any toolkit, period. Concurrency is hard and thinking all the "concurrency problems" go away with some message passing is both ludicrous and dangerous. Fearless concurrency can only be attained through understanding, not by thinking all your problems went away because you're using a "cool approach".
Surprisingly this is what the akka framework promises : Message passing and immutability of objects.
Software usually has state (unless that state is completely kept and managed externally in a database for example). And the state mutates. Simple case example is a big array that has to be processed in place.
It's pretty easy to make the leap from individual SQL statements to SQL statements which are wrapped in a transaction.
Excellent example for making my point, since "just wrap it in a transaction" usually leads to concurrency bugs like the beloved lost update.
If you're talking database like transaction it "usually" leads to concurrency bugs only if the transaction level is not strictly serializable. It does not hurt to know things before labeling them.
This is not something I'm familiar with. What's the beloved lost update and what transactions are you using that suffer from it?
Transactions give varying degrees of "isolation" between them, depending on the database (and its version + configuration). For example, in what SQL would call READ COMMITTED, where transactions will only read data that has been committed, read-modify-write updates are generally bugs. The classic example:

    - Intent: both transactions deduct 50 money
    - transaction 1: SELECT balance FROM account; // = 100
    - transaction 2: SELECT balance FROM account: // = 100
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    - transaction 2: UPDATE account SET balance = 50
    - transaction 2: COMMIT
    - Result: balance is 50, but should be 0
With serializabile transactions (not all databases have this, particularly if you look beyond SQL):

    - Intent: both transactions deduct 50 money
    - transaction 1: SELECT balance FROM account; // = 100
    - transaction 2: SELECT balance FROM account: // = 100
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    - transaction 2: UPDATE account SET balance = 50
    - transaction 2: COMMIT -> Fails, needs to retry
    - transaction 2b: SELECT balance FROM account: // = 50
    - transaction 2b: UPDATE account SET balance = 0
    - transaction 2b: COMMIT -> Ok!
    - Result: balance is 0
Because this is needed so frequently, databases have calculated updates, basically atomic operations:

    - transaction 1: UPDATE account SET balance = balance - 50; // values indeterminate
    - transaction 2: UPDATE account SET balance = balance - 50; // values indeterminate
    - transactions 1,2: COMMIT
    - Result: balance is 0
Or, one could lock the rows, like so:

    - transaction 1: SELECT FOR UPDATE balance FROM account; // = 100
    - transaction 2: SELECT FOR UPDATE balance FROM account: // = transaction 2 is stalled until transaction 1 commits or rollbacks
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    // transaction 2 can now continue and gets balance = 50
    - transaction 2: UPDATE account SET balance = 00
    - transaction 2: COMMIT
    - Result: balance is 0
And this is just one simple example of the problems you can have concurrently accessing one table, even while using transactions. Not to speak of the issues you can run into when interacting with systems outside a single database, which don't interact with the transaction semantics of the DB.

Concurrency is just very non-trivial regardless the abstraction.

What's a better alternative to synchronizing access to shared resources?
Treat it like GC and don't leave it up to the programmer.
An make a programmer unable to achieve highest performance when needed. We leave in supposedly free world. If you want to be "protected" be my guest and use languages with GC. Plenty of those. For somebody who need the opposite and uses "unprotected" tools - leave them alone. You have no rights to decide how other people do their work unless they're under your direct control.
Does that involve using concurrency primatives that basically don't allow access to any shared mutable state?
I'd say provide concurrency primitives to disallow direct access to shared mutable state. You still do the reads and the writes, but you let the system take and release locks for you.

Let's say you wanted to turn a list into a bounded list of 4 elements.

Race-condition insert:

    if (sz < 4) {
      list.insert(x);
      sz++;
    }
Safe insert:

    atomically {
      if (sz < 4) {
        list.insert(x);
        sz++;
      }
    }
So atomically organises the locking/unlocking/rollback for you such that a fifth element will not be inserted.
Aren't the semantics of this exactly the same as Java's synchronise? What error does it protect you against compared to synchronising on list? What happens if .insert() also uses atomically{} somehow?
> The concepts of threads and concurrent data access is simple enough for any decent programmer to comprehend. There is no hell here.

It's notoriously difficult to reason about concurrent programs using intuition. Much more difficult than reasoning about non-concurrent imperative code. This is why there are articles like [0], and why a bug in a Wikipedia article on a fundamental concurrency algorithm went unnoticed until an analysis tool detected the issue, [1] and why lock-free algorithms in particular are so tricky to get right.

[0] https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedL...

[1] [PDF] https://llvm.org/pubs/2008-08-SPIN-Pancam.pdf

Threading concept is simple,real world is not.

if i had no idea about C10K problem, success of Nginx, Redis, other concurrency success stories on Actors, CSP concept, Concurrency via messages over shared memory, I would say threads are ok when you can use it. But indeed it is so simple and tempting people to design shitty software. Software is a very welcoming medium for it. it is hell.