Hacker News new | ask | show | jobs
by steveklabnik 2108 days ago
In C++ terms, Arc is a shared_ptr. clone bumps the reference count up. You only need to do that if you want another owner. If you don't, you can take a regular old reference to the contents of the arc.

The core issue here is that it's not clear (from the snippet, but also to the compiler due to the type signatures) when the threads join. If they're joined in the same scope, they could use "scoped threads" instead, which would remove the need for the arc alltogether.

2 comments

> it's not clear (from the snippet, but also to the compiler due to the type signatures) when the threads join.

Thanks. Assuming the join indeed immediately follows the snippet, what were the needed changes in the type signatures in this case? I’d like to know the “good” and “idiomatic” example (which avoids doing more than needed) for that example.

    use scoped_threadpool::Pool;
    use std::sync::Mutex;
    
    const N: u32 = 3;
    
    fn main() {
        let mut pool = Pool::new(N);
    
        let data_mutex = Mutex::new(vec![0, 1, 2, 3, 4]);
        let res_mutex = Mutex::new(0);
    
        pool.scoped(|scoped| {
            for _ in 0..N {
                scoped.execute(|| {
                    let mut data = data_mutex.lock().unwrap();
                    
                    let result = data.iter().fold(0, |acc, x| acc + x * 2);
                    
                    data.push(result);
                    
                    *res_mutex.lock().unwrap() += result;
                });
            }
        });
        
        println!("{:?}", res_mutex);
    }
I kept the variable names similar to hopefully make it easy to see the transformation. Here, instead of using thread::spawn, we use a scoped threadpool. Effectively, rather than saying "take this closure and run it on a thread" like thread::spawn does, this uses "scoped.execute", which is like spawn, but ties the lifetime to the variable "scoped". This lets the compiler understand that all of these threads will be joined inside the given scope, and so it's able to grok the lifetimes. That variable is zero-sized and so will compile away to nothing.

(You still need the mutexes because multiple threads are writing to the same variables at the same time; imagine if we didn't push the result onto the vector; we could drop the mutex around it entirely, which would simplify things even further. See the example here, where because we are accessing disjoint parts of the vector, we can use no mutexes at all: https://crates.io/crates/scoped_threadpool

Thanks!

> You still need the mutexes because multiple threads are writing to the same variables at the same time

I understand that for the "data" for which the push exists. Do we however need it for res_mutex when the value that we want to update is a single (I guess integer) variable for which atomic add could be performed? Is there something like atomic add?

Oh, duh, yes. one line added, two lines changed, I've left them at their indent levels and left the variable name the same:

    use std::sync::atomic::{AtomicU32, Ordering};
    
        let res_mutex = AtomicU32::new(0);

                    res_mutex.fetch_add(result, Ordering::SeqCst);
I think the snippet is part of the example code for Mutex. Here: https://doc.rust-lang.org/std/sync/struct.Mutex.html

After "It is sometimes necessary to manually drop the mutex..."

Ah ha! I should have recognized it, given that I'm pretty sure I wrote that, haha.

Incidentally, this is one reason why I've been advocating for a return of scoped threads to libstd; I can't change this example to be the "good" one, because we don't refer to external packages and we don't have scoped threads built-in.