Hacker News new | ask | show | jobs
by andrewcooke 5283 days ago
how do you combine first class functions (and so closures) with no garbage collection? don't you end up with possibly multiple references to memory and no way of knowing who owns it?
2 comments

If you have a function that returns a function, basically you need to rewrite things to lift the referenced state into the outer scope. In pseudo-C:

    (void -> int) incrementer(int initial_value) {
	int count = initial_value;
	return lambda() { return count++; }
    }
might be rewritten as:

    struct hidden_environment_type {
	int count;
    }

    int incrementer_lambda(hidden_environment_type *env) {
	return env->count++;
    }

    void incrementer(hidden_environment_type *hidden_arg, int initial_value) {
	hidden_arg->count = initial_value;
    }
Now the caller will allocate a variable of type hidden_environment_type, and pass its address as the first (artificial) argument.

Note that lambda functions require a different calling convention to C function pointers, since they need to be passed a pointer to their environment in addition to their regular arguments. Introducing lambdas either requires some resolution of this complication, such as specialisation machinery (ie in C++11, templates provide the necessary specialisation). Another option is to use the environment-passing convention for every indirect call, possibly giving a minor performance hit for calling raw function pointers.

Another issue is that the size of an environment type depends on the code of a function involving it, so lambdas don't have uniform size unless some additional indirection is introduced, probably to heap storage (Apple's blocks extension to C takes this approach). This makes it either impossible or slightly more expensive to store lambda functions in data structures, depending on which implementation you choose.

After all this, your ownership/lifetime question boils down to "who is holding onto the instance of hidden_environment_type?", which is a pretty simple question to answer in C-like languages.

If you read the dissertation, it turns out that variables are captured by value. But even if they are captured by reference - solutions are possible such as the one in C++11. The decac doesn't provide more memory safety than C, so there are no problems :) Yet another solution is to inline all higher-order functions so no closures exist at runtime.
Actually, this issue is what spurred a major evolution in the language. I hadn't had a real region or effect system before, and now I do. Why? I needed some way to register what pointers (regions) are getting captured by an existential package (aka: a lexical closure). Since regions are now registered as part of an existential package, the regions captured by a package can be constrained to suit the region of the stack up to which a package is being returned.

In short, closures now remember what pointers they capture, and you won't be allowed to return a closure over a pointer further up the stack than safe. Other than that, closures capture by value, copying the captured variable into an environment structure.