Hacker News new | ask | show | jobs
by nirvana 5411 days ago
My "through the DB" solution is not as good as a heap or stack would be, but it's not as bad as it might sound, because the DB lives in memory. If, in a given phase, you have some data, you add it to the context, it will be there in the next phase.

It would be easy to have an API that is along the lines of "in the next phase, call this function, pass it this data". I could make an API that does that, or you could put the data under a key in the context, and then call that function at the beginning of the next phase. IF the set of functions you'd like to have called that way varies from request to request, they could be stuffed in a list under a key, and you just process each of the functions in that list.

I think it will be quite possible to provide something equivalent to closures, via an API, though I can't yet say how syntactically convenient they will be, but really not too bad, I don't think.

On further thought, I think it would be quite possible to do actor style message passing... I'm focusing a bit much on the mechanics of implementation right now, and not making this transparent, but the context could easily be used to manage a set of mailboxes and "processes", where, in each phase, or even between phases, whenever a message is available in a mailbox, the function that it was sent to gets woken up and executed. In fact, not function, but process.

So, I can add an API that provides an actor model interface. The actors can be identified by a process ID, they can send messages to each other (addressed by PID) and include arbitrary data, and this can happen in concurrently in coffeescript.

1 comments

Wouldnt it be cleaner if you send messages to a computation state (this request in a future phase) as an indirection, as the pid might not be allocated yet?
I think the pids are getting confused. When I say pid, I mean an id for a combination of a given function and some data, an instance, a fake sort of process that is facilitated by my code invoking the function with the data from its mailbox, whenever there is a message sent to the function by another "process". I'm not talking about erlang processes or "real" processes. So, you wouldn't have the problem of the "pid might not be allocated yet" because you would allocated it.

example in pseudo coffeerlangscript:

init-> pidOne = spawn(functionA, argumentlist), pidTwo = spawn(functionA, differentarguments), contextSet("pidOne",pidOne), contextSet("pidTwo",pidTwo), lookupData(bucket, key, pidOne), lookupData(bucket, key, functionB).

functionA(message) -> doStuff().

So, the here you're "spawning" two processes. For a function to act like a process it is written such that it takes any messages it get as arguments. I could set up their own contexts too, so "contextSet" in pidOne and pidTwo would be unique namespaces. LookupData, instead of taking a function to invoke, takes a process, and sends a message when it has retrieved the data off of the disk.

FunctionB could send a message or to pidOne and pidTwo (which it can find in the context).

So, the init phase is here, and later the start phase will be called. But the thread of execution would be: init, then the database queries happen in parallel, when they are successful, pidOne gets a message and functionB are called (possibly running in different environments.) FunctionB sends a message to pidOne and pidTwo, both of which are invoked with these new messages. When there are no more messages waiting for any of these pseudo processes, and no more database queries or other long running processes running in parallel, then the next phase is called.

If you're saying there's a better way to do this, my ears are open, I just need a little more explanation.

Ah ok by pid I took it to mean a unixy pid or an Erlang mailbox. What you are saying is what I was thinking...