Hacker News new | ask | show | jobs
by yorwba 3217 days ago
Since I'm not sure how understandable the explanation was, here is an example, a simple interpreter for a Lisp-like language, using Python lists as the syntax tree.

    def interpret(state, function_table, expression):
        function = function_table[expression[0]]
        return function(state, function_table, *expression[1:])
    
    def interpret_all(state, function_table, expressions):
        return [interpret(state, function_table, e) for e in expressions]
    
    interpreter_table = {
        '+': lambda s, ft, *es: reduce(lambda a, b: a + b, interpret_all(s, ft, es)), # sum
        '*': lambda s, ft, *es: reduce(lambda a, b: a * b, interpret_all(s, ft, es)), # product
        '!': lambda s, ft, k, v: s.update({k: interpret(s, ft, v)}),                  # set variable
        '?': lambda s, ft, k: s[k],                                                   # get variable
        ';': lambda s, ft, *es: interpret_all(s, ft, es)[-1],                         # execute a sequence and get the last value
    }
An example program for 'x = a * (b + a), return x * x' looks like this:

    program = [';',
                ['!', 'x', ['*', ['?', 'a'], ['+', ['?', 'b'], ['?', 'a']]]],
                ['*', ['?', 'x'], ['?', 'x']]]
You can execute the interpreter with initial values for a and b like this:

    >>> interpret({'a': 10, 'b': 20}, interpreter_table, program)
    90000
The compiler can be implemented by swapping out the interpreter_table with different functions to print code instead of executing it:

    def fresh_variable(state):
        next_tmp = state.get('__next_tmp__', 0)
        var = 'tmp_%s' % next_tmp
        state['__next_tmp__'] = next_tmp+1
        return var
    
    def compile_sum(state, a, b):
        var = fresh_variable(state)
        print(var+' = '+a+' + '+b)
        return var
    
    def compile_product(state, a, b):
        var = fresh_variable(state)
        print(var+' = '+a+' * '+b)
        return var
    
    def compile_set(state, key, value):
        state[key] = key
        print(key+' = '+value)
    
    compiler_table = {
        '+': lambda s, ft, *es: reduce(lambda a, b: compile_sum    (s, a, b), interpret_all(s, ft, es)), # sum
        '*': lambda s, ft, *es: reduce(lambda a, b: compile_product(s, a, b), interpret_all(s, ft, es)), # product
        '!': lambda s, ft, k, v: compile_set(s, k, interpret(s, ft, v)),                                 # set variable
        '?': lambda s, ft, k: s[k],                                                                      # get variable
        ';': lambda s, ft, *es: interpret_all(s, ft, es)[-1],                                            # execute a sequence and get the last value
    }
Then you can easily compile the program:

    >>> interpret({'a': '10', 'b': '20'}, compiler_table, program)
    tmp_0 = 20 + 10
    tmp_1 = 10 * tmp_0
    x = tmp_1
    tmp_2 = x * x
    'tmp_2' # this is the return value, telling you where to find the result
Depending on the syntax of your target language, control flow might require additional compiler state. In Python, I would have to store the current indentation level. If someone is interested, I could show how to do if-statements.
1 comments

Somewhat like this but taking it further: https://github.com/darius/toot