| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by JavaOnlyGuy 1450 days ago
	How are pointers implemented in a language that doesn't support them?

4 comments

suprjami 1450 days ago

I don't think this is how it works.

The JVM is a specification which describes a pretend computer and its instruction set.

This TruffleC doesn't translate C to Java and run a Java program. This compiles C to bytecode which operates on the JVM.

Whatever Java does or doesn't support is irrelevant to this compiler. TruffleC has nothing to do with the Java programming language at all.

Just like you can compile C and get a memory address of a stack or heap location on any physical computer supported by a C compiler, likewise you can compile C with TruffleC and get a memory address within the stack or heap of the pretend computer called the JVM.

This must be how it works, unless the JVM itself has no concept of memory addresses, which seems very unlikely to me. Let me know if I am wrong?

chrisseaton 1450 days ago

> This compiles C to bytecode which operates on the JVM.

No, it compiles C to an AST, which it then interprets. The AST, which is also the interpreter in the Truffle design, are then partially evaluated to produce machine code. No bytecode is generated at any point, and in fact you can run it on a JVM that doesn't use byteocde, and then there is no bytecode anywhere.

chaosite 1450 days ago

I learned most of what I know about Truffle and Graal from your blog posts, so you obviously know more about this than me. However, I was under the impression that Truffle is quite closely integrated into GraalVM, that is, you can't use Truffle on a different JVM. Is that not true?

origin_path 1450 days ago

Not so. Truffle is just a Java library like any other. You can therefore run Truffle languages on any JVM. However, they will run slow as they are just interpreters, then. To get the speedups you need to use Graal, which recognizes Truffle as a library and treats it specially.

chaosite 1450 days ago

Well, OK, sure, but Truffle without partial evaluation is just an interpreter written in a very particular way...

I see what you mean though, thanks!

chrisseaton 1450 days ago

> Well, OK, sure, but Truffle without partial evaluation is just an interpreter written in a very particular way..

That's what it was to start with. Partial evaluation came later.

rschatz 1450 days ago

Truffle and partial evaluation also works on native-image. You could say this is a VM where there are no bytecodes anymore.

chaosite 1450 days ago

Oh, of course, but native-image is still a Graal feature, and I was asking about Truffle without Graal.

entropicdrifter 1449 days ago

native-image was created as part of the Graal project but I think it's a separate JVM implementation from GraalVM

dzaima 1450 days ago

the JVM bytecode does not have any memory address type. Just various width integers & floats, and references to managed heap objects. Arbitrary pointers would have to be done with 'long's one way or another.

rschatz 1450 days ago

You can still use pointers. It's a bit hidden, but there are things like `Unsafe.allocateMemory`, `Unsafe.getByte` and so on ;)

dzaima 1450 days ago

right; at which point the subset of jvm you're using is a subset of any other IR/VM, the 'j' in 'jvm' being only useful as an implementation/runtime.

chaosite 1450 days ago

Sure, but don't discount all of the JIT optimizations that were implemented in the JVM and the huge number of engineer years invested in that particular implementation/runtime...

quietbritishjim 1450 days ago

I guess a really brute force way would be to have a huge dictionary mapping from "memory address" (really just an arbitrary number) to JVM object. malloc() would add to the dictionary and free() would remove an entry. Pointer dereference would look up in it but would need to be able to find the nearest lower entry (for when you have an array and dereference an entry in it, or use a pointer to a field in a struct).

I would hope that there's a much more efficient way to do it, this idea is just evidence that it could be done in principle. But I don't see what that more efficient way would be. You certainly need to keep a secret reference to each JVM object somehow because C doesn't require you to keep any pointer to an object e.g.

    intptr_t x = (intptr_t)malloc(sizeof(int));
    *(int*)x = 99;
    bool did_subtract_50 = false;
    if (x > 50) {
        did_subtract_50 = true;
        x -= 50;
    }
    // Now there is no pointer or even integer that contains the address
    
    // ... later ...
    // Retrieve the address and use and free it
    int* y = (int*)(x + 50 * did_subtract_50);
    printf("value: %d\n", *y);
    free(y);

chrisseaton 1450 days ago

A class wrapping a long value with the pointer address in it.

sitkack 1450 days ago

Ha! Your paper is a "highly influential citation"

https://www.semanticscholar.org/paper/TruffleC%3A-dynamic-ex...

chrisseaton 1450 days ago

The side-bar says 'highly influential' but the badge lower down says 'highly influenced' which sounds like a bad thing doesn't it?

forgotpwd16 1449 days ago

Probably meant as "[this paper has] highly influenced [citing paper]".

sitkack 1449 days ago

Semantic Scholar is calling out when it thinks the researchers were using drugs.

MaxBarraclough 1450 days ago

How is the C memory modelled? One big Java array, or are there multiple data-structures?

For instance, what happens when you call a function-pointer?

chrisseaton 1450 days ago

> How is the C memory modelled?

Using a combination of native memory and JVM managed memory, depending on what the memory is needed for.

> For instance, what happens when you call a function-pointer?

This is a good example - because TruffleC can inline-cache a function-pointer, inlining the called function!

All this is in the linked paper, of course.

aardvark179 1450 days ago

It can be done in a few different ways. Native memory can be managed as plain native memory (under the hood you can use Unsafe to access that memory) but the real advantage is that pointers to many objects can be kept as managed pointers and not converted to a native value most of the time. For example Ruby C extensions often use VALUEs to refer to Ruby objects which are normally tagged pointers. In TruffleRuby we use ValueWrapper objects to represent these, and maintain a fast map between native values and these objects when necessary.

samus 1450 days ago

Well-behaved usages of pointers according to the C standard can be implemented by whatever means fit best. Fat pointers with metadata about the destination and a huge block of memory for generic cases come to mind. The rest is undefined behavior where the runtime can just nuke the program, aka segfaulting.