Hacker News new | ask | show | jobs
by chrixian 4032 days ago
The thing about pointers that I don't get is why do you need the memory address of the variable? Is that the only way to get the value when you want it? Like, every variable has to have a pointer in order to make use of the variable?
5 comments

This comes from a restriction of almost all existing computer architectures. You have a small amount (16 on amd64) of 'variables' called registers that you can directly work with. Additional variables have to be loaded from and stored in memory, which is slower and requires you to know that variables address (pointer) -- which is just an integer with special meaning. In C, variables you never take a pointer to might live exclusively in a register, and all local variables are stored at a fixed offset to a special 'stack pointer' which is kept in a dedicated register.

Some architectures try to have a more sophisticated approach and have some sort of 'fat pointer' in which pointer values have a special tag and are subject to special rules so they can only point to valid objects. The exact rules used and what constitutes 'valid' is specific to the architecture. Intel has introduced mpx on newer processors to check array bounds with such a scheme, and older processors such as LISP machines have much stronger (but less efficient) schemes.

If you want arrays of variables, you need pointers. It's not enough to know the value of an int (the first element of the array), but you need the pointer to an int (pointer to the first element of the array). Now you can increment the pointer to go to the next element. You couldn't do this with a simple value.

Also, suppose you want to pass a variable of a large data type, like an image, to a function. Instead of copying the entire variable, just pass the cheap pointer. The analogy is giving someone an URL vs the source code of the site for them to paste in the browser (You can't fit the latter onto a QR code, for instance, but you can the former).

Of course the problem is, if you're passing a pointer to a data structure to a function, the function doesn't know the size of the data structure unless you pass that as another argument.
You meant to say, "if you're passing a pointer to an array to a function, the function doesn't know the size of the array unless you pass that as another argument".

When passing a (pointer to a) data structure to a function, in 99.99% of cases there's only one data structure you'd pass, and you build this into the function's prototype, e.g.,

  int myfunction( struct my_structure *x )
instead of

  int myfunction( void *x )
and so, yes, the function does know the size of the structure. And in the case of arrays, often it's enough to mark the end of the array (with '\0' in the case of char arrays or NULL in the case of pointer arrays), I'd only roll my sleeves up and worry about minimizing length calculations if I had actually done some profiling and determined that such nitty-gritty optimization was needed (it rarely is).
You don't need the address of a variable: you can use the plain variable just fine, and most C code does a lot of that.

What a pointer does is add a level of indirection: so instead of having a value "an integer" you can have a value which is "the location of an integer". A variable holding such a value can be assigned the location of any integer variable, and importantly can also be reassigned the location of a different integer variable.

The additional indirection also means that you can link one data structure to another without including one as an integral part of the other.

And why is this useful? Well, for high-level folks, pointers are used for roughly the same thing as reference variables in other languages.

For low-level folks, sometimes you need to be able to read from / write to a specific address in memory. So if you have, for instance, a system clock device that always give you the current time if you read address 0x1234, you might do something like this:

uint64 system_time;

uint64 system_time_device = 0x1234; // A pointer to the system time device...

system_time = system_time_device; // read the contents of the memory at address 0x1234 to get the time

For anyone confused, HN's formatting system changed the two asterisks that give this meaning into italics-start and italics-end.

  unit64 system_time;
  unit64 *system_time_device = 0x1234; // set pointer
  system_time = *system_time_device; // read what's pointed to
Speaking from a C++ perspective, the item pointed to may not be a simple type (like an integer) but a complex object.

Copying an object around all over the place (into functions, out of them) would be expensive. It would be like copying an entire ledger every time you wanted to make a change to the ledger. Far better would be to hand the ledger around, or when looking for it ask "where is it?" and be pointed to where it is now.

It also makes it simpler to make sure all your data is in one place, which is a good thing for program design.

If you don't have a copy of the value, you need someway to find it, right? That's a pointer, or a reference, or a handle. (These are all approximate synonyms for some way to "address" the data.) In the olden days, there was a fixed mapping from a number to a physical location in storage, but now there are many levels of indirection, such as virtual memory, pools, etc.