Hacker News new | ask | show | jobs
by 1718627440 301 days ago
> Having to pass the object explicitly every time feels clunky, especially compared to C++ where this is implicit.

I personally don't like implicit this. You are very much passing a this instance around, as opposed to a class method. Also explicit this eliminates the problem, that you don't know if the variable is an instance variable or a global/from somewhere else.

8 comments

Agreed, one of the biggest design mistakes in the OOP syntax of C++ (and Java, for that matter) was not making `this` mandatory when referring to instance members.
Mandatory this can also be a major hit in readability. What if you have a class that implements the abc-formula. You get

  (- this->b + sqrt(this->b * this->b - 4 this->a * this->c))/(2 * this->a)
and

  (- this->b - sqrt(this->b * this->b - 4 this->a * this->c))/(2 * this->a)
This is a readability problem for any class that is used to do computations.
C++ and Java went for the "objects as static closures" route, where it doesn't make any sense to have a `this`. Or, they made them superficially look like static closures, which in hindsight was probably not the best idea. Anyway, Java lets you use explicit `this`, I don't recall whether C++ makes it into a footgun or not.
Both languages let you use explicit `this` but don’t mandate it. The “static closure” approach is great. I don’t like having to explicitly pass `this` as a parameter to every method call as in the OP (or worse, the confusing Python approach of forcing `self` to be explicitly written in every non-static method signature but having it be implicitly passed during method calls).

What I don’t like is being able to reference instance members without `this`, e.g.

   void foo() {
      int x = bar + 1; // should be illegal: it can be hard to distinguish if `bar` is a local variable versus an instance member
      int y = this->bar + 1; // disambiguation is good
   }
> int x = bar + 1; // should be illegal: it can be hard to distinguish if `bar` is a local variable versus an instance member

If it was this->bar it could be a member, it could also be a static variable. A bar on its own could be local or it could be in any of the enclosing scopes or namespaces. Forcing "this" to be explicit doesn't make the code any clearer on its own.

The guy was referring to the explicit case where bar is a member variable. The cases where it is in the local scope under scoping rules or the global scope are not really an issue, since you can check the function definition to find if it is local. In the case that it is not in the function definition, then it is in the global scope in C. If implicit this were not done in C++, that would also be the case in C++, provided you do the sane thing and use the std namespace for everything. Just thinking about the headaches namespaces could cause when looking for such definitions with cscope gives me yet another reason to stay away from C++ whenever possible.
this in C++ is just a regular pointer, it has no special footguns, just the typical ones you have with pointers in general
that's not really true - unlike a regular pointer, `this` is not allowed to be null, thus removing `if(this == nullptr)` is always a valid optimization to do.
It absolutely is allowed to be null:

    #include <iostream>
    struct Foo {
      void bar() {
        std::cout << this << std::endl;
      }
    };
    
    int main() {
      Foo *p = nullptr;
      p->bar();
    }
will print 0
This is undefined behavior in my understanding, it just happens to work until it doesn't.

I wouldn't be surprised if any null check against this would be erased by the optimizer for example as the parent comment mentioned. Sanitizers might check for null this too.

Just to cite what the others have told you: https://godbolt.org/z/bWfaYrqoY

    /app/example.cpp:10:6: runtime error: member call on null pointer of type 'Foo'
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.cpp:10:6 
    /app/example.cpp:3:8: runtime error: member call on null pointer of type 'Foo *'
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.cpp:3:8

    Foo *p = nullptr;
    p->bar();
That's undefined behavior. It's "allowed" in the sense of "yes it's possible to write, compile and run that code", but the language makes no guarantees about what the results will be. So maybe not the most useful definition of "allowed".
That code is outside the set of valid c++ programs
Only in so much that this being null is UB, but in the real world it very much can be null and programs will sometimes crash deep inside methods called on nullptr.
Supporting an explicit this is required to be able to access the member variable when a local variable hides it under scoping rules. As another person replied, C++ does indeed have an implicit this.
You must be joking.

That would be a complete redability disaster... at least for C++. Java peeps probably won't even flinch ;)

I think the author is talking about this:

  object->ops->start(object)
Where not only is it explicit, but you need to specify the object twice (once to resolve the Vtable, and a second time to pass the object to the stateless C method implementation).
Nothing a little macro magic couldn't fix..

  #define CALL(object, function, ...) (object->ops->function(object, __VA_ARGS__))
You can also just replicate the vtable of sorts in C that keeps track of things when new objects are created.
You can just use C++ instead of reinventing a worse version of C++98!
It would be an improvement on C++ since you would avoid the horrendous error messages from ridiculously long types, slow compile times, exception handling nightmares, bloat from RTTI and constant worry that operators might not mean what you expect. That is before even mentioning entirely new classes of problems like the diamond problem. Less is more.

That said, similar macro magic is used in C generic data structures and it works very well.

What guarantee do you have that ->ops is a vtable? It could contain function pointers that don’t take an implicit this argument like struct file_operations in Linux does. It could also contain variables that are non-pointers. Neither is allowed in a vtable, but both are fine to do in C.
As it isn't the compiler that creates the vtable, you can also have the equivalent of this as the last parameter or where you want it to be.

> It could also contain variables that are non-pointers.

The convention of it being a pure vtable is that it just doesn't.

> Neither is allowed in a vtable

Who is the vtable membership authority? :-)

You can take the API for a linked list and implement a balanced binary search tree behind it, but continuing to call it a linked list after doing that is wrong. Similarly, you can do pointer indirections the same way a vtable would do them, but if the things you get are not the equivalent of member function pointers, it is not a vtable.
> In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding). (Wikipedia)

When its a table of function pointers used for dynamic dispatch, to me it's a vtable. I don't care about their type signatures as long as they logically belong to the object in question.

You seem to have a different very narrow definition of vtables, so the discussion is kind of useless.

Read the definition again. The programming language here is not the one using this. It is the programmer using it.
Yes I know. From the caller that might seem to be redundant, my argument was about the callee's side. Also it is not truely redundant, as you can write:

    object1->op->start(object2)
    superclass->op->start(object)
I think both of these invocations are invalid. Using object1's vtable methods on object2, obviously, but in the latter case: the vtable method should just point at the superclass impl, if not overridden. And if overridden and the child impl needs to call the superclass, it can just do so without dispatching through some vtable.
When you think of vtables as unique or owned by an object, then these example seem weird to you. When you think of them as orthogonal to your types/objects, these examples can be useful.

In the first example, object1 and object2 can very much be of the same type or compatible types/subtypes/supertypes. Having vtables per object as opposed to per class to me indicates, that it IS intended to modify the behaviour of an object by changing it's vtable. Using the behaviour of another object of the same type to treat the second object, seams valid to me.

In the second case, it's not about the child implementation dispatching to the superclass, it's about some external code wanting to treat it as an object of the supertype. It's what in other languages needs an upcast. And the supertype might also have dynamic behaviour, otherwise you of course wouldn't use a vtable.

I think it is wrong/weird for objects of the same type to have different vtables, yes. I would call those different types.

Upcasting is fine, but generally speaking the expected behavior of invoking a superclass method on an object that is actually a subclass is that the subclass method implementation is used (in C++, this would be a virtual/override type method, as opposed to a static method). Invoking a superclass-specific method impl on a subclass object is kind of weird.

In most languages, this is not possible, because they abstract over the implementation of classes. In C it is so you can be more creative. You can for example use it instead of a flag for behaviour. Why branch on a variable and then call separate methods, when you can simply assign the wanted implementation directly. If you want to know, which implementation/mode is used, comparing function pointers and scalar variables amounts to the same. It is also an easy way to get a unique number. When all implementations of that can operate on the same type, they are interchangeable.

In C you can also change the "class" of an instance as needed, without special syntax. Maybe you need to already call a method of the new/old class, before/after actually changing the class type.

> is that the subclass method implementation is used

The entire point of invoking the superclass method is, because the subclass has a different implementation and you want to use the superclass implementation.

"Orthogonal" vtables are essentially traits/typeclasses.
What I really like about C is that it supports these sophisticated concepts without having explicit support for them. It just naturally emerges from the core concepts. This is what makes it feel like it just doesn't restrict the programmer much.

Maybe it's a bit due to its evolution. It started with a language that should have all features every possible, that was to complicated to be implemented at the time. Then it was dumbed down to a really simple language. And then it evolved along side a project adding the features, that are truly useful.

You are wrong about those invocations being invalid. Such patterns happen in filesystem code fairly often. The best example off the top of my head is:

error = old_dir->i_op->rename(rd->new_mnt_idmap, old_dir, old_dentry, new_dir, new_dentry, flags);

https://github.com/torvalds/linux/blob/master/fs/namei.c#L51...

That is a close match for the first example, with additional arguments.

It is helpful to remember that this is not object oriented programming and not try to shoehorn this into the paradigm of object oriented programming. This is data abstraction, which has similarities (and inspired OOP), but is subtly different. Data abstraction does not automatically imply any sort of inheritance. Thus you cannot treat things as necessarily having a subclass and superclass. If you must think of it in OOP terms, imagine that your superclass is an abstract class, with no implemented members, except you can instantiate a child class that is also abstract, and you will never do any inheritance on the so called child class.

Now, it is possible to implement things in such a way where they actually do have something that resembles a subclass and a superclass. This is often done in filesystem inode structures. The filesystem will have its own specialized inode structure where the generic VFS inode structure is the first member and thus you can cast safely from the generic inode structure to the specialized one. There is no need to cast in the other direction since you can access all of the generic inode structure’s members. This trick is useful when the VFS calls us via inode operations. We know that the inode pointer is really a pointer to our specialized inode structure, so we can safely cast to it to access the specialized fields. This is essentially `superclass->op->start(object)`, which was the second example.

Data abstraction is a really powerful technique and honestly, object oriented programming rarely does anything that makes me want it over data abstraction. The only thing that I have seen object oriented programming do better in practice than data abstraction is marketing. The second example is similar to C++’s curiously recurring template pattern, which adds boilerplate and fighting with the compiler with absurdly long error messages due to absurdly verbose types to achieve a result that often at best is the same thing. On top of those headaches, all of the language complexity makes the compile times slow. Only marketing could convince someone that the C++ OOP way is better.

I don't agree that your example pattern matches to the example I'm complaining about. vfs_rename() is using old_dir's vtable on old_dir. The vtable matches the object.
It is not a vtable. It is a structure of function pointers called struct inode_operations. It is reused for all inodes in that filesystem. If you get it from one callback, you can safely use it on another struct inode on the same filesystem without a problem because of that, because nobody uses this like a vtable to implement an inheritance hierarchy. There are even functions in struct inode_operations that don’t even require the inode structure to be passed, such as ->readlink, which is most unlike a vtable since static member functions are never in vtables:

https://www.kernel.org/doc/html/latest/filesystems/vfs.html

As I said previously, it is helpful to remember that this is not object oriented programming and not try to shoehorn this into the paradigm of object oriented programming. Calling this a vtable is wrong.

> Also explicit this eliminates the problem, that you don't know if the variable is an instance variable or a global/from somewhere else.

People typically use some kind of naming convention for their member variables, e.g. mFoo, m_Foo, m_foo, foo_, etc., so that's not an issue. I find `foo_` much more concise than `this->foo`. Also note that you can use explicity this in C++ if you really want to.

In code I write, I can know what variables mean. The feature loses its point, when it's not mandatory. Also being explicit allows you to be more expressive with variable name and ordering.
The implicit this sounds to me like magic. Magic!

Ask how do I do this, well see it's magic. It just happens.

Something went wrong? That's also magic.

After 40 years I hate magic.

I don't quite agree, especially because the implicit this not only saves you from explicitly typing it, but also because by having actual methods you don't need to add the struct suffix to every function.

    mystruct_dosmth(s);
    mystruct_dosmthelse(s);
vs

    s->dosmth();
    s->dosmthelse();
My problem with implicit this is more, that you can access member variables, without it being explicit, i.e. about the callee, not about the caller.

For the function naming, nothing stops you from doing the same in C:

   static dosmth (struct * s);
   
   s->dosmth = dosmth;
That doesn't stop you from mentioning s twice. While it is redundant in the common case, it isn't in every case like I wrote elsewhere. Also this is easily fixable as written several times here, by a macro, or by using the type directly.
This is not the same, you introduced dynamic function resolution (i.e.a function pointer tied to a specific instance), we are talking about static function resolution (purely based on the declared type).
True, if you don't trust the compiler to optimize that, then you must live with the C naming.
You can also get clever with macros.
...and C++ added explicit this parameters (deducing this) in C++23.
“this” is a reserved keyword in C++, so you do not need to worry about it being a global variable.

That said, I like having a this pointer explicitly passed as it is in C with ADTs. The functions that do not need a this pointer never accidentally have it passed from the developer forgetting to mark the function static or not wanting to rewrite all of the function accesses to use the :: operator.

It’s not about ‘this’ being a global, it’s if you see ‘i++’ in code it’s not obvious if ‘i’ is a member or not without having to check context.
If you see "i++" in code and you don't have any context about what "i" is, then what difference does it make if "i" is a member variable, global variable, parameter, etc etc...

If all you see in code is a very tiny 3 character expression, you won't be able to make much of a judgement about it to begin with.

Not allowing a variable to implicitly refer to a member variable makes it much easier to find. If it is not declared in the function and there is no implicit dereferencing of a this pointer, the variable is global. If the variable name is commonly used and it is a member variable, it is a nightmare to hunt for the correct declaration in the codebase with cscope.
Good point. I had misunderstood the previous comment as suggesting that this be passed to the member function as an explicit argument, rather than requiring dereferences of this be explicit. The latter makes far more sense and I agree it makes reasoning about things much easier.