Hacker News new | ask | show | jobs
by Jasper_ 2187 days ago
I don't think it's the best code I've ever seen, but a lot of these are surface-level complaints.

> * goto inside 3 nested for-loops.

C has no pattern for breaking out of multiple for loops at the same time. Other languages like Java and JavaScript introduced "break label;" to handle this edge case. goto is perfectly acceptable to break out of multiple loops.

> * new/delete, with no RAII

They use RAII, but it's not applicable here. In this case, the developers only want to allocate a temporary array when it needs to be resized. Since we don't want a large stack allocation (stack sizes are super small on consoles), using a delete/free to resize an array seems fine to me. Game developers have a long-seated distrust of std::vector, and for good reasons.

RAII is used for the WriteCondLock which you criticize in the next bullet point, so it's not like they were unaware of it. Just not the right tool for the job.

> * thread specific variables and locks (?)

You seem to be upset that they have code that uses locks at all? I don't really know what this bullet-point is saying, other than "I looked for 5 minutes and didn't understand the threading structure".

If I had to take an issue with this code, it's the lack of enums for e.g. iSimClass, despite it having an enum with definitions. That's the sort of stuff that's difficult to reason about and follow along with without having a mapping in my head. And it has no overhead, so why not do it?

https://github.com/CRYTEK/CRYENGINE/blob/6c4f4df4a7a092300d6...

3 comments

I've found that a lot of more junior engineers just want to rotely pattern match on one-size-fits-all "style" rules in their code review without fully understanding why those rules exist in the first place.

Like all rules, there's always exceptions. In a hot code path like the physics engine logic, you're going for performance optimizations (which often LOOK messy) and that means making slight compromises on readability. It's quite a bit different than the code implementing the business logic in your SaaS company.

> C has no pattern for breaking out of multiple for loops at the same time.

True, but it is cleaner to reorganize the code into several functions and use the return value to propagate across layers if needed. Performance should be the same.

> it's not applicable here

Why? If there is a new/delete pair anywhere, it should have been an object.

> Game developers have a long-seated distrust of std::vector, and for good reasons.

Which reasons? std::vector (and std::unique_ptr) is universally useful (unlike many other std data structures) and codegen should be on par as a new/delete pair.

If the std is completely broken in some of the console platforms they need to support (likely from what I hear here) then it can also be done with a custom heap array wrapper.

So I don’t see the reason why that shouldn’t have been an object.

> True, but it is cleaner to reorganize the code into several functions and use the return value to propagate across layers if needed.

I disagree. Having to jump to another function definition which is inline is a bigger mental block than following a goto. The large amount of arguments you'd need to pass might also be a barrier, as is the mental overhead of checking to see if this function might be called from elsewhere. But reasonable people can disagree on this point.

I suggest you try to clean up the function yourself and see if your function-ed version is in any way improved.

> Why? If there is a new/delete pair anywhere, it should have been an object.

The case we want is that we have a large array of pairwise collisions. This is a temporary array used inside the method. If we ever have more than this, we need a new (contiguous) array. Are you suggesting that the code should have done something like this?

template<typename T> class TempArray { T* storage; void resize(int n) { delete[] storage; storage = new T[n]; };

I mean, sure, it's a very minor cleanup. It changes like, two lines though, and makes us have to access the array through an awkward ->storage pointer. Not ideal IMO.

> codegen should be on par as a new/delete pair.

Here's modern MSVC. Let's play "spot the difference": https://gcc.godbolt.org/z/KqR-LN

I'm not even doing anything but allocating the vector / array. It took me 2 minutes to navigate to godbolt and type this in. Don't just say "should be"... test it yourself!

If you enable /Ox, the codegen basically drops to what you would expect: the vector version drops down essentially identical code to the new/delete (modulo a memset to enforce the clear to zero condition)

It is a good illustration of why debug stl builds are such hot garbage though...

It is a good illustration for why using the STL is not always a good idea: you can't blindly take the perf hit from that kind of overhead in a debuggable build of a game that you still want to run at reasonably interactive frame rates.
False, most standard libraries out there allow you to configure whether you want extra checks or not.
But at least in the case of MSVC that comfiguration option leads to binary incompatibilities that make it an all or nothing option for everything that gets linked together statically. And the checks are so heavy that the "all" option becomes unbearably slow quite quickly. If your project hits a reasonable size, you end up requiring some clever solutions.
> Having to jump to another function definition which is inline is a bigger mental block than following a goto.

Local lambdas are ideal for this.

> Are you suggesting that the code should have done something like this?

Yes, but you can manage the array inside too.

> I mean, sure, it's a very minor cleanup. It changes like, two lines though

The point is that TempArray can be reused everywhere. This is a typical class that many projects use (stack if small, heap is bigger than threshold).

> Here's modern MSVC. Let's play "spot the difference"

The optimizer has been asked to leave everything as it is, so that is the expected result.

BTW, MSVC is not what you should be using if you want performance.

> Don't just say "should be"... test it yourself!

I always test codegen for all abstractions I use! So I agree.

> Local lambdas are ideal for this.

The problem with lambdas, you can't mark their operator() with __forceinline or __attribute__((always_inline)) attributes. For this reason, when writing high-performance manually vectorized code, lambdas are borderline useless.

> MSVC is not what you should be using if you want performance.

Security and compatibility has higher priority. gcc and clang don't deliver their C runtime libraries with windows updates. Also, debugging and crash diagnostic is much easier with MSVC.

It's same on Linux BTW, only with gcc.

> when writing high-performance manually vectorized code

The point was not about manually vectorized loops in particular. Why is that a problem if you are manually doing it, though?

> Security and compatibility has higher priority.

In commercial games, not really.

As for "compatibility", I am not sure what you mean.

> gcc and clang don't deliver their C runtime libraries with windows updates.

AFAIK you can use Windows libraries just fine. No need for using a different libc.

> Also, debugging and crash diagnostic is much easier with MSVC.

AFAIK, Clang can produce debugging info that you can use with VS.

I don't work on the environment, but it is what I have read here.

> Why is that a problem if you are manually doing it, though?

Here’s couple examples: https://github.com/Const-me/DtsDecoder/blob/master/Utils/App... https://github.com/Const-me/SimdIntroArticle/blob/master/Flo... I would like to use lambdas instead of classes, but can’t, due to that defect of C++.

> In commercial games, not really.

In commercial games too. While they don’t care about security, they do care about compatibility and crash report diagnostics.

> As for "compatibility", I am not sure what you mean.

A windows update shouldn’t break stuff. A software or a game should run on a Windows released at least 10 years in the future.

> Clang can produce debugging info that you can use with VS.

According to marketing announcements. This page however https://clang.llvm.org/docs/MSVCCompatibility.html only says “mostly complete” and also “Work to teach lld about CodeView and PDBs is ongoing”. PDB support is not about VC compatibility, it’s about Windows compatibility really: the debugger engine is a component of OS, even of the OS kernel. WinDbg is merely a GUI client for that engine, visual studio is another one.

Overall, in my experience, the platform-default compilers cause the least amount of issues. On Windows this means msvc, on Linux gcc, on OSX clang. Technically gcc and clang are very portable. Practically, when you’re using a non-default toolset, you’re a minority of users of that toolset, e.g. the bugs you’ll find will be de-prioritized.

I'm still getting up to speed on C++ (& ASM) so forgive the ignorance - are the differences you're referring to all the cleanup code that the vector does in the destructor?

As a side note, you're not calling delete[] on the array code, but I suppose it makes little difference if you take the vector cleanup into account.

unfortunately MSVC produces hot garbage asm, clang and gcc produce pretty much the same asm output using your example.
> C has no pattern for breaking out of multiple for loops at the same time. Other languages like Java and JavaScript introduced "break label;" to handle this edge case. goto is perfectly acceptable to break out of multiple loops.

Yeah, that's not what it does. The code looks like this:

    if (pgeom->Intersect(pentlist[i]->m_parts[j].pPhysGeomProxy->pGeom, gwd,gwd+1, &ip, pcontacts)) {
            got_unproj:
            if (dirUnproj.len2()==0) dirUnproj = pcontacts[0].dir;
            t = pcontacts[0].t;     // lock should be released after reading t
            return t;
    }       
    for(int ipart=1;ipart<m_nParts;ipart++) if (m_parts[ipart].flagsCollider & pentlist[i]->m_parts[j].flags) {
            gwd[2].R = Matrix33(qrot); 
            gwd[2].offset = pos + qrot*m_parts[ipart].pos;
            gwd[2].scale = m_parts[ipart].scale;
            gwd[2].v = -dirUnproj;
            if (m_parts[ipart].pPhysGeomProxy->pGeom->Intersect(pentlist[i]->m_parts[j].pPhysGeomProxy->pGeom, gwd+2,gwd+1, &ip, pcontacts))
                    goto got_unproj;
            }
    }
Notice (a) the if statement on the same line as the for loop, and (b) the fact that the goto jumps out of a conditional inside a for loop inside of a conditional into a conditional just before the for loop.

EDIT: Just realised the poster is referring to a different function.

IMO this one is more horrifying.

That's in a different function, so I wasn't counting it, but sure. I mean, I would write it differently, but it's fairly readable. It's jumping to a common function epilogue once it finds something it can collide with. Seriously -- try reading through it rather than gawking at the goto.
It's funny watching people reply to you without knowing anything about your skill set and experience.

Jasper_ is the real deal, people.

I still don't know who he is, but just from the arguments brought up you can tell this guy has experience in this stuff.

Those blanket statements like "never use goto" and "always trust stl" generally make me wary. I started out with gwbasic once, writing horrible goto spaghetti code. When moving through Pascal and C I eventually learned the "never use goto" mantra and naively tried to follow it at all cost. After I while I eventually encountered the "breaking out of nested loops" problem and refrained from using goto since I didn't want to look like a complete dork. I think I ended up with a flag that was set in the inner loop and checked in the outer and a break in each loop. That's what you get from blindly following rules that others try to present you as god-given.

IMO it should be clear from the top of a for loop how many iterations it will run. A goto randomly inside the loop is akin to a side effect in a function. Sure it might be the easiest solution you can think of, and might even generate optimal code, but it's less readable and forces future maintainers to exert more effort to understand the code.
> IMO it should be clear from the top of a for loop how many iterations it will run.

How would you implement something simple like a lookup in an array? From this argument even a break in a normal for-loop would be bad since I wouldn't know anymore how many iterations it will take. So if you have a nested loop and a "goto end_outer_loop" that's perfectly fine.

Either that code is misleadingly indented and you copied one more closing brace than necessary, or you omitted the opening brace for the last if's body (which contains a single goto).
Good catch. I think I copied one more than necessary, and then messed up when trying to fix up the indentation for HN.