| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by glymor 2897 days ago

A reply in the thread from David Major about performance:

"At the moment, performance is a mixed bag. Some tests are up and some are down. In particular I believe Speedometer is down a few percent."

"Note however that clang-cl is punching above its weight. These builds currently have neither LTO nor PGO, while our MSVC builds use both of those. Any regressions that we're seeing ought to be short-lived. Once we enable LTO[1] and PGO[2], I expect clang to be a clear performance win."

[1] "LTO (Link Time Optimization) is a method for achieving better runtime performance through whole-program analysis and cross-module optimization." http://blog.llvm.org/2016/06/thinlto-scalable-and-incrementa...

[2] PGO = Profile Guided Optimization

2 comments

krasin 2897 days ago

LTO can be particularly handy when doing devirtualization. Our experience ([1]) with Chromium demonstrated that there's a fair amount of virtual methods which have exactly two implementation in the code: one for production and another one for tests. While linking production code, it's trivial for this optimization to spot that, replace a virtual call with regular invocation and then often inline and allow for other in-place optimizations. In many cases of the renderer (Blink), that gives 3%-7% speedup out of nowhere.

1. https://bugs.chromium.org/p/chromium/issues/detail?id=580389...

mehrdadn 2897 days ago

Out of curiosity, why use virtual calls in those cases? Why not just have two different implementation files -- one for tests, and one for production -- and choose one based on which one you're doing? It seems like a win all around.

fulafel 2897 days ago

There's always performance hacks to be found in big apps, in exchange for implementation complexity and man-hours. And there's always a next bottleneck to be found. Performnace bugs are subject to an extended form of survivorship bias where they become bigger when the preceding perf bottlenecks get fixed.

mehrdadn 2897 days ago

I've refreshed and seen three major revisions of your comment after writing a full reply to the first one, and I can't say I've been able to clearly follow where you're coming from and where you're going with any of them. I would suggest that if you feel the need for such drastic revisions, it may be worth reconsidering whether whatever you're trying to argue is really compelling.

slededit 2897 days ago

His advice is sage and you should look past whatever deficiencies you find in his form of communication. Simply put there's always bigger fish to fry, and virtual methods are a quick and easy way to implement test stubs. Especially when the compiler devirtualizes.

mehrdadn 2897 days ago

Well I had a response to that originally but his edits kind of changed how much sense my reply made in response. Here's what I had:

Performance is only one aspect of it. It also reduces code bloat, reducing the program's size footprint. Most tests (yes, I know, not all, but most) should not make it into the final binary users are running. I also don't see what's "hacky" about making a foo.test.cc file when I want an alternate implementation for foo.cc. It seems to be quite a positive and clear way to document the fact that an implementation is only needed for testing, and vice-versa. And not only that, but it reduces compile (& link) times, since you only need to compile one of the two implementations for each use case.

fulafel 2897 days ago

Sorry! That's a risk you have when following comments tightly :)

I'll reconstruct my first take from memory if you have comments on it:

Current CPUs are pretty good at predicting indirect branches, and it's hard to tell beforehand which virtual calls still turn out to be perf problems - and it's wasted effort and a fallible process to attempt their compllete elimination beforehand.

mehrdadn 2897 days ago

Right, and I posted my reply to that here: https://news.ycombinator.com/item?id=17504680

stinos 2897 days ago

It seems like a win all around.

Depends. Virtual calls can be done purely in code, with different implementation files you need to go round via the build system. The former just always works independent of platform whereas often for the latter it's more configuration work. Also, ignoring YAGNI, you could say the virtual one makes it easy to quickly test things out etc, is easily mockable, ...

Matthias247 2897 days ago

From my point of view a benefit for virtual calls is toolability. If you have an abstract interface, it’s pretty easy to either directly create mocks for it in the test file, or use sth like gmock. If you go through the link completely different implementation route, you might need to create and link a different library which contains the faked implementation for several test cases. It also works well with pure header code.

mehrdadn 2897 days ago

If I understand you correctly aren't you violating the initial assumption that there are only 2 implementations, one for production and one for testing? At which point, you're answering a different question...

krasin 2897 days ago

>Out of curiosity, why use virtual calls in those cases? Why not just have two different implementation files -- one for tests, and one for production -- and choose one based on which one you're doing? It seems like a win all around.

I don't fully understand the proposal here. Do you have an example of a code in the wild that uses the proposed scheme?

mehrdadn 2897 days ago

I'm just saying do conditional compilation instead of virtual dispatch, since you generally shouldn't really need both the test and production implementations to run inside the same program. So if you have reason to require

  // widget.h
  class Widget { virtual void throb(); };
  // widget.cc
  class WidgetImpl : public Widget { void throb() { ... } };
  class WidgetTest : public Widget { void throb() { ... } };

then, instead of that, just do

  // widget.h
  class Widget { void throb(); };
  // widget.cc
  void Widget::throb() { ... }
  // widget.test.cc
  void Widget::throb() { ... }

where you only compile widget.cc for the production build, and only compile widget.test.cc for the test build.

Feel free to post a more specific example and I can take a stab at seeing if I can make this transformation to it (or why it might not be possible, if you think it's not).

krasin 2897 days ago

Thank you for elaborating your proposal. It might work in a theoretical scenario, but unlikely to be practical for existing projects with millions lines of codes.

Essentially, it was easier to write a compiler / linker optimization, than change the source code. On the bright side, everyone wins!

mehrdadn 2897 days ago

> It might work in a theoretical scenario, but unlikely to be practical for existing projects with millions lines of codes.

I mean there's no reason you have to change everything by tomorrow. You could introduce it as a policy moving forward, the same way I'm sure you make any other changes to other patterns that need to be followed. I'm sure you have a process for this?

> Essentially, it was easier to write a compiler / linker optimization, than change the source code.

I agree, but see above and below.

> On the bright side, everyone wins!

Well, you waste space, time, and energy making the compiler and linker do work that doesn't inherently need to be done. You also lose parallelizability of the compilation of the two implementations if they currently reside in the same file. Finally the compiler (er, linker) might not be able to actually apply the devirtualizations you expect, whereas in this case it's just a matter of inlining since the target method is already known.

acqq 2897 days ago

In the performance-insensitive code the approach you defend could be "good enough" but for the code on the "hot path" the changes like the proposed by dataflow could result in the consistent improvement and are surely better long-term "win."

"We do it that way here" is not something that has to be always applied.

bluGill 2897 days ago

I have done things like that, but the one definition rule gets in the way.

    void DoSomething(widget* param, OtherClassWithvirtual* param2) {...}

My DoSomething function needs to link to widget, but now I have two different libraries that widget could be in. The linker encodes in my binary which library to load to get widget. All attempts to use the other library instead are undefined behavior.

The only way I know of to work around that is build DoSomething twice, linking the real widget once and the test widget the other. Note that DoSomething takes two parameters, and those are classes that may themselves also take other classes with virtual functions. All the possible cases quickly gets into a combinatoric explosion and my build times goes from an hour (already way too long) to days.

If I'm wrong please tell me how, I'd love to not have so many interfaces just because someone doesn't want their unit test to become a large scale integration test.

yzmtf2008 2897 days ago

What if I want both? What if in ClassATest, I need ClassBImpl? Do I go around creating more compile cases?

Because that’s what we need, more magic in the build system.

mehrdadn 2897 days ago

Not sure I see what you're thinking of... could you give me a code example?

pdpi 2897 days ago

The implementation of TestWidget only defines some stuff of its own but otherwise shares a bunch of code with Widget is a possible scenario here.

mehrdadn 2897 days ago

You can still derive classes when you need to, right?

yoklov 2897 days ago

One reason is that other test might need the production version. Building twice for that case is more of a pain and not really worth it.

mehrdadn 2897 days ago

I mean, then you can avoid it in that situation. Or actually use the production version like I explained here: https://news.ycombinator.com/item?id=17505358 Or find another way to use it. It's not all-or-nothing.

yoklov 2897 days ago

Yes that works, but relies on devirtualization in the production code, and isn't at all what you were suggesting before about having separate implementation files.

jeremyjh 2897 days ago

Have you seen this in use in a real project?

tzahola 2897 days ago

Or just use a macro?

hohenheim 2897 days ago

Yeah that would work but also make your code base a mess. Things that you can catch during build step and move in, or out, is much better handled at that level rather than leaking it into your code base.

simcop2387 2897 days ago

That's really cool. I've primarily seen LTO benefits on the more embedded side of things where it ends up enabling far more aggressive dead code elimination between code and libraries which makes it much easier to fit into small chips. I hadn't thought much about virtual methods with C++ code (uncommon to see those in embedded code to begin with).

kjeetgill 2897 days ago

Interesting. Do they collect profiles from Nightly users or do they have some internal characteristic workload?

gsnedders 2897 days ago

It's based on the perf tests, which I think are all public.