std::list is a linked list, so it makes sense that size() is linear. Only if you store the size separately, you get constant behavior (which the c++11 enforces), it's constant.
Your friend probably could've sped his code up even more by using a std::vector. Linked lists have very bad processor cache behavior (not allocated in continuous memory), there are not a lot of things a list does better than a vector.
Only if you prepend a lot, or insert in the middle of a list, it might have some advantages over a vector, even then, there're are better alternatives.
I have timed it on several implementations and posted (mostly on on reddit.com/r/cpp), but until you get several thousand items std::vector is just faster than std::list. Of course it varies from machine to machine, for example my 4th gen i7 could insert about 8,000 items before list finally got the same speed as inserting at the front of an std::vector and my AMD 8150 (The bulldozer thing way before ryzen) was about 3,500.
I did similar tests comparing linear scans through an std::vector and lookups in a std::map or std::unsorted_map, with similar results, many thousands of items are required before dumb vectors got slow.
My default rule is to use a std::array if the size is fixed or an std::vector if the size changes until I hit 1,000 items. Then look at the complexity notation of the operations I am doing and choose what containers to then to benchmark. Then I throw a couple microbenchmarks in the test suite and use whatever that tells me will be faster in my case. Sometimes I even go so far as to use ifdefs and typedefs to change the container for different builds.
In practice I use a lot of std::vectors and I sort them a lot. Sorted vectors are very fast in the general case. In the few workloads I have with large numbers of items, even with millions of items, I use std::sort and std::binary_search with std::vector because they can be much faster than other associative containers. You do need to use operator< intelligently and that means either already passing in an overload for the comparator function or making a simple container class that wraps these concepts the way that makes sense for your workload.
Amazing how often just replacing a std::vector with a gap buffer vector implementation will get both the benefits of vector storage combined with good insert/delete behavior (presuming locality of such operations which is often the case, especially when doing something that scans through the array).
I once had some code that I managed to speed up around 90%. It was returning a massive data structure that another function would then operate on. I sped it up by removing the data structure entirely, and instead adding a callback that would be called for each element of the data structure.
It's not always possible to do, and half the time it doesn't even help, but every now and then it's a good way to optimize.
Your friend probably could've sped his code up even more by using a std::vector. Linked lists have very bad processor cache behavior (not allocated in continuous memory), there are not a lot of things a list does better than a vector.
Only if you prepend a lot, or insert in the middle of a list, it might have some advantages over a vector, even then, there're are better alternatives.