| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by frde 1137 days ago
	I don't write C / C++ so I'm not too aware of what's going on there, but wouldn't someone that wants those features just switch to C++? Is there any reason to change C at this point?

5 comments

Someone 1137 days ago

Switching to C++ gets you lots of things that aren’t “C-like” (that, of course, is a vague term that, as this thread shows, people will disagree about, but I think there’s consensus that C++ has many features that aren’t C-like), and may get you subtle bugs because of small incompatibilities between C and C++. For example, sizeof('x') is 1 in C++, but >1 in C because 'x' is a char in C++ and an int in C.

https://en.wikipedia.org/wiki/Compatibility_of_C_and_C%2B%2B

nequo 1137 days ago

Why is

  sizeof('x')

equal to 4 if

  char letter = 'x';

  sizeof(letter)

is equal to 1, just like `sizeof(char)`? If `'x'` is represented as an `int` in C, shouldn't `letter` in this example also be represented as an `int`?

tialaramex 1137 days ago

No. The type of 'x' is int. It so happens on your platform (and most available systems today) sizeof(int) == 4.

The type of letter was explicitly char, and sizeof(char) == 1 by definition in C.

char letter = 'x'; is a type coercion. That literal is an integer, with the value 120 and then it's coerced to fit in the char type, which is an 8-bit integer of some sort (might be signed, might not, doesn't matter in this case).

gallier2 1136 days ago

People often forget that 'ab' or '1ert' multi-char immediate are allowed in C. They are almost unusable as they are highly un-portable (because of endianess issues between the front-end and the back-end).

tialaramex 1136 days ago

This is once in a while kinda useful, aside from the data layout issue for stuff like a FourCC.

Rust has e.g. u32.to_le_bytes() to turn an integer into some (little endian) bytes, but I don't know if there's a trivial way to write the opposite and turn b"1ert" (which is an array of four bytes) into a native integer.

Edited to Add: oh yeah, it has u32.from_le_bytes(b"1ert"). I should have checked.

nequo 1137 days ago

Does this mean that `word` in

  char *word = "xyz";

is a pointer to an array of four `int`s, `'x'`, `'y'`, `'z'`, and `'\0'`? When I evaluate

  sizeof(*word)

I do get 1 instead of 4, even though `*word` is pointing to `'x'`. Where are the remaining 3 bytes in memory?

jcranmer 1137 days ago

A char is 1 byte by definition. But the type of a character literal (the 'x' syntax) is not a char, but an int instead.

The C type system generally matters so little that the type of an expression has little relevance (sizeof is the most notable exception to that rule), which obscures this fact.

seba_dos1 1137 days ago

Not at all. There are no character literals in "xyz", this is a string literal and it's unrelated to what your parent was saying.

_kst_ 1137 days ago

word is of type char*, a pointer to a (single) object of type char.

The initializer means that the char object it points to happens to be the first (0th) element of an array containing 4 elements with values 'x', 'y', 'z', and '\0'.

Most manipulation of arrays in C is done via pointers to the individual elements, and arithmetic on those pointers. (Incrementing a pointer value yields a pointer to the next element in the array.)

For example, `sizeof word` gives you the size of the pointer object, but `strlen(word)` yields 3, because it calls a library function that performs pointer arithmetic to find the trailing '\0' that marks the end of the string. (A "string" in C is a data layout, not a data type.)

lucozade 1137 days ago

If you specifically type it as char * the it's a pointer to chars each of which has size 1.

kzrdude 1137 days ago

you'll have to understand the 'x' syntax and the "xyz" syntax as two different things. Different quotes.

nequo 1137 days ago

I know. But my understanding was that `"xyz"` is an array of characters so that these two would have the same representation in memory:

  char word[] = {'x', 'y', 'z', '\0'};  // sizeof(word) = 4, sizeof(*word) = 1
  char word[] = "xyz";                  // sizeof(word) = 4, sizeof(*word) = 1

What I did not realize was that the above two are not the same as this:

  char *word = "xyz";  // sizeof(word) = 8, sizeof(*word) = 1

_kst_ 1137 days ago

In the declaration

    char letter = 'x';

the initialization expression 'x', which for historical reasons is of type int in C, is implicitly converted to the type of the object. `letter` is a `char` because you defined it that way.

If you had written

    int letter = 'x';

that would be perfectly valid, and the conversion would be trivial (int to int).

It's just like:

    double x = 42;

`sizeof 42` might be 4 (sizeof (int)), but `sizeof x` will be the same as `sizeof (double)` (perhaps 8).

eMSF 1137 days ago

The type of the expression 'x' is int, not char (in C). The type of an expression consisting of a variable name is the type of the variable (as far as sizeof is concerned).

flohofwoe 1137 days ago

Many things that C++ added on top of C aren't actually improvements (I guess the most charitable thing that can be said about C++ is that it identifies and weeds out all the stupid ideas before they can make it into C).

ho_schi 1137 days ago

A usual programmer doesn’t need most features of C++ but there are many important:

Generic-programming with templates, a usable std::string, smart-pointers, references, the howl standard-library (streams, file access, containers, threads).

The controversial ones seem to be exceptions and classes. Exceptions affect programming flow, exception safety is very hard and the runtime costs are an issue depending on the environment. Class and inheritance are complicated feature, operator overloading is one of the best stuff I’ve seen. But I can understand why many programmers don’t want handle all the special rules involving classes.

flohofwoe 1136 days ago

This would add a lot of complexity into the C compiler and stdlib, and C would just end up as another C++. IMHO it is still important that a C compiler and stdlib can be written in reasonable time by an individual or small team.

And just one example (because it's one of my favourite topics where the C++ stdlib really failed):

std::string as a standard string type has so many (performance) problems that it is borderline useless. You can't simply use it anywhere without being very aware of the memory management implications (when and where does memory allocation and freeing happen - and std::string does its best to obscure such important details), and this isn't just an esoteric niche problem: https://groups.google.com/a/chromium.org/g/chromium-dev/c/EU...

nly 1136 days ago

Classes, which have constructors and destructors, simplify code immensely..so do copy and move constructors and operators.

pjmlp 1137 days ago

At least WG21 actually acknowledges solving security issues and UB problems are a real problem that needs to be sorted out.

Meanwhile, WG14, "not our problem ".

shrimp_emoji 1137 days ago

Trying to figure out whether it's well-defined to compare two pointers to different objects in memory...

...By reading the C++ standard: "it's well-defined"

...By reading the C standard: *self-referencing Zalgo text quoted by dozens of StackOverflow thread debates where no completely confident conclusion is ever reached, although 3/4 people way smarter than you think it's well-defined and blame observations to the contrary on compiler bugs which they've reported to all the major compiler maintainers with varying reception from said maintainers as to whether they agree that those are actually bugs, forcing you, at the end of the day, to realize that you should really be writing your compiler's C, not aspiring to a universal, platonic ideal of C*

uecker 1135 days ago

Why? I do not find the wording in the C standard less clear than the C++ wording (where the result is unspecified for unrelated pointers).

rwmj 1137 days ago

C++ drags in a ton of other baggage that a large enough number of programmers don't want.

trelane 1137 days ago

Especially when there's Go, Rust, etc. these days. There is so much legacy with C++ the language that it's pretty easy do do stuff subtly wrong if you're not rigidly careful and adhering to a style guide that forces you to only use the safer bits.

hbossy 1137 days ago

C and C++ are separate languages and lots of people don't like C++.

lou1306 1137 days ago

Maybe. But by reading the article one does get the impression that GCC devs (and C2X proposal authors) really like C++: the language is mentioned 16 times, and easily half of the features are lifted more or less as-is from there.

seba_dos1 1137 days ago

As a C programmer, it just seems like they finally took the minority of ideas added by C++ that were actually good ideas and added them back to C. Aside of `auto` which I'm ambivalent about (I think the only place where it's useful are macros) those all make perfect sense in context of C and I believe the only reason for C++ having them first is that C++ simply evolved faster.

klodolph 1137 days ago

It makes a lot of sense for the languages to be harmonized with each other. Differences like noreturn versus [[noreturn]] does nobody any favors. C++ has all these wacky things you can do with constexpr functions, and C is getting a VERY LIMITED version of this that only applies to constants, addressing a long-standing deficiency, where C provides only a way to define named constants as int type (using enum) or using macros, and you really want to be able to define constants as any type you like. The "const" qualifier doesn't do that, you see... it really means a couple different things, but the main one is "read-only", which is not the same as "constant".

steveklabnik 1137 days ago

One of the benefits, historically, to both languages is that they share a very large chunk of the language in common. It's therefore in their common interest to try and maintain that common subset wherever possible. The goal here (just to be clear, from my outside perspective) isn't to unify the languages, it's to ensure that stuff that's the same stays roughly the same. If the same code produces two different things, based on the language, that's unfortunate. Code that works in one but doesn't compile in the other is totally fine, of course.

frde 1137 days ago

I guess my question is: If you want `auto`, why put it in C instead of using C++ with no other C++ specific feature besides auto?

I get people don't like classes / templates / .. but there isn't any reason one has to use those.

masklinn 1137 days ago

> I guess my question is: If you want `auto`, why put it in C instead of using C++ with no other C++ specific feature besides auto?

Because they're orthogonal, and making function bodies less verbose with no loss of expressivity is nice, without needing to significantly alter the language?

Pretty much every C competitor has local type inference, and C actually needs more than most due to the `struct` namespace, typing `struct Foo` everywhere is annoying, and needing to typedef everything to avoid it is ugly.

Also C++ is missing convenient C features, like not needing to cast `void*` pointers, and designated initialisers were only added in C++20.

pmarin 1137 days ago

Type inference only make the code harder to read. You ended doing mental compiler work when you could just write the damm type.

And the people who say "I Just hover the variable in my IDE" It doesn't work in a terminal with grep, you can't hoved a diff file and not even github do the hover thing.

Combine that with the implicit type promotion rules of C. Have Fun.

masklinn 1137 days ago

> Type inference only make the code harder to read.

Nonsense.

> Combine that with the implicit type promotio rules of C. Have Fun.

This sort of trivial TI does not make that any worse. C is broken, it neither breaks nor unbreaks C.

hgs3 1137 days ago

typeof and auto are useful for writing type-generic macros.

pmarin 1137 days ago

Yeah but looking how auto has been abused in C++ I don't think it worth it.

circuit10 1137 days ago

I guess one possible reason is if there’s no C++ compiler for an obscure platform as it would be too much work, but there is an up-to-date C compiler

klodolph 1137 days ago

Yeah, seems like half of the embedded architectures are like this. Well, the C compiler is not quite standards compliant, and often made to an older version of the C spec, but give it time—the C2x standard comes out this year, and it may not benefit people in the embedded space until some years down the road.

Second-best time to plant a tree, and all.

klodolph 1137 days ago

You end up having to turn a lot of C++ features off in order to get the experience you want in certain environments. In an application running on a modern Windows/Linux/Mac system, it’s no big deal to use those features.

Some platforms also just don’t have C++ compilers. Yes, they still exist. You buy some microcontroller, download an IDE from the manufacturer's web site, and you get some version of C with a couple extensions to it. And then there are all the random incompatibilities between C and C++, where C code doesn’t compile as C++, or gives you a different result.

jcelerier 1137 days ago

What's "a lot of features" ? -fno-rtti, -fno-exceptions?

> Some platforms also just don’t have C++ compilers

It's not like they're going to have C23 compilers either

flohofwoe 1137 days ago

> It's not like they're going to have C23 compilers either

Niche compilers like SDCC (https://sdcc.sourceforge.net/) are actually keeping track of recent C language improvements quite well.

klodolph 1137 days ago

C++ has a lot of funny rules when it comes to constructors and initializers. It's easy to accidentally to do something unintended, and end up with code that relies on initialization order.

ori_b 1137 days ago

Yes, so why copy paste C++ into C?

pmarin 1137 days ago

I wish the C Standard Committe stopped smearing all C++ bullshit in to C. Now that many of the C++ people who promoted those features are abandoning the ship.

It's what you get when your C compilers are implemented in C++.

dale_glass 1137 days ago

Why "bullshit"? I looked at the article, and everything looks extremely reasonable, and desirable in C.

* nullptr: fixes problems with eg, va_arg

* better enums: Who doesn't want that? C is a systems language, dealing with stuff like file formats, no? So why shouldn't it be comfortable to define an enum of the right type?

* constexpr is good, an improvement on the macro hell some projects have

* unprototyped functions removed: FINALLY! That's a glaring source of security issues.

Really I don't see what's there to complain about, all good stuff.

hgs3 1137 days ago

> nullptr: fixes problems with eg, va_arg

nullptr is an overkill solution. The ambiguity could have been solved by mandating that NULL be defined as (void*)0 rather than giving implementations the choice of (void*)0 or 0.

cremno 1137 days ago

dmr would've approved of going a step further - only nullptr, no 0 and (void*)0:

>Although it would have been a bit of a pain to adapt,

>an '89 or '99 standard in which the only source representation

>of the null pointer was NULL or nil or some other built-in token

>would have had my approval.

https://groups.google.com/g/comp.std.c/c/fh4xKnWOQuo/m/IAaOe...

zajio1am 1137 days ago

Are there any mainstream implementation where NULL is not typed as (void *)? That seems like a choice that would cause so many problems (type warnings, va_arg issues), i wonder why would anyone do that.

peterfirefly 1137 days ago

Vintage code or code written by vintage coders.

Code written by C++ programmers.

Code written to be both C and C++.

ori_b 1137 days ago

No.

peterfirefly 1137 days ago

That would have been my preference as well. Either force it to be (void*)0 or, maybe, allow it to be 0 iff it has the same size and parameter passing method.

oblio 1137 days ago

> Really I don't see what's there to complain about, all good stuff.

It's called "change" and people don't like it.

quelsolaar 1137 days ago

constexpr is terrible.

-constexpr is not anything like constexpr in C++. -It makes no guarantees about anything being compile time. -It in no way reflects the ability of the compiler to make something compile time. -It adds implementation burden by forcing the implementations to issue errors that do not reflect any real information. (For instance you may get an error saying your constexpr isnt a constant expression, but if you remove the constexpr qualifier, then the compiler can happily solve it as a constant expression) -All kinds of floating point issues.

We should not have admitted this in to the standard, please do not use.

nullptr is the third definition of null. one should be enough, two was bad. why three?

dale_glass 1137 days ago

Well, that's interesting.

Got any more information on that? Why does it fail in that way? Is that an implementation or a specification problem?

quelsolaar 1137 days ago

Im in the WG14 so ive been involved in the discussions.

It fails for 2 reasons:

In order to make it easy to implement it had to be made so limited, that it in no way useful.

The second reason, and the real killer, is the "as if" rule. It states that any implementation can do what ever it wants, as long as the output of the application is the same. This means that how a compiler goes about implementing something is entirely up to the compiler. This means that any expression can be compile or execution time. You can even run the pre-processor at run time if you like! This enables all kinds of optimizations.

In reality, modern compilers like gcc, llvm and MSVC are far better at optimizing than what constexpr permits. However since the specification specifies exactly what can be in a constexpr, the implementations are required to issue an error if a constexpr does something beyond this.

dale_glass 1137 days ago

Okay, so that's a good start, but I still don't get it.

> In order to make it easy to implement it had to be made so limited, that it in no way useful.

Such as?

> The second reason, and the real killer, is the "as if" rule.

Why is that a problem? It sounds like a benefit. It means that the optimization can't break anything, which to me is kind of the point.

flohofwoe 1136 days ago

"Modern compilers" still fail at:

    const int bla = 23;
    const int blub[bla] = { 0 };

(see: https://www.godbolt.org/z/hjessMhGK)

Isn't this exactly what constexpr is supposed to solve?

kats 1137 days ago

It has a lot of costs to add to the C language, even if it's just the increased complexity in the documentation, and doesn't effect c99. Every processor, OS, programming language needs used in business needs to fully support a C standard. So adding to C effects every processor and computer architecture, every new OS, every new language.

If you look at CPPreference you can see how much complexity has been added to the C standard in the last few years.

dale_glass 1137 days ago

What do those have to do with that? A processor has no need to know anything about constexpr, auto, or static_assert.

In fact I don't see anything that needs support anywhere but the actual compiler.

peterfirefly 1137 days ago

constexpr is also ridiculously simple to implement -- because the existing compilers already do something similar internally for all enumeration constants.

(Enumeration constants are the identifiers defined inside enum xxx {...})

flohofwoe 1137 days ago

...and most compilers also already silently treat compile time constant expressions like constexpr, an explicit constexpr just throws an error if the expression isn't actually a compile time constant.

myrmidon 1137 days ago

This is a completely unfair mischaracterization.

A lot of these ARE relevant and useful improvements to the C language itself; constexpr reduces the need for macro-constants (which is nice), ability to specify the enum storage type is often helpful and clean keywords for static_assert etc. are a good idea too.

And getting rid of the "void" for function arguments is basically the best thing since sliced bread.

Kranar 1137 days ago

> constexpr reduces the need for macro-constants

const is sufficient to eliminate the use of macro-constants with the exception of the use of such constants by the preprocessor itself (in which case constexpr is also inapplicable).

peterfirefly 1137 days ago

    #define DEF   (ABC+(GHI<<JKL_SHIFT))

Kranar 1137 days ago

Please make a point.

peterfirefly 1137 days ago

I did. This is exactly what we need for constexpr for.

pmorici 1137 days ago

I don’t understand why anyone would use the “auto” variable type thing. In my experience it makes it impossible to read and understand code you aren’t familiar with.

unwind 1137 days ago

Well, the obvious (?) reason is to type less, and also reduce the risk of doing the wrong thing and using a type that is (subtly) wrong and having values converted which can lead to precision loss.

Also it can (in my opinion, brains seems to work differently) lower the cognitive load of a piece of code, by simply reducing the clutter.

Sure it can obscure the exact type of things, but I guess that's the trade-off some people are willing to do, at least sometimes.

Something like:

    const auto got = ftell(fp);

saves you from having to remember if ftell() returns int, long, long long, size_t, ssize_t, off_t or whatever and in many cases you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.

If you want to do I/O (print the number) then you have to know or convert to a known type of course.

This was just a quick response off the top of my head, I haven't actually used GCC 13/C2x yet although it would be dreamy to get a chance to port some old project over.

gradstudent 1137 days ago

typing less sounds like a minor benefit to me and the downsides are major: auto makes code unintelligible to humans on casual inspection

#noauto

shrimp_emoji 1137 days ago

> you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.

No no no no nonononono. No!

Loose typing was a mistake. I think any sober analyst of C and C++ knows that. The languages have been trying to rectify it ever since.

But dynamic typing was an even bigger mistake. Perversely, it's one caused by a language not having a compiler that can type check the code, which C does.

I want to actually know what my code is doing, thanks. If you want "expressive" programs that are impossible to reason about, just build the whole thing in Python or JS. (And then pull the classic move of breaking out mypy or TypeScript half way in to development, tee hee.)

The only time `auto` is acceptable is when used for lambdas or things whose type is already deducable from the initializer, like `auto p = make_unique<Foo>()`.

unwind 1137 days ago

There is nothing "dynamic" about what I suggested. There is a real, concrete, static and compile-time known type at all times.

In this case it would be long. I fail to see the huge risk you're implying by operating upon a long-typed value without repeating the type name in the code.

    const auto pos_auto = ftell(fp);
    const long pos_long = ftell(fp);

I don't understand what you can do with 'pos_long' that would be dangerous doing with 'pos_auto' (again, disregarding I/O since then you typically have to know or cast).

shrimp_emoji 1137 days ago

> again, disregarding I/O since then you typically have to know or cast

Thank you for answering for me!

grumpyprole 1137 days ago

You are confusing dynamic types with type inference.

shrimp_emoji 1137 days ago

I'm not, although it apparently came off that way.

I meant that, to a person reading the code, `auto` tells you about as much about the type you're looking at as no type at all (like in a dynamically typed language).

This chatter said it better: https://news.ycombinator.com/item?id=35814337

dxuh 1137 days ago

There is this guideline of "Almost Always Auto" (https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style...) and I have been following it for yeears both in my job and my personal projects and I have never been very confused by it or had any sort of bug because of it. I felt very reluctant about using it at all for quite a while myself, but in practice it just makes stuff easier and more obvious. A huge reason it's useful in C++ is generic code (stuff that depends on template parameters or has many template parameters) or deeply nested stuff (typing std::unordered_map<std::string, MyGreatObject::SomeValue>::iterator gets annoying), but it's nice almost everywhere. Most types are repeated and getting rid of those repetitions makes refactorings a lot easier and gets rid of some source of bugs. For example sometimes you forget to change all the relevant types from uint32_t to uint64_t when refactoring some value and stuff breaks weirdly (of course your compiler should warn on narrowing conversions, but just to illustrate the point, because it is very real).

pmarin 1137 days ago

>For example sometimes you forget to change all the relevant types from uint32_t to uint64_t when refactoring some value and stuff breaks weirdly

Use size_t

KerrAvon 1137 days ago

`size_t` may not be helpful. You can argue that some other specific typedef should have been used in this case, but it's kind of water under the bridge already.

tom_ 1137 days ago

size_t is for object sizes. It might not be 64 bits.

dale_glass 1137 days ago

auto is at its best when you have something like:

    std::unordered_multimap<string, std::unordered_multimap<string, someclass>> getWidgets();

With templates you can easily have very unwieldy types, and there's not that much benefit from spelling them out explicitly.

Like any tool, there are good and bad uses of it. Well used, it removes unnecessary clutter and makes the code more readable.

HybridCurve 1137 days ago

I agree with you here. Many people might find it useful but this is something better suited for C++ which is full of nebulous typing features.

bonzini 1137 days ago

It makes sense to have it in macros. Though standard C still doesn't have statement expressions, so...

dzaima 1137 days ago

As an example of this, a generic `MAX` macro that doesn't evaluate its arguments multiple times, would be (using said GNU extension of statement expressions):

    #define MAX(A, B) ({ auto a = (A); auto b = (B); a>b? a : b; })

As-is, for such things I just use __auto_type, as it's already in the GNU-extension-land.

afdbcreid 1137 days ago

Do you really need to parenthesize the parameters? Is there something that can break the variable declaration into multiple statements?

kps 1137 days ago

Here, no. It's just a habit or common style guideline to always parenthesize macro parameters since so many macros can otherwise break.

dzaima 1137 days ago

Here, probably not (with proper arguments at least; without the parens something like `MAX(1;2, 3)` would compile without errors though), but I'm just used to doing it everywhere.

peterfirefly 1137 days ago

I wish ({...}) had been in C23.

uecker 1137 days ago

I agree, this is one of the more important common extensions we are still missing.

Timpy 1137 days ago

Isn't auto already a keyword for specifying scope? I know it's never used and kind of redundant, but something like `auto x = foo(y);` is a terrible direction for C. Type being abundantly clear at all times is a huge feature.

tyler569 1137 days ago

The accepted proposal does address and preserve `auto`'s use as a storage class, so `auto x = foo(y);` means type deduction based on the return type of `foo`, and `auto int x = 10;` declares an integer with automatic storage duration.

myrmidon 1137 days ago

It becomes tolerable by using a text editor that is too clever for his own good and fills the typeinformation back in as hint.

But I'm not a big fan either.

flohofwoe 1137 days ago

This gives a pretty good explanation why auto is useful in C:

https://thephd.dev/c23-is-coming-here-is-what-is-on-the-menu...

(TL;DR: it makes sense in macros)