| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nabla9 2298 days ago
	> no such feature has emerged in practice Arrays with length constantly emerge among C users and libraries. They are just all incompatible because without standardization there is no convergence.

4 comments

simias 2298 days ago

I think the problem is that C is simply ill-suited for these "high level" constructs. The best you're likely to get is an ad-hoc special library like for wchar_t and wcslen and friends. Do we really want that?

I'd argue that linked list might make a better candidate for inclusions, because I've seen the kernel's list.h or similar implementations in many projects and that's stuff is trickier to get right than stuffing a pointer and a size_t in a struct.

rseacord 2298 days ago

Sounds like a good use of standardization. If there is existing implementation practice, please go ahead and submit a proposal. I would be happy to champion such a proposal if you can't attend in person.

nabla9 2298 days ago

It was an observation, not suggestion.

When the language standardization body has not managed to add arrays with length in 48 years, I don't think it should be added at this point. The culture is backward looking and incompatible with modern needs and people involved are old and incompatible with the future (no offense, so am I).

C standardization effort should focus on finishing the language, not developing it to match modern world. I have programmed with C over 20 years, since I was a teenager. It's has long been the system programming language I'm most familiar with. For the last 10 years I have never written an executable. Just short callable functions from other languages. Python, Java, Common Lisp, Matlab, and 'horrors or horrors' C++.

I think Standard C's can live next 50 years in gradual decline as portable assembler called from other languages and compilation target.

If I would propose new extension to C language, I would propose completely new language that can be optionally compiled into C and works side by side with old C code.

apotheon 2298 days ago

> If I would propose new extension to C language, I would propose completely new language that can be optionally compiled into C and works side by side with old C code.

There are a few somewhat popular languages that fit that description already, and none of them are suitable replacements for C (as far as I've seen). That's not to say there couldn't be a suitable replacement -- just that nobody in a position to do something about it wants the suitable replacement enough for it to have emerged, apparently.

I suspect the first really suitable complete replacement for C would be something like what Checked C [1] tried to be, but a little more ambitious and willing to include wholly new (but perhaps backward-compatible) features (like some of those you've proposed) implemented in an interestingly new enough way to warrant a whole new compile-to-C implementation. Something like that could greatly improve the use cases where a true C replacement would be most appreciated, and still fit "naturally" into environments where C is already the implementation language of choice via a piecemeal replacement strategy where the first step is just using the new language's compiler as the project compiler front end's drop-in replacement (without having to make any changes to the code at all for this first step).

1: https://www.microsoft.com/en-us/research/project/checked-c/

xtian 2298 days ago

Sounds like you are describing Zig. https://ziglang.org

apotheon 2296 days ago

I haven't looked at Zig too closely yet (only started just a few minutes ago), but it immediately appears to me that this violates one of the requirements I suggested, as demonstrated by this use-case wish from my previous comment:

> > using the new language's compiler as the project compiler front end's drop-in replacement (without having to make any changes to the code at all for this first step)

I'll look into Zig more, though. Maybe I'll like it.

---

I stand corrected, given my phrasing. I should have specified that it needs to also support incrementally adding the new language's features while most of the code is still unaltered C, rather than (for instance) having to suddenly replace all the includes and function prototypes just because you want to add (in the case of Zig) an error "catch" clause.

xtian 2296 days ago

You can use the Zig compiler to compile C with no modifications, and easily call C from Zig or Zig from C, so I'm not sure what more you're hoping for. A language that allows you to mix standard C and "improved C" in the same file sounds like a mess to me.

ATsch 2298 days ago

typedef struct {uint8_t *data; size_t len;} ByteBuf; is the first line of code I write in a C project.

mobilemidget 2297 days ago

Could you add some extra information why this is so helpful or handy to have? Think it will benefit readers that are starting out with C etc.

saagarjha 2297 days ago

In C, dynamically-sized vectors don’t carry around size information with them, often leading to bugs. This struct attempts to keep the two together.

GoblinSlayer 2297 days ago

Memory corruption in sudo password feedback code happened because length and pointer sit as unrelated variables and have to be manipulated by two separate statements every time like some kind of manually inlined function. For comparison putty slice API handles slice as a whole object in a single statement keeping length and pointer consistent.

kkdwivedi 2298 days ago

Another option is a struct with a FAM at the end.

  typedef struct {
      size_t len;
      uint8_t data[];
  } ByteBuf;

Then, allocation becomes

  ByteBuf *b = malloc(sizeof(*b) + sizeof(uint8_t) * array_size);
  b->len = array_size;

and data is no longer a pointer.

ATsch 2298 days ago

Well, your ByteBuf is still a pointer. You also now need to dereference it to get the length. It also can't be passed by value, since it's very big. You can also not have multiple ByteBufs pointing at subsections of the same region of memory.

Thing is, you rarely want to share just a buffer anyway. You probably have additional state, locks, etc. So what I do is embed my ByteBuf directly into another structure, which then owns it completely:

    typedef struct {
        ...
        ByteBuf mybuffer;
        ...
    } SomeThing;

So we end up with the same amount of pointers (1), but with some unique advantages.

kkdwivedi 2298 days ago

Right, totally depends on what you're doing. My example is not a good fit for intrusive use cases.

saagarjha 2298 days ago

sizeof(ByteBuf) == sizeof(size_t), and you can pass it by value; I just don't think you can do anything useful with it because it'll chop off the data.

kevin_thibedeau 2298 days ago

This will an alignment problem on any platform with data types larger than size_t. You'd need an alignas(max_align_t) on the struct. At which point some people are going to be unhappy about the wasteful padding on a memory constrained target.

benibela 2297 days ago

Why not typedef struct {uint8_t *data, dataend} ?

Makes it easier to take subranges out of it

orenht 2297 days ago

should be

  typedef struct {uint8_t *data, *dataend}

if I'm not mistaken :)

gkfasdfasdf 2297 days ago

What are the advantages of saving the end as a pointer? Genuinely curious. Seems like a length allows the end pointer to be quickly calculated (data + len), while being more useful for comparisons, etc.

benibela 2297 days ago

You can remove the first k elements of a view with data += k.

With the length you would need to do data += k; length -= k

Especially if you want to use it as safe iterator, you can do data++ in a loop

zoomablemind 2296 days ago

> ...You can remove the first k elements of a view with data += k.

How would you safely free(data) afterwards? You'd need to keep an alloc'ed pointer somehow.

gkfasdfasdf 2296 days ago

Got it. That is really neat, going to add to my bag of tricks...

benibela 2297 days ago

Right. I always think the pointer declaration is part of the type. (that is why I do not use C. Is there really a good reason for this C syntax?)

enriquto 2298 days ago

That's a really bizarre layout for your struct. Why don't you put the length first?

twic 2298 days ago

Why would it matter? The bytes aren't inline, this is just a struct with two word-sized fields.

A possible tiny advantage for this layout is that a pointer to this struct can be used as a pointer to a pointer-to-bytes, without having to adjust it. Although i'm not sure that's not undefined behaviour.

epr 2297 days ago

I don't think that's undefined behavior. That's how C's limited form of polymorphism is utilized. For example, many data structures behind dynamic languages are implemented in this way. A concrete example would be Python's PyObject which share PyObject_HEAD.

https://github.com/python/cpython/blob/master/Include/object...

ATsch 2298 days ago

I'm not sure if it matters. It might be better for some technical reason, such as speeding up double dereferences, because you don't need to add anything to get to the pointer. But to be honest I just copied it out of existing code.

saagarjha 2298 days ago

Most platforms have instructions for dereferencing with a displacement.

fulafel 2297 days ago

The "existing practice" qualification refers to existing compiler extensions I'd guess. Then lobbying about the feature should be addressed to eg LLVM and GCC developers.