Hacker News new | ask | show | jobs
by marcan_42 1535 days ago
The solution isn't to stop using rand(). The solution is to stop using newlib.

If you're doing your own custom memory management like this, you shouldn't even have a malloc implementation at all. Even newlib is too bloated for your use case. At this point, chances are you're using a trivial subset of the C library and it'd be easy to roll your own. You can import bits and pieces from other projects (I personally sometimes copy and paste bits from PDClib for this). In such a tight embedded project, chances are you don't even have threads; why even pull in that reentrancy code?

Freestanding C code with no standard library isn't scary. If you need an example, look at what we do for m1n1:

https://github.com/AsahiLinux/m1n1/tree/main/src

In particular, this is the libc subset we use (we do have malloc here, which is dlmalloc, but still not enough of libc to be worth depending on a full one):

https://github.com/AsahiLinux/m1n1/tree/main/sysinc

3 comments

I don’t think there’s any reason to be worried about a “bloated” library if you understand the components of that library. I’ve used Newlib in embedded projects without malloc and it’s fine. It’s easier and faster to check what parts of Newlib you are pulling in than to try and reimplement it on your own.

It’s a trap that people fall into to think that it’s easy to roll your own. It depends on what, exactly, you are rolling, what resources you have, etc. Maybe I need a couple math functions, a decent implementation of memset/memcpy/etc, and having the C library at my disposal gives me those things.

The idea that you throw out Newlib just because one function pulled in malloc seems unjustifiable to me.

On the other hand, I think it’s quite normal to want your own PRNG.

I don't know about newlib's library or code structure, but I've got a project I'm working on where I don't want a whole libc, but it's pretty easy to pull in bits and pieces of the FreeBSD libc; when I come across something else I need, I just add the C file it's in to my Makefile and that's usually enough. For the things that are self-contained, I don't need to build the environment libc expects, and if I try to include something that's not self-contained, the linker will yell about the missing symbols.
The main use case for Newlib is embedded systems, and Newlib supports more architectures than FreeBSD does. Newlib also includes a few assembly routines for functions like memcpy and memset.
Sure, my point is, the question isn't use newlib or write your own, you can probably pick pieces out of it, rather than using the whole thing, by avoiding their build system and just using their source. I just don't know if the code (or license) is structured for that.
If rand() gives you trouble who knows what else in the C library may give you more trouble. It’s not that hard to copy paste or even write your own implementation for basic parts of the C API, minus the allocation related things. For me I’ve found GitHub Copilot great for filling in basic stubs for C stdlib APIs.
> If rand() gives you trouble who knows what else in the C library may give you more trouble.

Ooh, I can answer this one! It turns out that I know what else in the C library can give you trouble. Newlib is open-source. For embedded projects, I find myself reading the source code to Newlib, and sometimes looking at the disassembly to double-check what version of a function I’m getting.

If I had gotten burned by having malloc included when I didn’t wanted (never happened to me personally), I would consider running "ar d" to remove it from the copy of the library I’m using. It’s pretty easy to modify static libraries.

> For me I’ve found GitHub Copilot great for filling in basic stubs for C stdlib APIs.

I might want a decent implementation of snprintf, might want some decent implementation of memcpy/memset/memmove. I’ve implemented all of these myself, but it’s a pain, and I generally would rather grab Newlib rather than trust Copilot.

I've been a fan of importing pdclib functions piece by piece as I need chunks of libc in deeply embedded situations. Simple enough that I can audit everything for the semantics of each functions dependencies without any rigamore, but without the sirens of yak shaving singing to me like when I get the urge to write libc stuff myself in most cases. Although the CC0 license is a bit unfortunate in some cases for source code.
> Although the CC0 license is a bit unfortunate in some cases for source code.

CC0 isn't a viral license, you have literally no obligations if you use it: no attribution, no license compatibility worries, no relicensing issues, not even a need to mention it at all.

The issues are deeper than a question of virality.

> Can I apply a Creative Commons license to software?

> We recommend against using Creative Commons licenses for software. Instead, we strongly encourage you to use one of the very good software licenses which are already available. We recommend considering licenses listed as free by the Free Software Foundation and listed as “open source” by the Open Source Initiative.

> Unlike software-specific licenses, CC licenses do not contain specific terms about the distribution of source code, which is often important to ensuring the free reuse and modifiability of software. Many software licenses also address patent rights, which are important to software but may not be applicable to other copyrightable works. Additionally, our licenses are currently not compatible with the major software licenses, so it would be difficult to integrate CC-licensed work with other free software. Existing software licenses were designed specifically for use with software and offer a similar set of rights to the Creative Commons licenses.

https://creativecommons.org/faq/#can-i-apply-a-creative-comm...

read further, its not counting CC0 as a CC license in that context:

> Also, the CC0 Public Domain Dedication is GPL-compatible and acceptable for software. For details, see the relevant CC0 FAQ entry

following that link: https://wiki.creativecommons.org/wiki/CC0_FAQ#May_I_apply_CC...

> May I apply CC0 to computer software? If so, is there a recommended implementation?

> Yes, CC0 is suitable for dedicating your copyright and related rights in computer software to the public domain, to the fullest extent possible under law. Unlike CC licenses, which should not be used for software, CC0 is compatible with many software licenses, including the GPL. However, CC0 has not been approved by the Open Source Initiative and does not license or otherwise affect any patent rights you may have.

The other issues still exist; not everything is GPLed.

Particularly the license has been an issue getting through legal in older companies where it might not be whitelisted. Particularly with their FAQ still saying not to use it for software.

picolibc seems nice enough
Newlib can be configured without reentrancy support and you can do that in a multi-tasking environment while still using their malloc provided you implement some locking callbacks. This saves a ton of static state for non-reentrant library functions you're unlikely to ever need and can replace with safe variants if you do need them.
When you are working in a tight embedded system, instead of fighting to slim down a library that is designed to scale to significantly larger systems, it makes much more sense to start from zero and add only what you need. You also shouldn't update your dependencies blindly, ever (even updating the compiler should be done with care).

The considerations are very different from, say, developing apps for an operating system. There is a gradient between those, and the software development considerations shift as you move along. Most people doing embedded development defer to libraries way too early - usually because they're either pure hardware people who haven't learned low-level firmware bring-up and rely on vendor tooling, or people from a higher level software background who find the idea of bare metal scary.

You can see an example of this gradient in the Asahi Linux boot chain:

- m1n1: bare metal, no threads (barebones SMP support for research only), statically linked, no device model, not readily portable, 64-bit only, assumes everything is an M1/Apple Silicon, embedded libc subset, vendored (C: dlmalloc, printf, decompression algorithms, libfdt) or git submodule (Rust: FAT32 implementation) dependencies, single-purpose NVMe & FAT32 implementations (no abstraction).

- u-boot: bare metal, no threads, statically linked, basic device model, portable, embedded libc subset, vendored dependencies, basic filesystem & block device abstraction.

- GRUB: runs on EFI environment, no threads, dynamically linked modules, portable, filesystem & block device abstractions, no actual modifications for Apple Silicon (we use vanilla bins)

- Linux kernel: bare metal, complex thread model, dynamically linked modules, rich device model, portable, embedded libc subset, vendored dependencies, rich filesystem and block device abstractions.

- Linux userspace: you know this.

Notice how none of the components before userspace use a full fat libc, and only have minimal dependencies which are carefully controlled - and this is on a system with gigabytes of RAM.

Personally, I would only use newlib on systems which are roughly equivalent, in terms of model/software stack, to a full desktop OS. That is, embedded systems with at least a threaded RTOS and a filesystem abstraction, possibly a BSD-style sockets layer. Anything smaller than that (no threads? lwip callback style sockets? no filesystem?), you're better off rolling your own.

Even with a threaded RTOS (assuming an embedded target like your typical MCUs) I still prefer going without dynamic allocation--it makes it much easier to reason about and you don't have to worry about heap fragmentation. When you have RAM you can count in kilobytes chances are you should be thinking about how and where you're using it.
There’s a footnote at the end of the article that that’s basically what happened. They upgraded one of their tool chains and it came with a version of newlib that had been compiled reentrant when the previous version hadn’t.
My immediate thought was to turn off the re-entrancy code, there has to be a #define switch somewhere you can use to disable it since you won't ever need that. That's the better solution, although using your own prng is also good.

Also, this seems to be suggesting they are in fact importing all of newlib with its malloc code onto their code which is...kind of a waste. Again, if they could throw that re-entrancy switch somewhere they'd probably eek out a KB or so for the code on their eeprom or w/e holds it. Again, the better solution isn't to keep using newlib with that shit turned on (lol) but then implement your own rand() which will take up a few more bytes in addition to the newlib rand() that you won't use, turning off the re-entrancy is all around the right solution, assuming they want to keep using newlib and won't roll their own (which is by far even better assuming they have the will for that and the time, although as you point out, you don't need a lot of the stdlib, basically most of stdio is out, str to int converters, and so on, it might be worth a future investment to have their own mini library).