I'm Not a Fan of Strlcpy(3)
strlcpy is debated for efficiency compared to strcpy and strncpy. For optimal performance, memccpy is suggested over strlcpy or strncpy. Dynamic allocation or mem* functions are preferred for string operations.
Read original articlestrlcpy(3) is often considered a safer alternative to strcpy(3) and strncpy(3) in OpenBSD. However, a critical view emerged when Ulrich Drepper rejected its inclusion in glibc due to inefficiency. The main issue lies in copying null-terminated strings efficiently. In cases where truncation is irrelevant, using strlcpy or strncpy is deemed inefficient. Instead, memccpy(3) is suggested for better performance. For scenarios where truncation matters, dynamic allocation or using strlen(3) and memcpy(3) is recommended over strlcpy. The article argues that the mem* functions are suitable for string operations, contrary to common misconceptions. While strlcpy lacks universal applicability, memccpy, memcpy, and strdup are favored for string manipulation. The author concludes that in most cases, strlcpy is not the optimal choice, advocating for the use of memccpy, memcpy, or strdup.
Related
How much memory does a call to 'malloc' allocate?
The malloc function in C allocates memory on the heap. Allocating 1 byte incurs an 8-byte overhead. Memory alignment may result in 16-24 bytes. Avoid small allocations for efficiency; consider realloc for extensions.
How much memory does a call to 'malloc' allocates? – Daniel Lemire's blog
The malloc function in C allocates memory on the heap. Allocating 1 byte may result in 16-24 bytes due to overhead. Avoid small allocations and focus on broader concepts for efficient memory management.
Some Tricks from the Scrapscript Compiler
The Scrapscript compiler implements optimization tricks like immediate objects, small strings, and variants for better performance. It introduces immediate variants and const heap to enhance efficiency without complexity, seeking suggestions for future improvements.
Designing a Better Strcpy
Saagar Jha explores challenges in enhancing strcpy in C, proposing strxcpy for efficient, null-terminated string copying with overflow indication. Comparison of strcpy variants reveals strscpy's functionality superiority but standardization absence. Jha notes original bug and C string handling complexities, emphasizing efficiency, safety, and standardization in strcpy evolution.
Malloc() and free() are a bad API (2022)
The post delves into malloc() and free() limitations in C, proposing a new interface with allocate(), deallocate(), and try_expand(). It discusses C++ improvements and emphasizes the significance of a robust API.
This holds even if the total length of the data in an interaction is not known ahead of time. E.g. an audio stream can be of indeterminate length, not known when the first byte is sent over the network, but each UDP packet has a well-determined length given in the header.
The length field can be made variable-size in a rather fool-proof way [1], allowing to economically represent both tiny and huge sizes.
(Zip files, WAD files, etc have that info at the very end, but this is because a file has a well-defined end before you start appending to it; fseek(fp, 0, SEEK_END) can't miss.)
[1]: http://personal.kent.edu/~sbirch/Music_Production/MP-II/MIDI...
If there are bugs with truncation in the resulting buffer, those are the program's bugs, and they existing before strlcpy(3) came into the picture.
strlcpy is a stopgap, whack-a-mole solution for buffer overflows. It is rationalized by the reasoning that it does not make the program less wrong, while (probably) making it more secure.
When truncation matters and you have a fixed size buffer, that buffer should be large enough in order for it to be justifiable to say that someone is misusing the application. Perhaps a tester trying to break it.
Nobody’s surname needs 128+ bytes. No reasonable URL for a firmware update download needs 4096 bytes.
If truncation matters, no, it does not always make sense to accept a gig of data and be ready for more. You can impose a limit. A violation of the limit is an error, treated like a case of bad input.
They're counted, zero terminated, ASCII or Unicode, and magic as far as I'm concerned.
Oh... And a string copy is an O(1) operation as it only breaks the copy on modification.
Edit: correct to O(1), thanks mort96
To be honest, every time I need to deal with strings in C I feel like I'm banging rocks together, regardless of approach. I try to avoid it at all costs.
It might also be that in some programs with different access patterns that doesn't happen and it makes sense to optimize for the slow case, sure, but the author should acknowledge that variability instead of being adamant on what's better, even to the point of calling "schizo" the solution it doesn't understand. In my experience the pattern of optimizing the fast path makes a lot of sense.
BTW, the strlcpy/"schizo" variant could stand some improvement: realloc() already copies the part of the string within the original size of the buffer, so you can start copying at that point. Also, once you know that the destination is big enough to receive a full copy of the source you can use good old strcpy(). Cargo cult and random "linters"/"static checkers" will tell you shouldn't, but you know that it's a perfectly fine function to call once you've ensured that its prerequisites are satisfied.
Do you want to copy and truncate, or just copy?
Within that, do you want to manage your own allocation, or do you want that abstracted?
There's too many decision points and tradeoffs to just neatly hide behind a single "one true function" for copying C strings.
Also not a huge fan of locale controls and wchar APIs :)
So refreshing to see a common-sense take in a world of shrill low-level programming alarmists.
https://github.com/uecker/noplate
(attention: this is experimental and incomplete for trying ideas and is subject to change.)
#define strlcpy strncpy
CUT TO:
"I'm not a fan of strlcpy(3)"
It's also pretty telling that every article that tries to explain how to safely copy or concat strings in C, like this one, only ever works with ASCII, no attempt whatsoever to handle UTF-8 and keep code points together, let alone grapheme clusters. No wonder almost all C software has problems with non-English strings...
[1] https://manpages.debian.org/testing/linux-manual-4.8/strscpy...
Related
How much memory does a call to 'malloc' allocate?
The malloc function in C allocates memory on the heap. Allocating 1 byte incurs an 8-byte overhead. Memory alignment may result in 16-24 bytes. Avoid small allocations for efficiency; consider realloc for extensions.
How much memory does a call to 'malloc' allocates? – Daniel Lemire's blog
The malloc function in C allocates memory on the heap. Allocating 1 byte may result in 16-24 bytes due to overhead. Avoid small allocations and focus on broader concepts for efficient memory management.
Some Tricks from the Scrapscript Compiler
The Scrapscript compiler implements optimization tricks like immediate objects, small strings, and variants for better performance. It introduces immediate variants and const heap to enhance efficiency without complexity, seeking suggestions for future improvements.
Designing a Better Strcpy
Saagar Jha explores challenges in enhancing strcpy in C, proposing strxcpy for efficient, null-terminated string copying with overflow indication. Comparison of strcpy variants reveals strscpy's functionality superiority but standardization absence. Jha notes original bug and C string handling complexities, emphasizing efficiency, safety, and standardization in strcpy evolution.
Malloc() and free() are a bad API (2022)
The post delves into malloc() and free() limitations in C, proposing a new interface with allocate(), deallocate(), and try_expand(). It discusses C++ improvements and emphasizes the significance of a robust API.