The difference between undefined behavior and ill-formed C++ programs
The article explains the difference between undefined behavior and ill-formed programs in C++. It highlights the risks of ill-formed no diagnostic required programs and suggests tools for mitigation.
Read original articleThe article discusses the distinction between undefined behavior (UB) and ill-formed programs in C++. Undefined behavior occurs at runtime when a program executes actions that the language standard prohibits, allowing the program to behave unpredictably, including potentially invalidating previous operations. For example, if a function is called with parameters that avoid triggering UB, the program remains safe. In contrast, an ill-formed program violates syntactical or semantic rules, such as attempting to modify a constant variable. There are two types of ill-formed programs: those that require a diagnostic (the compiler must report an error) and those that do not (ill-formed no diagnostic required, IFNDR). The latter can lead to unpredictable behavior without any compiler warnings, as seen in cases where two translation units disagree on inline function definitions. This can result in erratic program behavior, including memory corruption. The article emphasizes the importance of understanding these concepts to avoid potential pitfalls in C++ programming, particularly with IFNDR, which can lead to fundamentally invalid programs. It also mentions tools like Visual Studio's command line options and defensive coding practices to help identify and mitigate these issues.
Related
How GCC and Clang handle statically known undefined behaviour
Discussion on compilers handling statically known undefined behavior (UB) in C code reveals insights into optimizations. Compilers like gcc and clang optimize based on undefined language semantics, potentially crashing programs or ignoring problematic code. UB avoidance is crucial for program predictability and security. Compilers differ in handling UB, with gcc and clang showing variations in crash behavior and warnings. LLVM's 'poison' values allow optimizations despite UB, reflecting diverse compiler approaches. Compiler responses to UB are subjective, influenced by developers and user requirements.
Weekend projects: getting silly with C
The C programming language's simplicity and expressiveness, despite quirks, influence other languages. Unconventional code structures showcase creativity and flexibility, promoting unique coding practices. Subscription for related content is encouraged.
The Byte Order Fiasco
Handling endianness in C/C++ programming poses challenges, emphasizing correct integer deserialization to prevent undefined behavior. Adherence to the C standard is crucial to avoid unexpected compiler optimizations. Code examples demonstrate proper deserialization techniques using masking and shifting for system compatibility. Mastery of these concepts is vital for robust C code, despite available APIs for byte swapping.
Don't use null objects for error handling
The article critiques using null objects for error handling in programming, arguing it misleads users and propagates errors. It advocates for immediate error handling and context-based strategies instead.
C Isn't a Programming Language Anymore (2022)
The article examines the shift in perception of C from a programming language to a protocol, highlighting challenges it poses for interoperability with modern languages like Rust and Swift.
> // Check the pointer before using it
> if (p != nullptr) p->DoSomething();
I love this example.
Importantly, UB is a runtime condition in the general case, in the sense that it cannot be statically determined (again, in the general case). It may depend on input data, or detecting it may amount to solving the halting problem.
A consequence of that is that UB cannot be caught by static tools in the general case without changing the language so that a large class of previously valid programs (i.e. not containing UB) become invalid.
I'm sure there's a good reason why this is hard, but I'm a little surprised that this isn't caught by static analysis. Sure enough, I can't get MSVC Code Analysis to complain about the example with different inline functions.
This seems incorrect as demonstrated by the other undefined behavior story I read on HN today[1], the tl;dr of which, as I understood it, is since UB is not allowed, the compiler can elide checks that would protect against UB for the sake of optimization since a correct program wouldn't have caused the UB in the first place and the compiler doesn't have to respect the semantics of incorrect programs.
Related
How GCC and Clang handle statically known undefined behaviour
Discussion on compilers handling statically known undefined behavior (UB) in C code reveals insights into optimizations. Compilers like gcc and clang optimize based on undefined language semantics, potentially crashing programs or ignoring problematic code. UB avoidance is crucial for program predictability and security. Compilers differ in handling UB, with gcc and clang showing variations in crash behavior and warnings. LLVM's 'poison' values allow optimizations despite UB, reflecting diverse compiler approaches. Compiler responses to UB are subjective, influenced by developers and user requirements.
Weekend projects: getting silly with C
The C programming language's simplicity and expressiveness, despite quirks, influence other languages. Unconventional code structures showcase creativity and flexibility, promoting unique coding practices. Subscription for related content is encouraged.
The Byte Order Fiasco
Handling endianness in C/C++ programming poses challenges, emphasizing correct integer deserialization to prevent undefined behavior. Adherence to the C standard is crucial to avoid unexpected compiler optimizations. Code examples demonstrate proper deserialization techniques using masking and shifting for system compatibility. Mastery of these concepts is vital for robust C code, despite available APIs for byte swapping.
Don't use null objects for error handling
The article critiques using null objects for error handling in programming, arguing it misleads users and propagates errors. It advocates for immediate error handling and context-based strategies instead.
C Isn't a Programming Language Anymore (2022)
The article examines the shift in perception of C from a programming language to a protocol, highlighting challenges it poses for interoperability with modern languages like Rust and Swift.