October 14th, 2024

Self-referential variable initialization in C

The article discusses self-referential variable initialization in C, noting it leads to undefined behavior. It explores pointer usage, highlighting successful compilation of a test program with GCC 14.2.0.

Read original articleLink Icon
Self-referential variable initialization in C

The discussion revolves around self-referential variable initialization in the C programming language, particularly highlighting a key difference from the SDL Shader Language. In C, a variable cannot reference itself during its initialization, as demonstrated by the example `int x = x + 1;`, which leads to undefined behavior since the variable is not initialized at that point. The article notes that while this behavior is well-known, the author was intrigued by the implications of taking a pointer to the variable. The author suggests that using a pointer to the variable in its definition should be permissible. In C, unlike C++, implicit casting between `void*` and other pointer types is allowed, enabling the declaration `void *x = &x`. A test program was created to validate this, which compiled successfully with GCC 14.2.0 without warnings when using the appropriate flags. This exploration highlights the nuances of variable initialization and pointer usage in C.

- C does not allow self-referential variable initialization to prevent undefined behavior.

- Implicit casting between `void*` and other pointer types is permitted in C.

- The author tested self-referential pointer initialization, which compiled successfully.

- The discussion contrasts C's behavior with that of the SDL Shader Language.

- The findings were validated using GCC 14.2.0 with specific compilation flags.

Related

Weekend projects: getting silly with C

Weekend projects: getting silly with C

The C programming language's simplicity and expressiveness, despite quirks, influence other languages. Unconventional code structures showcase creativity and flexibility, promoting unique coding practices. Subscription for related content is encouraged.

Initialization in C++ is Seriously Bonkers Just Start With C

Initialization in C++ is Seriously Bonkers Just Start With C

Variable initialization in C++ poses challenges for beginners compared to C. C requires explicit initialization to prevent bugs, while C++ offers default constructors and aggregate initialization. Evolution from pre-C++11 to C++17 introduces list initialization for uniformity. Explicit initialization is recommended for clarity and expected behavior.

Undefined behavior in C is a reading error (2021)

Undefined behavior in C is a reading error (2021)

Misinterpretations of undefined behavior in C have caused significant semantic issues, risking program stability and making C unsuitable for critical applications. A clearer standard interpretation is needed for reliable programming.

Lesser known tricks, quirks and features of C

Lesser known tricks, quirks and features of C

The article explores lesser-known C programming features, including the comma operator, designated initializers, compound literals, and advanced topics like volatile qualifiers and flexible array members, highlighting their potential pitfalls.

The Strict Aliasing Situation Is Pretty Bad (2016)

The Strict Aliasing Situation Is Pretty Bad (2016)

Strict aliasing rules in C and C++ can hinder compiler optimizations and lead to undefined behavior when pointers are misused. Using the -fno-strict-aliasing flag and checking tools is recommended.

Link Icon 4 comments
By @Joker_vD - 3 months
It's allowed because it's in the spirit of C: both allowing and prohibiting it is almost equally easy, and it's also sometimes useful, so it's allowed. Now, there is a way [0] to statically check whether self-referential value initialization is well-founded or not but... it's kinda tricky to do and so, again in the C's spirit, such diagnostics is not required.

[0] https://www.cl.cam.ac.uk/~jdy22/papers/a-right-to-left-type-...

By @fanf2 - 3 months
The most common and idiomatic example that relies on a variable being in scope in its initializer is,

  struct something *foo = malloc(sizeof(*foo));
But that isn’t, strictly speaking, self-referential. Here’s one that is…

In BSD <sys/queue.h> there are a bunch of macros for various kinds of linked list.

https://man.freebsd.org/cgi/man.cgi?query=STAILQ_HEAD_INIT

https://cgit.freebsd.org/src/tree/sys/sys/queue.h

The STAILQ macros define a singly-linked list where the head is a pair of pointers (simplified slightly):

  struct stailq_head {
    struct stailq_elem *stqh_first;
    struct stailq_elem **stqh_last;
  }
• a pointer to the first element, which is NULL when the list is empty

• a pointer to the NULL pointer that terminates the list

The last pointer allows fast appends, O(1) instead of O(n).

When you initialize the head, the last pointer needs to point to the first pointer. The STAILQ_HEAD_INITIALIZER() macro does basically:

  struct stailq_head head = {
    NULL,
    &head.stqh_first,
  };
There, head refers to itself!

To append an element, the STAILQ_INSERT_TAIL() macro does basically:

  elem->next = NULL;
  *head.sthq_last = elem;
  head.sthq_last = &elem->next;
So normally the last pointer points into an element; the last pointer points into the head only in an empty list.
By @uecker - 3 months
The traditional use case are circular linked lists or similar data structures.

struct foo { struct foo *next; } x = { .next = &x };

By @fuhsnn - 3 months
For the compiler (or linker if global), it already has to prepare a space for the variable, the address is acquired from layout plan, not from referencing the variable at runtime, so the semantics is analogous to "fill number three in the third box."