The perils of transition to 64-bit time_t
The transition to 64-bit time_t in Gentoo is crucial to prevent 32-bit application failures by 2038, with proposed strategies addressing ABI compatibility and system stability challenges.
Read original articleThe transition from 32-bit to 64-bit time_t in systems like Gentoo poses significant challenges, particularly due to the potential for ABI (Application Binary Interface) breakage. As the year 2038 approaches, 32-bit applications will encounter failures when trying to access time-related functions, leading to critical errors. While other distributions like Musl and Debian have successfully adopted 64-bit time_t, Gentoo's source-based nature complicates the transition. The ABI change is particularly problematic because it requires all linked libraries to use the same type width, which can lead to runtime errors if mixed binaries are used. The article discusses historical context, comparing the transition to Large File Support, and outlines the risks associated with ABI changes, including potential security vulnerabilities. To mitigate these risks, the author suggests several strategies: changing the platform tuple to distinguish new ABIs, altering the library directory to allow for separate installations of old and new libraries, and introducing a binary-level ABI distinction to prevent incompatible binaries from linking. These measures aim to ensure a smoother transition while maintaining system stability and compatibility.
- Transitioning to 64-bit time_t is critical to avoid failures in 32-bit applications by 2038.
- ABI changes pose significant risks, including runtime errors and security vulnerabilities.
- Gentoo's source-based nature complicates the transition compared to binary distributions.
- Proposed solutions include changing platform tuples and library directories to manage ABI compatibility.
- The transition requires careful planning to minimize disruption and maintain system integrity.
Related
Y292B Bug
The Y292B bug is a potential timekeeping issue in Unix systems due to a rollover in the year 292,277,026,596. Solutions involve using dynamic languages or GNU Multiple Precision Arithmetic Library in C, emphasizing the need for kernel-level fixes.
Building Chromium at a distro? Here's your copium
Building Chromium on Linux faced challenges with M120 changes, leading to distros like Alpine, Arch, Gentoo, and Fedora using LLVM's libc++. Despite efforts, issues persist with GCC and libstdc++, emphasizing ongoing compatibility struggles.
Gentoo Linux Drops IA-64 (Itanium) Support – Gentoo Linux
Gentoo Linux will discontinue support for the IA-64 architecture due to lack of kernel and glibc support, minimal user interest, and will remove all related profiles by September 2024.
A Time Consuming Pitfall for 32-Bit Applications on AArch64
Running 32-bit applications on 64-bit AArch64 Linux requires separate GCC toolchains and proper configuration to avoid performance issues, particularly ensuring vDSO support for efficient system calls.
Overview of cross-architecture portability problems
Michał Górny discusses cross-architecture portability challenges between 32-bit and 64-bit systems, highlighting issues with memory allocation, file size limitations, and the Y2K38 problem affecting multiple programming languages.
- Several users discuss the complexities of ABI compatibility and the potential for breaking changes during the transition.
- Some suggest alternative strategies, such as using unsigned 32-bit time_t or static linking to mitigate issues.
- There are references to how other operating systems, like Mac OS X and FreeBSD, have handled similar transitions successfully.
- Concerns are raised about the burden on Gentoo developers and the potential for user frustration during the transition.
- Many commenters express a desire for clearer solutions or a more straightforward approach to the problem.
1. Allow building against packages without installing them. The core issue here is that Gentoo package build and install happen as a single step: one can't "build a bunch of things that depend on one another" and then "atomically install all the build items into place". This means that Gentoo can easily be partially broken when one is doing updates when an ABI change occurs (modulo so versioning, see next option). This issue with 64-bit time_t is an example of an ABI change that folks are very aware of and is very widespread. It's also an example of something that causes an ABI change that isn't handled by the normal `.so` versioning scheme (see next option).
2. Extend the normal `.so` versioning to capture changes to the ABI of packages caused by packages they depend on. Normally, every `.so` (shared object/library) embeds a version number within it, and is also installed with that version number in the file name (`libfoo.so.1.0.0`, for example, would be the real `.so` file, and would have a symlink from `libfoo.so` to tell the linker which `.so` to use). This shared object version is normally managed by the package itself internally (iow: every package decides on their ABI version number to enable them to track their own internal ABI breakages). This allows Gentoo to upgrade without breaking everything while an update is going on as long as every package manages their `.so` version perfectly correctly (not a given, but does help in many cases). There is a process in Gentoo to remove old `.so.x.y.z` files that are no longer used after an install completes. What we'd need to do to support 64-bit time_t is add another component to this version that can be controlled by the inherited ABI of dependencies of each `.so`. This is very similar in result to the "use a different libdir" option from the post, but while it has the potential to set things up to enable the same kinds of ABI changes to be made in the future, it's likely that fixing this would be more invasive than using a different libdir.
Instead, the OS and its SDK are versioned, and at build time you can also specify the earliest OS version your compiled binary needes to run on. So using this, the headers ensured the proper macros were selected automatically. (This is the same mechanism by which new/deprecated-in-some-version annotations would get set to enable weak linking for a symbol or to generate warnings for it respectively.)
And it was all handled initially via the preprocessor, though now the compilers have a much more sophisticated understanding of what Apple refers to as “API availability.” So it should be feasible to use the same mechanisms on any other platform too.
For Debian it was extremely painful. A few people probably burned out. Lots of people pointed to source-based distributions and said "they will have it very easy".
The amd64 architecture did have some interesting features that made this much easier than it might have been for other cpu architectures. One of which was the automatic cast of 32 bit function arguments to 64 bit during the function call. In most cases, if you passed a 32 bit time integer to a function expecting a 64 bit time_t, it Just Worked(TM) during the platform bringup. This meant that a lot of the work on the loose ends could be deferred.
We did have some other 64 bit platforms at the time, but they did not have a 64 bit time_t. FreeBSD/amd64 was the first in its family, back in 2003/2004/2005. If I remember correctly, sparc64 migrated to 64 bit time_t.
The biggest problem that I faced was that (at the time) tzcode was not 64 bit safe. It used some algorithms in its 'struct tm' normalization that ended up in some rather degenerate conditions, eg: iteratively trying to calculate the day/month/year for time_t(2^62). IIRC, I cheated rather than change tzcode significantly, and made it simply fail for years before approx 1900, or after approx 10000. I am pretty sure that this has been fixed upstream in tzcode long ago.
We did have a few years of whackamole with occasional 32/64 time mixups where 3rd party code would be sloppy in its handling of int/long/time_t when handling data structures in files or on the network.
But for the most part, it was a non-issue for us. Being able to have 64 bit time_t on day 1 avoided most of the problem. Doing it from the start was easy. Linux missed a huge opportunity to do the same when it began its amd64/x86_64 port.
Aside: I did not finish 64 bit ino_t at the time. 32 bit inode numbers were heavily exposed in many, many places. Even on-disk file systems, directory structures in UFS, and many, many more. There was no practical way to handle it for FreeBSD/amd64 from the start while it was a low-tier platform without being massively disruptive to the other tier-1 architectures. I did the work - twice - but somebody else eventually finished it - and fixed a number of other unfortunately short constants as well (eg: mountpoint path lengths etc).
If legacy dates were a concern one could shift the epoch by a couple of decades, or even reduce the time granularity from 1 second to 2 seconds. Each alternative has subtle problems of their own. It depends on the use case.
> Take this out and a UNIX Demon will dog your steps from now
> until the time_t's wrap around.
Obviously back in the 70s when that was written 2038 seemed unimaginably far in the future.> Musl has already switched to that, glibc supports it as an option. A number of other distributions such as Debian have taken the leap and switched. Unfortunately, source-based distributions such as Gentoo don’t have it that easy.
While I applaud their efforts I just think as a user I want to be over and done with this problem by switching to a non source-based distribution such as Debian.
struct {
int a;
time_t b;
int c;
};
With 32-bit time_t, the offset of c is 8. With the 64-bit type, it’s 12."Surely it's 16 not 12, as b needs to be 64-bit aligned, so padding is added between a and b? Which also kind of makes the point the author is trying to make stronger.
For what it's worth, on openbsd this problem is fixed, all architectures, even the 32bit ones, have a 64bit time_t.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_a... for more info.
I must say, mgorny's posts are always a treat for those who like to peer under the hood! (The fact that Gentoo has remained my happy place for years doesn't influence this position, though there's probably some correlation)
Then the linker and loader could error if two incompatible objects with different ABIs were attempted to be linked together.
For instance, I suspect you could have fields to denote the size of certain types. I guess DWARF has that... But DWARF is optional and sucks to parse.
Yet another is to freeze the time for those programs at the last 32-bit POSIX second. They would just appear to execute incredibly fast :). Of course some will still break, and it’s obviously not suitable for many use cases (but neither is running in the past), but some might be just fine.
Related
Y292B Bug
The Y292B bug is a potential timekeeping issue in Unix systems due to a rollover in the year 292,277,026,596. Solutions involve using dynamic languages or GNU Multiple Precision Arithmetic Library in C, emphasizing the need for kernel-level fixes.
Building Chromium at a distro? Here's your copium
Building Chromium on Linux faced challenges with M120 changes, leading to distros like Alpine, Arch, Gentoo, and Fedora using LLVM's libc++. Despite efforts, issues persist with GCC and libstdc++, emphasizing ongoing compatibility struggles.
Gentoo Linux Drops IA-64 (Itanium) Support – Gentoo Linux
Gentoo Linux will discontinue support for the IA-64 architecture due to lack of kernel and glibc support, minimal user interest, and will remove all related profiles by September 2024.
A Time Consuming Pitfall for 32-Bit Applications on AArch64
Running 32-bit applications on 64-bit AArch64 Linux requires separate GCC toolchains and proper configuration to avoid performance issues, particularly ensuring vDSO support for efficient system calls.
Overview of cross-architecture portability problems
Michał Górny discusses cross-architecture portability challenges between 32-bit and 64-bit systems, highlighting issues with memory allocation, file size limitations, and the Y2K38 problem affecting multiple programming languages.