September 28th, 2024

The perils of transition to 64-bit time_t

The transition to 64-bit time_t in Gentoo is crucial to prevent 32-bit application failures by 2038, with proposed strategies addressing ABI compatibility and system stability challenges.

Read original articleLink Icon
ConcernFrustrationHope
The perils of transition to 64-bit time_t

The transition from 32-bit to 64-bit time_t in systems like Gentoo poses significant challenges, particularly due to the potential for ABI (Application Binary Interface) breakage. As the year 2038 approaches, 32-bit applications will encounter failures when trying to access time-related functions, leading to critical errors. While other distributions like Musl and Debian have successfully adopted 64-bit time_t, Gentoo's source-based nature complicates the transition. The ABI change is particularly problematic because it requires all linked libraries to use the same type width, which can lead to runtime errors if mixed binaries are used. The article discusses historical context, comparing the transition to Large File Support, and outlines the risks associated with ABI changes, including potential security vulnerabilities. To mitigate these risks, the author suggests several strategies: changing the platform tuple to distinguish new ABIs, altering the library directory to allow for separate installations of old and new libraries, and introducing a binary-level ABI distinction to prevent incompatible binaries from linking. These measures aim to ensure a smoother transition while maintaining system stability and compatibility.

- Transitioning to 64-bit time_t is critical to avoid failures in 32-bit applications by 2038.

- ABI changes pose significant risks, including runtime errors and security vulnerabilities.

- Gentoo's source-based nature complicates the transition compared to binary distributions.

- Proposed solutions include changing platform tuples and library directories to manage ABI compatibility.

- The transition requires careful planning to minimize disruption and maintain system integrity.

AI: What people are saying
The comments reflect a range of perspectives on the transition to 64-bit time_t in Gentoo and the challenges associated with it.
  • Several users discuss the complexities of ABI compatibility and the potential for breaking changes during the transition.
  • Some suggest alternative strategies, such as using unsigned 32-bit time_t or static linking to mitigate issues.
  • There are references to how other operating systems, like Mac OS X and FreeBSD, have handled similar transitions successfully.
  • Concerns are raised about the burden on Gentoo developers and the potential for user frustration during the transition.
  • Many commenters express a desire for clearer solutions or a more straightforward approach to the problem.
Link Icon 31 comments
By @codys - 7 months
There are a few options for Gentoo not discussed in the post, possibly because for Gentoo they would be a larger amount of work due to the design of their system:

1. Allow building against packages without installing them. The core issue here is that Gentoo package build and install happen as a single step: one can't "build a bunch of things that depend on one another" and then "atomically install all the build items into place". This means that Gentoo can easily be partially broken when one is doing updates when an ABI change occurs (modulo so versioning, see next option). This issue with 64-bit time_t is an example of an ABI change that folks are very aware of and is very widespread. It's also an example of something that causes an ABI change that isn't handled by the normal `.so` versioning scheme (see next option).

2. Extend the normal `.so` versioning to capture changes to the ABI of packages caused by packages they depend on. Normally, every `.so` (shared object/library) embeds a version number within it, and is also installed with that version number in the file name (`libfoo.so.1.0.0`, for example, would be the real `.so` file, and would have a symlink from `libfoo.so` to tell the linker which `.so` to use). This shared object version is normally managed by the package itself internally (iow: every package decides on their ABI version number to enable them to track their own internal ABI breakages). This allows Gentoo to upgrade without breaking everything while an update is going on as long as every package manages their `.so` version perfectly correctly (not a given, but does help in many cases). There is a process in Gentoo to remove old `.so.x.y.z` files that are no longer used after an install completes. What we'd need to do to support 64-bit time_t is add another component to this version that can be controlled by the inherited ABI of dependencies of each `.so`. This is very similar in result to the "use a different libdir" option from the post, but while it has the potential to set things up to enable the same kinds of ABI changes to be made in the future, it's likely that fixing this would be more invasive than using a different libdir.

By @eschaton - 7 months
The way this was handled on Mac OS X for `off_t` and `ino_t` might provide some insight: The existing calls and structures using the types retained their behavior, new calls and types with `64` suffixes were added, and you could use a preprocessor macro to choose which calls and structs were actually referenced—but they were hardly ever used directly.

Instead, the OS and its SDK are versioned, and at build time you can also specify the earliest OS version your compiled binary needes to run on. So using this, the headers ensured the proper macros were selected automatically. (This is the same mechanism by which new/deprecated-in-some-version annotations would get set to enable weak linking for a symbol or to generate warnings for it respectively.)

And it was all handled initially via the preprocessor, though now the compilers have a much more sophisticated understanding of what Apple refers to as “API availability.” So it should be feasible to use the same mechanisms on any other platform too.

By @grantla - 7 months
> A number of other distributions such as Debian have taken the leap and switched. Unfortunately, source-based distributions such as Gentoo don’t have it that easy.

For Debian it was extremely painful. A few people probably burned out. Lots of people pointed to source-based distributions and said "they will have it very easy".

By @darkhelmet - 7 months
Every time I see people struggling with this, I am so incredibly glad that I forced the issue for FreeBSD when I did the initial amd64 port. I got to set the fundamental types in the ABI and decided to look forward rather than backward.

The amd64 architecture did have some interesting features that made this much easier than it might have been for other cpu architectures. One of which was the automatic cast of 32 bit function arguments to 64 bit during the function call. In most cases, if you passed a 32 bit time integer to a function expecting a 64 bit time_t, it Just Worked(TM) during the platform bringup. This meant that a lot of the work on the loose ends could be deferred.

We did have some other 64 bit platforms at the time, but they did not have a 64 bit time_t. FreeBSD/amd64 was the first in its family, back in 2003/2004/2005. If I remember correctly, sparc64 migrated to 64 bit time_t.

The biggest problem that I faced was that (at the time) tzcode was not 64 bit safe. It used some algorithms in its 'struct tm' normalization that ended up in some rather degenerate conditions, eg: iteratively trying to calculate the day/month/year for time_t(2^62). IIRC, I cheated rather than change tzcode significantly, and made it simply fail for years before approx 1900, or after approx 10000. I am pretty sure that this has been fixed upstream in tzcode long ago.

We did have a few years of whackamole with occasional 32/64 time mixups where 3rd party code would be sloppy in its handling of int/long/time_t when handling data structures in files or on the network.

But for the most part, it was a non-issue for us. Being able to have 64 bit time_t on day 1 avoided most of the problem. Doing it from the start was easy. Linux missed a huge opportunity to do the same when it began its amd64/x86_64 port.

Aside: I did not finish 64 bit ino_t at the time. 32 bit inode numbers were heavily exposed in many, many places. Even on-disk file systems, directory structures in UFS, and many, many more. There was no practical way to handle it for FreeBSD/amd64 from the start while it was a low-tier platform without being massively disruptive to the other tier-1 architectures. I did the work - twice - but somebody else eventually finished it - and fixed a number of other unfortunately short constants as well (eg: mountpoint path lengths etc).

By @nobluster - 7 months
For a large legacy 32 bit unix system dealing with forward dates I replaced all the signed 32 bit time_t libc functions with unsigned 32 bit time_t equivalents. This bought the system another 68 years beyond 2038 - long after I'll be gone. The downside is that it cannot represent dates before the unix epoch, 1970, but as it was a scheduling system it wasn't an issue.

If legacy dates were a concern one could shift the epoch by a couple of decades, or even reduce the time granularity from 1 second to 2 seconds. Each alternative has subtle problems of their own. It depends on the use case.

By @seanhunter - 7 months
In the original BSD man pages, the "Bugs" section for "tunefs" had the famous joke "You can tune a file system, but you can't tune a fish." but according to "Expert C Programming"[1], the source code for this manpage had a comment next to the joke saying

   > Take this out and a UNIX Demon will dog your steps from now
   > until the time_t's wrap around. 
Obviously back in the 70s when that was written 2038 seemed unimaginably far in the future.

[1] https://progforperf.github.io/Expert_C_Programming.pdf

By @kccqzy - 7 months
My biggest takeaway (and perhaps besides-the-point) is this:

> Musl has already switched to that, glibc supports it as an option. A number of other distributions such as Debian have taken the leap and switched. Unfortunately, source-based distributions such as Gentoo don’t have it that easy.

While I applaud their efforts I just think as a user I want to be over and done with this problem by switching to a non source-based distribution such as Debian.

By @n_plus_1_acc - 7 months
I'm no expert on C, but I was under the impression that type aliases like off_t are introduced to have the possibility to change then later. This clearly doesn't work. Am I wrong?
By @mhandley - 7 months
"Let’s consider a trivial example:

  struct {
      int a;
      time_t b;
      int c;
  };
With 32-bit time_t, the offset of c is 8. With the 64-bit type, it’s 12."

Surely it's 16 not 12, as b needs to be 64-bit aligned, so padding is added between a and b? Which also kind of makes the point the author is trying to make stronger.

By @Netch - 7 months
All this suggests the insane Windows time (64 bits of 100ns periods since 01.01.1601 00:00 GMT as it would have been Gregorian) sometimes has it small advantages - both an excellent discretion and will work even the whole galaxy will be conquerred... ;))
By @somat - 7 months
Just pull the bandaid off, it will hurt a little, but not as bad as this endless agonizing about the problem.

For what it's worth, on openbsd this problem is fixed, all architectures, even the 32bit ones, have a 64bit time_t.

http://www.openbsd.org/55.html

By @wallstprog - 7 months
Briefly mentioned elsewhere in the comments, but C++11 had a similar issue around the transition from a copy-on-write (COW) to a small-string-optimization (SSO) implementation for std::string. If any type is more ubiquitous than std::string, I don't know what it could be, but the transition was reasonably painless, at least in my shop.

See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_a... for more info.

By @BoingBoomTschak - 7 months
A very thoughtful way of handling a problem much trickier than the earlier /usr merge. Had thought about 2 but not 1 and 3. I also had kinda forgotten that 2038 thing, time sure is flying!

I must say, mgorny's posts are always a treat for those who like to peer under the hood! (The fact that Gentoo has remained my happy place for years doesn't influence this position, though there's probably some correlation)

By @wpollock - 7 months
I'm probably naive, but I see another way forward. If all the t64 binaries are static linked, they have no danger of mixing abis. After 2038, all the t32 code is broken anyway so there's no additional risk for going back to dynamic linking then. I feel if this was a solution the author would have mentioned it but I'm willing to look foolish to hear what others will say.
By @loeg - 7 months
The C standard does not require time_t to be signed (nor does POSIX). Just changing the 32-bit type to unsigned would (in some senses) extend the lifetime of the type out to 2106. You could at least avoid some classes of ABI breakage in this way. (On the other hand, Glibc has explicitly documented that its time_t is always signed. So they do not have this option.)
By @ndesaulniers - 7 months
I think we should start putting details of the ABI in ELF (not the compiler flags as a string). Wait.. Who owns the ELF spec??

Then the linker and loader could error if two incompatible objects with different ABIs were attempted to be linked together.

For instance, I suspect you could have fields to denote the size of certain types. I guess DWARF has that... But DWARF is optional and sucks to parse.

By @dark-star - 7 months
Are there other distros where the switch to 64-bit time_t has already happened? What's the easiest way to figure out whether $distro uses 32-bit or 64-bit time_t? Is there something easier/quicker than writing a program to print `sizeof(struct stat)` and check if that's 88, 96 or 1098 bytes (as hinted in the article)?
By @fred_is_fred - 7 months
Besides epoch time and the LFS support mentioned, are there any other 32-bit bombs waiting for Linux systems like this?
By @panzi - 7 months
The only place where this is relevant for me is running old Windows games via wine. Wonder how wine is handling this? Might as well re-map the date for 32bit wine to the late 90s/early 2000s, where my games are from. Heck, with faketime I can do that already, but don't need it yet.
By @ddoolin - 7 months
Why was it 32 bits to begin with? Wasn't it known that 2038 would be the cutoff?
By @rini17 - 7 months
Hope Gentoo alone won't get overburdened with the development and maintenance of such hybrid toolchain, to be able to compile and run time32 processes alongside time64. When everyone else has moved on. Better do a complete reinstall from stage3.
By @kazinator - 7 months
All those arguments apply to off_t also. If a small-file executable with 32 bit off_t uses a large-file library with 64 bit off_t, there is no protection. Only glibc has the multiply implemented functions, selected by header file macrology.
By @malkia - 7 months
Microsoft's famous (infamous?) "A" / "W" functions and macros controlling them -... but it works and one can link two different versions no problem. Wonder if that was possible here?
By @ok123456 - 7 months
What about making time_t 64-bit across all profiles and incrementing all the profile versions by one? Gentoo users are used to breaking changes when upgrading profiles.
By @shmerl - 7 months
LFS should have become mandatory.
By @layer8 - 7 months
> The second part is much harder. Obviously, as soon as we’re past the 2038 cutoff date, all 32-bit programs — using system libraries or not — will simply start failing in horrible ways. One possibility is to work with faketime to control the system clock. Another is to run a whole VM that’s moved back in time.

Yet another is to freeze the time for those programs at the last 32-bit POSIX second. They would just appear to execute incredibly fast :). Of course some will still break, and it’s obviously not suitable for many use cases (but neither is running in the past), but some might be just fine.

By @M95D - 7 months
I belive this problem is irrelevant. 32bit platforms are mostly unsupported, with Gentoo doing eforts to keep them working and mostly failing. Until 2038, 32bit systems will be as rare as an i286 today.
By @cryptonector - 7 months
The simplest thing to do is to make any 32-bit `time_t`s be unsigned. That buys another 68 years to get the transition done. Not that that's exactly easy, but it's easier than switching to 64-bit time_t.
By @jeffbee - 7 months
Are we still keeping shared libraries? They are a complex solution to a problem that arguably stopped existing 20 years ago. Might be time to rethink the entire scheme.