What is the Year 2038 problem and how to fix it?

18 years from now, when the clock strikes 14 minutes and seven seconds past three on the morning of Tuesday 19 January 2038 UTC, a bug known as the Year 2038 Problem is expected to occur. Any computer, program, server or embedded system that store time using 32-bit signed integer will go haywire unless they are upgraded in advance. Some software that works with future dates has already begun to fail because it should have been patched even sooner.

Almost all operating systems in use today can be traced back to UNIX. When engineers developed the first UNIX computer operating system in the 1970s, they arbitrarily decided that time would be represented as a signed 32-bit integer and be measured as the number of seconds since 12:00:00 a.m. on January 1, 1970. 32-bit date and time systems can only count to 2,147,483,647 which translates into January 19, 2038 (3:14:08 am). On this date, any C programs that use the standard 32-bit time_t library will have trouble calculating the date.

The issue with signed integers is that they don’t behave like an automobile’s odometer. When a 5-digit odometer reaches 99 999 miles, and the driver goes one extra mile, the digits “turn over” to 00000. But when a signed integer reaches its maximum value and then gets augmented, it goes back to its lowest possible negative value. Adding 1 more to the maximum value of 2,147,483,647 will cause the integer to wrap around to its minimum value of -2,147,483,647 which represents December 13, 1901, at 8:45:52 PM GMT. Any affected computer will think it traveled back in time. This is called an ‘integer overflow’, and it means the counter has run out of usable bits and begins reporting a negative number.

Most of the support functions that use the time_t data type cannot handle negative time_t values at all. They fail and return an error code, and this results in the calling program crashing spectacularly. In particular, the bug affects the Unix operating system, which powers Android and Apple phones and most internet servers. Some programs that work with future dates may also start experiencing problems sooner. For example, a program that deals with dates 20 years ahead should have been fixed by 2018.

For Y2038 planning, an incremental and proactive approach is needed at this stage. Right now, some areas to focus on include: 1) software dealing with future times and dates; 2) on-the-wire message and file formats; 3) devices with long deployed lifetimes and their dependencies.

The most important area to focus on initially is software that deals with future dates, such as for handling X.509 certificates (like the ones used for HTTPS) and certificate authorities (CAs) or for financial planning. In many of these cases, it has been possible to resolve the issues by moving legacy software from a 32-bit integer time_t to a 64-bit time_t. In other cases, more extensive changes are needed, especially when times get cast into integers for math, when message wire formats get involved or for when values are stored in databases. In testing and fixing support for the 20-year CAs, downstream dependencies can come into play. If a date 30 years in the future gets fed into a logging system or monitoring system, and if those in-turns feed into alerting systems or reporting databases or provisioning systems, then those may also all need fixes.

The impact can extend well beyond a specific system when 32-bit timestamps are put into messages, databases, or file formats. These are also systems with external dependencies where more advanced planning is often needed as interactions across system boundaries. For these collections of interoperating systems, changes may need to be released in a specific order, and most of the time, backward compatibility comes into play. Furthermore, if there are either formally or informally standardized protocols that use 32-bit epoch timestamp values in messages, any migration or fix could be predicated on fixing the standard. As such, these become important to worry about as with a dependency chain such as:

  • Update protocol/standards to support post-Y2038 timestamps.
  • Implement support for the updated standard in software libraries.
  • Get a new version of libraries incorporated into software packages.
  • Get software packages included in a new shipping product.

If each of these takes a few years and the shipping product has a long lifespan, then the long lead-times here may already be a problem.

Devices with long deployment lifetimes should also be an area of focus. Embedded devices shipping with 32-bit hardware may also not have an easy fix of compiling for a 64-bit time_t via a software update. Connected automobiles, as well as other IoT devices, are likely to be an area of specific concern. Given current trends, it is likely that over 10% of cars sold today will still operate in Y2038, and with increases in vehicle age and some vehicles on the road, this may be even higher. We may end up with a significant fraction of automobiles with the potential to have serious issues in eighteen years. This same pattern exists in other embedded systems such as home gaming consoles and smart televisions where devices may ship with 20+ year CA certificates pre-installed.

Communications devices, such as cell phones and Internet appliances (routers, wireless access points) are another major use of embedded systems. They rely on storing an accurate time and date and are increasingly based on UNIX-like operating systems. People reported that due to the Y2038 problem, some devices running 32-bit Android crash and not restart when the time is changed to January 19, 2038.

Devices with long deployed lifetimes may require more comprehensive testing that the operating system and software continue to work properly before, during, and after the Y2038 transition point.

Like the Y2K bug, it’s a well-known issue, however, many people don’t consider it a serious threat. A common excuse you can find on forums and message boards is that by the time 2038 rolls around, there won’t be any 32-bit software or system left. But the Y2K fiasco showed that everyone underestimated the longevity of software architecture and how embedded that would be.

People tend to be short-sighted, thinking of now more than even the near future. Programmers thought the year 2000 was so far off, computers and software would surely be different by then! They didn’t need to worry about it—until the 1990s when the Y2K bug went from a non-problem to a mild panic, with the direst warnings talking of civilization collapse.

The total cost to fix the Y2K bug was over $300 billion, plus a few more billions spent on dealing with issues that appeared after the turn of the century. When the year 2000 rolled around—nothing catastrophic happened. None of the dire warnings of the Y2K bug manifested. This led many to believe that the whole thing had been blown out of proportions.

But there was no Y2K crisis thanks to all the programmers who put the effort to fix the problem, to change millions of lines of code so that 8 digits instead of 6 would represent the date. The irony is if you do your job properly, either no one notices, or they may even question the need for your job in the first place.

The lack of impact of Y2K may cause organizations and technologists to under-prepare for Y2038. It is harder to explain the “Y2038 problem” to laypeople than Y2K, potentially making it harder to prioritize and focus on advanced work. Numerous embedded Internet of Things (IoT) devices becoming ubiquitous also makes the potential impact considerably higher for Y2038 than it was for Y2K.

The solution isn’t technically difficult. We just need to switch to 64 bits or higher bit values, which will give a higher maximum. Over the last decade, a lot of personal computers have made this shift, especially companies that have already needed to project time past 2038, like banks that must deal with 30-year mortgages.

Apple claims that the iPhone 5S is the first 64-bit smartphone. However, the 2038 problem applies to both hardware and software, so even if the 5S uses 64 bits, an alarm clock app could still be 32 bits and so must be updated as well.

The problem does not seem too urgent — we have 18 years to fix it! — but its scope is massive. To give you an idea of how slowly corporations can implement software updates, a majority of ATM cash machines were still running Windows XP, and thus vulnerable to hackers, until April 2019 even though Microsoft discontinued the product in 2007.

So, it’s important to upgrade your systems NOW and be aware of the vendors that refuse to do so in time to avoid costly and short-term patches to your system and software.

Leave a Reply