Understanding How to Convert Excel Style Dates with Pandas

Understanding Excel Style Dates with Pandas

====================================================================

As data analysts and scientists, we often encounter date-related issues when working with various file formats. One such format is the Excel style date, which represents dates using a number that corresponds to a specific day in the year. In this article, we will explore how to convert these numbers into regular datetime objects using pandas.

Introduction


The Excel style date was introduced by Microsoft as a way to represent dates more efficiently than traditional text representations. These dates are stored as floating-point numbers, with each digit representing the day of the month (1-31) in the year 1899, then divided into 100ths of a unit (0.01 being January 1st). This approach allows for easier data manipulation and sorting.

For example, an Excel style date of 42580.3333333333 corresponds to December 30, 1899.

The Problem


When working with pandas, we often need to convert these Excel style dates into regular datetime objects. However, the pd.to_datetime() function does not automatically recognize these numbers as dates.

This is where the unit and origin parameters come in. These parameters allow us to specify how to interpret the date value when converting it from a float to a datetime object.

The Solution


To convert an Excel style date into a regular datetime object, you can use the following code:

import pandas as pd

# Create a DataFrame with an Excel style date column
df = pd.DataFrame({'xldate': [42580.3333333333]})

# Convert the 'xldate' column to a datetime object using unit='D' and origin='1899-12-30'
df['date'] = pd.to_datetime(df['xldate'], unit='D', origin='1899-12-30')

print(df['date'])  # Output: 0   2016-07-29 07:59:59.999971200

In this example, we create a DataFrame with a single column xldate containing the Excel style date 42580.3333333333. We then use pd.to_datetime() to convert this value into a datetime object.

The unit='D' parameter specifies that the date value is in days since January 1, 1899. The origin='1899-12-30' parameter sets the base date for the conversion.

How it Works


When we pass unit='D', pandas understands that each digit in the float represents a day of the year in 1899. To convert this value into days since January 1, 1899, we simply divide by 100 and take the floor of the result (using the // operator).

For example, if we have an Excel style date of 42580.3333333333, we can calculate its corresponding datetime object as follows:

  • Divide by 100 to get the year: 42580 // 100 = 425
  • Take the floor of the result to get the day of the year (in 1899): 425 is the 425th day of the year.
  • Add the month and day components to get the final datetime object.

Handling Excel Ordinal Values


One potential issue when working with Excel style dates is handling values less than 60. According to Microsoft, these values are interpreted as years since January 1, 1900.

To handle these cases correctly, we can modify our conversion code to include an additional check for ordinal values less than 60:

import pandas as pd

# Create a DataFrame with an Excel style date column
df = pd.DataFrame({'xldate': [42580.3333333333, 12.0333333333333]})

# Convert the 'xldate' column to a datetime object using unit='D' and origin='1899-12-30'
for i, value in enumerate(df['xldate']):
    if value < 60:
        df.at[i, 'date'] = pd.to_datetime(f"1900-{value}", format="%Y-%m-%d")
    else:
        df.at[i, 'date'] = pd.to_datetime(value, unit='D', origin='1899-12-30')

print(df['date'])

In this updated example, we first check if the value is less than 60. If it is, we assume it represents a year since January 1, 1900 and use that format string to convert the value into a datetime object.

Further Reading


For more information on Excel style dates, including their history and implementation details, see:

By following these guidelines and using the unit and origin parameters, you can easily convert Excel style dates into regular datetime objects using pandas.


Last modified on 2025-03-07