Mastering Dates and Timestamps in Pandas (and Python in general) (2024)

  • Dates in general

More specifically, handling operations with…

  • Days
  • Weeks
  • Months
  • Years

Extra: Generating date ranges

  • Generate Sequential date ranges
  • Generate Random date ranges
Mastering Dates and Timestamps in Pandas (and Python in general) (3)

(*) Before running any code, please import pandas!

import pandas as pd

Now, let’s get started.

Changing Date Format in a column DataFrame

You can choose every format as you want, following this simple strftime documentation.

So for example, starting from this DataFrame:

Mastering Dates and Timestamps in Pandas (and Python in general) (4)

Change the Date Format, with:

df['date'] = df['date'].apply(lambda x: pd.Timestamp(x).strftime('%Y-%m-%d'))
Mastering Dates and Timestamps in Pandas (and Python in general) (5)

Or, we can go a bit more exotic and do:

df['date'] = df['date'].apply(lambda x: pd.Timestamp(x).strftime('%B-%d-%Y %I:%M %p'))
Mastering Dates and Timestamps in Pandas (and Python in general) (6)

Remember, all variations for timestamp formats that you can choose, you can find them in this link: strftime

Try it yourself! Starting from this example-built DataFrame:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-03-02 13:15:21']})

Change column type — from Object/String to DateTime

# 4 possible options# 1
df['date'] = pd.to_datetime(df['date'])
# 2
df['date'] = df['date'].astype('datetime64[ns]')
# 3
df['date'] = df['date'].apply(lambda x: parse(x))
# 4
df['date'] = df['date'].apply(lambda x: pd.Timestamp(x))

Example (we try only one of the 4 options, but all of them should work)

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-01-02 13:15:21']})df
df.dtypes
df['date'] = pd.to_datetime(df['date'])
df
df.dtypes

Output:

Mastering Dates and Timestamps in Pandas (and Python in general) (7)

Going from String to Date format, and from Date format to String

Example with isolated variables:

from dateutil.parser import parsestr_date = '2018-05-01'# String to Date:
date_1 = parse(str_date)
print ('date_1: ',date_1, type(date_1))
# Date to String:
date_2 = date_1.strftime('%Y-%m-%d')
print ('date_2: ',date_2, type(date_2))

Output:

Mastering Dates and Timestamps in Pandas (and Python in general) (8)

From Unix/Epoch time to Readable date format

df['date'] = pd.to_datetime(df['date'],unit='s')

Example:

df = pd.DataFrame({'date': [1349720105,1349806505]})dfdf['date'] = pd.to_datetime(df['date'],unit='s')df

Output (before and after):

Mastering Dates and Timestamps in Pandas (and Python in general) (9)

Add and subtract dates

Use Timedelta! Example:

from datetime import datetime, timedelta
from dateutil.parser import parse
parse('2019-04-07') — timedelta(days=3)# or, to get it as a string
(parse('2019-04-07') — timedelta(days=3)).strftime('%Y-%m-%d')

Output:

# with date format 
datetime.datetime(2019, 4, 4, 0, 0)
# with string format
'2019-04-04'

Get the difference between two dates

Convert both strings into date format, and then do the calculation. Example:

from dateutil.parser import parsed1 = parse('2018-12-01')
d2 = parse('2018-12-08')
abs((d2 - d1).days)

Output:

7
# 7 days

Get the day from a Date

# for a column in a DataFrame
from datetime import datetime as dt
df['day'] = df['date'].dt.day
# for a single value
from dateutil.parser import parse
parse('2018-08-09').day

Output:

9

Get week number of the year

Example:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-01-02 13:15:21']})# if date column type is a string
df['week'] = pd.DatetimeIndex(df['date']).week
# if date column type is a datetime
# df['week'] = df['date'].dt.week

Output:

Mastering Dates and Timestamps in Pandas (and Python in general) (10)

(*) To create a Week column, in the format yyyy-ww, use:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-03-02 13:15:21']})# if column type is a string/object
df['yyyy_ww'] = pd.DatetimeIndex(df['date']).strftime('%Y-%U')
# if column type is a datetime
# df['yyyy_ww'] = df['date'].dt.strftime('%Y-%U')
Mastering Dates and Timestamps in Pandas (and Python in general) (11)

And for an isolated variable:

import datetime
date_1 = '2018-02-06'
parse(date_1).isocalendar()[1]

Output:

6
# 6th week of the year

Get weekday

Example:

df['weekday'] = df['date'].apply(lambda x: parse(str(x)).strftime("%A"))

Output:

Mastering Dates and Timestamps in Pandas (and Python in general) (12)

Go from Year-Week format to yyyy-mm-dd format (getting the first and last day o of the week)

Example: you want to know what dates were the start and end from week number 37 in the year 2018:

# define this function
def get_start_end_dates(yyyyww):
year = yyyyww[:4]
week = yyyyww[-2:]
first_day_year = str(year) + '-' + '01' + '-' + '01'
d = parse(first_day_year)
if(d.weekday()<= 3):
d = d - timedelta(d.weekday())
else:
d = d + timedelta(7-d.weekday())
dlt = timedelta(days = (int(week)-1)*7)
return (d + dlt).strftime('%Y-%m-%d'), (d + dlt + timedelta(days=6)).strftime('%Y-%m-%d')
# run it
get_start_end_dates('201837')

Output (a tuple with the start and the end of the week):

('2018-09-10', '2018-09-16')

Get month number of the year

Example:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-03-02 13:15:21']})# if date column type is a string/object
df['month'] = pd.DatetimeIndex(df['date']).month
# if date column type is a datetime
# df['month'] = df['date'].dt.month

Output:

Mastering Dates and Timestamps in Pandas (and Python in general) (13)

And for an isolated variable:

import datetime
date_1 = '2018-02-06'
parse(date_1).month

Output:

2
# 2nd month of the year

(*) To create a month column, in the format YYYY-MM, use:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-03-02 13:15:21']})# if column type is a string/object
df['yyyy_mm'] = pd.DatetimeIndex(df['date']).strftime('%Y-%m')
# if column type is a datetime
# df['yyyy_mm'] = df['date'].dt.strftime('%Y-%m')
Mastering Dates and Timestamps in Pandas (and Python in general) (14)

Add or subtract months (go X months back or forward)

Use this function

def monthdelta(date, delta):
m, y = (date.month+delta) % 12, date.year + ((date.month)+delta-1) // 12
if not m: m = 12
d = min(date.day, [31,
29 if y%4==0 and not y%400==0 else 28,31,30,31,30,31,31,30,31,30,31][m-1])
new_date = (date.replace(day=d,month=m, year=y))
return new_date.strftime('%Y-%m-%d')

Example (subtracting 4 months to a certain date):

monthdelta(parse('2019-11-09'), -4)

Output (shows the same date, but 4 months before):

'2019-07-09'

Get the year

Example:

df = pd.DataFrame({'date': ['2018-08-09 11:10:55','2019-03-02 13:15:21']})# if date column type is a string/object
df['year'] = pd.DatetimeIndex(df['date']).year
# if date column type is a datetime
# df['year'] = df['date'].dt.year
Mastering Dates and Timestamps in Pandas (and Python in general) (15)

And for an isolated variable:

import datetime
date_1 = '2018-02-06'
parse(date_1).year

Output:

2018

Generate Sequential date ranges

Example: generating a date range from 01/01/2019 to 01/02/2019, with hourly frequency.

from datetime import datetime
import numpy as np
date_range = pd.date_range(start='01/01/2019', end='01/02/2019', freq='H')

See the different option for the frequencies in here.

Generate Random date ranges

import random
import time
from dateutil.parser import parse
def str_time_prop(start, end, format, prop):
stime = time.mktime(time.strptime(start, format))
etime = time.mktime(time.strptime(end, format))
ptime = stime + prop * (etime - stime)
return time.strftime(format, time.localtime(ptime))
selected_format = '%Y-%m-%d %H:%M:%S'def random_date(start, end, prop):
return parse(str_time_prop(start, end, selected_format, prop)).strftime(selected_format)
print(random_date("2020-01-01 13:40:00", "2020-01-01 14:10:00", random.random()))def make_date(x):
return random_date("2012-12-01 13:40:00", "2012-12-24 14:50:00", random.random())

Here’s the source for this function.

From this, we can generate random dates. For example, let’s generate a list of 10 random timestamps between Christmas and new year:

def make_date(x):
return random_date("2012-12-24 00:00:00", "2012-12-31 23:59:59", random.random())
[make_date(x) for x in range(10)]
Mastering Dates and Timestamps in Pandas (and Python in general) (16)

We can add it also to any dataframe, like this:

df = pd.DataFrame({'number': [1,2,3,4,5]})
df['time'] = df['number'].apply(make_date)
df
Mastering Dates and Timestamps in Pandas (and Python in general) (17)

This is the end of the article. Hope you enjoy it and that you can make good use of it!

Send me a message or leave a reply if you have any question.

Follow me if you want to get informed about articles like this one in the future!

Mastering Dates and Timestamps in Pandas (and Python in general) (2024)
Top Articles
Latest Posts
Article information

Author: Kimberely Baumbach CPA

Last Updated:

Views: 5665

Rating: 4 / 5 (61 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Kimberely Baumbach CPA

Birthday: 1996-01-14

Address: 8381 Boyce Course, Imeldachester, ND 74681

Phone: +3571286597580

Job: Product Banking Analyst

Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.