Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects #8802

eoincondron · 2014-11-13T11:59:35Z

I recently updated Pandas and found this strange behaviour which broke some of my existing code.
I was using a column of Datetime.date objects as a the second level in a two-level MulitIndex.
However, when setting the index with the latest version, the Datetime.date objects are converted to Timestamp objects with 00:00:00 as the time component:

pd.version
'0.15.1'
df
0 ID date
0 0.486567 10 2014-11-12
1 0.214374 20 2014-11-13
df.date[0]
datetime.date(2014, 11, 12)
df.set_index(['ID', 'date']).index[0](10, Timestamp%28'2014-11-12 00:00:00'%29)

This doesn't happen with version 0.14 or older.

There is a hack to get around it, setting the dates to a single level index, adding the other level and then swapping:

df.set_index('date').set_index('ID', append=True).index.swaplevel(0, 1)[0](10, datetime.date%282014, 11, 12%29)

This seems strange and I wondered was it intentional.

jreback · 2014-11-13T12:50:49Z

see #7888 and associated PR.

Their was an inconsistency in how date-likes (datetime.date,datetime.datetime,Timestamp) were inferred in a MultiIndex level. This led to the creation of an object dtyped Index rather than a DatetimeIndex. datetime.date are second class objects in pandas as they are not efficiently represented. Is their a reason you are not using Timestamp/datetime.datetime ?

If you really really want to create this, you can do this:

In [8]: pd.MultiIndex.from_arrays([Index([datetime.date(2013,1,1)]),['a']])
Out[8]: 
MultiIndex(levels=[[2013-01-01], [u'a']],
           labels=[[0], [0]])

eoincondron · 2014-11-13T13:03:14Z

Thanks for the reply. I looked for previous related issues but didn’t find them. Sorry if I’ve wasted your time.
My reason for using datetime.date objects is that I was using them in conjunction with datetime.time in a Mulitindex (3 levels altogether: (ID, date, time)).
It didn’t seem right to have a timestamp with 00:00:00 time component and then a time columns or index level with a different time of day I couldn’t see a way to separate date and time using Pandas objects. Also, doing things like converting the date to a string is a lot messier with the Timestamp if you only want the date component as you have the unwanted time component to deal with.
I’m pretty new to Python and programming in general ( < 6 months) and I made the decision to go about it this way when I was just getting started.
Would appreciate any advice in this regard.

From: jreback [mailto:[email protected]]
Sent: 13 November 2014 12:51
To: pydata/pandas
Cc: eoincondron
Subject: Re: [pandas] Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects (#8802)

see #7888 #7888 and associated PR.

Their was an inconsistency in how date-likes (datetime.date,datetime.datetime,Timestamp) were inferred in a MultiIndex level. This led to the creation of an object dtyped Index rather than a DatetimeIndex. datetime.date are second class objects in pandas as they are not efficiently represented. Is their a reason you are not using Timestamp/datetime.datetime ?

If you really really want to create this, you can do this:

In [8]: pd.MultiIndex.from_arrays([Index([datetime.date(2013,1,1)]),['a']])

Out[8]:

MultiIndex(levels=[[2013-01-01], [u'a']],

       labels=[[0], [0]])

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/8802#issuecomment-62885623.

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

jreback · 2014-11-13T13:33:18Z

their is no need to keep separate date/time components and it makes is quite inefficient to do so.

You can get at the date or time components in a number of ways, e.g. if you are resampling, or you can just index on the times. A more complete example would help me understand what you are trying to do.

eoincondron · 2014-11-13T15:06:35Z

One example would be using unstack on the time component to convert a column into a data frame with columns corresponding to the times and index given by the remaining levels. Is it possible to do this directly with a DateTimeIndex?

Also, consider this example using a DateTimeIndex on the second level of a MultiIndex with integers on the first. I'm tryin to locate rows corresponding to a list of index tuples.
Using Timestamps, trying to locate two rows simultaneously doesn't work even though it works using each individual tuple:

In [67]: pairs = [(34142, '20090422'), (34142, '20090423')]

dt_pairs = [(34142, datetime.date(2009, 4, 22)), (34142, datetime.date(2009, 4, 23))]

In [91]: df.loc[pairs]
Out[91]:
price volume time
(34142, 20090422) NaN NaN NaN
(34142, 20090423) NaN NaN NaN

In [93]: df.loc[dt_pairs]
Out[93]:
price volume time
(34142, 2009-04-22) NaN NaN NaN
(34142, 2009-04-23) NaN NaN NaN

In [90]: df.loc[pairs[0]]
Out[90]:
price volume time
tid date
34142 2009-04-22 22.75 31808 08:00:00
2009-04-22 22.88 210247 16:35:00

In [94]: df.loc[dt_pairs[0]]
Out[94]:
price volume time
tid date
34142 2009-04-22 22.75 31808 08:00:00
2009-04-22 22.88 210247 16:35:00

However, It works perfectly fine with datetime.date objects in the index:

In [92]: df2.loc[dt_pairs]
Out[92]:
price volume time
34142 2009-04-22 22.750 31808 08:00:00
2009-04-22 22.880 210247 16:35:00
2009-04-23 23.125 12576 08:00:00
2009-04-23 22.500 248969 16:35:00

I think I will stick to 0.14 for the current project which already has 2000+ lines of code depending on the use of datetime.date objects and try to incorporate Timestamps into future projects.
Thanks for the feedback.

jreback closed this as completed Nov 13, 2014

jreback added API Design Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions MultiIndex and removed Datetime Datetime data dtype labels Nov 13, 2014

jreback mentioned this issue May 5, 2015

Problem constructing Series from dict with datetime.date in level of MultiIndex #10060

Closed

jreback mentioned this issue Mar 9, 2017

Set multilevel index changes date to datetime in dataframe #15636

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects #8802

Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects #8802

eoincondron commented Nov 13, 2014

jreback commented Nov 13, 2014

Uh oh!

eoincondron commented Nov 13, 2014

Uh oh!

jreback commented Nov 13, 2014

Uh oh!

eoincondron commented Nov 13, 2014

Uh oh!

Uh oh!

Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects #8802

Version 0.15 MultiIndex forces Datetime.date objects to Timestamp objects #8802

Comments

eoincondron commented Nov 13, 2014

jreback commented Nov 13, 2014

Uh oh!

eoincondron commented Nov 13, 2014

Uh oh!

jreback commented Nov 13, 2014

Uh oh!

eoincondron commented Nov 13, 2014

Uh oh!