You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using pandas from master (0.16.2+590.g81b647f) in Python 3.4.2, the following code gives an IndexError: index out of bounds:
import pandas as pd, numpy as np
df = pd.DataFrame(dict(a=[np.nan]*3, b=[1,2,3]))
g = df.groupby(('a', 'b'))
len(g) # IndexError
The same problem occurs when calling list(g) instead. Since NaN values are skipped according to the documentation, I guess the correct answer would be zero for len(g) and an empty list for list(g).
Strangely, iteration works, so for x in g: pass (or [x for x in g]) does not give an error (and iterates zero times). Also, g.count(), g.sum() etc. work (and return an empty DataFrame).
To add to the confusion, g.groups gives the dictionary {(nan, 1): [0], (nan, 2): [1], (nan, 3): [2]}. Shouldn’t this be empty because group keys with NaNs are dropped?
Grouping only by column 'a' or 'b' works and results in a length of 0 or 3, respectively.
The text was updated successfully, but these errors were encountered:
Using pandas from master (0.16.2+590.g81b647f) in Python 3.4.2, the following code gives an
IndexError: index out of bounds
:The same problem occurs when calling
list(g)
instead. Since NaN values are skipped according to the documentation, I guess the correct answer would be zero forlen(g)
and an empty list forlist(g)
.Strangely, iteration works, so
for x in g: pass
(or[x for x in g]
) does not give an error (and iterates zero times). Also,g.count()
,g.sum()
etc. work (and return an empty DataFrame).To add to the confusion,
g.groups
gives the dictionary{(nan, 1): [0], (nan, 2): [1], (nan, 3): [2]}
. Shouldn’t this be empty because group keys with NaNs are dropped?Grouping only by column 'a' or 'b' works and results in a length of 0 or 3, respectively.
The text was updated successfully, but these errors were encountered: