Skip to content

BUG: Series with non-unique index: "Index length did not match values" error upon assignment #4548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jgehrcke opened this issue Aug 13, 2013 · 4 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@jgehrcke
Copy link
Contributor

Create a Series with non-unique index:

>>> import pandas as pd
>>> pd.__version__
'0.12.0'
>>> s1 = pd.Series(range(3))
>>> s2 = pd.Series(range(3))
>>> comb = pd.concat([s1,s2])
>>> comb
0    0
1    1
2    2
0    0
1    1
2    2
dtype: int64

Assign value by boolean mask:

>>> comb[comb<1] = 5
>>> comb
0    5
1    1
2    2
0    5
1    1
2    2
dtype: int64

This has worked. Now add a value by boolean mask:

>>> comb[comb<2] += 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "***/pandas/core/series.py", line 852, in __setitem__
    self.where(~key,value,inplace=True)
  File "***/pandas/core/series.py", line 749, in where
    other = other.reindex(ser.index)
  File "***/pandas/core/series.py", line 2646, in reindex
    return self._reindex_with_indexers(new_index, indexer, copy=copy, fill_value=fill_value)
  File "***/pandas/core/series.py", line 2650, in _reindex_with_indexers
    return Series(new_values, index=index, name=self.name)
  File "***/pandas/core/series.py", line 492, in __new__
    subarr.index = index
  File "properties.pyx", line 74, in pandas.lib.SeriesIndex.__set__ (pandas/lib.c:29541)
AssertionError: Index length did not match values

Is this expected behavior? If it is, I am sorry, because this was not clear to me from the docs and I am just wondering why simple assignment via = works and special assignment via += does not...

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

its a bug

in general using non_unique indicies is not a good idea; try using multiindexes: http://pandas.pydata.org/pandas-docs/dev/indexing.html#hierarchical-indexing-multiindex

@jgehrcke
Copy link
Contributor Author

Thanks for confirming. Side note: I don't really need the index so pd.concat([s1,s2], ignore_index=True) also is a good approach to circumvent the problem.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

yep....thanks for the case though, this already fixed in #3482; basically since Series is a sub-class of ndarray things are pretty tricky; in 0.13 via that PR its going to be a sub-class of NDFrame (which is what DataFrame subclasses); then things like this are much easier

@jreback
Copy link
Contributor

jreback commented Sep 9, 2013

closed by #4779

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

3 participants