-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
stochastic bug in saving dataframes with "int16" or "int32" columns to HDF5 file #4096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See your modified code below
Note, that if you read and create this csv all at once then this works fine (with your existing code) This is a buglet, but as you note, tricky to reproduce. The error message is basically saying: hey you are trying to write columns differently from what exists on disk, namely the Current Ric and GMT Offset are reversed. This is because they are constructed internally in a different order (namely the internal representation which has differernt dtypes in different blocks), gives a different order to the blocks. I am not exactly sure why that would be the case. Will look further.
|
Hi, jreback. Thank you very much for your suggestion and prompt bug fix! Could you please point to some docs where I can read about |
essentially |
jreback, thanks again for the explanation! |
There seems to be a strange random bug in saving dataframes with integer columns to HDF5 files. The error sounds like:
and the full traceback:
Interestingly, the error is not deterministic and depends at least on (i) set of other (non-integer) columns in dataframe and (ii) size of dataframe.
Unfortunately I was unable to narrow it down to "just-code" report, so I have to attach a piece of data and an ipython notebook. This is a minimal code in which I was able to reproduce the bug.
https://www.dropbox.com/s/myo03sbqbulzvaj/pandas_possible_bug.zip
the pandas version is 0.11.0, pytables version: 3.0.0
The text was updated successfully, but these errors were encountered: