-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Pandas version checks
- I have checked that the issue still exists on the latest versions of the docs on
mainhere
Location of the documentation
Documentation problem
Looking at the first example code given, the docu basically says that df["foo"].iloc[0] = 100 no longer works. It spends some text to explain why not and then tells the user:
This statement can be rewritten into a single statement with loc or iloc if this behavior is necessary. DataFrame.where() is another suitable alternative for this case.
I don't think this is a sufficient "the pandas 3 fix is this ....".
It is unclear to me how I would solve this with a where or loc statement in a non-ugly way. Also it usually isn't straight forward to go to iloc - the docu could mention df.columns.get_loc here.
It would be great to see a "suggested pandas 3 replacement" for this line that one can simply copy.
Note also that I don't particularly like my suggested fix - especially I dislike the solution with loc and where. Those bool arrays are ugly as hell - while the iloc one seems ok.
To summarize: It would be great to see a suggestion how to best do the assignment when you have a named column but an iloc row.
Suggested fix for documentation
To set a single value in the n-th row and column "foo" in Pandas 3.0, you need to rewrite the code into a single .loc or .iloc statement like this:
df.iloc[n, df.columns.get_loc("foo")] = 100
You can also use loc or where like this:
df.loc[[*[False] * n, True, *[False] * (len(df) - 1 - n)], "foo"] = 100
df["foo"] = df["foo"].where([*[True] * n, False, *[True] * (len(df) - 1 - n)], 100)
Note that assigning values to single cells usually isn't best in terms of efficiency.