Skip to content

BUG: fillna('') on a Int64 column causes TypeError: <U3 cannot be converted to an IntegerDtype #44289

Closed
@the21st

Description

@the21st

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

df = pd.DataFrame({"A": [1, 2, np.nan], "B": [4, np.nan, 8]}, dtype="Int64")
df.fillna('nan')

Issue Description

The code above causes the following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-2-fcfc27adad85> in <module>
      2 import numpy as np
      3 df = pd.DataFrame({"A": [1, 2, np.nan], "B": [4, np.nan, 8]}, dtype="Int64")
----> 4 df.fillna('nan')

~/venv/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

~/venv/lib/python3.8/site-packages/pandas/core/frame.py in fillna(self, value, method, axis, inplace, limit, downcast)
   5174         downcast=None,
   5175     ) -> DataFrame | None:
-> 5176         return super().fillna(
   5177             value=value,
   5178             method=method,

~/venv/lib/python3.8/site-packages/pandas/core/generic.py in fillna(self, value, method, axis, inplace, limit, downcast)
   6380 
   6381             elif not is_list_like(value):
-> 6382                 new_data = self._mgr.fillna(
   6383                     value=value, limit=limit, inplace=inplace, downcast=downcast
   6384                 )

~/venv/lib/python3.8/site-packages/pandas/core/internals/managers.py in fillna(self, value, limit, inplace, downcast)
    408 
    409     def fillna(self: T, value, limit, inplace: bool, downcast) -> T:
--> 410         return self.apply(
    411             "fillna", value=value, limit=limit, inplace=inplace, downcast=downcast
    412         )

~/venv/lib/python3.8/site-packages/pandas/core/internals/managers.py in apply(self, f, align_keys, ignore_failures, **kwargs)
    325                     applied = b.apply(f, **kwargs)
    326                 else:
--> 327                     applied = getattr(b, f)(**kwargs)
    328             except (TypeError, NotImplementedError):
    329                 if not ignore_failures:

~/venv/lib/python3.8/site-packages/pandas/core/internals/blocks.py in fillna(self, value, limit, inplace, downcast)
   1570         self, value, limit=None, inplace: bool = False, downcast=None
   1571     ) -> list[Block]:
-> 1572         values = self.values.fillna(value=value, limit=limit)
   1573         return [self.make_block_same_class(values=values)]
   1574 

~/venv/lib/python3.8/site-packages/pandas/core/arrays/masked.py in fillna(self, value, method, limit)
    174                 # fill with value
    175                 new_values = self.copy()
--> 176                 new_values[mask] = value
    177         else:
    178             new_values = self.copy()

~/venv/lib/python3.8/site-packages/pandas/core/arrays/masked.py in __setitem__(self, key, value)
    186         if _is_scalar:
    187             value = [value]
--> 188         value, mask = self._coerce_to_array(value)
    189 
    190         if _is_scalar:

~/venv/lib/python3.8/site-packages/pandas/core/arrays/integer.py in _coerce_to_array(self, value)
    332 
    333     def _coerce_to_array(self, value) -> tuple[np.ndarray, np.ndarray]:
--> 334         return coerce_to_array(value, dtype=self.dtype)
    335 
    336     def astype(self, dtype, copy: bool = True) -> ArrayLike:

~/venv/lib/python3.8/site-packages/pandas/core/arrays/integer.py in coerce_to_array(values, dtype, mask, copy)
    202 
    203     elif not (is_integer_dtype(values) or is_float_dtype(values)):
--> 204         raise TypeError(f"{values.dtype} cannot be converted to an IntegerDtype")
    205 
    206     if mask is None:

TypeError: <U3 cannot be converted to an IntegerDtype

Expected Behavior

df.fillna('nan') should return a new dataframe with np.nan values filled with 'nan' strings.

Installed Versions

INSTALLED VERSIONS

commit : 945c9ed
python : 3.8.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.47-linuxkit
Version : #1 SMP Sat Jul 3 21:51:47 UTC 2021
machine : x86_64
processor :
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.4
numpy : 1.19.5
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.2.0
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : 1.0.2
psycopg2 : 2.9.1 (dt dec pq3 ext lo64)
jinja2 : 3.0.2
IPython : 7.28.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 5.0.0
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : 1.4.25
tables : None
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
numba : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    API - ConsistencyInternal Consistency of API/BehaviorBugMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateNA - MaskedArraysRelated to pd.NA and nullable extension arrays

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions