Skip to content

DataFrame.itertuples() incorrectly determines when plain tuples should be used #28282

Closed
@plamut

Description

@plamut

Code Sample, a copy-pastable example if possible

>>> import pandas, sys
>>> sys.version
'3.6.7 (default, Oct 25 2018, 09:16:13) \n[GCC 5.4.0 20160609]'
>>> pandas.__version__
'0.25.1'
>>> df = pandas.DataFrame([{f"foo_{i}": f"bar_{i}" for i in range(255)}])
>>> df.itertuples(index=False)
...
SyntaxError: more than 255 arguments

The issue seems to have been caused/revealed by this commit that removed the try-catch block around the namedtuple class creation.

FWIW, this issue is not reproducible in version 0.24.2, and is also not a problem in Python 3.7+, as the limit of the max number of arguments that can be passed to a function has been removed (AFAIK).

Problem description

The condition in itertuples() method does not correctly determine when plain tuples should be used instead of named tuples.

This how the named tuple class template defines the __new__() method (in Python 3.6 at least):

"""
...
def __new__(_cls, {arg_list}):
    ...
"""

If there are 255 column names given, the total number of arguments to __new__() will be 256, because of that extra cls, causing a syntax error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    RegressionFunctionality that used to work in a prior pandas versionReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions