@@ -1295,57 +1295,41 @@ too many fields will raise an error by default:
12951295
12961296 You can elect to skip bad lines:
12971297
1298- .. code-block :: ipython
1299-
1300- In [29]: pd.read_csv(StringIO(data), on_bad_lines="warn")
1301- Skipping line 3: expected 3 fields, saw 4
1298+ .. ipython :: ipython
13021299
1303- Out[29]:
1304- a b c
1305- 0 1 2 3
1306- 1 8 9 10
1300+ pd.read_csv(StringIO(data), on_bad_lines="warn")
13071301
13081302Or pass a callable function to handle the bad line if ``engine="python" ``.
13091303The bad line will be a list of strings that was split by the ``sep ``:
13101304
1311- .. code-block :: ipython
1305+ .. versionadded :: 1.4.0
1306+
1307+ .. ipython :: ipython
1308+
1309+ external_list = []
13121310
1313- In [30]: pd.read_csv(StringIO(data), on_bad_lines=lambda x: x[-3:], engine="python")
1314- Out[30]:
1315- a b c
1316- 0 1 2 3
1317- 1 5 6 7
1318- 2 8 9 10
1311+ def func(line):
1312+ external_list.append(line)
1313+ return line[-3:]
13191314
1320- .. versionadded:: 1.4.0
1315+ pd.read_csv(StringIO(data), on_bad_lines=func, engine="python")
13211316
1317+ external_list
13221318
13231319You can also use the ``usecols `` parameter to eliminate extraneous column
13241320data that appear in some lines but not others:
13251321
1326- .. code-block :: ipython
1327-
1328- In [31]: pd.read_csv(StringIO(data), usecols=[0, 1, 2])
1322+ .. ipython :: ipython
13291323
1330- Out[31]:
1331- a b c
1332- 0 1 2 3
1333- 1 4 5 6
1334- 2 8 9 10
1324+ pd.read_csv(StringIO(data), usecols=[0, 1, 2])
13351325
13361326In case you want to keep all data including the lines with too many fields, you can
13371327specify a sufficient number of ``names ``. This ensures that lines with not enough
13381328fields are filled with ``NaN ``.
13391329
1340- .. code-block :: ipython
1341-
1342- In [32]: pd.read_csv(StringIO(data), names=['a', 'b', 'c', 'd'])
1330+ .. ipython :: ipython
13431331
1344- Out[32]:
1345- a b c d
1346- 0 1 2 3 NaN
1347- 1 4 5 6 7
1348- 2 8 9 10 NaN
1332+ pd.read_csv(StringIO(data), names=['a', 'b', 'c', 'd'])
13491333
13501334.. _io.dialect :
13511335
0 commit comments