python 保存csv按逗点分列_如何保存字符串numpy的阵列（用逗号）到CSV？

最新推荐文章于 2024-08-05 16:24:24 发布

weixin_39600885

最新推荐文章于 2024-08-05 16:24:24 发布

阅读量399

点赞数

CC 4.0 BY-SA版权

文章标签： python 保存csv按逗点分列

本文链接：https://blog.csdn.net/weixin_39600885/article/details/111764222

本文围绕Python和NumPy保存含逗号字符串的二维数组到CSV文件展开。使用numpy.savetxt无法得到预期结果，尝试修改分隔符参数也无效。最终建议使用Python标准库的csv模块，它在处理非数值数据的CSV文件时更强大，可灵活配置多种参数。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

tl;dr ANSWER: Don't use numpy. Use csv.writer instead of numpy.savetxt.

I'm new to Python and NumPy. It seems like it shouldn't be so difficult to save a 2D array of strings (that contain commas) to a CSV file, but I can't get it to work the way I want.

Let's say I have an array that looks like this (made from a list of lists):

[['text1, text2', 'text3'],

['text4', 'text5']]

I want a CSV file that looks like this (or without quote characters) in Excel (pipe = cell separator):

'text1, text2' | 'text3'

'text4' | 'text5'

I'm using numpy.savetxt(filename, array, fmt="%s"), and I get the following CSV output (with square brackets):

['text1, text2','text3']

['text4','text5']

Which displays in Excel like this:

['text1 | text2' | 'text3']

['text4' | 'text5']

I tried fussing with the savetxt delimiter argument, but no change in output.

Do I need to do this manually? If so, let me know if there are any shortcuts I should be aware of.

Ultimately, I need to import the CSV into a Postgresql database. I'm not completely clear on exactly what the CSV formatting needs to be for this to work as expected, but I'm assuming if it looks wrong in Excel, it will probably end up messed up in Postgres. The Postgres documentation says:

The values in each record are separated by the DELIMITER character. If

the value contains the delimiter character, the QUOTE character, the

NULL string, a carriage return, or line feed character, then the whole

value is prefixed and suffixed by the QUOTE character, and any

occurrence within the value of a QUOTE character or the ESCAPE

character is preceded by the escape character. You can also use

FORCE_QUOTE to force quotes when outputting non-NULL values in

specific columns.

Thanks!

++++++++++++++++++++++++++++

Real input and output, in case it's relevantly different:

array:

[['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']]

CSV output:

['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']

Excel's version:

['8908232' 'Plant Growth Chamber Facility at the Department of Botany University of Wisconsin-Madison' 'DBI' 'INSTRUMENTAT & INSTRUMENT DEVP' '1/1/90' '12/19/89' 'WI' 'Standard Grant' 'Joann P. Roskoski' '12/31/91' '$94 914.00 ' 'BIO' '1108' '' '$0.00 ']

解决方案

Adding fmt="%s" doesn't put quotes around each field—the quotes are part of the Python string literal for the string %s, and %s just says that any value should be formatted as a string. If you want to force quotes around everything, you need to have quotes in the format string, like fmt='"%s"'.

However, even if you don't do that, the line you showed can't possibly produce the output you showed. There is no way that NumPy is changing your commas into pipe characters, or using pipe characters as delimiters. The only you can get that is by adding delimiter=' |'. And if you add that… it works with no changes, and you get this:

text1, text2 | text3

text4 | text5

So whatever your actual problem is, it can't be the one you described.

Meanwhile, if you're trying to write CSV files for non-numeric data as flexibly as possible, the standard library's csv module is much more powerful than NumPy. The advantage of NumPy—as the name implies—is in dealing with numeric data. Here's how to do it with csv:

with open(filename, 'wb') as f:

csv.writer(f).writerows(array)

This will default to , as a delimiter. Since some of your strings have , characters in them, by default, it will quote those strings. But you can configure the quoting/escaping behavior, the quote character, the delimiter, and all kinds of other things that NumPy can't.