python 保存csv按逗点分列_如何保存字符串numpy的阵列(用逗号)到CSV?

本文围绕Python和NumPy保存含逗号字符串的二维数组到CSV文件展开。使用numpy.savetxt无法得到预期结果,尝试修改分隔符参数也无效。最终建议使用Python标准库的csv模块,它在处理非数值数据的CSV文件时更强大,可灵活配置多种参数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

tl;dr ANSWER: Don't use numpy. Use csv.writer instead of numpy.savetxt.

I'm new to Python and NumPy. It seems like it shouldn't be so difficult to save a 2D array of strings (that contain commas) to a CSV file, but I can't get it to work the way I want.

Let's say I have an array that looks like this (made from a list of lists):

[['text1, text2', 'text3'],

['text4', 'text5']]

I want a CSV file that looks like this (or without quote characters) in Excel (pipe = cell separator):

'text1, text2' | 'text3'

'text4' | 'text5'

I'm using numpy.savetxt(filename, array, fmt="%s"), and I get the following CSV output (with square brackets):

['text1, text2','text3']

['text4','text5']

Which displays in Excel like this:

['text1 | text2' | 'text3']

['text4' | 'text5']

I tried fussing with the savetxt delimiter argument, but no change in output.

Do I need to do this manually? If so, let me know if there are any shortcuts I should be aware of.

Ultimately, I need to import the CSV into a Postgresql database. I'm not completely clear on exactly what the CSV formatting needs to be for this to work as expected, but I'm assuming if it looks wrong in Excel, it will probably end up messed up in Postgres. The Postgres documentation says:

The values in each record are separated by the DELIMITER character. If

the value contains the delimiter character, the QUOTE character, the

NULL string, a carriage return, or line feed character, then the whole

value is prefixed and suffixed by the QUOTE character, and any

occurrence within the value of a QUOTE character or the ESCAPE

character is preceded by the escape character. You can also use

FORCE_QUOTE to force quotes when outputting non-NULL values in

specific columns.

Thanks!

++++++++++++++++++++++++++++

Real input and output, in case it's relevantly different:

array:

[['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']]

CSV output:

['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']

Excel's version:

['8908232' 'Plant Growth Chamber Facility at the Department of Botany University of Wisconsin-Madison' 'DBI' 'INSTRUMENTAT & INSTRUMENT DEVP' '1/1/90' '12/19/89' 'WI' 'Standard Grant' 'Joann P. Roskoski' '12/31/91' '$94 914.00 ' 'BIO' '1108' '' '$0.00 ']

解决方案

Adding fmt="%s" doesn't put quotes around each field—the quotes are part of the Python string literal for the string %s, and %s just says that any value should be formatted as a string. If you want to force quotes around everything, you need to have quotes in the format string, like fmt='"%s"'.

However, even if you don't do that, the line you showed can't possibly produce the output you showed. There is no way that NumPy is changing your commas into pipe characters, or using pipe characters as delimiters. The only you can get that is by adding delimiter=' |'. And if you add that… it works with no changes, and you get this:

text1, text2 | text3

text4 | text5

So whatever your actual problem is, it can't be the one you described.

Meanwhile, if you're trying to write CSV files for non-numeric data as flexibly as possible, the standard library's csv module is much more powerful than NumPy. The advantage of NumPy—as the name implies—is in dealing with numeric data. Here's how to do it with csv:

with open(filename, 'wb') as f:

csv.writer(f).writerows(array)

This will default to , as a delimiter. Since some of your strings have , characters in them, by default, it will quote those strings. But you can configure the quoting/escaping behavior, the quote character, the delimiter, and all kinds of other things that NumPy can't.

### 解决 PostgreSQL 中 `array_length` 函数处理 `varchar` 类型字段时的类型不匹配问题 当尝试在 PostgreSQL 中使用 `array_length` 函数时,可能会遇到错误代码 `42883`,提示未找到适用于指定参数类型的函数。这是因为 `array_length` 需要接收一个数组作为输入参数,而如果字段并非明确声明为数组类型(如 `_text` 或 `_varchar`),则会引发类型冲突。 #### 错误原因分析 错误的根本原因是目标列的实际数据类型与预期不符。尽管从表面上看它可能是字符串形式表示的一个集合(例如 `'{"value1", "value2"}'`),但在数据库内部并未将其识别为真正的数组对象。因此,在调用之前必须先进行必要的类型转换[^4]。 #### 正确解决方案 为了防止这种类型不兼容的问题发生,应该显式地把相关字段强制转化为合适的数组类型后再传递给 `array_length` 函数。下面展示了一个修正版的例子: ```sql SELECT *, array_length((phones)::text[], 1) AS phone_count FROM contacts; ``` 在这个例子当中 `(phones)::text[]` 运用了 PostgreSQL 提供的一种简便方式——即通过双冒号运算符来进行即时的数据类型转换,从而确保传入到 `array_length` 的确实是所需格式的数组[^4]。 另外一种更正式的方法则是采用 CAST 表达式完同样的功能: ```sql SELECT *, array_length(CAST(phones AS TEXT[]), 1) AS phone_count FROM contacts; ``` 这两种写法都能有效地规避由于隐含类型推断失败所造的运行期异常状况。 --- ### 处理复杂情形下的注意事项 对于更加复杂的场景,比如原始数据是以逗号分隔的形式存放在单个字符串单元格里而不是预设好的数组结构之中,那么就需要借助 string_to_array() 辅助工具先把它们拆分独立项目组阵列之后才能继续下一步计算过程。举个简单的实例来说就是这样的: ```sql SELECT *, array_length(string_to_array(emails, ','), 1) email_quantity FROM users; ``` 这里假定了每笔资料里的 emails 属性都遵循统一模式由若干电子信箱地址经半形逗点连接而的一串文字串构[^3]。 --- ### 总结 综上所述,只要合理运用 PostgreSQL 所提供的各种内置机制就能轻松克服因数据类别差异而导致的各种棘手难题。无论是简单直观的类型标注还是灵活多变的文字解析技巧都可以帮助我们构建稳健可靠的查询脚本。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值