我有逗号分隔的值,其中包含嵌套大括号中的逗号。具体地说,我将输入逗号分隔的C++11对象。
例如,以下是一个输入:
std::vector<int>{32, 45, 10}, std::array<std::string, 5>{"a", "bc", "def", "ghij", "whoa, this is, a toughie"}, 8, "foo, bar", {"initializer-list?", "no problem!", "(hopefully...)"}
下面是我想要的输出:
[
'std::vector<int>{32, 45, 10}',
'std::array<std::string, 5>{"a", "bc", "def", "ghij", "whoa, this is, a toughie"}',
'8',
'foo, bar',
'{"initializer-list?", "no problem!", "(hopefully...)"}'
]但是python的csv给了我:
[
'std::vector<int>{32',
'45',
'10}',
'std::array<std::string',
'5>{"a"',
'"bc"',
'"def"',
'"ghij"',
'"whoa',
'this is',
'a toughie"}',
'8',
'foo, bar', # at least this one works :/
'{"initializer-list?"',
'"no problem!"',
'"(hopefully...)"}'
]如何自定义csv模块来处理这些情况?
发布于 2015-04-01 03:10:26
您可以使用正则表达式来拆分每一行,然后对其进行清理
import re
a = r'std::vector<int>{32, 45, 10}, std::array<std::string, 5>{"a", "bc", "def", "ghij", "whoa, this is, a toughie"}, 8, "foo, bar", {"initializer-list?", "no problem!", "(hopefully...)"}'
# split on occurrences of "}, s"
results = re.split('},\s+s', a)注意:拆分将从每个字符串的末尾删除} (最后一个字符串除外),并从除第一个字符串以外的每个字符串中删除s。
编辑:
我想尝试解决这个问题,并提出了以下建议(假设字符串不包含集合{,},",<,>中的单个字符)。您可以通过提前查看cpp声明来更具体地删除<,>案例。
a = r'std::vector<int>{32, 45, 10}, std::array<std::string, 5>{"a", "bc", "def", "ghij", "whoa, this is, a toughie"}, 8, "foo, bar", {"initializer-list?", "no problem!", "(hopefully...)"}'
l_braces = {"{", "<"}
r_braces = {"}", ">"}
def split(s):
brace_count = 0
quote_count = 0
breaks = []
for i, c in enumerate(s):
if c == '"':
quote_count += 1
if quote_count % 2 == 1:
brace_count += 1
else:
brace_count -= 1
if (c in l_braces):
brace_count += 1
if (c in r_braces):
brace_count -= 1
if (c == ",") and (brace_count == 0):
breaks.append(i)
pieces = []
lag = 0
for b in breaks:
pieces.append(s[lag:b].strip())
lag = b+1
pieces.append(s[breaks[-1]+1:].strip())
return pieces
print(split(a))其中print(split(a))将打印以下内容...
['std::vector<int>{32, 45, 10}',
'std::array<std::string, 5>{"a", "bc", "def", "ghij", "whoa, this is, a toughie"}',
'8',
'"foo, bar"',
'{"initializer-list?", "no problem!", "(hopefully...)"}']发布于 2015-04-01 03:05:17
当CSV模块找到一个逗号时,它只是分隔值。它不关心其他符号。
为了达到你想要的效果,你必须扩展模块逻辑,让它能够检测到像"{“这样的开括号。当它找到一个左方括号时,所有逗号都应该被忽略,直到找到一个右方括号。
这样,您应该可以获得所需的输出。
https://stackoverflow.com/questions/29375614
复制相似问题