我正在用fgetcsv解析CSV文件。我从Magento的安装中得到了一个CSV出口。然而,这是不可理解的。以下是这类出口的一个有问题的行:
200000,三星银河S2,399.00美元,8806085359376美元,空,免费地面运输,新的,在股票,三星,“Vivid·Fast.Slim新的GALAXY让你的生活更聪明!4.3”超级AMOLED + 4.3“超级AMOLED +超级AMOLED +显示器已经超越了已经非常出色的超级AMOLED and,提供增强的可读性,更薄的设计,更好的电池消耗,为任何智能手机的最佳观看价值。”全触控显示尺寸: 4.3“分辨率: 480 x 800像素平台操作平台:Androidv4.1(果冻豆) TOUCHWiZ v4.0用户界面(最多7页小部件桌面)频段^ UMTS(850 / 900 / 1900 /2100 900)+电池容量:1650 900”,移动>制造商>三星,
问题是在文件中使用"作为inch和其他场合的短手。
我正在寻找一个RegEx,用于preg_replace的每一个双引号,没有后面或前面有一个逗号。但是,我的RegEx知识很差,我不能创建一个工作表达式。这就是我认为离解决方案很近的地方,但我无法让它发挥作用:
private static function _fixQuotesInString($string)
{
return preg_replace('/(?<!,)"|"(?!,)/', '"', $string);
}由于我的知识有限,我会读它,我会说:如果你找到一个双引号,检查它是否前面没有逗号或后面有逗号,如果是,用“。
当您发布解决方案时,如果您能够添加RegEx的“口头解释”,那么我就可以理解它了。
发布于 2013-06-03 12:51:26
您的regex将同时替换,"和",,因为两者都不同时满足这两种交替条件。相反,您可以只使用(?<!,)"(?!,),它要求引号两边都用逗号包围。
请注意,在"后面加上逗号的情况下,解决方案仍然存在潜在的问题,因此您应该从它的源头着手解决这个问题。
发布于 2013-06-03 15:11:16
描述
如果您想简单地解析每个逗号分隔的字段,这些字段可能被双引号包围,也可能不被双引号包围,那么您可以使用这个regex:
(?:^|,)("?)(.*?)\1(?=,(?!\s)|$)

组2被分配给每个逗号分隔的值。如果值是由引号打开的,那么关闭引号之后是,,后面没有空格,或者需要行尾来关闭字符串。
PHP代码示例:
<?php
$sourcestring="your source string";
preg_match_all('/(?:^|,)("?)(.*?)\1(?=,|$)/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
$matches Array:
(
[0] => Array
(
[0] => 200000
[1] => ,Samsung Galaxy S2
[2] => ,$399.00
[3] => ,8806085359376
[4] => ,null
[5] => ,Free ground shipping
[6] => ,New
[7] => ,In Stock
[8] => ,Samsung
[9] => ,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh"
[10] => ,Mobile > Manufacturer > Samsung
[11] => ,
)
[1] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] => "
[10] =>
[11] =>
)
[2] => Array
(
[0] => 200000
[1] => Samsung Galaxy S2
[2] => $399.00
[3] => 8806085359376
[4] => null
[5] => Free ground shipping
[6] => New
[7] => In Stock
[8] => Samsung
[9] => Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh
[10] => Mobile > Manufacturer > Samsung
[11] =>
)
)简单替换
因为源文本是逗号分隔的,并且逗号分隔符将没有任何周边空间来解决"excellent occasion, 4.3", samsung"的问题,所以可以使用
Regex:(?<!,)(")(?!,\S)替换为nothing
PHP代码示例:
<?php
$sourcestring="your source string";
echo preg_replace('/(?<!,)(")(?!,\S)/ims','',$sourcestring);
?>
$sourcestring after replacement:
200000,Samsung Galaxy S2,$399.00,8806085359376,null,Free ground shipping,New,In Stock,Samsung,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3 SUPER AMOLED Plus The 4.3 SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3 Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh",Mobile > Manufacturer > Samsung,https://stackoverflow.com/questions/16897053
复制相似问题