因此,我创建了这个regex来解析这样的字符串(我需要Byte和Time的值):
1463735418 Bytes: 0 Time: 4.297 这是下面的代码(使用了这)
string writePath = @"C:\final.txt";
string[] lines = File.ReadAllLines(@"C:\union.dat");
foreach (string txt in lines)
{
string re1 = ".*?"; // Non-greedy match on filler
string re2 = "\\d+"; // Uninteresting: int
string re3 = ".*?"; // Non-greedy match on filler
string re4 = "(\\d+)"; // Integer Number 1
string re5 = ".*?"; // Non-greedy match on filler
string re6 = "([+-]?\\d*\\.\\d+)(?![-+0-9\\.])"; // Float 1
Regex r = new Regex(re1 + re2 + re3 + re4 + re5 + re6, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String int1 = m.Groups[1].ToString();
String float1 = m.Groups[2].ToString();
Debug.Write("(" + int1.ToString() + ")" + "(" + float1.ToString() + ")" + "\n");
File.AppendAllText(writePath, int1.ToString() + ", " + float1.ToString() + Environment.NewLine);
}
}但是,当字符串被表示为一行时,这是非常有效的,但是当我试图读取我的文件时是这样的。
1463735418
Bytes: 0
Time: 4.297
1463735424
Time: 2.205
1466413696
Time: 2.225
1466413699
1466413702
1466413705
1466413708
1466413711
1466413714
1466413717
1466413720
Bytes: 7037
Time: 59.320
... (arbritrary repition)我得到垃圾数据。
Expected Output:
0, 4.297
7037, 59.320(只匹配存在时间字节对的位置)
编辑:我正在尝试这样的方法,但是我仍然没有得到想要的结果。
foreach (string txt in lines)
{
if (txt.StartsWith("Byte"))
{
string re1 = ".*?"; // Non-greedy match on filler
string re2 = "(\\d+)"; // Integer Number 1
Regex r = new Regex(re1 + re2, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String int1 = m.Groups[1].ToString();
//Console.Write("(" + int1.ToString() + ")" + "\n");
httpTable += int1.ToString() + ",";
}
}
if (txt.StartsWith("Time"))
{
string re3 = ".*?"; // Non-greedy match on filler
string re4 = "([+-]?\\d*\\.\\d+)(?![-+0-9\\.])"; // Float 1
Regex r1 = new Regex(re3 + re4, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m1 = r1.Match(txt);
if (m1.Success)
{
String float1 = m1.Groups[1].ToString();
//Console.Write("(" + float1.ToString() + ")" + "\n");
httpTable += float1.ToString() + Environment.NewLine;
}
}
}我该怎么修补呢?谢谢。
发布于 2016-06-27 23:15:03
我建议将时间和字节限定为查找,如果没有找到,默认设置为整数类别。然后,通过使用regex命名捕获,确定为每个匹配找到了什么。
string data = "1463735418 Bytes: 0 Time: 4.297 1463735424 Time: 2.205 1466413696 Time: 2.225 1466413699 1466413702 1466413705 1466413708 1466413711 1466413714 1466413717 1466413720 Bytes: 7037 Time: 59.320";
string pattern = @"
(?<=Bytes:\s)(?<Bytes>\d+) # Lookbehind for the bytes
| # Or
(?<=Time:\s)(?<Time>[\d.]+) # Lookbehind for time
| # Or
(?<Integer>\d+) # most likely its just an integer.
";
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
.OfType<Match>()
.Select(mt => new
{
IsInteger = mt.Groups["Integer"].Success,
IsTime = mt.Groups["Time"].Success,
IsByte = mt.Groups["Bytes"].Success,
strMatch = mt.Groups[0].Value,
AsInt = mt.Groups["Integer"].Success ? int.Parse(mt.Groups["Integer"].Value) : -1,
AsByte = mt.Groups["Bytes"].Success ? int.Parse(mt.Groups["Bytes"].Value) : -1,
AsTime = mt.Groups["Time"].Success ? double.Parse(mt.Groups["Time"].Value) : -1.0,
})下面是每个匹配的IEnumerable作为一个动态实体的结果,其中包含三个IsA和相应的As转换值(如果可行的话):

发布于 2016-06-27 22:56:29
由于您只需要Bytes: ...和Time: ...的值,所以使用确切的字符串,而不是填充:
用于捕获Bytes
Bytes: (\d+)用于捕获Time
Time: ([-+]\d*\.\d+)捕获这两种类型的通用模式
(Bytes|Time): (\d+|[-+]\d*\.\d+)https://stackoverflow.com/questions/38064416
复制相似问题