所以我每天都会收到包含信息的电子邮件。不幸的是,由于某些原因,数据是在电子邮件正文中发送的,而不是作为附件。那好吧。我正在使用Excel来抓取Outlook,使用VBA。
Sub mytry()
Dim olapp As Object
Dim olmapi As Object
Dim olmail As Object
Dim olitem As Object
Dim lrow As Integer
Dim olattach As Object
Dim str As String
Dim TextWeNeedToParse as String
Const num As Integer = 6
Const path As String = "C:\HP\"
Const emailpath As String = "C:\Dell\"
Const olFolderInbox As Integer = 6
Set olp = CreateObject("outlook.application")
Set olmapi = olp.getnamespace("MAPI")
Set olmail = olmapi.getdefaultfolder(num)
If olmail.items.restrict("[ReceivedTime]>=""&MacroDate&12:00am&""").Count = 0 Then
Else
For Each olitem In olmail.items.restrict("[ReceivedTime]>=""&MacroDate&12:00am&""")
TextWeNeedToParse = olitem.body
'Recursive text parsing here
Next olitem
End If好的,这段代码应该能把整个正文转换成一个字符串。现在我们可以传递字符串,并对其进行操作。
下面是我正在处理的文本示例:
WAL +300bp QTY
(M) FCTR SECURITY CPN ASK 1mPSA TYPE
0.77 1.15 458 0.04 GNR 2012-61 CA 2.00 99-16 217 SEQ
1.39 2.26 120 0.76 GNR 2005-13 AE 5.00 102-24 223 SUP
1.40 18.16 45 0.65 GNR 2015-157 NH 2.50 95-16 215 EXCH,+
1.50 21.56 25 0.94 GNR 2017-103 HD 3.00 98-08 375 PAC-2所以我可以有几种不同的方法来解决这个问题,但我不是很了解所有的部分。
1)我可以尝试计算存在多少回车符,然后进行循环。然后“计算”空间,找出所有东西的位置。不太确定它会有多好。
2)我可以正则表达式中间的唯一ID,如果我能想出如何正则表达式第n个实例(我被卡住的一个主要点),我也可以使用它来正则表达式数字-例如,第一行将是由空格包围的直数/小数的1-5个实例,以及number-number-dash-number-number的第一个实例。
我将通过它抛出的示例Regex代码:
Function regex(strInput As String, matchPattern As String, Optional ByVal outputPattern As String = "$0") As Variant
Dim inputRegexObj As New VBScript_RegExp_55.RegExp, outputRegexObj As New VBScript_RegExp_55.RegExp, outReplaceRegexObj As New VBScript_RegExp_55.RegExp
Dim inputMatches As Object, replaceMatches As Object, replaceMatch As Object
Dim replaceNumber As Integer
With inputRegexObj
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = matchPattern
End With
With outputRegexObj
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\$(\d+)"
End With
With outReplaceRegexObj
.Global = True
.MultiLine = True
.IgnoreCase = False
End With
Set inputMatches = inputRegexObj.Execute(strInput)
If inputMatches.Count = 0 Then
regex = False
Else
Set replaceMatches = outputRegexObj.Execute(outputPattern)
For Each replaceMatch In replaceMatches
replaceNumber = replaceMatch.SubMatches(0)
outReplaceRegexObj.Pattern = "\$" & replaceNumber
If replaceNumber = 0 Then
outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).Value)
Else
If replaceNumber > inputMatches(0).SubMatches.Count Then
'regex = "A to high $ tag found. Largest allowed is $" & inputMatches(0).SubMatches.Count & "."
regex = CVErr(xlErrValue)
Exit Function
Else
outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).SubMatches(replaceNumber - 1))
End If
End If
Next
regex = outputPattern
End If
End Function3)我可以尝试上面的一些方法,但使用递归。我的递归相当弱。
因此,一旦我提取了文本字符串,我想我将需要如下内容:
Sub QuickExample(Dim Cusip as String, Dim PriceStr as variant, Dim SpreadStr as variant)
Dim ws as WorkSheet
Set ws = thisworkbook.sheets("Results")
LastRow = ws.Cells(sht.Rows.Count, "A").End(xlUp).Row
ws.cells(Lastrow,1).value2 = Cusip
ws.cells(Lastrow,2).value2 = PriceStr
ws.cells(Lastrow,3).value2 = SpreadStr
End Sub最后:
Sub ParsingDate(EmailText as String)
Dim CarriageReturns As Long
CarriageReturns = Len(EmailText) - Len(Replace(EmailText, Chr(10), ""))
For i = 1 to CarriageReturns
'Parse out the data for the ith row, return it to the function above
Next i
End Sub这是实际的解析行为,我对此有点纠结--我如何正确地获得第n个结果,并且只得到第n个结果?即使添加了一些额外的空格或行,我如何确保它继续工作?有没有一种方法可以只使用正则表达式,并“查看”给定表达式的第n个查找结果?在没有大量递归的情况下,可以做到这一点吗?
谢谢
发布于 2018-04-27 01:03:21
WAL +300bp QTY
(M) FCTR SECURITY CPN ASK 1mPSA TYPE
0.77 1.15 458 0.04 GNR 2012-61 CA 2.00 99-16 217 SEQ
1.39 2.26 120 0.76 GNR 2005-13 AE 5.00 102-24 223 SUP
1.40 18.16 45 0.65 GNR 2015-157 NH 2.50 95-16 215 EXCH,+
1.50 21.56 25 0.94 GNR 2017-103 HD 3.00 98-08 375 PAC-2这似乎是一个格式化得很好的表。也可以再次使用split(),使用Split()将每一行放入数组,然后将每个字段放入一个数组
Sub dump()
arrLine = Split(TextWeNeedToParse, Chr(10))
For Each Line In arrLine
For Each field In Split(Line, " ")
Debug.Print field
Next
Next
End Sub它非常短,而且跑得很快。您只需要一个if语句和计数器(或正则表达式测试)即可获得所需的确切项目。
如果您删除多个空格,以便split()将每个元素放在适当的位置,则测试/计数可能会更容易。在运行以下代码之前,您可以使用一个循环来删除多个空格:
完全实现它可能是这样的:
<your code to get the bod>
'remove multiple spaces from string for parsing
Do While InStr(1, TextWeNeedToParse, " ")
TextWeNeedToParse= Replace(TextWeNeedToParse, " ", " ")
Loop
'Get each line into an array element
arrLine = Split(TextWeNeedToParse, Chr(10))
'Loop through the array
For Each Line In arrLine
'dump fields to an array
arrFields = Split(Line, " ")
'and spit out a particular element (your "unique id" is element 5)
If UBound(arrFields) >= 5 Then Debug.Print "unique id:"; arrFields(5)
Nexthttps://stackoverflow.com/questions/50034664
复制相似问题