首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在没有列的情况下将文本文件导入vba?

如何在没有列的情况下将文本文件导入vba?
EN

Stack Overflow用户
提问于 2014-05-27 16:49:15
回答 1查看 342关注 0票数 0

我在VBA中构建了一个简单的webscraper,它从Google专利中提取一个表,并将innerHTML存储在一个.txt文件中(名为1234567.txt),用于大约23,000项专利。现在我想分析每个文件的内容。为此,我希望将txt文件导入到VBA中,以便能够执行一些字符串搜索,但这似乎非常困难。我阅读了大约20种将.txt文件导入VBA的解决方案,但没有一种解决方案适用于我的文件,如下所示:

代码语言:javascript
复制
US6824791 B2 ' There is a shift + Enter here
<TD class="patent-data-table-td citation-patent"><A href="/patents/US7767249">US7767249</A></TD>
<TD class="patent-data-table-td patent-date-value">Jul 25, 2005</TD>
<TD class="patent-data-table-td patent-date-value">Aug 3, 2010</TD>
<TD class="patent-data-table-td ">Hewlett-Packard Development Company, L.P.</TD>
<TD class="patent-data-table-td ">Preparation of nanoparticles</TD></TR>
<TR>
<TD class="patent-data-table-td citation-patent"><A href="/patents/US7935853">US7935853</A><SPAN class=patent-tooltip-anchor aria-label="Cited by examiner" data-tooltip-text="Cited by examiner" data-tooltip="Cited by examiner" a="null"> *</SPAN></TD>
<TD class="patent-data-table-td patent-date-value">Oct 8, 2009</TD>
<TD class="patent-data-table-td patent-date-value">May 3, 2011</TD>
<TD class="patent-data-table-td ">Bobelium S.L.</TD>
<TD class="patent-data-table-td ">Micronized composition of a 2,4-disubstituted phenol derivative</TD></TR>
<TR>
<TD class="patent-data-table-td citation-patent"><A href="/patents/US8524829">US8524829</A></TD>
<TD class="patent-data-table-td patent-date-value">Jun 17, 2008</TD>
<TD class="patent-data-table-td patent-date-value">Sep 3, 2013</TD>
<TD class="patent-data-table-td ">Brown University Research Foundation</TD>
<TD class="patent-data-table-td ">Methods for micronization of hydrophobic drugs</TD></TR>
<TR>
<TD class="patent-data-table-td citation-patent"><A href="/patents/EP2422804A1?cl=en">EP2422804A1</A></TD>
<TD class="patent-data-table-td patent-date-value">Jun 16, 2005</TD>
<TD class="patent-data-table-td patent-date-value">Feb 29, 2012</TD>
<TD class="patent-data-table-td ">Amano Enzyme USA., Ltd.</TD>
<TD class="patent-data-table-td ">Controlled release formulations of enzymes, microorganisms, and antibodies with mucoadhesive polymers</TD></TR></TBODY></TABLE>
' There is a shift + Enter here

因此,尽管这些文件结构整洁、重复,但将它们作为一个字符串导入似乎非常困难。我基本上想遍历这个文件,提取专利号和提到的两个日期(使用MidInStr),并将它们放在三个不同的列中。这是我认为是最好的,但我渴望听到更聪明的建议!

可能需要知道每个.txt文件都有不同的长度(行数),但我可以用99.9%的确定性(1 (title) + 6 * total number of citations (which I know) - 1 (last <TR> is missing)来估计确切的行数

提前感谢

西蒙

编辑:我试过的一些例子。这些代码是从网上来源获取的。我不太清楚它们应该如何工作,很可能我使用它们不正确。

代码语言:javascript
复制
`Sub Text2Excel_Click()
Dim sourcestring as String

sourcestring = GetText("C\users\...\test.txt")
sourcestring = OpenTextFileToString("C\users\...\test.txt")

Function GetText(sFile As String) As String
   Dim sText As String
   Dim nSourceFile As Integer

   ''Close any open text files
   Close
   ''Get the number of the next free text file
   nSourceFile = FreeFile
   ''Write the entire file to sText
   Open sFile For Input As #nSourceFile
   sText = Input$(LOF(1), 1)
   Close

   GetText = sText
End Function

Function OpenTextFileToString(ByVal strFile As String) As String
' RB Smissaert - Author
Dim hFile As Long
hFile = FreeFile
Open strFile For Input As #hFile
OpenTextFileToString = Input$(LOF(hFile), hFile)
Close #hFile
End Function

我还尝试了以下建议:但这给了我运行时错误424“对象所需”。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-05-28 13:11:51

如果您只是尝试将文本文件放入vba中的1字符串中,则可以使用以下函数。

注意:这要求您添加对的引用。

代码语言:javascript
复制
Public Function ReadFileAsString(ByVal filePath As String) As String

Dim fso As FileSystemObject
Set fso = New FileSystemObject
Set txtstream = fso.OpenTextFile(filePath, ForReading, False)

Dim sourceString As String
Do While Not txtstream.AtEndOfStream
    sourceString = sourceString + txtstream.ReadLine
Loop

ReadFileAsString = sourceString
txtstream.Close

Set fso = Nothing
Set txtstream = Nothing

End Function

然后,您可以对上述函数返回的字符串执行InStr()和Mid()等操作。要查看文本文件,请尝试:

代码语言:javascript
复制
Public Sub Test()
    Debug.Print ReadFileAsString("C:\Users\ausername\Desktop\test.txt")
End Sub
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/23894822

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档