文章/答案/技术大牛

发布

社区首页 >问答首页 >从现有文本文件中查找和提取文本

问从现有文本文件中查找和提取文本
EN

Stack Overflow用户

提问于 2012-01-18 03:19:09

回答 2查看 2.8K关注 0票数 1

我需要能够从现有的文本文件中提取数据。文本文件的结构如下所示...

this line contains a type of header and always starts at column 1
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in

this line contains a type of header and always starts at column 1
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in

this line contains a type of header and always starts at column 1
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in

this line contains a type of header and always starts at column 1
     this line contains other data and is always tabbed in
     this line contains other data and is always tabbed in

如您所见，文本文件是按部分排列的。总是有一个标题行，后面跟着随机数量的其他数据行，并且在部分之间总是有一个空行。不幸的是，报头部分的命名方案或包含在其它数据lines...only中的数据没有韵律或原因。上述结构在某种程度上是一致的。我需要搜索的数据位于其他数据行中的一个数据行中，只在其中一个部分中，它可以位于文本文件中的任何位置。我可以使用find命令来定位我需要查找的文本，但是一旦我这样做了，我需要能够将整个部分提取到一个新的文本文件中。我想不出如何往上走多少行到前面的第一个空行，然后再往下到下一个空行，然后提取中间的所有内容。这有意义吗？不幸的是，VBScript根本不是这个应用程序的选项，否则它很久以前就已经结束了。有什么想法吗？谢谢。

batch-file

extract

text-files

回答 2

Stack Overflow用户

发布于 2012-01-18 04:20:20

@echo off
setlocal enableDelayedExpansion
set input="test.txt"
set output="extract.txt"
set search="MY TEXT"

::find the line with the text
for /f "delims=:" %%N in ('findstr /n /c:!search! %input%') do set lineNum=%%N
set "begin=0"

::find blank lines and set begin to the last blank before text and end to the first blank after text
for /f "delims=:" %%N in ('findstr /n "^$" %input%') do (
  if %%N lss !lineNum! (set "begin=%%N") else set "end=%%N" & goto :break
)
::end of section not found so we must count the number of lines in the file
for /f %%N in ('find /c /v "" ^<%input%') do set /a end=%%N+1
:break

::extract the section bracketed by begin and end
set /a count=end-begin-1
<%input% (
  rem ::throw away the beginning lines until we reach the desired section
  for /l %%N in (1 1 %begin%) do set /p "ln="
    rem ::read and write the section
    for /l %%N in (1 1 %count%) do (
      set "ln="
      set /p "ln="
      echo(!ln!
    )
)>%output%

此解决方案的限制：

行必须以<CR><LF>结尾(Windows样式)
行的长度必须为<= 1021字节长(不包括

将从每行中剥离控制字符

如果限制是一个问题，那么可以编写一个效率较低的变体，使用FOR /F而不是SET /P来读取节

票数 1

Stack Overflow用户

发布于 2012-01-18 13:33:44

下面的程序读取文件行，并将一节中的行存储在一个向量中，同时检查搜索文本是否在当前节中。当该节结束时，如果找到了搜索到的文本，则将当前节作为结果输出；否则，该过程将转到下一节。

@echo off
setlocal EnableDelayedExpansion
set infile=input.txt
set outfile=output.txt
set "search=Any text"
set textFound=
call :SearchSection < %infile% > %outfile%
goto :EOF

:SearchSection
   set i=0
   :readNextLine
      set line=
      set /P line=
      if not defined line goto endSection
      set /A i+=1
      set "ln%i%=!line!"
      if not "!ln%i%!" == "!line:%search%=!" set textFound=True
   goto readNextLine
   :endSection
   if %i% == 0 echo Error: Search text not found & exit /B
if not defined textFound goto SearchSection
for /L %%i in (1,1,%i%) do echo !ln%%i!
exit /B

这个程序的局限性与dbenham为他的程序所陈述的相同。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/8900374

复制

相似问题

问从现有文本文件中查找和提取文本
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从现有文本文件中查找和提取文本EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从现有文本文件中查找和提取文本
EN