文章/答案/技术大牛

发布

社区首页 >问答首页 >从文本中移除特定的latex命令并关闭其后面的括号。

问从文本中移除特定的latex命令并关闭其后面的括号。
EN

Unix & Linux用户

提问于 2017-06-27 21:42:59

回答 3查看 354关注 0票数 3

如何从文本中删除特定的latex命令，并将其后面的括号括起来，但将文本保留在括号内？下面的示例中要删除的命令是\edit{<some staff>}。\edit{和}应该删除，而<some staff>应该保持不变。

请填写建议SED、AWK、Perl或其他工作。

无意义的例子：

We \edit{Introduce a} model for analyzing \emph{data} from various
experimental designs, \edit{such as paired or \url{http://www/}
longitudinal; as was done 1984 by NN \cite{mycitation} and by NNN
\cite{mycitation2}}.

注意，在\command{smth}语句中可能有一个或多个\edit{}命令。\command{smth}应该保持原样

输出：

We Introduce a model for analyzing \emph{data} from various
experimental designs, such as paired or \url{http://www/}
longitudinal; as was done 1984 by NN \cite{mycitation} and by NNN
\cite{mycitation2}.

PS。我正在向我的文本文件介绍许多小编辑。我希望那些编辑被高亮显示，这样我的合作者就可以看到它们。但之后，我想删除所有的高光，并将文本发送给评论员。

这个问题最初是在AWK/SED从文本中移除特定的胶乳命令，并关闭后面的括号上提出的。但是，例如，那里太软了

text-processing

awk

sed

perl

latex

回答 3

Unix & Linux用户

发布于 2017-06-27 22:44:45

这里有一个在\edit{...}中只有一个命令级别的简单情况下工作的方法，最多可以：

perl -00 -lpe 's,\\edit\{( (?: [^}\\]* | \\[a-z]+\{[^}]*\} )+ )\},$1,xg'

中间部分(?: [^}\\]* | \\[a-z]+\{[^}]*\} )+必须有其他选项：[^}\\]*匹配任何没有结束大括号或反斜杠的字符串(常规文本)；\\[a-z]+\{[^}]*\}用反斜杠、小写字母和匹配的大括号(如\url{whatever...})匹配任何字符串。分组(?:...)+重复这些选项，以及外部括号捕获，因此我们可以将匹配替换为\edit{...}中的部分。

-00告诉Perl同时处理一个段落，其中空行分隔段落。如果您需要处理跨越段落的标记，请将其更改为-0777，以便一次性处理整个输入(对于以NUL分隔的输入，-0也适用于此，因为文本文件中没有任何输入)。

就你的例子而言，这似乎是可行的，给出了：

We Introduce a model for analyzing \emph{data} from various
experimental designs, such as paired or \url{http://www/}
longitudinal; as was done 1984 by NN \cite{mycitation} and by NNN
\cite{mycitation2}.

但是，对于\edit{...}中包含两级命令的输入，它(可预见的)失败：

Some \edit{\somecmd{\emph{nested} commands} here}.

转向：

Some \somecmd{\emph{nested} commands here}.

(拆卸错误的关闭支撑)

实际上，处理平衡括号要复杂一些，例如，在SO：Perl正则表达式:匹配嵌套括号上讨论这个问题。

票数 3

Unix & Linux用户

发布于 2023-02-07 03:13:49

我有一个基于Python的解决方案，不够简洁，但是使用嵌套命令执行得很好。

def command_remove(tex_in, keywords):
    # Romove command with curly bracket
    # keywords: "hl textbf" mean removing \hl{} and \textbf{}
    pattern = '\\\\(' + keywords.replace(' ', '|') + '){'
    commands = re.finditer(pattern, tex_in)
    idxs_to_del = [] # The index of }
    for command in commands:
        stack = 0
        current_loc = command.span()[1]
        while not (tex_in[current_loc] == '}' and stack == 0):
            if tex_in[current_loc] == '}':
                stack = stack - 1
            if tex_in[current_loc] == '{':
                stack = stack + 1
            current_loc = current_loc + 1
        idxs_to_del.append(current_loc)

    idxs_to_del = sorted(idxs_to_del, reverse=True) # sort
    tex_list = list(tex_in)
    for idx in idxs_to_del:
        tex_list.pop(idx) # remove }

    tex_out = ''.join(tex_list)
    tex_out = re.sub(pattern, '', tex_out) # remove \xxx{
    return tex_out

它通过正则表达式定位目标命令，然后用堆栈定位结束括号的位置。对于tex_out = command_remove(tex_in, "revise textbf")和tex_in：

\hl{Can you} \revise{can a \textbf{can} as a \emph{canner} can} can a can?

我们可以得到tex_out：

\hl{Can you} can a can as a \emph{canner} can can a can?

更多细节，即运行命令行，在乳胶_命令_删除中。

票数 0

Unix & Linux用户

发布于 2023-02-07 06:45:04

要使用内部的\edit{...}命令(即其他{...}对)来处理LaTeX S，可以使用perl的S功能来处理它的regexp中的递归：

perl -pe 's{\\edit(\{((?:[^{}]++|(?1))*)\})}{$2}g' file

在这里，(?1)回忆第一对(...)中的regexp，这里是匹配{...}对的一个。

(这里不处理转义大括号、\verb或注释，并假设\edit{...}s不跨越几行，如果需要，所有这些代码都可以很容易地添加)。

票数 0

页面原文内容由Unix & Linux提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://unix.stackexchange.com/questions/373772

复制

相似问题

问从文本中移除特定的latex命令并关闭其后面的括号。
EN

回答 3

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文本中移除特定的latex命令并关闭其后面的括号。EN

回答 3

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文本中移除特定的latex命令并关闭其后面的括号。
EN