首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用匹配将目录中的特定文件扩展名重命名为bash中的另一个文件

用匹配将目录中的特定文件扩展名重命名为bash中的另一个文件
EN

Stack Overflow用户
提问于 2016-03-04 21:23:40
回答 2查看 125关注 0票数 2

我在目录.bam中有一套特定的下载文件集(全部以/home/cmccabe/Desktop/NGS/API/2-15-2016结尾)。我要做的是使用与$2name中的匹配来重命名下载的文件。为了使事情更复杂,文件夹的日期是唯一的,在name的标题中,匹配日期是存在的,也是name中匹配的位置。我不知道该如何做,也不知道是否可能。谢谢:)。

文件夹/home/cmccabe/Desktop/NGS/API/2-15-2016内容

代码语言:javascript
复制
IonXpress_001.bam
IonXpress_002.bam
IonXpress_003.bam
IonXpress_007.bam
file1.gz
file2.gz

名称

代码语言:javascript
复制
2-15-2016
IonXpress_001.bam testname1_12345
IonXpress_002.bam testname2_45678
IonXpress_003.bam testname3_9012
IonXpress_007.bam testname1_12345-
2-19-2016
IonXpress_001.bam testname5_00000
IonXpress_002.bam testname6_11111
IonXpress_003.bam testname7_1213
IonXpress_007.bam testname8_78524

期望结果

代码语言:javascript
复制
testname1_12345.bam
testname2_45678.bam
testname3_9012.bam
testname1_12345.bam
file1.gz
file2.gz

bash到目前为止

代码语言:javascript
复制
logfile=/home/cmccabe/Desktop/NGS/API/2-15-2016/process.log
for f in /home/cmccabe/Desktop/NGS/API/2-15-2016/*.bam ; do
echo "patient identifier creation: $(date) - File: $f"
bname=$(basename $f)
pref=${bname%%.bam}
while read from to ; do
for i in $f* ; do
if [ "$i" != "${i/$from/$to}" ] ; then
  mv $i ${i/$from/$to}
fi
done < names.txt
echo "End patient identifier creation: $(date) - File: $f"
done >> "$logfile"

编辑:

代码语言:javascript
复制
for f in /home/cmccabe/Desktop/NGS/API/2-12-2016/*.bam ; do
  bname=$(basename $f)
  cmd=$(sed -n "/$f/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{s/\(.*\.bam\) \(.*\)/mv \1 \2/p}" /home/cmccabe/Desktop/NGS/panels/names.txt)
  echo "$cmd"
done
sed: -e expression #1, char 4: extra characters after command
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-03-04 22:33:09

您可以将此for循环与awk一起使用。

代码语言:javascript
复制
cd /home/cmccabe/Desktop/NGS/

for file in API/*/*.bam; do
   f="${file##*/}"
   path="${file%/*}"
   dt="${path##*/}"
   mv "$file" "$path/$(awk -v dt="$dt" -v f="$f" 'NF==1 {
               p=$0==dt ? 1 : 0; next} p && $1==f{print $2}' names.txt)"
done
票数 2
EN

Stack Overflow用户

发布于 2016-03-04 21:55:24

您可以这样做,我在sed中使用了f变量:

代码语言:javascript
复制
 cmd=$(sed -n "/$f/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{s/\(.*\.bam\) \(.*\)/mv \1 \2/p}" names.txt)
 # for testing use echo and this will also save what you just tried 
 #to do to your log file :) just in case.
 echo "$cmd"
 # when it works the way you want
 # uncomment the next line and it will execute your command :)
 #eval "$cmd"

这是告诉sed不要打印用-n读取的行。

然后是从匹配日期($f)到下一个数据模式的行( DD-DD-20DD (regex: 0-9{1,2}-0-9{1,2}-200-9{2}) )在{}之间执行命令。

{}中的命令是一个替换的"s“命令,它将匹配一个模式并替换为另一个模式。

我告诉它将字符串一路带到.bam,并将它放在(和)之间,然后匹配行的其余部分,并将其放到另一个组中。

替换模式是mv字符串,后面是在匹配模式中捕获的组1,然后是组2中的字符串,有效地创建了mv file.bam new_filename命令的列表。

然后将它们存储在cmd变量中。

eval将执行命令..。

我以您的name.txt文件的示例内容为例,并进行了转换以说明:

代码语言:javascript
复制
 ~$echo "2-12-2016
 IonXpress_001.bam testname1_12345
 IonXpress_002.bam testname2_45678
 IonXpress_003.bam testname3_9012
 IonXpress_007.bam testname1_12345-
 2-19-2016
 IonXpress_001.bam testname5_00000
 IonXpress_002.bam testname6_11111
 IonXpress_003.bam testname7_1213
 IonXpress_007.bam testname8_78524" |sed -n "/$f/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{s/\(.*\.bam\) \(.*\)/mv \1 \2/p}"
 mv IonXpress_001.bam testname1_12345
 mv IonXpress_002.bam testname2_45678
 mv IonXpress_003.bam testname3_9012
 mv IonXpress_007.bam testname1_12345-
 mv IonXpress_001.bam testname5_00000
 mv IonXpress_002.bam testname6_11111
 mv IonXpress_003.bam testname7_1213
 mv IonXpress_007.bam testname8_78524

UPDATE:从您的评论和编辑中我看到我不是很擅长解释:)我这里是您脚本的编辑版本。我将假设您在运行此文件时位于/home/cmccabe/Desktop/NGS/API/文件夹中。如果没有,我相信你会知道如何作出改变或进行辩论。

代码语言:javascript
复制
 logfile=/home/cmccabe/Desktop/NGS/API/2-15-2016/process.log
 # no need to loop for each file ending in bam as the name file
 # will be our driver. After all if the entry is not present in
 # the name file then we really cannot do anything.

 # First lets get the date from the folder name:
 #    pwd will return the current working directory (which we are supposed 
 #        to be in the directory to process)
 #    basename will strip all but the last folder name, hence the date
 date_to_process=$(basename $(pwd))

 # variable to store name file path (hint change this to where it really is or pass as argument to script)
 name_file_path = "/home/cmccabe/Desktop/NGS/panels/names.txt"

 # from the name file build the file move (mv) commmands using sed 
 # as described before and store that command in the cmd variable.
 # note that I added a couple of echo commands to have the same output you 
 # were trying to do. I also split the command on multiple lines 
 # for clarity (well I hope it makes it more clear at least).
 cmd=$(sed -n "/$date_to_process/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{
    s/\(.*\.bam\) \(.*\)/echo \"Start patient identifier creation: \$(date) - File: \1\"\n mv \1 \2\n echo \"End patient identifier creation: \$(date) - File: \1\"/p
 }" $name_file_path)

 # print the generated commands to you can see what it did.
 echo "about to execute this command: 
 $cmd" 

 # execute the commands to perform the move operations and send the 
 #output to the log file. Make sure to pipe stderr (errors) to the log file 
 # too so you will know what/if something failed. (using 2>&1) this will make all stderr go to the same pipe as stdin. 
 eval "$cmd" >> "$logfile" 2>&1
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/35806287

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档