我试图只打印预定义的序列(ATOM名称),但没有得到预期的输出。我想打印输入文件,按照以下预期输出。链ID可以是A到H。
代码:
my $OutputDir = 'C:\test_result_file';
open my $dir, "Document1.txt" or die "Failed to open Document1.txt:$!";
chomp(my @files = <$dir>);
foreach my $file (@files) {
my $win_len = 4;
my @window = ();
my $prev_chain = "";
open my $input, $file or die "failed to open $file: $!\n";
open my $output, '>', "$OutputDir/$file" or die "failed to open $OutputDir/$file.pdb: $!\n";
while (<$input>) {
my ($atom_name, $chain) = (split)[2, 4];
next unless $atom_name =~ /\b(?:C4B|O4B|C1B|C2B|O4B|C1B|C2B|C3B|C1B|C2B|C3B|C4B|C2B|C3B|C4B|O4B|C3B|C4B|O4B|C1B)\b/;
if ($chain eq $prev_chain) {
if (@window == $win_len) {
print_window($output, @window);
shift @window;
}
push @window, $_;
} else {
print_window($output, @window) if @window;
@window = ($_);
$prev_chain = $chain;
}
}
print_window($output, @window) if @window;
}
sub print_window {
my $fh = shift;
print $fh $_ foreach @_;
print $fh "\n";
}输入文件:
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C 预期输出:
一条链:
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C B链:
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C 描述:我想对HETATM预定义的原子名称进行排序(例如: C4B、O4B、C1B、C2B等)。到目前为止,我已经有了上面的脚本。所以请任何人帮助我解决这个问题。在我当前的脚本中,我得到了相同的格式,但无法获得预期的结果。
我不想要A链和B链或任何链id的单独文件。我想根据我的顺序(预定义)对原子名称进行排序。
我的顺序是:
C4B-O4B-C1B-C2B
O4B-C1B-C2B-C3B
C1B-C2B-C3B-C4B
C2B-C3B-C4B-O4B
C3B-C4B-O4B-C1B
e.g., first row: C4B
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
Second row: O4B
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
Third Row: C1B
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
Fourth Row: C2B
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
Fifth Row: O4B
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
Sixth Row: C1B
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
Seventh Row: C2B
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
Eighth Row: C3B
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
.
.
.
so onB和其他链的格式也相同。
这意味着我需要多次使用每一行。在输入文件中,原子名称上面的所有off都应该存在,并且是链式的。我们需要复制上面所有的原子名称文件,然后我们需要按照上面的顺序粘贴。
发布于 2016-10-12 16:22:07
在我看来,你的错误来自于下面这行:
my ($atom_name, $chain) = (split)[2, 4];这将把第三列放在$atom_name中,把第五列放在$chain中。
我猜你想要:
my ($atom_name, $chain) = (split)[1, 3];对于第一行,您将得到:
$atom_name = C4B和$chain = B
https://stackoverflow.com/questions/39972953
复制相似问题