一段截取序列的perl代码

发布时间:2020-10-23编辑:脚本学堂
一段截取序列的perl代码,供大家学习参考。
复制代码 代码如下:#!/usr/bin/perluse Bio::Seq;
use Bio::SeqIO;
use Bio::DB::Fasta;die "Usage: input the LOCI files!

一段截取序列的perl代码,供大家学习参考。
 

复制代码 代码如下:

#!/usr/bin/perl

use Bio::Seq;
use Bio::SeqIO;
use Bio::DB::Fasta;

die "Usage: input the LOCI files!" if @ARGV<1;
open(LOCI, "$ARGV[0]") || die "can't to open the LOCI files!";
while(<LOCI>)
{
my @loci = split(/s+/,$_);
my $q_file1 = $loci[0];
my $q_file = ">".$q_file1; #id有“>”号,要加上
my $q_begin = $loci[1];
my $q_end = $loci[2];
# print "$q_filen";
my $seqin = Bio::SeqIO -> new (-file =>"seq.fasta",
-format => 'Fasta');
open OUT, '>', "out.fasta";
while (my $seqobj = $seqin -> next_seq()) {
my $db = Bio::DB::Fasta -> new($seqin);
my $seq_all = $db -> get_Seq_by_id($seqin);
my $name = $seqobj -> $q_file;
my $seq = $seqobj -> subseq($q_begin, $q_end);
print OUT ">$namen$seqn";
}
close OUT;
}
close LOCI;

用的模块没问题,但是你刻意加上fasta格式序列名前面的'>'恰恰是多余的。'>'是fasta格式的固有标识,不包括在序列名里面。

只写open loci文件之后的部分,最简单的方法:
 

复制代码 代码如下:

my $db = Bio::DB::Fasta -> new('seq.fasta');
my $out = Bio::SeqIO -> new (-format => 'Fasta', -file => ">out.fasta"); # 这里的'>'是IO输出方向

while (<LOCI>) {
chomp;
my @loci = split;
my $seq = $db -> subseq ($loci[0], $loci[1], $loci[2]);
my $obj = Bio::Seq -> new ( -id = $loci[0],
-seq = $seq
);
$out -> write_seq($obj);
}