One2three - perl script. One letter to three letter protein sequences

From NMR Wiki

Jump to: navigation, search

Save this file as one2three.

This script can be useful to prepare input for QUEEN software.

Usage:

one2three file > output_file

Script

#!/usr/bin/perl
 
my %AA = (ALA=>'A',TYR=>'Y',MET=>'M',LEU=>'L',CYS=>'C',GLY=>'G',
         ARG=>'R',ASN=>'N',ASP=>'D',GLN=>'Q',GLU=>'E',HIS=>'H',TRP=>'W',
         LYS=>'K',PHE=>'F',PRO=>'P',SER=>'S',THR=>'T',ILE=>'I',VAL=>'V');
 
my %aa = reverse %AA;
 
if (scalar @ARGV == 0)
{
        print "\n\tconvert one-letter aminoacid sequence file to three-letter with one residue per line\n";
        use File::Basename;
        print "\t",basename($0), " <one-letter seq file>\n";
        print "\tempty spaces and multiline files are allowed\n\n";
        exit 1;
}
 
my $file = $ARGV[0];
open F, "<$file" or die $!;
my @lines = <F>;
chomp foreach @lines;
 
my $let = join('',@lines);
$let =~ s/\s//g;
my @aa = split /|/,$let;
 
print ">A\n"; #use dummy chain name
foreach my $a (@aa)
{
        print $aa{$a}, "\n";
}
Personal tools