Protein Sequence Alignment
• Sequence alignment is the procedure of comparing two (pair-
wise alignment) or more (multiple sequence alignment)
sequences by searching for series of individual characters of
Protein Sequence Alignment character patterns that are in the same order in sequences.
• Two sequences are aligned in two rows.
• Identical or similar characters are placed in the same column,
and nonidentical characters can either be placed in the same
column as mismatch or opposite a gap in the other sequence.
Protein Sequence Alignment Global / Local Alignment
• There are two types of sequence alignment:
• GLOBAL SEQUENCE ALINGEMENT
• Global alignments, which attempt to align every residue in every
sequence, are most useful when the sequences in the query set are
similar and of roughly equal size.
• LOCAL SEQUENCE ALINGEMENT
• Local alignments, streches of sequence with the highest density of
matches are aligned, thus generating one or more island of matches or
subalignments in the aligned sequences.
Pairwise alignment PAM and BLOSUM Matrix
• Pairwise sequence alignment methods are used to find the best- • PAM matrices were introduced by Margaret Dayhoff in 1978.
matching piecewise (local or global) alignments of two query • The calculation of these matrices was based on 1572 observed
mutations in the phylogenetic trees of 71 families of closely related
sequences. proteins.
• Pairwise alignments can only be used between two sequences at • The proteins to be studied were selected on the basis of having high
a time, but they are efficient to calculate and are often used for similarity with their predecessors.
methods that do not require extreme precision (such as searching • The protein alignments included were required to display at least 85%
identity. As a result, it is reasonable to assume that any aligned
a database for sequences with high similarity to a query). mismatches were the result of a single mutation event, rather than
Dot-matrix methods several at the same location.
Dynamic programming • BLOSUM matrices are obtained by using blocks of similar amino acid
Word methods sequences as data, then applying statistical methods to the data to
obtain the similarity scores.
https://en.wikipedia.org/wiki/Point_accepted_mutation, https://en.wikipedia.org/wiki/BLOSUM
PAM and BLOSUM Matrix
• To compare closely related sequences, PAM matrices with
lower numbers, to compare distantly related proteins, PAM
matrices with high numbers are created. Higher numbers in
matrices naming scheme denote larger evolutionary distance.
• To compare closely related sequences, BLOSUM matrices with
higher numbers, to compare distantly related proteins,
BLOSUM matrices with low numbers are created. Larger
numbers in matrices naming scheme denote higher sequence
similarity and therefore smaller evolutionary distance.
https://en.wikipedia.org/wiki/Point_accepted_mutation, https://en.wikipedia.org/wiki/BLOSUM