Update: 08.04.2017

A Java tool for structure-based amino acid sequence alignments

Features | Examples | Download | How to cite

SBAL (Structure Based Amino Acid Sequence Alignment) is intended for multiple protein sequence alignments guided by secondary structure elements. The program provides automatic and semi-automatic alignment features, and also possesses manual editing capabilities.
Using sequences provided either as individual PSIPRED output files, FASTA files, DSSP output files or PDB files, SBAL calculates a multiple sequence alignment using a position-specific global alignment algorithm that accounts both for structural as well as for sequence homology.

Sequence import
Amino acid sequences can be imported from the following data formats:
Alignment tools
Analysis tools
SBAL alignments can be saved in three different formats:

Examples illustrating common applications for SBAL:
I have output files with secondary structure prediction results from PSIPRED
1.You can use PSIPRED output files in horizontal (file extension horiz) or vertical format (file extension ss2).
2.Put all individual PSIPRED files to be analysed into one directory.
3.PSIPRED files do not include titles for individual sequences. If you wish to have an individual title for each sequence, then you need to provide an individual FASTA file (with the same name root name) for each sequence. E.g. to combine the secondary structure from the PSIPRED files 1.ss2, 2.ss2, 3.ss2 with information for each sequence, you need to provide the files 1.fa, 2.fa, 3.fa in FASTA format. The title of each sequence will be read from the first line.
4.Start SBAL, and go to Tools - Auto-Alignment. Select the directory with the PSIPRED files and choose PSIPRED files (*.ss2, *.horiz) as source. If you want the sequences to be automatically aligned, check automatically align sequences. Then click Start.

I have many AA sequences in individual files in FASTA format
1.All files must have a title in the first line, beginning with ">". Sequences start in the second line. The file extension needs to be "fa" (e.g. 1dk5.fa, 1abl.fa, etc).
2.Since the FASTA format has no secondary structure information, SBAL will predict secondary structure using a single sequence prediction algorithm.
3.Put all individual FASTA files to be analysed into one directory.
4.Start SBAL, and go to Tools - Auto-Alignment. Select the directory with the FASTA files and choose FASTA files (*.fa) as source. If you want the sequences to be automatically aligned, check automatically align sequences. Then click Start.

I have an existing AA sequence alignment
Exisiting SBAL sequence alignments can be read in through File - Open Alignment; select your input file and choose SBAL format. Alignments in SBAL format contain amino acid sequence and secondary structure infromation.
Existing alignments can also imported from the following formats: FASTA, MSF and Clustal, using File - Open Alignment. Since these formats do not contain secondary structure information, SBAL will perform single sequence secondary structure prediction for each sequence.

I have an existing non-SBAL AA sequence alignment and PSIPRED secondary structure prediction
Prerequisite 1An existing alignment in one of the formats FASTA, MSF or Clustal.
Prerequisite 2PSIPRED secondary structure prediction files in *.ss2 or *.horiz format; named for example 01.ss2, 02.ss2, 03.ss2, etc.
Prerequisite 3FASTA files with the same root name as the PSIPRED files (i.e. 01.fa, 02.fa, 03.fa, etc).
Prerequisite 4All files above need to be in the same directory.
Open the alignment using File - Open Alignment. SBAL will search all available FASTA files to find a match of the sequence identifier between the alignment and the individual FASTA files. The secondary structure information will then be read from the corresponding PSIPRED file. If no match can be found, SBAL will predict secondary structure using the built-in single-sequence prediction algorithm.

Introducing a manual domain annotation
1. Place the line cursor in a sequence above which the Annotation line is supposed to occur.
2. Use the menu item Edit - Add annotation to generate the annotation line.
3. Enter the annotation text in the newly generated line and use the pipe character | to denote domain boundaries.
Place the line cursor in the annotation line.
4. Then use the menu item Edit - Convert to domain annotation.
5. A popup window with the recognised domains appears.
6. In the popup window, the background colour (first colour box) of the individual domains, as well as the text colour (second colour box) of the domain description can be selected by clicking on the colour boxes. The annotation text cannot be changed in the popup window.
7. After confirming the choices, the individual domains are shown in the chosen colours. To change any settings, simply repeat from step 1.

Download SBAL from the PCSB home page.
(480 downloads as of 17.01.2018)

How to cite
When using this program, please cite:
Wang, C.K., Broder, U., Weeratunga, S.K., Gasser, R.B., Loukas, A., Hofmann, A. (2012) SBAL: a practical tool to generate and edit structure-based amino acid sequence alignments. Bioinformatics 28, 1026-1027.
DOI | PubMed | More