Tools
General Charts
It is possible to generate different statistic charts related to the sequence project and also to understand the progress of the analysis (figure 3, figure 4 and figure 5).
- Data distribution bar chart: Bar chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
- Data distribution pie chart: Pie chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
- Analysis Progress: Bar chart showing the cumulative number of sequences with Blast hits, InterProScan, GO Mapping and GO Annotation results.
- Sequence Length Distribution: Area chart showing the number of sequences for each sequence length.
Find Duplicated Sequences
This function allows to quickly identify and remove redundant sequences (exactly the same sequences) within a dataset.
It is possible to select mark as selected or directly remove or Create an ID-List of all sequences in the dataset which have the exact same sequence string.
Find Similar Sequences
This function allows searching for similar sequences within a dataset. The search for similar sequences is done via BLAT alignments. The function searches a list of sequences against itself and reports all alignments above a certain similarity percentage. It is possible to remove similar sequences from the project or remove or to extract a less redundant result dataset into a new project.
Set to Sense (Based on Best-Blast-Hit)
Convert all selected sequences with a negative reading frame Best-Blast-Hit to anti-sense i.e. query-sequences will be translated to its reverse complement (e.g.: ATTG ->CAAT). The tag "_antisense" will be added to the end of the sequence names. Use the batch rename function to undo the name change.
Batch Rename
Perform a batch rename of all selected sequences by converting, replacing or adding text to the actual sequence name. Link here for a detailed explanation on how to use this tool.
Translate Longest ORF
Convert all selected sequences to its longest ORF protein sequence. The tag "_ORF'' will be added to the sequence names. Use the batch rename function to undo the name change. The user may select the reading frame, the genetic code depending to the species that will be considered to the prediction.