Skip to content

Tools

Figure 1: Tools options

General Charts

It is possible to generate different statistic charts related to the sequence project and also to understand the progress of the analysis (figure 3, figure 4 and figure 5).

  • Data distribution bar chart: Bar chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
  • Data distribution pie chart: Pie chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
  • Analysis Progress: Bar chart showing the cumulative number of sequences with Blast hits, InterProScan, GO Mapping and GO Annotation results.
  • Sequence Length Distribution: Area chart showing the number of sequences for each sequence length.

Figure 2: Project Statistics

Figure 3: Data Distribution Bar Chart

Figure 4: Analysis Progress

Figure 5: Sequence Length

Find Duplicated Sequences

This function allows to quickly identify and remove redundant sequences (exactly the same sequences) within a dataset.

It is possible to select mark as selected or directly remove or Create an ID-List of all sequences in the dataset which have the exact same sequence string.

Figure 6: Find Duplicated Sequences wizard

Find Similar Sequences

This function allows searching for similar sequences within a dataset. The search for similar sequences is done via BLAT alignments. The function searches a list of sequences against itself and reports all alignments above a certain similarity percentage. It is possible to remove similar sequences from the project or remove or to extract a less redundant result dataset into a new project.

Figure 7: Find Similar Sequences wizard

Set to Sense (Based on Best-Blast-Hit)

Convert all selected sequences with a negative reading frame Best-Blast-Hit to anti-sense i.e. query-sequences will be translated to its reverse complement (e.g.: ATTG ->CAAT). The tag "_antisense" will be added to the end of the sequence names. Use the batch rename function to undo the name change.

Figure 8: Set to Sense wizard

Batch Rename

Perform a batch rename of all selected sequences by converting, replacing or adding text to the actual sequence name. Link here for a detailed explanation on how to use this tool.

Figure 9: Bach Rename wizard

Translate Longest ORF

Convert all selected sequences to its longest ORF protein sequence. The tag "_ORF'' will be added to the sequence names. Use the batch rename function to undo the name change. The user may select the reading frame, the genetic code depending to the species that will be considered to the prediction.

Figure 10: Bach Rename wizard