bug #9527: Consensus Sequence Details: Consensus Sequence 5' -> 3' limit characters to [aAcCgGTtUu\s] - EDIT - EDIT Project Management

bug #9527

Updated by Andreas Kohlbecker about 3 years ago

for data to be entered or modified into the "Consensus Sequence 5' -> 3'" text-area the allowed characters, to be typed or pasted must be limited to those that are being used as code for the nucelosides of DNA, it might be a good idea though to also allow uracil which replaces thymin in RNA. 

 Also whitespace must be allowed. 

 Depending on how the consensus sequence is used, the consensus sequence calculates the most frequently appearing nucleotide for every position or it shows which residues are conserved and which residues are variable. Consider the following example DNA sequence: A[CT]N{A}YR. In this notation, A means that an A is always found in that position; [CT] stands for either C or T; N stands for any base; and {A} means any base except A. Y represents any pyrimidine, and R indicates any purine. (see https://en.wikipedia.org/wiki/Consensus_sequence) 

 Therefore we also need to allow different kind of brackets, Y and R. Maybe there are other characters used in consensus sequences.  

 regex for validation of DNA and RNA sequences: 

 ~~~ 
 ^[aAcCgGTtUuRrNnYy\s\{\}\[\]].*$ ^[aAcCgGTtUu\s].*$ 
 ~~~

Back

Project

General

Profile

EDIT

bug #9527