Algorithm
All tests
PAM120
Proteomes:
Condition positive for GDSL lipase, input query: FVFGDSLSDA (isoforms counted once):
For condition positive we required GoMapMan description: GDSL lipase, GDSL esterase or hydrolase.
Link to IGLOSS article
Condition positive for MADS-box, input query: RQVTFSKRRNGLLKKA (isoforms counted once):
Condition positive for Cath-B, input query: QGQCGSCWAF (isoforms counted once):
Download Proteomes folder
Upload and output:
Upload file can be in FASTA or plain text format. Maximal upload file size is 104MB.
Output format can be changed at any time without losing any data.
Search for motifs:
Motif input can contain amino acids ARNDCQEGHILKMFPSTWYV and simbol x or X both of which stand for any amino acid.
Query input can be built from single amino acid motif or multiple motifs of the same length.
Maximal allowed number of amino acids per motif is 70. Maximal allowed number of motifs in query construction is 30.
Input "Scale" parameter:
This parameter measures similarity between input query and the response. The higher it is - higher is the similarity.
Sensible range for the input "Scale" parameter is somewhere between 3 and 15. The higher this parameter is - smaller is the response.
Key (conserved) positions:
In addition to amino acids, user can define which positions in query are highly conserved (key positions) by adding a sequence of 0 (default) and 1 (highly conserved).
Example 1. Following inputs give the same output:

Example 2. Key positions on the 4-th and 7-th column:
Key position puts more weight on an input in a specified column and reduces variation of the response in that column.
File should contain only amino-acid blocks of equal length. Clique filter is already very slow for files over 450 rows.
A
Download python code cliqueFilter.py