Human and Clinical Genetics - Leiden University Medical Center

RetinoschisisDB©

RS1 data: gene, mRNA, protein

(modified last November 11, 2005)


Contents



RS1-gene


Exon Length
(bp)
5' cDNA
position
Splice Length
intron (kb)
GenBank Remarks
1 >67 -15 1 >7 AF018958 5'-UTR, 52 bp coding
2 26 53 0 0.9 AF018959  
3 106 79 1 1.6 AF018960  
4 142 185 2 1.9 AF018961  
5 196 327 0 2.2 AF018962  
6 2235 523 - - AF018963 150 bp coding, 3'-UTR

Legend:
Exon: numbering of exons and intron/exon boundaries is according to Sauer et al., with the first base of the Met-codon counted as position 1 (see coding DNA Reference Sequence). Length (bp): size of exon indicated in basepairs. 5' cDNA position: first base of the exon (according to cDNA sequence reported by  Sauer et al.). Splice: splicing occurs in between of two coding triplets (0), after the first (1) or the second (2) base of a triplet. Length intron (kb): size of intron indicated in kilobasepairs. GenBank: accession number of sequence in GenBank. Remarks: 5'UTR = 5' untranslated region, 3'UTR = 3' untranslated region.

primers for DNA-amplification


RS1 protein


Alignment of the human, mouse and fugu RS1 proteins


       1            15 16           30 31           45 46           60 61           75 76           90 
       *         **                                                                  *          *  *
human: MSRKIEGFLLLL--- --------------- --------------- --------------- --------------L FGYEATLGLSSTEDE     28
mouse: MPHKIEGFFLLL--- --------------- --------------- --------------- --------------L FGYEATLGLSSTEDE     28
fugu:  MHLPREAFLLALAGA FIFPSSQQEKRTQRD LRVVCFYQKDFRNGT KAEQWGKKTPTQIWR GCMS--VCNAIVVCL F----ELGVSET---     81


       91          105 106         120 121         135 136         150 151         165 166         180 
                  *                           *     *     * ****          *  **  *   *  *   ***   ***
human: GEDPWYQKACKCDCQ GGPNALW-------S AGATSLDCIPECPYH KPLGFESGEVTPDQI TCSNPEQYVGWYSSW TANKARLNSQGFGCA    111
mouse: GEDPWYQKACKCDCQ VGANALW-------S AGATSLDCIPECPYH KPLGFESGEVTPDQI TCSNPEQYVGWYSSW TANKARLNSQGFGCA    111
fugu:  ----WNSKSCKCDCE GGESPTEFPSIRTGS SMVRGVDCMPECPYH RPLGFEAGSISPDQI TCSNQDQYTAWFSSW LPSKARLNTQGFGCA    167


       181         195 196         210 211         225 226         240 241         255 256         270 
       **              *       ** * ** **  *       **   *    *               *   * **       **   * **
human: WLSKFQDSSQWLQID LKEIKVISGILTQGR CDIDEWMTKYSVQYR TDERLNWIYYKDQTG NNRVFYGNSDRTSTV QNLLRPPIISRFIRL    201
mouse: WLSKYQDSSQWLQID LKEIKVISGILTQGR CDIDEWVTKYSVQYR TDERLNWIYYKDQTG NNRVFYGNSDRSSTV QNLLRPPIISRFIRL    201
fugu:  WLSKFQDNTQWLQID LIDAKVVSGILTQGR CDADEWITKYSLQYR TDEKLNWIYYKDQTG NNRVFYGNSDRSSSV QNLLRPPIVARYIRI    257


       271         285 286         300
       **  ** *   * **   *  **
human: IPLGWHVRIAIRMEL LECVSKCA    224
mouse: IPLGWHVRIAIRMEL LECASKCA    224
fugu:  LPLGWHTRIALRLEL LLCMNKCS    280

Legend:
The mouse (Reid et al., GenBank AF073780) and fugu (Brunner et al., GenBank AF146687) RS1 proteins show 95% and 71% amino acid identity to the human RS1 protein, respectively. Identical amino acids are indicated in red. The identity between the three proteins is mainly in the discoidin domains (shown in boldface) and less in the putative leader sequences (amino acids 1-23). All amino acids showing variations in human are indicated by an asterisk above the sequence.


Alignment of discoidin domain sequences


mis          C     C    * * **         *  *C  C     * *   **C   C*C C*                 *       ** * *  Cc* 
con              C    PLGmesG I d QItASS          s W p +aRLn  g  nAW p      d   qWLQvDL + + vtGv TQGa ++d
        55                                                         g             e i i       i  i
hRS1     TSLDCIPECPYHKPLGFESGEVTPDQITCSNPEQYVGWYS-S-WTANKARLNSQGFGCAWLSK--FQ-DS-SQWLQIDLKEIKVISGILTQG--RCD---
mRS1     TSLDCIPECPYHKPLGFESGEVTPDQITCSNPEQYVGWYS-S-WTANKARLNSQGFGCAWLSK--YQ-DS-SQWLQIDLKEIKVISGILTQG--RCD---
fRS1     RGVDCMPECPYHRPLGFEAGSISPDQITCSNQDQYTAWFS-S-WLPSKARLNTQGFGCAWLSK--FQ-DN-TQWLQIDLIDAKVVSGILTQG--RCD---
TRK3     KAQVNPAICRY--PLGMSGGQIPDEDITASSQ---WS---EST-AAKYGRLDSEEGDGAWCPEIPVEPDDLKEFLQIDLHTLHFITLVGTQG--RHAGGH
EDD1     KGHFDPAKCRY--ALGMQDRTIPDSDISASS--S-WS---DST-AARHSRLESSDGDGAWCPAGS-VFPKEEEYLQVDLQRLHLVALVGTQG--RHAGGL
AEBP     WTPTEKVKCP---PIGMESHRIEDNQIRASSMLRHGLGAQRGRLNMQTGATEDDYYDGAWCA----EDDARTQWIEVDTRRTTRFTGVITQG---RDSSI
CAP      AEGWGYYGC-DEELVGPLYARS----LGASSYYSLL------T-APRFARLH---GISGWSPRIG-DPNP---WLQIDLMKKHRIRAVATQGSF--NS--
MFGM-1   AGNHCETKC--VEPLGMENGNIANSQIAASSVRVTF-LGLQH-WVPELARLNRAGMVNAWTPS----SNDDNPWIQVNLLRRMWVTGVVTQGA-SRLAS-
MFGM-2   ----ELNGCA--NPLGLKNNSIPDKQITASSSYKTWGL-HLFSWNPSYARLDKQGNFNAWVAG-SYGND---QWLQVDLGSSKEVTGIITQGA--RN-FG
HEMO-1   STVSPPPECSPDNYIDLVMGDEPLPD-TAFSASSEFS-EIFAPHNARLNRGPTNSGAGSWNPKV----NNDKQYIQVELPRREPIYGVVLQGSPIFD---
HEMO-2   PTSESPLQCTE--PLGLI-GELPLENIQVSSNSEEKD--YLSINGNRGWKPLYNT--PGWV--MFDFT-GPRNITGILTKGGN-------------D---
COAG5-1  PFLIMDRDCRM--PMGLSTGIISDSQIKAS----EF-LGY---WEPRLARLNNGGSYNAWSVEKLAAEFASKPWIQVDMQKEVIITGIQTQGA-KHYLK-
COAG5-2  ----EVNGCST--PLGMENGKIENKQITASSFKKSW--WG-DYWEPFRARLNAQGRVNAWQA-K--ANNN-KQWLEIDLLKIKKITAIITQGC-KSLSS-
COAG8-1  LFLVYSNKC--QTPLGMASGHIRDFQITASG------QYG-Q-WAPKLARLHYSGSINAWST-K----EP-FSWIKVDLLAPMIIHGIKTQGA--RQKFS
COAG8-2  ----DLNSCSM--PLGMESKAISDAQITASSYF-TNMF--AT-WSPSKARLHLQGRSNAWRP----QVNNPKEWLQVDFQKTMKVTGVTTQG-VKSLLT-
NRP-1    SSVSEDFKC-ME-ALGMESGEIHSDQITASS---QY---S-TNWSAERSRLNYPE--NGW----TPGEDSYREWIQVDLGLLRFVTAVGTQGAISKETKK
NRP-2    --KITDYPCSG--MLGMVSGLISDSQITSSNQGDRN-------WMPENIRLVT-SR-SGWALPPA-PHSYINEWLQIDLGEEKIVRGIIIQGG-KH----
NRP2-1   QEPLENFQC--NVPLGMESGRIANEQISASSTY------SDGRWTPQQSRLHGDD--NGW----TPNLDSNKEYLQVDLRFLTMLTAIATQGAISRET--
NRP2-2   --RVTDAPCSN--MLGMLSGLIADSQISASSTQEYL-------WSPSAARLVS-SR-SGWFPR-IPQAQPGEEWLQVDLGTPKTVKGVIIQGARGGDSIT
DISCa    Q-LLANAQCH--------LRTSTNYNGV-HT----QF---NSALNYKNNGTNTIDGSEAWCSSIVDTN----QYIVAGCEVPRTFMCVALQG--RGDA--

mis        *        *C   *    C               *   C **       **   C *C **  **   C   * **  C  *C
con        e    v+syki yS ng  W  y+d      kvF GN D    V +nlf PPI ARyiRi P tWh   +IaLRlELlGC
       144         f      d                      n                 fv                        224
hRS1     IDEW-MTKYSVQYR-TDERLNWIYYKDQTG-NNRVFYGNSDRTSTV-QNLLRPPIISRFIRLIPLGWHV--RIAIRMELLECVSKCA*
mRS1     IDEW-VTKYSVQYR-TDERLNWIYYKDQTG-NNRVFYGNSDRSSTV-QNLLRPPIISRFIRLIPLGWHV--RIAIRMELLECASKCA*
fRS1     ADEW-ITKYSLQYR-TDEKLNWIYYKDQTG-NNRVFYGNSDRSSSV-QNLLRPPIVARYIRILPLGWHT--RIALRLELLLCMNKCS*
TRK3     GIEF-APM--YKINYSRDGTRWISWRNRHGKQV--LDGNSNPYDIFLKDL-EPPIVARFVRFIPVTDHSMN-VCMRVELYGCVWLDGL
EDD1     GKEF--S-RSYRLRYSRDGRRWMGWKDRWGQ-E-VISGNEDPEGVVLKDL-GPPMVARLVRFYPRADRVMS-VCLRVELYGCLWRDGL
AEBP     HDDF---VTTFFVGFSNDSQTWVMYTNGYE--EMTFHGNVDKDTPVLSELPE-PVVARFIRIYPLTWNG-S-LCMRLEVLGCSVAPVY
CAP      -WDW---VTRYMLLYGDRVDSWTPFYQRGHNST--FFGNVNESAVVRHDL-HFHFTARYIRIVPLAWNPRGKIGLRLGLYGCPYKADI
MFGM-1   -HEY---LKAFKVAYSLNGHEFDFIHD-VNKKHKEFVGNWNKNA-VHVNLFETPVEAQYVRLYPTSCHT-A-CTLRFELLGC------
MFGM-2   SVQF---VASYKVAYSNDSANWTEYQDPRTGSSKIFPGNWDNHS-HKKNLFETPILARYVRILPVAWHN--RIALRLELLGC*
HEMO-1   --QYVTSY-EIMYGDDGNTFSTVDGPDGKPK-I--FRGPIDNTHPV-KQMISPPIEAKVVRIRPLTWHD-E-ISLRLEIIGC------
HEMO-2   -GWVTS-YK-VLYTSDFETFNPVIDKD--GKE-KIFPANFDGIVSVTNE-FHPPIRARYLKVLPQKWNK-N-IELRIEPIGCFEPYPE
COAG5-1  -SCY-T--TEFYVAYSSNQINWQIFKGNSTRNVMYFNGNSD-ASTIKENQFDPPIVARYIRISPTRAYN--RPTLRLELQGC------
COAG5-2  --EM--YVKSYTIHYSEQGVEWKPYRLKSSMVDKIFEGNTNTKG-HVKNFFNPPIISRFIRVIPKTWNQ-S-IALRLELFGCDIY*
COAG8-1  SL----YISQFIIMYSLDGKKWQTYRGNSTGTLMVFFGNVDSSG-IKHNIFNPPIIARYIRLHPTHYSI--RSTLRMELMGC------
COAG8-2  --SM--YVKEFLISSSQDGHQWTLFFQNGKV--KVFQGNQDSFTPV-VNSLDPPLLTRYLRIHPQSWVH-Q-IALRMEVLGCEAQDLY
NRP-1    K--Y--YVKTYKIDVSSNGEDWITIKEGNK--PVLFQGNTNPTDVVVA-VFPKPLITRFVRIKPATWET-G-ISMRFEVYGC------
NRP-2    REN-KVFMRKFKIGYSNNGSDWKMIMDDSKRKAKSFEGNNNYDTPELR-TF-PALSTRFIRIYPERATHGG-LGLRMELLGCEVEAPT
NRP2-1   -QNG-YYVKSYKLEVSTNGEDWMVYRHGKNH--KVFQANND-ATEVVLNKLHAPLLTRFVRIRPQTWHS-G-IALRLELFGC------
NRP2-2   AVEARAFVRKFKVSYSLNGKDWEYIQDPRTQQPKLFEGNMHYDTPDIRRF-D-PIPAQYVRVYPERWSPAG-IGMRLEVLGCDWTDSK
DISCa    -DQW---VTSYKIRYSLDNVSWFEYRN-GAA----VTGVTDRNTVVNH-FFDTPIRARSIAIHPLTWNG--HISLRCEFYTQ

Legend:
Discoidin domain alignment. An alignment was made between 31 proteins containing one or two discoidin domains (14 and 17 proteins respectively). If for a specific protein sequences from more than one organism were known, only the human sequence is presented (except for RS1, for which all three known discoidin sequences are displayed).
Top line; mis is amino acids hit by RS1-missense mutations (C if a cysteine was hit or created, asterisk for other changes), con is the consensus sequence with amino acid in bold capital when found in at least 18/20 sequences, capital when found in at least 12/20 sequences, small when found at least 7 times and + when a positively charged amino acid was found in at least 7/20 sequences. For proteins containing two discoidin domains, the first and second domain are indicated as -1 and -2 respectively. Consensus amino acids are shown in color in the proteins.
The proteins aligned are (between brackets known number of species, full name and GenBank accession number): hRS1 (1* human X-linked juvenile retinoschisis precursor protein, AF014459), mRS1 (1* mouse X-linked juvennile retinoschisis precursor protein, AF073780), fRS1(1* fugu X-linked juvenile retinoschisis precursor protein, AF146687) TRK3 (2* tyrosine receptor kinase, Q16832), EDD1 (3* epithelial discoidin domain receptor 1 precursor, Q08345), AEBP1 (2* adipocyte transcription factor, JC5256), CAP (2* contactin associated protein, U87223), MFGM (5* milk fat globule membrane protein, Q08431), HEMO (1* silkworm hemocytin, S52093), COAG5 (2* coagulation factor V precursor, M14335), COAG8 (2* coagulation factor VIII precursor, P00451), NRP (4* neuropilin, AF018956), NRP2 (3* neuropilin 2, AF022859) and DISCa (1* slime mold discoidin I chain A, J01282).



| Top of page | RetinoschisisDB homepage |
| RS1 gene sequence variations |