GlycoProtDB ID - GPDB0002492


Protein Name Apolipoprotein B-100
Protein Accession Number E9Q414
Gene Apob
Length 4505
Status
Reviewed
release : 2016-12-05

Glycosylation Sites

Schema of N-glycosylation site(s) of the protein

Sequence

(__:Potential Sequon , N:Identified Site)

1
101
201
301
401
501
601
701
801
901
1001
1101
1201
1301
1401
1501
1601
1701
1801
1901
2001
2101
2201
2301
2401
2501
2601
2701
2801
2901
3001
3101
3201
3301
3401
3501
3601
3701
3801
3901
4001
4101
4201
4301
4401
4501
MGPRKPALRT PLLLLFLLLF LDTSVWAQDE VLENLSFSCP KDATRFKHLR KYVYNYEAES SSGVQGTADS RSATKINCKV ELEVPQICGF IMRTNQCTLK
EVYGFNPEGK ALMKKTKNSE EFAAAMSRYE LKLAIPEGKQ IVLYPDKDEP KYILNIKRGI ISALLVPPET EEDQQELFLD TVYGNCSTQV TVNSRKGTVP
TEMSTERNLQ QCDGFQPIST SVSPLALIKG LVHPLSTLIS SSQTCQYTLD PKRKHVSEAV CDEQHLFLPF SYKNKYGIMT RVTQKLSLED TPKINSRFFS
EGTNRMGLAF ESTKSTSSPK QADAVLKTLQ ELKKLSISEQ NAQRANLFNK LVTELRGLTG EAITSLLPQL IEVSSPITLQ ALVQCGQPQC YTHILQWLKT
EKAHPLLVDI VTYLMALIPN PSTQRLQEIF NTAKEQQSRA TLYALSHAVN SYFDVDHSRS PVLQDIAGYL LKQIDNECTG NEDHTFLILR VIGNMGRTME
QVMPALKSSV LSCVRSTKPS LLIQKAALQA LRKMELEDEV RTILFDTFVN GVAPVEKRLA AYLLLMKNPS SSDINKIAQL LQWEQSEQVK NFVASHIANI
LNSEELYVQD LKVLIKNALE NSQFPTIMDF RKFSRNYQIS KSASLPMFDP VSVKIEGNLI FDPSSYLPRE SLLKTTLTVF GLASLDLFEI GLEGKGFEPT
LEALFGKQGF FPDSVNKALY WVNGRVPDGV SKVLVDHFGY TTDGKHEQDM VNGIMPIVDK LIKDLKSKEI PEARAYLRIL GKELSFVRLQ DLQVLGKLLL
SGAQTLQGIP QMVVQAIREG SKNDLFLHYI FMDNAFELPT GAGLQLQVSS SGVFTPGIKA GVRLELANIQ AELVAKPSVS LEFVTNMGII IPDFAKSSVQ
MNTNFFHESG LEARVALKAG QLKVIIPSPK RPVKLFSGSN TLHLVSTTKT EVIPPLVENR QSWSTCKPLF TGMNYCTTGA YSNASSTESA SYYPLTGDTR
YELELRPTGE VEQYSATATY ELLKEDKSLV DTLKFLVQAE GVQQSEATVL FKYNRRSRTL SSEVLIPGFD VNFGTILRVN DESAKDKNTY KLILDIQNKK
ITEVSLVGHL SYDKKGDGKI KGVVSIPRLQ AEARSEVHTH WSSTKLLFQM DSSATAYGST ISKRVTWRYD NEIIEFDWNT GTNVDTKKVA SNFPVDLSHY
PRMLHEYANG LLDHRVPQTD VTFRDMGSKL IVATNTWLQM ATRGLPYPQT LQDHLNSLSE LNLLKMGLSD FHIPDNLFLK TDGRVKYTMN RNKINIDIPL
PLGGKSSKDL KMPESVRTPA LNFKSVGFHL PSREVQVPTF TIPKTHQLQV PLLGVLDLST NVYSNLYNWS ASYTGGNTSR DHFSLQAQYR MKTDSVVDLF
SYSVQGSGET TYDSKNTFTL SCDGSLHHKF LDSKFKVSHV EKFGNSPVSK GLLTFETSSA LGPQMSATVH LDSKKKQHLY VKDIKVDGQF RASSFYAQGK
YGLSCERDVT TGQLSGESNM RFNSTYFQGT NQIVGMYQDG ALSITSTSDL QDGIFKNTAS LKYENYELTL KSDSSGQYEN FAASNKLDVT FSTQSALLRS
EHQANYKSLR LVTLLSGSLT SQGVELNADI LGTDKINTGA HKATLKIARD GLSTSATTNL KYSPLLLENE LNAELGLSGA SMKLSTNGRF KEHHAKFSLD
GRAALTEVSL GSIYQAMILG ADSKNIFNFK LSREGLRLSN DLMGSYAEMK LDHTHSLNIA GLSLDFFSKM DNIYSGDKFY KQNFNLQLQP YSFITTLSND
LRYGALDLTN NGRFRLEPLK LNVGGNFKGT YQNNELKHIY TISYTDLVVA SYRADTVAKV QGVEFSHRLN ADIEGLTSSV DVTTSYNSDP LHFNNVFHFS
LAPFTLGIDT HTSGDGKLSF WGEHTGQLYS KFLLKAEPLA LIVSHDYKGS TSHSLPYESS ISTALEHTVS ALLTPAEQTS TWKFKTKLND KVYSQDFEAY
NTKDKIGVEL SGRADLSGLY SPIKLPFFYS EPVNVLNGLE VNDAVDKPQE FTIIAVVKYD KNQDVHTINL PFFKSLPDYL ERNRRGMISL LEAMRGELQR
LSVDQFVRKY RAALSRLPQQ IHHYLNASDW ERQVAGAKEK ITSFMENYRI TDNDVLIAID SAKINFNEKL SQLETYAIQF DQYIKDNYDP HDLKRTIAEI
IDRIIEKLKI LDEQYHIRVN LAKSIHNLYL FVENVDLNQV SSSNTSWIQN VDSNYQVRIQ IQEKLQQLRT QIQNIDIQQL AAEVKRQMDA IDVTMHLDQL
RTAILFQRIS DIIDRVKYFV MNLIEDFKVT EKINTFRVIV RELIEKYEVD QHIQVLMDKS VELAHRYSLS EPLQKLSNVL QRIEIKDYYE KLVGFIDDTV
EWLKALSFKN TIEELNRLTD MLVKKLKAFD YHQFVDKTNS KIREMTQRIN AEIQALKLPQ KMEALKLLVE DFKTTVSNSL ERLKDTKVTV VIDWLQDILT
QMKDHFQDTL EDVRDRIYQM DIQRELEHFL SLVNQVYSTL VTYMSDWWTL TAKNITDFAE QYSIQNWAES IKVLVEQGFI VPEMQTFLWT MPAFEVSLRA
LQEGNFQTPV FIVPLTDLRI PSIRINFKML KNIKIPLRFS TPEFTLLNTF HVHSFTIDLL EIKAKIIRTI DQILSSELQW PLPEMYLRDL DVVNIPLARL
TLPDFHVPEI TIPEFTIPNV NLKDLHVPDL HIPEFQLPHL SHTIEIPAFG KLHSILKIQS PLFILDANAN IQNVTTSGNK AEIVASVTAK GESQFEALNF
DFQAQAQFLE LNPHPPVLKE SMNFSSKHVR MEHEGEIVFD GKAIEGKSDT VASLHTEKNE VEFNNGMTVK VNNQLTLDSH TKYFHKLSVP RLDFSSKASL
NNEIKTLLEA GHVALTSSGT GSWNWACPNF SDEGIHSSQI SFTVDGPIAF VGLSNNINGK HLRVIQKLTY ESGFLNYSKF EVESKVESQH VGSSILTANG
RALLKDAKAE MTGEHNANLN GKVIGTLKNS LFFSAQPFEI TASTNNEGNL KVGFPLKLTG KIDFLNNYAL FLSPRAQQAS WQASTRFNQY KYNQNFSAIN
NEHNIEASIG MNGDANLDFL NIPLTIPEIN LPYTEFKTPL LKDFSIWEET GLKEFLKTTK QSFDLSVKAQ YKKNSDKHSI VVPLGMFYEF ILNNVNSWDR
KFEKVRNNAL HFLTTSYNEA KIKVDKYKTE NSLNQPSGTF QNHGYTIPVV NIEVSPFAVE TLASSHVIPT AISTPSVTIP GPNIMVPSYK LVLPPLELPV
FHGPGNLFKF FLPDFKGFNT IDNIYIPAMG NFTYDFSFKS SVITLNTNAG LYNQSDIVAH FLSSSSFVTD ALQYKLEGTS RLMRKRGLKL ATAVSLTNKF
VKGSHDSTIS LTKKNMEASV RTTANLHAPI FSMNFKQELN GNTKSKPTVS SSIELNYDFN SSKLHSTATG GIDHKFSLES LTSYFSIESF TKGNIKSSFL
SQEYSGSVAN EANVYLNSKG TRSSVRLQGA SKVDGIWNVE VGENFAGEAT LQRIYTTWEH NMKNHLQVYS YFFTKGKQTC RATLELSPWT MSTLLQVHVS
QLSSLLDLHH FDQEVILKAN TKNQKISWKG GVQVESRVLQ HNAQFSNDQE EIRLDLAGSL DGQLWDLEAI FLPVYGKSLQ ELLQMDGKRQ YLQASTSLLY
TKNPNGYLLS LPVQELADRF IIPGIKLNDF SGVKIYKKLS TSPFALNLTM LPKVKFPGID LLTQYSTPEG SSVPIFEATI PEIHLTVSQF TLPKSLPVGN
TVFDLNKLAN MIADVDLPSV TLPEQTIVIP PLEFSVPAGI FIPFFGELTA RAGMASPLYN VTWSAGWKTK ADHVETFLDS MCTSTLQFLE YALKVVETHK
IEEDLLTYNI KGTLQHCDFN VEYNEDGLFK GLWDWQGEAH LDITSPALTD FHLYYKEDKT SLSASAASST IGTVGLDSST DDQSVELNVY FHPQSPPEKK
LSIFKTEWRY KESDGERYIK INWEEEAASR LLGSLKSNVP KASKAIYDYA NKYHLEYVSS ELRKSLQVNA EHARRMVDEM NMSFQRVARD TYQNLYEEML
AQKSLSIPEN LKKRVLDSIV HVTQKYHMAV MWLMDSFIHF LKFNRVQFPG YAGTYTVDEL YTIVMKETKK SLSQLFNGLG NLLSYVQNQV EKSRLINDIT
FKCPFFSKPC KLKDLILIFR EELNILSNIG QQDIKFTTIL SSLQGFLERV LDIIEEQIKC LKDNESTCVA DHINMVFKIQ VPYAFKSLRE DIYFVLGEFN
DFLQSILQEG SYKLQQVHQY MKALREEYFD PSMVGWTVKY YEIEENMVEL IKTLLVSFRD VYSEYSVTAA DFASKMSTQV EQFVSRDIRE YLSMLTDING
KWMEKIAELS IVAKETMKSW VTAVAKIMSD YPQQFHSNLQ DFSDQLSSYY EKFVGESTRL IDLSIQNYHV FLRYITELLR KLQVATANNV SPYIKLAQGE
LMITF

N-glycosylation sites

Potential sites Identified Peptide*1
Position Sequon Position Sequence Unique or Shared
34 NLS
95 NQC
185 NCS
476 NEC
974 NYC
983 NAS 961 - 1000
QSWSTCKPLFTGMNYCTTGAYSNASSTESASYYPLTGDTR
Shared ( 2 )
1368 NWS 1345 - 1380
THQLQVPLLGVLDLSTNVYSNLYNWSASYTGGNTSR
Shared ( 2 )
1377 NTS 1345 - 1380
THQLQVPLLGVLDLSTNVYSNLYNWSASYTGGNTSR
Shared ( 2 )
1523 NST
2126 NAS 2117 - 2132
LPQQIHHYLNASDWER
Shared ( 2 )
2112 - 2132
AALSRLPQQIHHYLNASDWER
Shared ( 2 )
2244 NTS
2554 NIT
2773 NVT
2823 NFS
2929 NFS
2976 NYS 2968 - 2979
LTYESGFLNYSK
Shared ( 2 )
3095 NFS
3331 NFT
3353 NQS
3460 NSS
3747 NLT 3738 - 3753
KLSTSPFALNLTMLPK
Shared ( 3 )
3739 - 3753
LSTSPFALNLTMLPK
Shared ( 3 )
3739 - 3753
LSTSPFALNLTMLPK
Shared ( 3 )
3738 - 3753
KLSTSPFALNLTMLPK
Shared ( 3 )
3860 NVT 3852 - 3868
AGMASPLYNVTWSAGWK
Shared ( 3 )
3852 - 3868
AGMASPLYNVTWSAGWK
Shared ( 3 )
4081 NMS
4264 NES
4489 NVS
Kidney Liver Liver: b4GalT-I(+/+) Liver: b4GalT-I(-/-) Lung Serum Stomach
RCA120 ConA RCA120 RCA120 RCA120 RCA120 Amide80 ConA
               
               
               
               
               
           
 
     
   
     
   
               
   
         
     
       
               
               
               
               
               
     
       
               
               
               
               
   
         
 
   
 
   
   
           
 
 
         
   
         
               
               
               
*1:   _:Potential Sequon     :Asn (glycosylated)     :Gln (deaminated:pyroGlu)     :Met (oxidized)     :Cys (carbamidomethylated and deaminated)

External Links

Publications

Sugahara D, Kaji H, Sugihara K, Asano M, Narimatsu H Large-scale identification of target proteins of a glycosyltransferase isozyme by Lectin-IGOT-LC/MS, an LC/MS-based glycoproteomic approach Sci Rep 2012;2(): PMID:23002422
Kaji H, Shikanai T, Sasaki-Sawa A, et al Large-scale Identification of N-Glycosylated Proteins of Mouse Tissues and Construction of a Glycoprotein Database, GlycoProtDB J Proteome Res 2012;11(9):4553-4566 PMID:22823882
Kaji H, Yamauchi Y, Takahashi N, Isobe T Mass spectrometric identification of N-linked glycopeptides using lectin-mediated affinity capture and glycosylation site--specific stable isotope tagging Nat Protocols 2007;1(6):3019-3027 PMID:17406563

Entry information

Entry version 2016-12-05
protein sequence version 2016-12-05
Entry status Latest version
Entry history

create : 2016-12-05