GlycoProtDB ID - GPDB0002464


Protein Name Apolipoprotein B-100 (Fragment)
Protein Accession Number E9Q1Y3
Gene Apob
Length 4456
Status
release : 2016-12-05

Glycosylation Sites

Schema of N-glycosylation site(s) of the protein

Sequence

(__:Potential Sequon , N:Identified Site)

1
101
201
301
401
501
601
701
801
901
1001
1101
1201
1301
1401
1501
1601
1701
1801
1901
2001
2101
2201
2301
2401
2501
2601
2701
2801
2901
3001
3101
3201
3301
3401
3501
3601
3701
3801
3901
4001
4101
4201
4301
4401
MGPRKPALRT PLLLLFLLLF LDTSVWAQDA TRFKHLRKYV YNYEAESSSG VQGTADSRSA TKINCKVELE VPQICGFIMR TNQCTLKEVY GFNPEGKALM
KKTKNSEEFA AAMSRYELKL AIPEGKQIVL YPDKDEPKYI LNIKRGIISA LLVPPETEED QQELFLDTVY GNCSTQVTVN SRKGTVPTEM STERNLQQCD
GFQPISTSVS PLALIKGLVH PLSTLISSSQ TCQYTLDPKR KHVSEAVCDE QHLFLPFSYK NKYGIMTRVT QKLSLEDTPK INSRFFSEGT NRMGLAFEST
KSTSSPKQAD AVLKTLQELK KLSISEQNAQ RANLFNKLVT ELRGLTGEAI TSLLPQLIEV SSPITLQALV QCGQPQCYTH ILQWLKTEKA HPLLVDIVTY
LMALIPNPST QRLQEIFNTA KEQQSRATLY ALSHAVNSYF DVDHSRSPVL QDIAGYLLKQ IDNECTGNED HTFLILRVIG NMGRTMEQVM PALKSSVLSC
VRSTKPSLLI QKAALQALRK MELEDEVRTI LFDTFVNGVA PVEKRLAAYL LLMKNPSSSD INKIAQLLQW EQSEQVKNFV ASHIANILNS EELYVQDLKV
LIKNALENSQ FPTIMDFRKF SRNYQISKSA SLPMFDPVSV KIEGNLIFDP SSYLPRESLL KTTLTVFGLA SLDLFEIGLE GKGFEPTLEA LFGKQGFFPD
SVNKALYWVN GRVPDGVSKV LVDHFGYTTD GKHEQDMVNG IMPIVDKLIK DLKSKEIPEA RAYLRILGKE LSFVRLQDLQ VLGKLLLSGA QTLQGIPQMV
VQAIREGSKN DLFLHYIFMD NAFELPTGAG LQLQVSSSGV FTPGIKAGVR LELANIQAEL VAKPSVSLEF VTNMGIIIPD FAKSSVQMNT NFFHESGLEA
RVALKAGQLK VIIPSPKRPV KLFSGSNTLH LVSTTKTEVI PPLVENRQSW STCKPLFTGM NYCTTGAYSN ASSTESASYY PLTGDTRYEL ELRPTGEVEQ
YSATATYELL KEDKSLVDTL KFLVQAEGVQ QSEATVLFKY NRRSRTLSSE VLIPGFDVNF GTILRVNDES AKDKNTYKLI LDIQNKKITE VSLVGHLSYD
KKGDGKIKGV VSIPRLQAEA RSEVHTHWSS TKLLFQMDSS ATAYGSTISK RVTWRYDNEI IEFDWNTGTN VDTKKVASNF PVDLSHYPRM LHEYANGLLD
HRVPQTDVTF RDMGSKLIVD HLNSLSELNL LKMGLSDFHI PDNLFLKTDG RVKYTMNRNK INIDIPLPLG GKSSKDLKMP ESVRTPALNF KSVGFHLPSR
EVQVPTFTIP KTHQLQVPLL GVLDLSTNVY SNLYNWSASY TGGNTSRDHF SLQAQYRMKT DSVVDLFSYS VQGSGETTYD SKNTFTLSCD GSLHHKFLDS
KFKVSHVEKF GNSPVSKGLL TFETSSALGP QMSATVHLDS KKKQHLYVKD IKVDGQFRAS SFYAQGKYGL SCERDVTTGQ LSGESNMRFN STYFQGTNQI
VGMYQDGALS ITSTSDLQDG IFKNTASLKY ENYELTLKSD SSGQYENFAA SNKLDVTFST QSALLRSEHQ ANYKSLRLVT LLSGSLTSQG VELNADILGT
DKINTGAHKA TLKIARDGLS TSATTNLKYS PLLLENELNA ELGLSGASMK LSTNGRFKEH HAKFSLDGRA ALTEVSLGSI YQAMILGADS KNIFNFKLSR
EGLRLSNDLM GSYAEMKLDH THSLNIAGLS LDFFSKMDNI YSGDKFYKQN FNLQLQPYSF ITTLSNDLRY GALDLTNNGR FRLEPLKLNV GGNFKGTYQN
NELKHIYTIS YTDLVVASYR ADTVAKVQGV EFSHRLNADI EGLTSSVDVT TSYNSDPLHF NNVFHFSLAP FTLGIDTHTS GDGKLSFWGE HTGQLYSKFL
LKAEPLALIV SHDYKGSTSH SLPYESSIST ALEHTVSALL TPAEQTSTWK FKTKLNDKVY SQDFEAYNTK DKIGVELSGR ADLSGLYSPI KLPFFYSEPV
NVLNGLEVND AVDKPQEFTI IAVVKYDKNQ DVHTINLPFF KSLPDYLERN RRGMISLLEA MRGELQRLSV DQFVRKYRAA LSRLPQQIHH YLNASDWERQ
VAGAKEKITS FMENYRITDN DVLIAIDSAK INFNEKLSQL ETYAIQFDQY IKDNYDPHDL KRTIAEIIDR IIEKLKILDE QYHIRVNLAK SIHNLYLFVE
NVDLNQVSSS NTSWIQNVDS NYQVRIQIQE KLQQLRTQIQ NIDIQQLAAE VKRQMDAIDV TMHLDQLRTA ILFQRISDII DRVKYFVMNL IEDFKVTEKI
NTFRVIVREL IEKYEVDQHI QVLMDKSVEL AHRYSLSEPL QKLSNVLQRI EIKDYYEKLV GFIDDTVEWL KALSFKNTIE ELNRLTDMLV KKLKAFDYHQ
FVDKTNSKIR EMTQRINAEI QALKLPQKME ALKLLVEDFK TTVSNSLERL KDTKVTVVID WLQDILTQMK DHFQDTLEDV RDRIYQMDIQ RELEHFLSLV
NQVYSTLVTY MSDWWTLTAK NITDFAEQYS IQNWAESIKV LVEQGFIVPE MQTFLWTMPA FEVSLRALQE GNFQTPVFIV PLTDLRIPSI RINFKMLKNI
KIPLRFSTPE FTLLNTFHVH SFTIDLLEIK AKIIRTIDQI LSSELQWPLP EMYLRDLDVV NIPLARLTLP DFHVPEITIP EFTIPNVNLK DLHVPDLHIP
EFQLPHLSHT IEIPAFGKLH SILKIQSPLF ILDANANIQN VTTSGNKAEI VASVTAKGES QFEALNFDFQ AQAQFLELNP HPPVLKESMN FSSKHVRMEH
EGEIVFDGKA IEGKSDTVAS LHTEKNEVEF NNGMTVKVNN QLTLDSHTKY FHKLSVPRLD FSSKASLNNE IKTLLEAGHV ALTSSGTGSW NWACPNFSDE
GIHSSQISFT VDGPIAFVGL SNNINGKHLR VIQKLTYESG FLNYSKFEVE SKVESQHVGS SILTANGRAL LKDAKAEMTG EHNANLNGKV IGTLKNSLFF
SAQPFEITAS TNNEGNLKVG FPLKLTGKID FLNNYALFLS PRAQQASWQA STRFNQYKYN QNFSAINNEH NIEASIGMNG DANLDFLNIP LTIPEINLPY
TEFKTPLLKD FSIWEETGLK EFLKTTKQSF DLSVKAQYKK NSDKHSIVVP LGMFYEFILN NVNSWDRKFE KVRNNALHFL TTSYNEAKIK VDKYKTENSL
NQPSGTFQNH GYTIPVVNIE VSPFAVETLA SSHVIPTAIS TPSVTIPGPN IMVPSYKLVL PPLELPVFHG PGNLFKFFLP DFKGFNTIDN IYIPAMGNFT
YDFSFKSSVI TLNTNAGLYN QSDIVAHFLS SSSFVTDALQ YKLEGTSRLM RKRGLKLATA VSLTNKFVKG SHDSTISLTK KNMEASVRTT ANLHAPIFSM
NFKQELNGNT KSKPTVSSSI ELNYDFNSSK LHSTATGGID HKFSLESLTS YFSIESFTKG NIKSSFLSQE YSGSVANEAN VYLNSKGTRS SVRLQGASKV
DGIWNVEVGE NFAGEATLQR IYTTWEHNMK NHLQVYSYFF TKGKQTCRAT LELSPWTMST LLQVHVSQLS SLLDLHHFDQ EVILKANTKN QKISWKGGVQ
VESRVLQHNA QFSNDQEEIR LDLAGSLDGQ LWDLEAIFLP VYGKSLQELL QMDGKRQYLQ ASTSLLYTKN PNGYLLSLPV QELADRFIIP GIKLNDFSGV
KIYKKLSTSP FALNLTMLPK VKFPGIDLLT QYSTPEGSSV PIFEATIPEI HLTVSQFTLP KSLPVGNTVF DLNKLANMIA DVDLPSVTLP EQTIVIPPLE
FSVPAGIFIP FFGELTARAG MASPLYNVTW SAGWKTKADH VETFLDSMCT STLQFLEYAL KVVETHKIEE DLLTYNIKGT LQHCDFNVEY NEDGLFKGLW
DWQGEAHLDI TSPALTDFHL YYKEDKTSLS ASAASSTIGT VGLDSSTDDQ SVELNVYFHP QSPPEKKLSI FKTEWRYKES DGERYIKINW EEEAASRLLG
SLKSNVPKAS KAIYDYANKY HLEYVSSELR KSLQVNAEHA RRMVDEMNMS FQRVARDTYQ NLYEEMLAQK SLSIPENLKK RVLDSIVHVT QKYHMAVMWL
MDSFIHFLKF NRVQFPGYAG TYTVDELYTI VMKETKKSLS QLFNGLGNLL SYVQNQVEKS RLINDITFKC PFFSKPCKLK DLILIFREEL NILSNIGQQD
IKFTTILSSL QGFLERVLDI IEEQIKCLKD NESTCVADHI NMVFKIQVPY AFKSLREDIY FVLGEFNDFL QSILQEGSYK LQQVHQYMKA LREEYFDPSM
VGWTVKYYEI EENMVELIKT LLVSFRDVYS EYSVTAADFA SKMSTQVEQF VSRDIREYLS MLTDINGKWM EKIAELSIVA KETMKSWVTA VAKIMSDYPQ
QFHSNLQDFS DQLSSYYEKF VGESTRLIDL SIQNYHVFLR YITELLRKLQ VATANN

N-glycosylation sites

Potential sites Identified Peptide*1
Position Sequon Position Sequence Unique or Shared
82 NQC
172 NCS
463 NEC
961 NYC
970 NAS 948 - 987
QSWSTCKPLFTGMNYCTTGAYSNASSTESASYYPLTGDTR
Shared ( 2 )
1335 NWS 1312 - 1347
THQLQVPLLGVLDLSTNVYSNLYNWSASYTGGNTSR
Shared ( 2 )
1344 NTS 1312 - 1347
THQLQVPLLGVLDLSTNVYSNLYNWSASYTGGNTSR
Shared ( 2 )
1490 NST
2093 NAS 2084 - 2099
LPQQIHHYLNASDWER
Shared ( 2 )
2079 - 2099
AALSRLPQQIHHYLNASDWER
Shared ( 2 )
2211 NTS
2521 NIT
2740 NVT
2790 NFS
2896 NFS
2943 NYS 2935 - 2946
LTYESGFLNYSK
Shared ( 2 )
3062 NFS
3298 NFT
3320 NQS
3427 NSS
3714 NLT 3705 - 3720
KLSTSPFALNLTMLPK
Shared ( 3 )
3706 - 3720
LSTSPFALNLTMLPK
Shared ( 3 )
3706 - 3720
LSTSPFALNLTMLPK
Shared ( 3 )
3705 - 3720
KLSTSPFALNLTMLPK
Shared ( 3 )
3827 NVT 3819 - 3835
AGMASPLYNVTWSAGWK
Shared ( 3 )
3819 - 3835
AGMASPLYNVTWSAGWK
Shared ( 3 )
4048 NMS
4231 NES
Kidney Liver Liver: b4GalT-I(+/+) Liver: b4GalT-I(-/-) Lung Serum Stomach
RCA120 ConA RCA120 RCA120 RCA120 RCA120 Amide80 ConA
               
               
               
               
           
 
     
   
     
   
               
   
         
     
       
               
               
               
               
               
     
       
               
               
               
               
   
         
 
   
 
   
   
           
 
 
         
   
         
               
               
*1:   _:Potential Sequon     :Asn (glycosylated)     :Gln (deaminated:pyroGlu)     :Met (oxidized)     :Cys (carbamidomethylated and deaminated)

External Links

Publications

Sugahara D, Kaji H, Sugihara K, Asano M, Narimatsu H Large-scale identification of target proteins of a glycosyltransferase isozyme by Lectin-IGOT-LC/MS, an LC/MS-based glycoproteomic approach Sci Rep 2012;2(): PMID:23002422
Kaji H, Shikanai T, Sasaki-Sawa A, et al Large-scale Identification of N-Glycosylated Proteins of Mouse Tissues and Construction of a Glycoprotein Database, GlycoProtDB J Proteome Res 2012;11(9):4553-4566 PMID:22823882
Kaji H, Yamauchi Y, Takahashi N, Isobe T Mass spectrometric identification of N-linked glycopeptides using lectin-mediated affinity capture and glycosylation site--specific stable isotope tagging Nat Protocols 2007;1(6):3019-3027 PMID:17406563

Entry information

Entry version 2016-12-05
protein sequence version 2016-12-05
Entry status Latest version
Entry history

create : 2016-12-05