Human Gene NSD1 (ENST00000439151.7) from GENCODE V43
  Description: Homo sapiens nuclear receptor binding SET domain protein 1 (NSD1), transcript variant 1, mRNA. (from RefSeq NM_172349)
RefSeq Summary (NM_022455): This gene encodes a protein containing a SET domain, 2 LXXLL motifs, 3 nuclear translocation signals (NLSs), 4 plant homeodomain (PHD) finger regions, and a proline-rich region. The encoded protein enhances androgen receptor (AR) transactivation, and this enhancement can be increased further in the presence of other androgen receptor associated coregulators. This protein may act as a nucleus-localized, basic transcriptional factor and also as a bifunctional transcriptional regulator. Mutations of this gene have been associated with Sotos syndrome and Weaver syndrome. One version of childhood acute myeloid leukemia is the result of a cryptic translocation with the breakpoints occurring within nuclear receptor-binding Su-var, enhancer of zeste, and trithorax domain protein 1 on chromosome 5 and nucleoporin, 98-kd on chromosome 11. Multiple transcript variants encoding distinct isoforms have been identified for this gene. [provided by RefSeq, Sep 2018].
Gencode Transcript: ENST00000439151.7
Gencode Gene: ENSG00000165671.22
Transcript (Including UTRs)
   Position: hg38 chr5:177,133,773-177,300,213 Size: 166,441 Total Exon Count: 23 Strand: +
Coding Region
   Position: hg38 chr5:177,135,104-177,295,459 Size: 160,356 Coding Exon Count: 22 

Page IndexSequence and LinksUniProtKB CommentsPrimersMalaCardsCTD
RNA-Seq ExpressionMicroarray ExpressionRNA StructureProtein StructureOther SpeciesGO Annotations
mRNA DescriptionsPathwaysOther NamesGeneReviewsMethods
Data last updated at UCSC: 2023-02-17 13:02:02

-  Sequence and Links to Tools and Databases
Genomic Sequence (chr5:177,133,773-177,300,213)mRNA (may differ from genome)Protein (2696 aa)
Gene SorterGenome BrowserOther Species FASTAGene interactionsTable SchemaAlphaFold
BioGPSEnsemblEntrez GeneExonPrimerGencodeGeneCards

-  Comments and Description Text from UniProtKB
DESCRIPTION: RecName: Full=Histone-lysine N-methyltransferase, H3 lysine-36 and H4 lysine-20 specific; EC=; AltName: Full=Androgen receptor coactivator 267 kDa protein; AltName: Full=Androgen receptor-associated protein of 267 kDa; AltName: Full=H3-K36-HMTase; AltName: Full=H4-K20-HMTase; AltName: Full=Lysine N-methyltransferase 3B; AltName: Full=Nuclear receptor-binding SET domain-containing protein 1; Short=NR-binding SET domain-containing protein;
FUNCTION: Histone methyltransferase. Preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4 (in vitro). Transcriptional intermediary factor capable of both negatively or positively influencing transcription, depending on the cellular context.
CATALYTIC ACTIVITY: S-adenosyl-L-methionine + L-lysine-[histone] = S-adenosyl-L-homocysteine + N(6)-methyl-L-lysine-[histone].
SUBUNIT: Interacts with the ligand-binding domains of RARA and THRA in the absence of ligand; in the presence of ligand the interaction is severely disrupted but some binding still occurs. Interacts with the ligand-binding domains of RXRA and ESRRA only in the presence of ligand. Interacts with ZNF496 (By similarity). Interacts with AR DNA- and ligand-binding domains.
SUBCELLULAR LOCATION: Nucleus. Chromosome (Probable).
TISSUE SPECIFICITY: Expressed in the fetal/adult brain, kidney, skeletal muscle, spleen, and the thymus, and faintly in the lung.
DISEASE: Defects in NSD1 are the cause of Sotos syndrome type 1 (SOTOS1) [MIM:117550]; also known as cerebral gigantism. It is a disorder characterized by excessively rapid growth, acromegalic features, and a nonprogressive cerebral disorder with mental retardation. High-arched palate and prominent jaw are noted in several patients. Most cases of Sotos syndrome are sporadic and may represent new dominant mutation.
DISEASE: Defects in NSD1 are the cause of Weaver syndrome type 1 (WVS1) [MIM:277590]. A syndrome of accelerated growth and osseous maturation, unusual craniofacial appearance, hoarse and low- pitched cry, and hypertonia with camptodactyly. Distinguishing features of Weaver syndrome include broad forehead and face, ocular hypertelorism, prominent wide philtrum, micrognathia, deep horizontal chin groove, and deep-set nails. In addition, carpal bone development is advanced over the rest of the hand.
DISEASE: Defects in NSD1 are a cause of Beckwith-Wiedemann syndrome (BWS) [MIM:130650]. BWS is a genetically heterogeneous disorder characterized by anterior abdominal wall defects including exomphalos (omphalocele), pre- and postnatal overgrowth, and macroglossia. Additional less frequent complications include specific developmental defects and a predisposition to embryonal tumors.
DISEASE: Note=A chromosomal aberration involving NSD1 is found in childhood acute myeloid leukemia. Translocation t(5;11)(q35;p15.5) with NUP98.
DISEASE: Note=A chromosomal aberration involving NSD1 is found in an adult form of myelodysplastic syndrome (MDS). Insertion of NUP98 into NSD1 generates a NUP98-NSD1 fusion product.
SIMILARITY: Belongs to the histone-lysine methyltransferase family.
SIMILARITY: Contains 1 AWS domain.
SIMILARITY: Contains 4 PHD-type zinc fingers.
SIMILARITY: Contains 1 post-SET domain.
SIMILARITY: Contains 2 PWWP domains.
SIMILARITY: Contains 1 SET domain.
WEB RESOURCE: Name=Atlas of Genetics and Cytogenetics in Oncology and Haematology; URL="";
WEB RESOURCE: Name=GeneReviews; URL="";

-  Primer design for this transcript

Primer3Plus can design qPCR Primers that straddle exon-exon-junctions, which amplify only cDNA, not genomic DNA.
Click here to load the transcript sequence and exon structure into Primer3Plus

Exonprimer can design one pair of Sanger sequencing primers around every exon, located in non-genic sequence.
Click here to open Exonprimer with this transcript

To design primers for a non-coding sequence, zoom to a region of interest and select from the drop-down menu: View > In External Tools > Primer3

-  MalaCards Disease Associations
  MalaCards Gene Search: NSD1
Diseases sorted by gene-association score: sotos syndrome 1* (1702), beckwith-wiedemann syndrome* (914), weaver syndrome 1* (100), weaver syndrome (34), leukemia, acute myeloid* (31), deletion 5q35* (25), 5q35 microduplication syndrome* (25), lipedema (18), marshall-smith syndrome (12), macroglossia (7), mental retardation, x-linked syndromic, lubs type (6), myeloid leukemia (6), autosomal genetic disease (1)
* = Manually curated disease association

-  Comparative Toxicogenomics Database (CTD)
  The following chemicals interact with this gene           more ... click here to view the complete list

-  RNA-Seq Expression Data from GTEx (53 Tissues, 570 Donors)
  Highest median expression: 10.03 RPKM in Brain - Cerebellar Hemisphere
Total median expression: 270.69 RPKM

View in GTEx track of Genome Browser    View at GTEx portal     View GTEx Body Map

+  Microarray Expression Data
  Press "+" in the title bar above to open this section.

-  mRNA Secondary Structure of 3' and 5' UTRs
RegionFold EnergyBasesEnergy/Base
Display As
5' UTR -113.40197-0.576 Picture PostScript Text
3' UTR -1561.504754-0.328 Picture PostScript Text

The RNAfold program from the Vienna RNA Package is used to perform the secondary structure predictions and folding calculations. The estimated folding energy is in kcal/mol. The more negative the energy, the more secondary structure the RNA is likely to have.

-  Protein Domain and Structure Information
  InterPro Domains: Graphical view of domain structure
IPR006560 - AWS
IPR003616 - Post-SET_dom
IPR000313 - PWWP
IPR001214 - SET_dom
IPR019786 - Zinc_finger_PHD-type_CS
IPR011011 - Znf_FYVE_PHD
IPR001965 - Znf_PHD
IPR019787 - Znf_PHD-finger
IPR001841 - Znf_RING

Pfam Domains:
PF00628 - PHD-finger
PF00855 - PWWP domain
PF00856 - SET domain

Protein Data Bank (PDB) 3-D Structure
MuPIT help
3OOI - X-ray MuPIT

ModBase Predicted Comparative 3D Structure on Q96L73
The pictures above may be empty if there is no ModBase structure for the protein. The ModBase structure frequently covers just a fragment of the protein. You may be asked to log onto ModBase the first time you click on the pictures. It is simplest after logging in to just click on the picture again to get to the specific info on that model.

-  Orthologous Genes in Other Species
  Orthologies between human, mouse, and rat are computed by taking the best BLASTP hit, and filtering out non-syntenic hits. For more distant species reciprocal-best BLASTP hits are used. Note that the absence of an ortholog in the table below may reflect incomplete annotations in the other species rather than a true absence of the orthologous gene.
MouseRatZebrafishD. melanogasterC. elegansS. cerevisiae
Genome BrowserGenome BrowserNo orthologNo orthologNo orthologNo ortholog
Gene Details     
Gene Sorter     
Protein SequenceProtein Sequence    

-  Gene Ontology (GO) Annotations with Structured Vocabulary
  Molecular Function:
GO:0000978 RNA polymerase II core promoter proximal region sequence-specific DNA binding
GO:0003682 chromatin binding
GO:0003712 transcription cofactor activity
GO:0003714 transcription corepressor activity
GO:0005515 protein binding
GO:0008168 methyltransferase activity
GO:0008270 zinc ion binding
GO:0016740 transferase activity
GO:0018024 histone-lysine N-methyltransferase activity
GO:0030331 estrogen receptor binding
GO:0042799 histone methyltransferase activity (H4-K20 specific)
GO:0042974 retinoic acid receptor binding
GO:0046872 metal ion binding
GO:0046965 retinoid X receptor binding
GO:0046966 thyroid hormone receptor binding
GO:0046975 histone methyltransferase activity (H3-K36 specific)
GO:0050681 androgen receptor binding

Biological Process:
GO:0000122 negative regulation of transcription from RNA polymerase II promoter
GO:0000414 regulation of histone H3-K36 methylation
GO:0006325 chromatin organization
GO:0006351 transcription, DNA-templated
GO:0006355 regulation of transcription, DNA-templated
GO:0010452 histone H3-K36 methylation
GO:0016571 histone methylation
GO:0032259 methylation
GO:0033135 regulation of peptidyl-serine phosphorylation
GO:0034770 histone H4-K20 methylation
GO:0034968 histone lysine methylation
GO:0045893 positive regulation of transcription, DNA-templated
GO:1903025 regulation of RNA polymerase II regulatory region sequence-specific DNA binding

Cellular Component:
GO:0005634 nucleus
GO:0005654 nucleoplasm
GO:0005694 chromosome

-  Descriptions from all associated GenBank mRNAs
  AK126591 - Homo sapiens cDNA FLJ44628 fis, clone BRACE2017872, highly similar to Mus musculus nuclear receptor-binding SET-domain protein 1 (Nsd1).
AF395588 - Homo sapiens putative nuclear protein NSD1 mRNA, complete cds.
AF380302 - Homo sapiens androgen receptor-associated coregulator 267-a mRNA, complete cds.
AF322907 - Homo sapiens NSD1 (NSD1) mRNA, complete cds.
AY049721 - Homo sapiens androgen receptor associated coregulator 267-b (ARA267b) mRNA, complete cds.
BC144629 - Homo sapiens cDNA clone IMAGE:9053160.
BC150628 - Homo sapiens nuclear receptor binding SET domain protein 1, mRNA (cDNA clone MGC:183538 IMAGE:9056998), complete cds.
BC139789 - Homo sapiens nuclear receptor binding SET domain protein 1, mRNA (cDNA clone IMAGE:40127049), complete cds.
AK056667 - Homo sapiens cDNA FLJ32105 fis, clone OCBBF2001402, moderately similar to Mus musculus NSD1 protein mRNA.
BC021961 - Homo sapiens cDNA clone IMAGE:3908832, complete cds.
JD466090 - Sequence 447114 from Patent EP1572962.
AK091358 - Homo sapiens cDNA FLJ34039 fis, clone FCBBF2005658, moderately similar to Mus musculus NSD1 protein mRNA.
AX746932 - Sequence 457 from Patent EP1308459.
AK055187 - Homo sapiens cDNA FLJ30625 fis, clone CTONG2001820, highly similar to Mus musculus NSD1 protein mRNA.
KU695532 - Homo sapiens clone 1 NUP98/NSD1 fusion protein (NUP98/NSD1 fusion) mRNA, complete cds.
KU695534 - Homo sapiens clone 3 NUP98/NSD1 fusion protein (NUP98/NSD1 fusion) mRNA, complete cds.
KU695533 - Homo sapiens clone 2 NUP98/NSD1 fusion protein (NUP98/NSD1 fusion) mRNA, complete cds.
AK026066 - Homo sapiens cDNA: FLJ22413 fis, clone HRC08475.
AL832983 - Homo sapiens mRNA; cDNA DKFZp666C163 (from clone DKFZp666C163).
AK025916 - Homo sapiens cDNA: FLJ22263 fis, clone HRC03036.
AF085858 - Homo sapiens full length insert cDNA clone YN49B07.
JD324316 - Sequence 305340 from Patent EP1572962.
JD052912 - Sequence 33936 from Patent EP1572962.
AK001546 - Homo sapiens cDNA FLJ10684 fis, clone NT2RP3000220.
JD145737 - Sequence 126761 from Patent EP1572962.
JD090191 - Sequence 71215 from Patent EP1572962.
JD412962 - Sequence 393986 from Patent EP1572962.
JD490419 - Sequence 471443 from Patent EP1572962.
JD329581 - Sequence 310605 from Patent EP1572962.
JD194302 - Sequence 175326 from Patent EP1572962.
JD276658 - Sequence 257682 from Patent EP1572962.
JD120318 - Sequence 101342 from Patent EP1572962.
JD553949 - Sequence 534973 from Patent EP1572962.
JD503038 - Sequence 484062 from Patent EP1572962.
JD377243 - Sequence 358267 from Patent EP1572962.
JD183569 - Sequence 164593 from Patent EP1572962.
JD339938 - Sequence 320962 from Patent EP1572962.
JD090616 - Sequence 71640 from Patent EP1572962.
JD455799 - Sequence 436823 from Patent EP1572962.
JD347966 - Sequence 328990 from Patent EP1572962.
JD242142 - Sequence 223166 from Patent EP1572962.
JD081897 - Sequence 62921 from Patent EP1572962.
JD502143 - Sequence 483167 from Patent EP1572962.
JD278728 - Sequence 259752 from Patent EP1572962.
JD122318 - Sequence 103342 from Patent EP1572962.
JD516533 - Sequence 497557 from Patent EP1572962.
JD561883 - Sequence 542907 from Patent EP1572962.
JD555448 - Sequence 536472 from Patent EP1572962.
JD520066 - Sequence 501090 from Patent EP1572962.
JD402762 - Sequence 383786 from Patent EP1572962.
JD491089 - Sequence 472113 from Patent EP1572962.
JD049520 - Sequence 30544 from Patent EP1572962.
JD221107 - Sequence 202131 from Patent EP1572962.
JD332760 - Sequence 313784 from Patent EP1572962.
JD344206 - Sequence 325230 from Patent EP1572962.
JD483190 - Sequence 464214 from Patent EP1572962.
JD129246 - Sequence 110270 from Patent EP1572962.
JD226404 - Sequence 207428 from Patent EP1572962.

-  Biochemical and Signaling Pathways
  KEGG - Kyoto Encyclopedia of Genes and Genomes
hsa00310 - Lysine degradation

Reactome (by CSHL, EBI, and GO)

Protein Q96L73 (Reactome details) participates in the following event(s):

R-HSA-4827383 WHSC1 (KMT3G), NSD1 (KMT3B), SMYD2 (KMT3C) methylate lysine-37 of histone H3 (H3K36)
R-HSA-5638157 WHSC1 (KMT3G), NSD1 (KMT3B), SMYD2 (KMT3C), ASH1L methylate methyl-lysine-37 of histone H3 (H3K36)
R-HSA-3214841 PKMTs methylate histone lysines
R-HSA-3247509 Chromatin modifying enzymes
R-HSA-4839726 Chromatin organization

-  Other Names for This Gene
  Alternate Gene Symbols: ARA267, ENST00000439151.1, ENST00000439151.2, ENST00000439151.3, ENST00000439151.4, ENST00000439151.5, ENST00000439151.6, KMT3B, NM_172349, NSD1_HUMAN, Q96L73, Q96PD8, Q96RN7, uc003mfr.1, uc003mfr.2, uc003mfr.3, uc003mfr.4, uc003mfr.5, uc003mfr.6
UCSC ID: ENST00000439151.7
RefSeq Accession: NM_022455
Protein: Q96L73 (aka NSD1_HUMAN)
CCDS: CCDS4412.1

-  GeneReviews for This Gene
  GeneReviews article(s) related to gene NSD1:
sotos (Sotos Syndrome)
wilms-ov (Wilms Tumor Predisposition)

-  Methods, Credits, and Use Restrictions
  Click here for details on how this gene model was made and data restrictions if any.