Report an assembly or annotation error

Information about assembly Zm-EP1-REFERENCE-TUM-1.0    (also known as EP1)
Click here to learn about maize genome and gene model nomenclature rules.

Genome Sequencing Project Information

   The enormous diversity of maize is reflected by a large number of SNPs and substantial structural variation. To remedy the scarcity of sequence resources for the Flint pool, a reference sequence was generated de novo from inbred line EP1. The EP1 reference sequence complements the maize pan-genome with European Flint diversity.



The Plant Breeding Group (Technical University of Munich) and the Plant Genome and Systems Biology Group (Helmholtz Center Munich) in collaboration with NCBI and MaizeGDB have released a beta-version of the EP1 maize genome prior to scientific publication in accordance with guidelines set forth by the Toronto Agreement for prepublication data sharing (Nature. 2009 461:168). The above groups reserve the first right to publish on the available EP1 data, including but not limited to whole-genome comparisons, genes, structural annotations, functional annotations, and genome-wide association studies, and to improve the sequence and its annotations. Under the Toronto agreement, researchers can use the sequence to study individual or small sets of genes and localized regions of the genome. Any redistribution of these data should include the full text of the data use policy. This work was funded by the Bavarian State Ministry of the Environment and Consumer Protection (Project BayKlimaFit; http://www.bayklimafit.de/).
   This sequence has been released under the Toronto Agreement. No whole-genome research may be submitted for publication until the official publication for this genome assembly has been published.

   GenBank BioProject   PRJNA360920  
   Project PI   Chris-Carolin Schön
   Project start date   2016-10-01
   Release date   February, 2017 (under Toronto agreement)
   Browse Genome   Genome browser at MaizeGDB
   Data download   ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001984235.2_Zm-EP1-REFERENCE-TUM-1.0
   Publication status   pre-print
Project reference European Flint reference sequences complement the maize pan-genome. Unterseer, Sandra*; Seidel, Michael A.*; Bauer, Eva; Haberer, Georg; Hochholdinger, Frank; Opitz, Nina, Marcon, Caroline; Baruch, Kobi; Manuel Spannagl; Mayer, Klaus F.X.; Schön, Chris-Carolin
* These authors contributed equally.
At MaizeGDB  
DOI  

Stock and Biosample Information

Stock information
   Stock name   EP1_TUM_2015
   Stock record   25146
   Stock details   EP1_TUM_2015
   Stock provided by   Technical University of Munich, Plant Breeding
Biosample information
   Sample description   Pedigree: Spanish population 'Lizargarate'; Important line in early European hybrid breeding programs; released in 1950's
   Collection date   missing
   Collected by   Eva Bauer
   Plant structure   leaf

Sequencing and Assembly Information

   Assembly name   Zm-EP1-REFERENCE-TUM-1.0
   Sequencing description   Sequencing technologies: Illumina technologies
Sequencing method: Illumina technologies
   Assembly description   Assembly methods: NRGene de novo assembly. The assembly was done using DeNovoMAGIC 2.0 after which NRGene’s internal maize ancestral genome was used to build pseudo chromosomes from the de novo assembled scaffolds.
Construction of pseudomolecules: Yes
   Browse Genome   Genome browser at MaizeGDB
   Data download   ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001984235.2_Zm-EP1-REFERENCE-TUM-1.0
   Release date   February, 2017 (under Toronto agreement)
   Sequencing method   Illumina technologies
   Finishing strategy   Complete genome, 320X coverage
   Comment   Scaffolds
Ns are applied between contigs using paired-end and mate-pair information and their sizes are determined by the estimated insert sizes.
Negative gaps
Mate-pair and paired-end information is used to estimate the unfilled gap sizes in the scaffolds. In cases where the linking information indicated a "negative" gap size (a gap of undetermined size), an artificial gap size of 10 N’s is used.
Pseudomolecules
Scaffolds in all the rest of of the chromosomes are separated by 100 Ns. The unfilled gaps within scaffolds by a variable number of N’s according to the estimation of gap size between their contigs.
   Genome coverage   320X
Assembly statistics
   Scaff num   71,196
   Longest scaff   29,676,304 bp
   N50 scaff length   6,134,295 bp
   N50 scaff count   121
   N90 scaff length   1,226,173 bp
   N90 scaff count   439
   Longest contig   766,959 bp
   N50 contig length   82,295 bp
   N50 contig count   8,811
   N90 contig length   17,687 bp
   N90 contig count   31,904
Total number of scaffolds in assembly.
Longest scaffold in assembly.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
How many scaffolds are counted in reaching the N50 threshold.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 90% of the total assembly size.
How many scaffolds are counted in reaching the N90 threshold.
The longest contig.
The length of contig which takes the sum length (summing from longest to shortest contig) past 50% of the total assembly size.
How many contig are counted in reaching the N50 threshold.
The length of contig which takes the sum length (summing from longest to shortest contig) past 90% of the total assembly size.
How many contig are counted in reaching the N90 threshold.
A contig is a contiguous consensus sequence that is derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs that are linked to one another by mate pairs of sequencing reads.

Annotation

   Annotation Identifier   Zm00010a.1