|
|
|
Information about assembly Zm-F7-REFERENCE-TUM-1.0
(also known as F7)
|
|
|
Click
here
to learn about maize genome and gene model nomenclature rules.
|
|
|
|
|
|
Genome Sequencing Project Information |
|
|
The enormous diversity of maize is reflected by a large number of SNPs and substantial structural variation. To remedy the scarcity of sequence resources for the Flint pool, a reference sequence was generated de novo from inbred line F7. The F7 reference sequence complements the maize pan-genome with European Flint diversity.
This project is part of the European Maize project (http://www.europeanmaize.net/). This work was funded by the Bavarian State Ministry of the Environment and Consumer Protection (Project BayKlimaFit; http://www.bayklimafit.de/) |
|
|
GenBank BioProject |
PRJNA360923 |
|
Project PI |
Chris-Carolin Schön |
|
Project start date |
2016-10-01 |
|
Release date |
1st of February 2017 (pre-release) |
|
Browse Genome |
Genome browser at MaizeGDB |
|
Data download |
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001990705.1_Zm-F7-REFERENCE-TUM-1.0
|
|
Publication status |
pre-print |
|
Project reference |
European Flint reference sequences complement the maize pan-genome.
Unterseer, Sandra*; Seidel, Michael A.*; Bauer, Eva; Haberer, Georg; Hochholdinger, Frank; Opitz, Nina, Marcon, Caroline; Baruch, Kobi; Manuel Spannagl; Mayer, Klaus F.X.; Schön, Chris-Carolin * These authors contributed equally.
At MaizeGDB
DOI
|
|
|
|
|
|
|
|
|
|
|
Stock and Biosample Information |
|
Stock information |
|
Stock name |
F7_TUM_2015 |
|
Stock record |
25145 |
|
Stock details |
F7_TUM_2015 |
|
Stock provided by |
Technical University of Munich, Plant Breeding |
|
|
Biosample information |
|
GenBank BioSample |
SAMN06216672 |
|
Sample description |
Pedigree: French population 'Lacaune'; important line in early European hybrid breeding programs; released in 1950's |
|
Collection date |
missing |
|
Collected by |
Eva Bauer |
|
Plant structure |
leaf |
|
|
|
|
|
|
|
|
|
|
|
Sequencing and Assembly Information |
|
|
Assembly name |
Zm-F7-REFERENCE-TUM-1.0 |
|
WGS accession |
MTTB00000000 |
|
Sequencing description |
Sequencing technologies: Illumina technologies Sequencing method: Illumina technologies |
|
Assembly description |
Assembly methods: NRGene de novo assembly. The assembly was done using DeNovoMAGIC 2.0 after which NRGene’s internal maize ancestral genome was used to build pseudo chromosomes from the de novo assembled scaffolds. Construction of pseudomolecules: Yes |
|
Browse Genome |
Genome browser at MaizeGDB |
|
Data download |
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001990705.1_Zm-F7-REFERENCE-TUM-1.0
|
|
Release date |
1st of February 2017 (pre-release) |
|
Sequencing method |
Illumina technologies |
|
Finishing strategy |
Complete genome, 225X coverage |
|
Comment |
Scaffolds Ns are applied between contigs using paired-end and mate-pair information and their sizes are determined by the estimated insert sizes. Negative gaps Mate-pair and paired-end information is used to estimate the unfilled gap sizes in the scaffolds. In cases where the linking information indicated a "negative" gap size (a gap of undetermined size), an artificial gap size of 10 N’s is used. Pseudomolecules Scaffolds in all the rest of of the chromosomes are separated by 100 Ns. The unfilled gaps within scaffolds by a variable number of N’s according to the estimation of gap size between their contigs. |
|
Genome coverage |
225X |
|
Assembly statistics |
|
Scaff num |
77,899 |
|
Longest scaff |
43,780,027 bp |
|
N50 scaff length |
9,483,450 bp |
|
N50 scaff count |
70 |
|
N90 scaff length |
1,716,395 bp |
|
N90 scaff count |
284 |
|
Longest contig |
704,566 bp |
|
N50 contig length |
96,432 bp |
|
N50 contig count |
7,368 |
|
N90 contig length |
19,659 bp |
|
N90 contig count |
26,852 |
|
Total number of scaffolds in assembly.
|
Longest scaffold in assembly.
|
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
|
How many scaffolds are counted in reaching the N50 threshold.
|
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 90% of the total assembly size.
|
How many scaffolds are counted in reaching the N90 threshold.
|
The longest contig.
|
The length of contig which takes the sum length (summing from longest to shortest contig) past 50% of the total assembly size.
|
How many contig are counted in reaching the N50 threshold.
|
The length of contig which takes the sum length (summing from longest to shortest contig) past 90% of the total assembly size.
|
How many contig are counted in reaching the N90 threshold.
|
|
|
|
A contig is a contiguous consensus sequence that is
derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs
that are linked to one another by mate pairs of sequencing reads.
|
|
|
|
|
|
|
|
|
|
|
Annotation |
|
|
Annotation Identifier |
Zm00011a.1 |
|
Annotation Date |
March 2018 |
|
Annotation Description |
Annotation version 1.0 was a two-step process using an in-house annotation pipeline at PGSB (Plant Genome and Systems Biology, Helmholtz Zentrum München). In step 1, transcriptome assemblies were made from a pooled RNAseq library of 27 different tissues/conditions using Bridger and Trinity. These assemblies were unified with the Evigene5 pipeline to retrieve a transcript set for each line. The transcriptome assemblies were then used together with public transcriptome assemblies and proteome data from Sb, Bd, Os B73_v4 and PH207 to identify optimal spliced alignments to the reference sequence using GenomeThreader. From this, consensus models were derived and used for AHRD annotation and subsequent filtering for transposons. In step 2, we made pairwise whole-genome alignments (WGA) between EP1, F7, B73_v4 and PH207 to identify syntenic WGA blocks. Coding sequences of each line were mapped to all other lines. If a coding sequence of another line mapped with high confidence in a syntenic block where initially no gene was annotated, we kept the gene as novel gene model if a stringent bit-score threshold was surpassed and coverage was >0.97. These novel gene models were added to those of step 1 to provide annotation v1.0. |
|
|
|
|
|
|
|