Genome Overview

Carica papaya L. is a tropical fruit and an important crop grown for export and local consumption. Research in papaya ‘omics’ has increased progressively in the past years. The information based on genomics, genetics and bioinformatics research is useful for papaya researchers. Through the advancement of high throughput sequencing and molecular marker technology, it is now possible to assist plant breeding programme using omics approaches.

Eksotika is the important papaya cultivars in Malaysia mainly grown for domestic consumption and papaya export industry (Chan 1997).

Leaves from four-month old Eksotika papaya were harvested and grounded in liquid nitrogen. The extracted DNA was treated with RNase and then subjected to sequencing library preparation according to the manufacturer’s protocol, Illumina HiSeq 2000 sequencing (Illumina, Inc, Sand Diego, CA, USA). The resulting library was sequenced for 202 cycles to generate a total of 40 GB raw sequence reads with the length of 101bp for each sequence reads. Cleaned short reads of Eksotika was deposited in ENA database under accession code PRXXXX.

Feature Summary

Estimated genome sizes (Mb)287.71
Number of scaffolds26,208
Total length of scaffold (bp)287,712,485
N50 of scaffolds (bp)34,039
N90 of scaffolds (bp)5401
Longest scaffold (bp)253,788
Number of contigs48,911
Total length of contigs (bp)287,573,036
N50 of contigs (bp)14,132
N90 of contigs (bp)2463
Longest contig (bp)105,578
Predicted coverage of the assembled sequences (%)93%
GC content of the genome (%)34.67%

Predicted Gene Statistics

Number of predicted protein-coding genes16,165
Total length of all predicted genes (Mb)1933
Average gene length (bp)1196
Total peptide length5,857,362 amino acids
Total number of exon100,727
Total number of CDS98,999
Total number of 5’UTR6,915
Total number of 3’UTR5,945
Annotated to nr database15,234 (94.24%)
Annotated to SwissProt database12,648 (78.24%)
Annotated to InterProScan database12,777 (79.04%)
Annotated to Gene ontology13,576 (83.98%)
Annotated to KEGG pathway database9079 (56.16%)