Genome Overview
Carica papaya L. is a tropical fruit and an important crop grown for export and local consumption. Research in papaya ‘omics’ has increased progressively in the past years. The information based on genomics, genetics and bioinformatics research is useful for papaya researchers. Through the advancement of high throughput sequencing and molecular marker technology, it is now possible to assist plant breeding programme using omics approaches.
Eksotika is the important papaya cultivars in Malaysia mainly grown for domestic consumption and papaya export industry (Chan 1997).
Leaves from four-month old Eksotika papaya were harvested and grounded in liquid nitrogen. The extracted DNA was treated with RNase and then subjected to sequencing library preparation according to the manufacturer’s protocol, Illumina HiSeq 2000 sequencing (Illumina, Inc, Sand Diego, CA, USA). The resulting library was sequenced for 202 cycles to generate a total of 40 GB raw sequence reads with the length of 101bp for each sequence reads. Cleaned short reads of Eksotika was deposited in ENA database under accession code PRXXXX.
Feature Summary
Estimated genome sizes (Mb) | 287.71 |
Number of scaffolds | 26,208 |
Total length of scaffold (bp) | 287,712,485 |
N50 of scaffolds (bp) | 34,039 |
N90 of scaffolds (bp) | 5401 |
Longest scaffold (bp) | 253,788 |
Number of contigs | 48,911 |
Total length of contigs (bp) | 287,573,036 |
N50 of contigs (bp) | 14,132 |
N90 of contigs (bp) | 2463 |
Longest contig (bp) | 105,578 |
Predicted coverage of the assembled sequences (%) | 93% |
GC content of the genome (%) | 34.67% |
Predicted Gene Statistics
Number of predicted protein-coding genes | 16,165 |
Total length of all predicted genes (Mb) | 1933 |
Average gene length (bp) | 1196 |
Total peptide length | 5,857,362 amino acids |
Total number of exon | 100,727 |
Total number of CDS | 98,999 |
Total number of 5’UTR | 6,915 |
Total number of 3’UTR | 5,945 |
Annotated to nr database | 15,234 (94.24%) |
Annotated to SwissProt database | 12,648 (78.24%) |
Annotated to InterProScan database | 12,777 (79.04%) |
Annotated to Gene ontology | 13,576 (83.98%) |
Annotated to KEGG pathway database | 9079 (56.16%) |