search for


Complete genome sequence of Parasphingopyxis sp. CP4, an asparaginase-producing marine bacterium
Korean J. Microbiol. 2020;56(4):426-429
Published online December 31, 2020
© 2020 The Microbiological Society of Korea.

Yong Min Kwon, Dawoon Chung, Eunseo Cho, and Youngik Yang*

National Marine Biodiversity Institute of Korea (MABIK), Seocheon 33662, Republic of Korea
Correspondence to: E-mail:;
Tel.: +82-41-950-0956; Fax: +82-41-950-0951
Received November 27, 2020; Revised December 9, 2020; Accepted December 14, 2020.
A novel marine bacterium belonging to the genus Parasphingopyxis designated as strain CP4 was isolated from seawater around Chujado island in Korea. In the present study, we report the complete genome of Parasphingopyxis sp. CP4, which was determined using a hybrid approach combining the PacBio RS II and Illumina HiSeq platforms. The genome consists of one circular chromosome of 2,963,413 bp with a G + C content of 58.55 mol%. Furthermore, the genome contains 2,867 proteincoding sequences, 3 rRNA genes, 44 tRNA genes, 4 non-coding RNA genes, and 9 pseudogenes. The CP4 strain harbors asparaginase genes, which have applications in the field of clinical research pharmacology and in the food industry.
Keywords : Sphingomonadaceae, Parasphingopyxis, asparaginase, Chujado island, genome

The family Sphingomonadaceae, belonging to the class α-Proteobacteria, was first proposed by Kosako et al. (2000), and comprises 21 valid published genera ( Members of this family are widespread in nature due to their physiological and metabolic versatility, particularly due to their bioremediation capabilities (White et al., 1996; Balkwill et al., 2006). The genus Parasphingopyxis, a member of the family Sphingomonadaceae, was first proposed by Uchida et al. (2012) and comprises two type species with the validly published names P. lamellibrachiae JAMH 0132T (Uchida et al., 2012) and P. algicola ATAX6-5T (Jeong et al., 2017). The size of the reported genome sequences of the strains within the genus Parasphingopyxis including JAMH 0132T, ATAX6-5T and GrpM-11 is approximately 2.95~3.67 Mb encoding 2,944~ 3,575 genes with 50.2~64.3 mol% G + C contents. In this study, we report the complete genome sequence of the newly isolated marine bacterium Parasphingopyxis sp. CP4, which possess genes related to industrially important enzymes.

The Parasphingopyxis sp. CP4 was isolated from seawater (at an approximate depth of 20–30 m) around Chujado island in Korea. The strain was isolated using a dilution-plating method on instant salt agar (11 g/L Instant Ocean Sea Salt [Aquarium Systems] and 15 g/L agar [BD]) supplemented with vitamin B12 and routinely cultured on marine agar 2216 (MA; Difco) solid medium after primary isolation. Genomic DNA was obtained from cells cultivated for two days on MA using an Exgene DNA extraction kit (Gene All). The 16S rRNA gene sequence obtained by Sanger sequencing indicates that the strain is a new member of the genus Parasphingopyxis exhibiting the highest similarity to P. algicola ATAX6-5T (98.02%).

Whole genome sequencing was performed using PacBio RS II and Illumina HiSeq 2500 sequencing platforms at Macrogen. In total, 116,798 long reads (586,543,371 bp) were sequenced using the PacBio RS II platform. Likewise, 7,933,904 short reads (1,198,019,504 bp) were sequenced using the Illumina’s 151 bp pair-end library. De novo assembly was conducted using the HGAP assembler (v3.0) with PacBio reads only, followed by error correction of contig bases with Illumina reads using Pilon (v1.21). Structural and functional annotations were performed via the National Centre for Biotechnology Information’s Prokaryotic Genome Annotation Pipeline (PGAP v4.11). The clusters of orthologous groups (COGs) (Tatusov et al., 1997) were extracted using the eggNOG-mapper (v2.0) (Jensen et al., 2008) online tool. KEGG pathways and orthology assignments were obtained using BlastKOALA (Kanehisa et al., 2016). Putative secondary biosynthetic metabolites of the genome were predicted using antiSMASH 5.0 (Medema et al., 2011).

The complete CP4 genome consists of a single circular chromosome (2,963,413 bp, with a 58.55 mol% G + C content), whose map is shown in Fig. 1. The chromosome is composed of 2,867 (92.12%) coding DNA sequences (CDSs) of which 2,402 (83.49%) were assigned to COGs. The most abundant COG category in CP4 genome was “general function prediction only”, which comprised 241 proteins, followed by “lipid transport and metabolism”, “amino acid transport and metabolism”, “translation, ribosomal structure, and biogenesis”, and “cell wall/membrane/envelope biogenesis”, which encompassed 226, 210, 202, and 192 proteins, respectively (Fig. 1). In addition, the genome encodes 11 pseudogenes, 3 rRNAs (5S, 16S, and 23S), 44 tRNAs, and 4 ncRNAs (Table 1). The genome-based average nucleotide identities between CP4 and the three neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11, were 76.5%, 78.2%, and 74.9%, respectively, when calculated using the Orthologous Average Nucleotide Identity Tool (OAT) software available in the EzBioCloud server ( (Lee et al., 2016).

Parasphingopyxis sp. CP4 genome assembly and general features

Item Description
Genome Assembly Data
Sequencing technology PacBio RSII / Illumina HiSeq 2500
Assembly method HGAP / Pilon
Coverage 155 ×
Contigs 1
Finishing strategy Complete
Genomic features Chromosome
Submitted to NCBI CP051130
Size (bp) 2,963,413
G + C content (%) 58.55
Protein coding genes 2,867
rRNA genes 3
tRNA genes 44
ncRNA genes 4
Pseudogenes 9

Fig. 1. Circular map of the Parasphingopyxis sp. CP4 genome. From the outside to the center: The 2nd (forward strand) and 3rd (reverse strand) rings indicate genes; CDSs are indicated in light blue, rRNAs in yellow, and tRNAs in red. The 1st (forward strand) and 4th (reverse strand) rings indicate the COG functions corresponding to the CDSs sites. The values in parentheses within COG legends correspond to the gene count in each COG category. The 5th ring shows the G + C content (brown). The 6th ring shows the GC skew (pink/purple). The 7th ring depicts the location on the genome.

The genomes of CP4 and the three neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11, have secondary metabolite gene clusters related to terpenes. In addition, the KEGG predicted 36 proteins responsible for degradation and metabolism of aromatic and xenobiotic compounds, including alkane 1-monooxygenase, homogentisate 1,2-dioxygenase, protocatechuate 3,4-dioxygenase, ferredoxin reductase, and salicylate hydroxylase. Importantly, the genome harbors genes for asparagine biosynthesis. Moreover, it codes for asparaginase (locus tag: HFP51_08165) and isoaspartyl peptidase/L-asparaginase (locus tag: HFP51_10290); these enzymes are widely used in medications and food supplements, and for treating acute lymphoblastic leukemia, respectively. Sequence analyses using BlastKOALA revealed that the genome also harbors genes for asparagine synthetase (ansB) and isoaspartyl peptidase (iaaA). Those genes associated with asparagine biosynthesis and asparaginase are also found in genomes of the neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11.

Parasphingopyxis sp. CP4’s genome information may serve as an important resource for future research in asparaginase machinery and degradation of aromatic and xenobiotic compounds.

Nucleotide sequence and strain accession numbers

The complete genome sequence of Parasphingopyxis sp. CP4 has been deposited at DDBJ/EMBL/GenBank databases under the accession number CP051130. The partial sequence of the 16S rRNA gene of Parasphingopyxis sp. CP4 has been deposited into the GenBank database under the accession number MT280029. The strain CP4 was deposited at the Korean Culture Center of Microorganisms (KCCM) and Japan Collection of Microorganisms (JCM) as KCCM 43361 and JCM 34130, respectively.

적 요

Parasphingopyxis 속에 속하는 신종 해양 세균은 대한민국 추자도의 해수로부터 분리되었으며 CP4 균주로 명명하였다. 이 연구에서는 PacBio RS II와 Illumina HiSeq platforms을 사용하여 Parasphingopyxis sp. CP4의 유전체 분석을 수행하였다. 이 균주의 유전체는 58.55 mol% G + C 함량과 2,963,413 bp 크기를 가진 1개의 염색체로 구성되어 있다. 또한, 2,867개의 단백질 암호화 염기 서열, 3개의 rRNA 유전자, 44개의 tRNA 유전자, 4개의 비암호화 RNA 유전자 및 9개의 위 유전자를 포함한다. CP4 균주의 유전체에는 약학의 임상 연구와 식품업 분야에서 응용할 수 있는 아스파라기나제 유전자가 존재하는 것으로 나타났다.


This work was supported by the MABIK in-house programs 2020M00500 and 2020M00600.

  1. Balkwill DL, Fredrickson JK, and Romine MF. 2006. Sphingomonas and related genera, pp. 605-629. In Dworkin M, Falkow S, and Rosenberg E, Schleifer KH, Stackbrandt E (eds.). The Prokaryotes. Springer, New York, USA.
  2. Jensen LJ, Julien P, Kuhn M, Mering CV, Muller J, Doerks T, and Bork P. 2008. eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res. 36, D250-D254.
    Pubmed KoreaMed CrossRef
  3. Jeong SE, Kim KH, Baek K, and Jeon CO. 2017. Parasphingopyxis algicola sp. nov., isolated from a marine red alga Asparagopsis taxiformis and emended description of the genus Parasphingopyxis Uchida et al. 2012. Int. J. Syst. Evol. Microbiol. 67, 3877-3881.
    Pubmed CrossRef
  4. Kanehisa M, Sato Y, and Morishima K. 2016. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428, 726-731.
    Pubmed CrossRef
  5. Kosako Y, Yabuuchi E, Naka T, Fujiwara N, and Kobayashi K. 2000. Proposal of Sphingomonadaceae fam. nov., consisting of Sphingomonas Yabuuchi et al. 1990, Erythrobacter Shiba and Shimidu 1982, Erythromicrobium Yurkov et al. 1994, Porphyrobacter Fuerst et al. 1993, Zymomonas Kluyver and van Niel 1936, and Sandaracinobacter Yurkov et al. 1997, with the type genus Sphingomonas Yabuuchi et al. 1990. Microbiol. Immunol. 44, 563-575.
    Pubmed CrossRef
  6. Lee I, Kim YO, Park SC, and Chun J. 2016. OrthoANI: An improved algorithm and software for calculating average nucleotide identity. Int. J. Syst. Evol. Microbiol. 66, 1100-1103.
    Pubmed CrossRef
  7. Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, Weber T, Takano E, and Breitling R. 2011. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res. 39, W339-W346.
    Pubmed KoreaMed CrossRef
  8. Tatusov RL, Koonin EV, and Lipman DJ. 1997. A genomic perspective on protein families. Science 278, 631-637.
    Pubmed CrossRef
  9. Uchida H, Hamana K, Miyazaki M, Yoshida T, and Nogi Y. 2012. Parasphingopyxis lamellibrachiae gen. nov., sp. nov., isolated from a marine annelid worm. Int. J. Syst. Evol. Microbiol. 62, 2224-2228.
    Pubmed CrossRef
  10. White DC, Sutton SD, and Ringelberg DB. 1996. The genus Sphingomonas: physiology and ecology. Curr. Opin. Biotechnol. 7, 301-306.
    Pubmed CrossRef

December 2020, 56 (4)
Full Text(PDF) Free

Social Network Service

Author ORCID Information

Funding Information
  • MABIK in-house programs
      2020M00500, 2020M00600