
The family Sphingomonadaceae, belonging to the class α-Proteobacteria, was first proposed by Kosako et al. (2000), and comprises 21 valid published genera (http://www.bacterio.net). Members of this family are widespread in nature due to their physiological and metabolic versatility, particularly due to their bioremediation capabilities (White et al., 1996; Balkwill et al., 2006). The genus Parasphingopyxis, a member of the family Sphingomonadaceae, was first proposed by Uchida et al. (2012) and comprises two type species with the validly published names P. lamellibrachiae JAMH 0132T (Uchida et al., 2012) and P. algicola ATAX6-5T (Jeong et al., 2017). The size of the reported genome sequences of the strains within the genus Parasphingopyxis including JAMH 0132T, ATAX6-5T and GrpM-11 is approximately 2.95~3.67 Mb encoding 2,944~ 3,575 genes with 50.2~64.3 mol% G + C contents. In this study, we report the complete genome sequence of the newly isolated marine bacterium Parasphingopyxis sp. CP4, which possess genes related to industrially important enzymes.
The Parasphingopyxis sp. CP4 was isolated from seawater (at an approximate depth of 20–30 m) around Chujado island in Korea. The strain was isolated using a dilution-plating method on instant salt agar (11 g/L Instant Ocean Sea Salt [Aquarium Systems] and 15 g/L agar [BD]) supplemented with vitamin B12 and routinely cultured on marine agar 2216 (MA; Difco) solid medium after primary isolation. Genomic DNA was obtained from cells cultivated for two days on MA using an Exgene DNA extraction kit (Gene All). The 16S rRNA gene sequence obtained by Sanger sequencing indicates that the strain is a new member of the genus Parasphingopyxis exhibiting the highest similarity to P. algicola ATAX6-5T (98.02%).
Whole genome sequencing was performed using PacBio RS II and Illumina HiSeq 2500 sequencing platforms at Macrogen. In total, 116,798 long reads (586,543,371 bp) were sequenced using the PacBio RS II platform. Likewise, 7,933,904 short reads (1,198,019,504 bp) were sequenced using the Illumina’s 151 bp pair-end library. De novo assembly was conducted using the HGAP assembler (v3.0) with PacBio reads only, followed by error correction of contig bases with Illumina reads using Pilon (v1.21). Structural and functional annotations were performed via the National Centre for Biotechnology Information’s Prokaryotic Genome Annotation Pipeline (PGAP v4.11). The clusters of orthologous groups (COGs) (Tatusov et al., 1997) were extracted using the eggNOG-mapper (v2.0) (Jensen et al., 2008) online tool. KEGG pathways and orthology assignments were obtained using BlastKOALA (Kanehisa et al., 2016). Putative secondary biosynthetic metabolites of the genome were predicted using antiSMASH 5.0 (Medema et al., 2011).
The complete CP4 genome consists of a single circular chromosome (2,963,413 bp, with a 58.55 mol% G + C content), whose map is shown in Fig. 1. The chromosome is composed of 2,867 (92.12%) coding DNA sequences (CDSs) of which 2,402 (83.49%) were assigned to COGs. The most abundant COG category in CP4 genome was “general function prediction only”, which comprised 241 proteins, followed by “lipid transport and metabolism”, “amino acid transport and metabolism”, “translation, ribosomal structure, and biogenesis”, and “cell wall/membrane/envelope biogenesis”, which encompassed 226, 210, 202, and 192 proteins, respectively (Fig. 1). In addition, the genome encodes 11 pseudogenes, 3 rRNAs (5S, 16S, and 23S), 44 tRNAs, and 4 ncRNAs (Table 1). The genome-based average nucleotide identities between CP4 and the three neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11, were 76.5%, 78.2%, and 74.9%, respectively, when calculated using the Orthologous Average Nucleotide Identity Tool (OAT) software available in the EzBioCloud server (www.ezbiocloud.net/sw/oat) (Lee et al., 2016).
The genomes of CP4 and the three neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11, have secondary metabolite gene clusters related to terpenes. In addition, the KEGG predicted 36 proteins responsible for degradation and metabolism of aromatic and xenobiotic compounds, including alkane 1-monooxygenase, homogentisate 1,2-dioxygenase, protocatechuate 3,4-dioxygenase, ferredoxin reductase, and salicylate hydroxylase. Importantly, the genome harbors genes for asparagine biosynthesis. Moreover, it codes for asparaginase (locus tag: HFP51_08165) and isoaspartyl peptidase/L-asparaginase (locus tag: HFP51_10290); these enzymes are widely used in medications and food supplements, and for treating acute lymphoblastic leukemia, respectively. Sequence analyses using BlastKOALA revealed that the genome also harbors genes for asparagine synthetase (ansB) and isoaspartyl peptidase (iaaA). Those genes associated with asparagine biosynthesis and asparaginase are also found in genomes of the neighboring species, JAMH 0132T, ATAX6-5T, and GrpM-11.
Parasphingopyxis sp. CP4’s genome information may serve as an important resource for future research in asparaginase machinery and degradation of aromatic and xenobiotic compounds.
The complete genome sequence of Parasphingopyxis sp. CP4 has been deposited at DDBJ/EMBL/GenBank databases under the accession number CP051130. The partial sequence of the 16S rRNA gene of Parasphingopyxis sp. CP4 has been deposited into the GenBank database under the accession number MT280029. The strain CP4 was deposited at the Korean Culture Center of Microorganisms (KCCM) and Japan Collection of Microorganisms (JCM) as KCCM 43361 and JCM 34130, respectively.
Parasphingopyxis 속에 속하는 신종 해양 세균은 대한민국 추자도의 해수로부터 분리되었으며 CP4 균주로 명명하였다. 이 연구에서는 PacBio RS II와 Illumina HiSeq platforms을 사용하여 Parasphingopyxis sp. CP4의 유전체 분석을 수행하였다. 이 균주의 유전체는 58.55 mol% G + C 함량과 2,963,413 bp 크기를 가진 1개의 염색체로 구성되어 있다. 또한, 2,867개의 단백질 암호화 염기 서열, 3개의 rRNA 유전자, 44개의 tRNA 유전자, 4개의 비암호화 RNA 유전자 및 9개의 위 유전자를 포함한다. CP4 균주의 유전체에는 약학의 임상 연구와 식품업 분야에서 응용할 수 있는 아스파라기나제 유전자가 존재하는 것으로 나타났다.
This work was supported by the MABIK in-house programs 2020M00500 and 2020M00600.
![]() |
![]() |