Salmonella is a foodborne pathogenic bacterium that is mainly transmitted via the consumption of contaminated food or water (Hoelzer et al., 2011). It is found in poultry, eggs, milk, vegetables, and meat products. Currently, there are 2,659 known serotypes of Salmonella, which are classified on the basis of their unique O (somatic) and H (flagellar) surface antigens (Issenhuth-Jeanjean et al., 2014). These serotypes vary greatly in terms of their host range, virulence, and epidemiology. With the rise of next generation sequencing technologies, genomic typing tools have become increasingly popular and effective (Hu et al., 2021).
In this study, we determined the complete genome sequence of Salmonella enterica serovar Typhimurium strain MFDS1021937, which was isolated and identified from an egg garnish associated with food poisoning accidents reported in Gyeongsangnam-do, South Korea in 2022.
To acquire high-quality genomic DNA, MFDS1021937 isolate was grown on tryptic soy agar at 37°C overnight and genomic DNA was extracted using a DNeasy Blood and Tissue Kit (Qiagen). After quantitation and qualification, the DNA was used to prepare sequencing libraries. To obtain the complete genome sequence of the strain, sequencing was performed PacBio Sequel platform. The PacBio sequencing library was constructed through shearing and SMRTbell preparation per the manufacturer’s instructions. Library quality was assessed using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) and Agilent 2100 Bioanalyzer (Agilent Technologies).
The raw sequence reads were de novo assembled using the PacBio SMRT Analysis system by HGAP assembler. The hierarchical approach of HGAP, combinded with PacBio long-read sequencing, facilitated the generation of highly accurate and contiguous genome assemblies, which are crucial for obtaining complete genomes. The total reads were assembled into three contigs with a total length of 5,136,881 bp and an N50 of 4,857,986. With a genome coverage of 541X, the genome was annotated using RASTtk (Brettin et al., 2015). Virulence-associated genes were predicted using the Virulence Factor Database (VFDB) (Liu et al., 2019) and ResFinder v4.3.1 (Bortolaia et al., 2020). The genome of S. Typhimurium MFDS1021937 consists of a 4,857,986-bp chromosome and two 98,402-bp and 93,838-bp plasmids, with G + C contents of 52.2%, 50.4%, and 53.1%, respectively. The three contigs contain 4,859, 133, and 142 coding sequences (CDSs), respectively. The chromosome harbors 84 tRNAs and 22 rRNAs. One of the two plasmids contains the lnc1-Iα replicon type, and the other contains the lncFIB and lncFII replicon types (Table 1). The serovar was determined as Typhimurium monophasic variant (I 4, [5], 12: i: -) using Seqsero2 v1.1.0 (Zhang et al., 2019). VFDB predicted 136 virulence genes, and ResFinder predicted the presence of the antimicrobial resistance gene acc(6’)-laa. Based on the genome analysis data, Fig. 1 presents a circular genome map. In addition, MFDS1021937 contains Salmonella pathogenicity island (SPI) genes, including SPI-1 (invA, sipB, sipC, prgH, and sopE) and SPI-2 (sifA, pipB2, and spvC) genes (Table 2). The identification of virulence genes, including those associated with Salmonella Pathogenicity Islands (SPI-1 and SPI-2), provides valuable insights into the mechanisms underlying bacterial invasion and intracellular survival. This complete genome information is vital for understanding foodborne pathogens and food poisoning as it provides a comprehensive genetic basis for investigating virulence factors and identifying potential sources of foodborne outbreaks.
The complete genome sequence of Salmonella enterica serovar Typhimurium MFDS1021937 has been deposited in NCBI GenBank under accession numbers JBHFPU000000000 (MFDS1021937). The strain has been deposited in the Korean Culture Collection for Foodborne Pathogens under strain number MFDS1021937.
살모넬라는 오염된 식품을 섭취하였을 때 심각한 질병을 일으킬 수 있는 식품매개 병원균이다. 본 연구에서는 2022년 경상남도 소재 식당에서 발생한 식중독 사고의 원인식품으로 추정되는 계란지단으로부터 분리된 Salmonella strain MFDS1021937의 유전체 분석을 수행하였다. Salmonella enterica serovar Typhimurium MFDS1021937은 4,857,986 bp 길이의 chromosome과 98,402 bp, 93,838 bp 길이의 두 개의 plasmid로 구성되어 있으며, 각각의 G + C contents는 52.2%, 50.4%, 53.1%로 확인되었다. 또한, chromosome에서는 4,859개의 단백질 코딩유전자, 84개 transfer RNA, 그리고 22개의 ribosomal RNA가 예측되었으며 두 개 plasmid에서는 각각 133개, 142개의 단백질 코딩유전자가 확인되었다. MFDS1021937의 유전체 정보는 S. Typhimurium의 독성 및 병원성의 유전적 기초를 밝히는 심층적인 유전적 정보를 제공한다.
This study was financially supported by the Ministry of Food and Drug Safety, Republic of Korea (23194MFDS017).
The authors have no conflict of interest to report.