Development of a PCR-Based Assay for Identifying Members of the Pseudomonas syringae Species Complex from Environmental Samples

Progress report for GNE20-232

Project Type: Graduate Student
Funds awarded in 2020: $15,000.00
Projected End Date: 08/31/2022
Grant Recipient: The Pennsylvania State University
Region: Northeast
State: Pennsylvania
Graduate Student:
Faculty Advisor:
Kevin Hockett
The Pennsylvania State University
Expand All

Project Information


There is a growing appreciation for the diversity of bacteria within the Pseudomonas syringae species complex (Pssc) that are capable of causing disease on tomato. Further, non-pathogenic lineages of the Pssc that are present in environmental reservoirs have been shown to be capable of adapting to agricultural settings and emerging as highly virulent pathogens. The goal of the proposed research is to develop an assay for screening various types of environmental samples that is both highly specific to the Pssc and offers a high degree of discriminatory ability within the Pssc. This assay involves the amplification of a genomic region partially encoding an R-type tailocin ubiquitous in the Pssc. Using primers that have been developed and tested both in-silico and in-planta, I will assess the ability of the assay to specifically amplify Pssc members from plant material, soil, and rain - all of which are important reservoirs for the Pssc. I will also determine the effectiveness and lower detection limit on commercial tomato seed, a common means of transmission for Pssc tomato pathogens. In addition to validating the molecular biology, I will also develop a software pipeline for accurately and reproducibly analyzing data obtained from the assay. The results of this study will provide extension staff and researchers with a valuable tool for studying and managing Pssc-mediated disease on tomato. Results will be communicated in publications and presentations geared toward extension staff, farmers, and agricultural researchers.

Project Objectives:

In this project, I propose to develop a diagnostic tool for researchers to detect and monitor a broad range of Pssc lineages associated with disease in tomato, and investigate the potential for tailocins, ubiquitous anti-competitor toxins found within PSSC, to be used as an indicator of tissue- and host- specific virulence.

The specific objectives of my proposal are:

  1. Investigate the evolutionary history of tailocin tail fibers within the PSSC. This will provide insight into evolutionary pressures that are currently being exerted on microbial antagonism mediated by tailocins within PSSC, which will in turn will improve our understanding of the importance of microbe-microbe interactions for plant pathogens.
  2. Investigate the ecological role of the tailocin-encoding region of PSSC as it pertains to host- and tissue-specific virulence. This objective will provide insight into the usefulness of the region as an epidemiological marker for future work.
  3. Develop a software pipeline and online application to predict identities and virulence factors carried by unknown isolates detected through PCR-based assay. Expected outcome: This objective will provide extension faculty and plant pathologists with a simple, accurate approach for analyzing the data obtained from PCR-amplification with currently used primers. The output of the pipeline will include most probable identities of unique lineages detected as well as summaries of suspected virulence factors carried by the unknown isolate based on closely related reference strains.

The expected outcome for all three objectives together is to provide a diagnostic tool for extension faculty and epidemiologists to efficiently and accurately survey natural sources of Pssc members, identify known tomato pathogens throughout the complex, and better detect environmental lineages that might represent emerging tomato pathogens. We will do this by applying current knowledge of PSSC taxonomy and virulence factors of concern (VFOC) as well as investigating the potential of the tailocin region to be considered the first VFOC linked to microbe-microbe interactions.



The purpose of this project is to provide extension faculty and plant pathologists with a tool that will allow for pathovar-level identification of the economically significant plant pathogen, Pseudomonas syringae


The Pseudomonas syringae species complex (Pssc) is composed of both environmental lineages of bacteria and plant-pathogenic lineages called pathovars. Phylogenetic evidence suggests that the pathogenic lineages currently infecting tomato, bean, and cucurbits emerged from environmental lineages (Morris et al. 2013), and there is still frequent intermixing between agricultural and non-agricultural Pssc strains (Monteil et al. 2013). The emergence of a highly virulent strain of P. syringae pathovar actinidiae (pv. actinidiae) in 2008, responsible for widespread destruction of kiwifruit that forced growers in many regions to switch to more resistant but less profitable cultivars, is currently hypothesized to be the result of such intermixing of pathogenic and environmental strains (Mccann et al. 2017).

Like in kiwi, throughout the last century, there have been considerable shifts in the dominant lineages of Pto found on diseased tomato (cai et al. 2011), displaying a pattern of pathogen emergence, proliferation, and replacement by even more successful lineages. Emerging pathogenic strains are the result of environmental lineages adapting to agricultural settings or intermixing with current pandemic strains, introducing new virulence factors into these pathogenic lineages (Monteil et al. 2013). Genomic analysis of pathogenic and environmental Pseudomonas syringae species complex (PSSC) lineages reveals multiple independent instances of pathogen emergence from environmental lineages (Monteil et al. 2016). Therefore, methods for identifying both pathogenic and environmental PSSC lineages are imperative to understanding the ecology of pathogen emergence as well as to the timely detection of the inevitable emergence of new pathogen lineages. 

Understanding the diversity of Pssc lineages co-existing in and near agricultural settings is crucial for sustainable management of current and emerging pathogens. An important example of why studying the full diversity of Pssc pathogens is crucial to disease management can be seen by examining an economically important pathovar found in the northeast United States: Pseudomonas syringae pv. tomato (Pto), the causative agent of bacterial speck in tomato. Although Pto is the most common Pssc pathovar to cause disease in tomato, many other members of the group have been identified as tomato pathogens, including pv. maculicola, pv. appi, pv. antirrhini, pv. syringae and Pseudomonas viridiflava, all causing symptoms very similar to bacterial speck (Morris et al. 2019, Goumas et al. 1999). In a recent preliminary survey of bacterial isolates taken from tomato plants throughout New York exhibiting speck-like symptoms, only 44 out of 57 were confirmed with whole genome sequencing to be Pto. Two of the disease-associated isolates were identified as P. syringae pv. syringae, whereas the majority of the others were various non-syringae pseudomonads, including one instance of P. viridiflava. The extent to which these and other Pssc lineages might be intermixing has consequences for pathogen evolution and emergence that affects short- and long-term management strategies.






Materials and methods:

Primer design and preliminary testing of primer specificity conducted in the summer of 2019 found that roughly 80% of sequences on tomato leaves obtained from a tomato research field at The University of Florida and amplified by the Pssc-tailocin primers belonged to P. syringae. Analysis of in-silico PCR results from 191 published Pssc genomes showed that phylogenetic relationships inferred by amplification products broadly recapitulated known phylogeny based on multilocus sequence analysis (MLSA). Further, the amplicon-derived phylogeny was able to accurately discriminate between the known biovars within pv. actinidiae, including being able to identify strains belonging to the recently emerged and highly virulent lineage responsible for the 2008 kiwifruit pandemic. Further validation of the technique is necessary to determine usefulness under field conditions and will be accomplished through the following approaches:

Objective 1 

note: Objective one has since been altered. methods below are for the original objective 1:

"Assess the specificity and discriminatory ability of a Pssc-tailocin primer set compared to culture-based method of detection within natural microbial communities"

Field collections: Asymptomatic leaves from tomato and the nearby weeds Chenopodium and Amaranthus were collected from Russell E. Larson Agricultural Center at Rock Springs, Pennsylvania in late September 2020. Ca. five grams of plant material were removed from 3 individual plants of each species, immediately stored on ice for transport, and kept at 4°C for 48 hours. In late July 2021, Asymptomatic leaves from tomato, squash, and the nearby weeds Chenopodium and Amaranthus were again collected from a single 50x50 meter plot at the Gates West research farm in Geneva, New York in late July 2021, using the same procedure. 


Processing of microbial communities: Leaf samples were suspended in a 50ml phosphate saline buffer containing 0.2% tween. Suspended samples were vortexed for 20 seconds, sonicated in a water bath for 5 minutes, and sonicated a final time for 20 seconds. Leaf material was removed. Bacterial cells in remaining supernatant were concentrated on 0.4mm MicroFunnel Disposable Filter Funnels (Pall Laboratory) and DNA was extracted using the DNeasy PowerWater kit (Qiagen). 


PCR amplification: using Phusion High-Fidelity DNA Polymerase: 98C for 30 seconds, 30 cycles of: 98C for 5 seconds, 62C for 30 seconds, and 72C for 120 seconds, followed by a final extension at 72C for 10 minutes. PCR products were visualized on 1% agarose containing ethidium bromide. 

Objective 3:

application pipeline
Figure 1: bioinformatic pipeline for PSSC isolate identification and prediction of carried virulence factors

PSSC genome curation: 2,473 genome assemblies belonging to ‘Pseudomonas syringae group’ were downloaded from genbank in November, 2021. BUSCO analysis was performed on genomes against the pseudomonadales_odb10.2019-04-24 reference database (figure 1). 2,161 Assemblies scoring 99% or higher were kept for inclusion in the application’s dataset.

Screening genomes for virulence factors: The canonical type three secretion system (T3SS), all known subfamilies of type three effector proteins associated with the T3SS, and the WHOP genetic region associated with infection of woody tissue were chosen as virulence factors of concern (VFOC). For all VFOC, at least 12 translated reference sequences were aligned using MAFFT with the default settings, and the generated multiple sequence alignments used as input for a HMMER search of the curated PSSC genome assemblies (figure 1).

In-silico PCR of PSSC marker genes: Published primer sets for common marker genes RpoD, RpoB, GyrB, and Cts and ian in-house Perl script were used to extract predicted amplicon sequences from the curated database of PSSC genome assemblies. Predicted amplicons for each gene were aligned using MAFFT with default settings (figure 1).

PSSC phylogenetic tree: a multi-locus sequence alignment of 120 conserved housekeeping genes will be generated using GTDB-tk, from which an approximate maximum likelihood tree will be constructed using FastTree2 (figure 1).

Linking user-submitted amplicon sequences to reference databases: Each user-submitted sequence is added to the pre-computed multiple sequence alignment for the chosen marker gene using MAFFT -add, and pairwise distances between the submitted sequence and all reference sequences in the alignment are calculated using the R package ‘phanghorn’. Genome assemblies whose reference amplicon sequences produce the lowest distance scores to the user’s submitted sequence are returned to the user, along with the VFOC they contain.

User interface
Figure 2: User interface of web application. User submits amplicon sequences and is returned an interactive report containing information about taxonomic prediction and possible VFOC carried by the unknown isolate.

User interface: Users can submit multiple amplicon sequences through a text field in FASTA format (figure 2). After choosing the marker gene being used is selected and sequences submitted for analysis on a remote server, the user is returned an interactive report giving an overview of genetic distances between their sequence and all reference sequences, and detailed information about the best match. The user also can set a threshold genetic distance that should be used for generating a summary of the closest relatives. Summary report includes all species, pathovars, and phylogroups found among closet relatives, as well as the frequency of each of 132 genes marked as VFOC. The user interface is implemented utilizing the ReactJS framework.

Research results and discussion:
  1. Assess the specificity and discriminatory ability of a Pssc-tailocin primer set compared to culture-based method of detection within natural microbial communities


Based on preliminary testing of the PSSC tailocin-region primer set conducted in the summer of 2019, it was thought that they displayed enough sensitivity to detect PSSC members in natural communities without the requirement of culturing or isolation. However, in both the summers of 2020 and 2021, the primers failed to return any PCR products indicative of the region of interest, and it was therefore determined that the method was not reliable. A suspected reason for these two failed attempts at amplification from natural communities despite promising preliminary data is that the preliminary work was done on samples displaying symptoms of bacterial speck, and therefore the relative abundance of PSSC in these samples might have been much higher than the subsequent samples consisting entirely of asymptomatic leaves. While PCR amplification can detect DNA at very low concentrations, we suspect that the large amplicon size generated by our primers (3-5kb) reduced sensitivity severely.  

Considering our results, we made the determination to alter objective 1 and objective 2 to focus on understanding the evolutionary history and ecological role of the tailocin region as it pertains to host- and tissue-specific virulence, and the potential the region has as a VFOC. While the approach we originally outlined did not allow for use of the tailocin region as an epidemiological marker, as a ubiquitous and potent toxin the region still likely holds important significance for survival and establishment of plant pathogens within PSSC on their host plants. Our updated objectives will provide important insights into how the tailocin region could be used in the future to predict pathogenicity at the host and/or tissue level, and motivate future work utilizing the region as an epidemiological marker.


  1. Develop a software pipeline to predict identities of lineages detected through PCR-based assay


We have developed a robust taxonomic and functional prediction pipeline based on curated databases representing all currently known diversity of the PSSC, along with their VFOC, and predicted amplicon sequences generated from popular marker genes. Our pipeline is accessible through a web application with an intuitive and responsive user interface, allowing rapid and reproducible predictions of important taxonomic classifications within the PSSC including Species, phylogroup, and pathovar designations. Along with taxonomic information, our web application provides prediction of ecologically significant genes associated with virulence. We believe our application will encourage wider adoption of the use of marker genes for identification purposes and represents an important tool for precision agriculture. 

Participation Summary

Education & Outreach Activities and Participation Summary

Participation Summary:

Education/outreach description:

Currently no educational or outreach events have been undertaken.

Project Outcomes

Project outcomes:

Our results have shown that our proposed method of assaying natural communities was not feasible due to the large amplicon size generated from our primers. However, in the development of our primers, we have uncovered unexpected patterns in the distribution of alleles of the tail fiber associated with the tailocin region, and investigating these patterns through our updated objectives 1 and 2 has the potential to increase our knowledge of the role the tailocin region plays in tissue and/or host specific pathogenicity.

The diagnostic tool has shown great promise in allowing researchers to more efficiently use PCR marker genes, and therefore improve their utility in epidemiological tracking and identification.

Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the U.S. Department of Agriculture or SARE.