Geminiviruses are plant viruses having circular single-stranded DNA genomes encapsidated in twinned icosahedral particles (
Harrison and Robinson, 1999). Papaya leaf curl China virus (PaLCuCNV), classified within the genus
Begomovirus of the family
Geminiviridae, constitutes a substantial contributor to global crop losses (
Wang et al., 2004). The initial report of PaLCuCNV associated with leaf curl disease of papaya was from Guangxi Province of China (
Wang et al., 2004). In recent years, there has been an expansion in both the geographical origin and host range of PaLCuCNV, resulting in escalating economic damage. Weeds serve as intermediate hosts for a wide range of geminiviruses and function as natural virus reservoirs that play an indispensable role in virus transmission and evolution (
Huang and Zhou, 2006;
Passos et al., 2017;
Wang et al., 2021).
Ageratum conyzoides is an annual weed extensively distributed across tropical and subtropical regions of Asia, and is a crucial host for a variety geminiviruses (
Huang and Zhou, 2006;
Li et al., 2018). The heterogeneous begomovirus complexes, PaLCuCNV/ageratum yellow vein China betasatellite (AYVCNB), could induce the typically severe symptoms associated with yellow vein disease in
A. conyzoides, and PaLCuCNV alone produced asymptomatic infections in
A. conyzoides (
Jiao et al., 2013).
In the “virus-host-environment” triangle, plant viruses are subject to both host and environmental pressures, and molecular evolution is a valuable tool for revealing the molecular basis of viral adaptation to novel hosts and geographic spread, as well as for developing more effective epidemic control strategies (
Cuevas et al., 2012;
Guan et al., 2024). Mutation, recombination, genetic drift, selective pressure, and migration stand as the primary driving forces dictating the molecular evolution of plant viruses. In this study, utilizing molecular identification of the pathogen of
A. conyzoides yellow vein virus disease, the population genetic diversity of PaLCuCNV was analyzed. Sequences obtained in this research were compared with the published isolates from diverse regions and hosts in NCBI GenBank to assess phylogeny, population variation, and genetic diversity. Additionally, the study evaluated the role of genetic forces such as recombination, selective pressure, and gene drift on population differentiation. The outcomes of this investigation enhance the understanding of genetic diversity and molecular evolution dynamics of PaLCuCNV within natural populations.
In the present study, 34 samples of
A. conyzoides plants exhibiting yellow vein symptoms were collected from fields located in Yunnan, Guangdong, and Guangxi Provinces, China. Total DNA was extracted from the leaf tissues of samples according to an established method (
Lee et al., 2024). The degenerate primer pair PA and PB was used to amplify the DNA partial fragment of the virus (about 500 bp in size), and the samples with positive results of PA/PB amplification were selected, and then amplified by using the specific primer pair (PA-F36/PATY-R) for PaLCuCNV (
Deng et al., 1994). And the samples with positive results of partial specific detection were selected for cloning and sequence determination. The primer PaLCuCNV99-F (5′-CTGGTGGGCCAGTATGCAA-3′)/PaLCuCNV99-R (5′-CACCAGTAACAGTCGCCT-3′) was custom-designed to amplify the full-length viral genomic DNA according to the sequence of the obtained fragments. The PCR reaction conditions comprised initial pre-denaturation at 94°C for three min, denaturation at 98°C for 10 s, annealing at 53°C for 15 s, extension at 72°C for 3 min with 30 cycles, and extension at 72°C for another 10 min. The PCR products were recovered, purified, and ligated into the pEASY-Blunt Cloning Kit cloning vector, and the complete genome sequence of PaLCuCNV was obtained after sequence determination.
Sequence editing and analysis were conducted utilizing DNAStar software (version 7.0, Madison, WI, USA) and DNAMAN software (version 5.2.2, Lynnon Biosoft, Quebec, Canada). The complete genome sequences of PaLCuCNV isolates were aligned using the Clustal W program within MEGA 11.0 software (
Tamura et al., 2021). Consistency analysis of complete nucleotide sequences was performed using Sequence Demarcation Tool software (SDT v1.2). The phylogenetic tree was constructed using the neighbor-joining method based on the complete DNA sequences of isolates obtained in this study and 68 PaLCuCNV isolates retrieved from GenBank, with 1,000 bootstrap replications. The PaLCuCNV isolates information used in this study are listed in
Supplementary Table 1. To investigate the potential occurrence of recombination events during the evolutionary trajectory of PaLCuCNV, the complete genome sequences of 71 PaLCuCNV isolates were subjected to recombination analysis using seven algorithms, including RDP, GeneConv, BootScan, MaxChi, CHIMAERA, SiScan, and 3SEQ, within Recombination Detection Program version 4.0 (RDP v4.0) software (
Jiao et al., 2013). A recombination event was deemed significant when detected by three or more methods, with an associated
P < 0.05 (
Freire et al., 2015). To examine the historical dynamics of the PaLCuCNV populations, the haplotype diversity (Hd) and nucleotide diversity (Pi) values were estimated using DnaSP 6.0 software (
Rozas et al., 2017), with 0 < Hd < 1 and 0 < Pi < 0.1. Additionally, three tests of selective neutrality, namely Tajima’s D, Fu & Li’s D, and Fu & Li’s F, were performed (
Fu and Li, 1993;
Tajima, 1989). Selection pressure analysis involves the ratio between the non-synonymous mutations (dN) and the synonymous mutation (dS). Following the exclusion sequences containing recombination sites, the values of dN and dS of PaLCuCNV were individually computed using DnaSP 6.0 software to determine the selection pressure within each population (
Sun et al., 2021;
Tomitaka and Ohshima, 2006). The genetic differentiation and gene flow of PaLCuCNV were analyzed using DnaSP 6.0 software. Population genetic differences were assessed through the statistics of the sequence statistic (Kst), the rank statistic (Z), and the nearest-neighbor statistic (Snn), and significant differentiation among populations was determined when all three statistics yielded significant values (
P < 0.05) (
Hudson, 2000;
Hudson et al., 1992). The genetic differentiation coefficient, fixation index (Fst) and number of migrations (Nm), were calculated to quantify the degree of genetic differentiation and the extend of gene flow between populations (
Sun et al., 2021).
In the present study, 34 specimens underwent PCR using the extracted total genome of
A. conyzoides plants as a template and the degenerate begomoviruses primer pair PA/PB. Notably, 28 samples amplified specific bands of 500 bp, indicating that 82.4% of the samples might be infected by geminiviruses (
Table 1). The 28 fragments obtained from the positive samples were subsequently cloned and sequenced. Upon comparing the sequencing results with sequences available in GenBank, it was found that 22 of the specific fragment sequences exhibited more than 98% similarity to PaLCuCNV, suggesting potential PaLCuCNV infection in these
A. conyzoides plant samples. To further identify molecular characterization of PaLCuCNV infecting
A. conyzoides plants, from the 22 positive samples, one sample was randomly selected from each province. Following successful amplification, cloning, and sequencing procedures, three distinct PaLCuCNV isolates were obtained. The complete genome sequences of three isolates were cloned and sequenced, subsequently deposited in the GenBank database with accession numbers MF417764 (GD5, Guangdong), MF417765 (BN3, Yunnan), and MF417766 (GX128, Guangxi). The genome organization of the three isolates was analyzed, revealing that BN3 and GD5 exhibited a genome size of 2,754 nucleotides (nt), and GX128 had a full-length of 2,732 nt. Three isolates had a typical genome organization of
Begomovirus, with closed circular single-stranded DNA genomes encoding six open reading frames.
To further clarify the presence of PaLCuCNV in all samples, 34 samples were specifically detected using the specific primer pair PA-F362/PATY-R for PaLCuCNV, which showed that 22 samples were infected with PaLCuCNV (
Table 1), with a detection rate of 64.7%, which was in agreement with the sequencing results of the PA/PB assay. Among them, the highest detection rate of PaLCuCNV was 85.7% in the samples collected from Yunnan Province, while only 50.0% of the samples collected from Guangdong Province were infected with PaLCuCNV.
Comparative analysis was performed against 68 other PaLCuCNV isolates registered in GenBank. The results showed that the nucleotide identity between three isolates obtained in this study and the other 68 sequences ranged from 85.3% to 99.9% (
Fig. 1). Notably, both BN3 and GD5 exhibited the highest similarity with PaLCuCNV isolate Fz10 (GenBank accession no. JF682837, Fujian, China) at 99.9% and 99.8%, respectively, and GX128 showed the highest similarity with PaLCuCNV isolate G111 (GenBank accession no. HG003651, Guangxi, China) at 99.5%.
Subsequently, a phylogenetic tree was constructed based on the complete genomes of the isolates (
Fig. 2). The clustering analysis revealed five distinct groups, denoted as I, II, III, IV, and V, each group containing 10, 10, 14, 16, and 21 members, respectively. BN3 and GD5 from this study are distributed within group IV, and GX128 is distributed within group V. The hosts of PaLCuCNV isolates were diverse and geographically not homogeneous within each population. Among them, group V was a host species-richest group which included isolates from six distinct hosts; the isolates in group IV had the most diverse regional species, distributed in Vietnam, as well as Guangxi, Jiangxi, Guangdong, Yunnan, and Fujian provinces of China. The above results indicated that the genetic differentiation of PaLCuCNV was not significantly correlated with host species and geographical locations.
To investigate the occurrence of recombination events during the evolutionary process of PaLCuCNV, recombination analysis of the full-length genome sequences of 71 PaLCuCNV isolates was conducted. The results revealed a total of 13 recombination events existed among the 71 isolates. Notably, the major and minor parents of all recombinants were PaLCuCNV intraspecific isolates, involving a total of 20 isolates (
Table 2). And the recombination analysis of the isolates involved in the 13 identified recombination events was integrated with the phylogenetic tree (
Table 3). The findings revealed that the isolates engaged in recombination were dispersed across five distinct populations. Interestingly, both the recombinant and their parental isolates did not precisely align within the same population group, indicating inter-population recombination events. For instance, isolate GX4, situated in population I, displayed NN1 as its major parent from population IV, while its minor parent JZ1 originated from population III.
The complete genome sequences of PaLCuCNV isolates were analyzed for population genetic diversity and neutrality tests, and categorized by phylogenetic grouping. Population I exhibited the highest Hd and Pi values, recorded as 1 and 0.03314, respectively. Conversely, population II displayed the lowest Hd value of 0.956, while population III had the lowest Pi value of 0.00847. Overall, across all PaLCuCNV isolates, the Hd and Pi values were 0.997 and 0.05736, respectively, indicating a notably high level of genetic diversity for PaLCuCNV (
Table 3).
The neutrality tests of the grouped PaLCuCNV full-length genome sequences showed that Tajima’s D, Fu and Li’s D, and Fu and Li’s F values were negative for all populations, except for populations II and IV, where Tajima’s D, Fu & Li’s D and Fu and Li’s F values were positive. The differences among all populations for three tests were not significantly different (P > 0.05). These results suggest an overall expansion trend and indicate that PaLCuCNV evolution follows a neutral model.
After excluding the 13 sequences known by recombination analysis to have undergone recombination events, the dN and dS values of PaLCuCNV were calculated separately to estimate its selection pressure. The results showed that all populations had dN/dS < 1 except population II, which had dN/dS > 1, and population V had the smallest dN/dS value of 0.24219 (
Table 4). This indicates that most of the mutations in the PaLCuCNV populations are non-synonymous mutations, and PaLCuCNV evolution is mainly affected by negative selection pressure, which is consistent with the results of the tests of selective neutrality.
The analysis of genetic differentiation and gene flow among the five populations, categorized according to the clustering patterns, revealed significant values of Kst, Z, and Snn (
P < 0.05) (
Table 5). These significant values indicate substantial genetic differences among the PaLCuCNV populations. Moreover, the values of Fst among various populations were all >0.33, indicating infrequent gene flow among populations; the values of Nm among five populations were all < 1, suggesting that genetic drift is easily occurring among these populations. The above results suggest that genetic drift is the main reason for the significant genetic differentiation among PaLCuCNV populations.
In this study, a total of 34 samples of A. conyzoides plant with yellow vein symptoms were collected from Guangxi, Guangdong, and Yunnan provinces. Using the degenerate primer pair PA/PB, the detection rate of geminiviruses was 82.4%, of which the detection rate of PaLCuCNV accounted for 64.7%, lower than that of the PA/PB. It is hypothesized that the yellow vein symptoms exhibited by some of the samples collected in the field may be the result of a non-geminiviruses infection, and other geminiviruses in addition to PaLCuCNV were also present in the samples, so whether there is a possibility of mixed infection needs to be verified by subsequent experiments. And we should clear the PaLCuCNV host weeds such as A. conyzoides in the farmland to minimize the damage of the virus to the crops.
The current study revealed that the correlation between the level of genetic diversity of PaLCuCNV and plant host and geographic factors was not significant, which is consistent with the previous report that PaLCuCNV isolates infecting tomato and other hosts in Guangxi, Fujian and Henan Provinces cannot be sub-clustered based on host and distribution (
Zhang et al., 2010). A similar phenomenon has been observed in studies of plant RNA viruses, cymbidium mosaic virus (CymMV) and odontoglossum ringspot virus (ORSV) (
Rao et al., 2014). However, the limited number of known PaLCuCNV isolates, mostly from Guangxi province, China, and from other countries and other parts of China, may influence the assessment of the relevance of PaLCuCNV phylogeny to geographical factors. Additional genome sequences of PaLCuCNV isolates from other regions are needed to fully assess the phylogeny of PaLCuCNV.
Recombination plays a crucial role in driving the evolution of geminiviruses (
Hou and Gilbertson, 1996). Many viral virulence variants, such as those causing cassava mosaic disease, cotton leaf curl disease, and tomato yellow leaf curl disease, are generated through recombination events in the viral genome (
Farooq et al., 2011;
Monci et al., 2002). Moreover, recombination between different viruses can lead to the emergence of new viruses, as seen in the cases of conyza yellow vein virus (CoYVV), which have arisen through recombination between viruses related to tomato yellow leaf curl Yunnan virus (TYLCYnV) and TYLCCNV (
Li et al., 2022). Recombination events within PaLCuCNV populations were found to be common in this study, indicating that recombination was an important factor influencing PaLCuCNV variation. These findings underscore the importance of further research to understand the potential for PaLCuCNV to recombine with other viruses.
Population genetic diversity can be used as a pivotal indicator of the pathogen's adaptability to its environment, with rapidly mutating pathogens demonstrating enhanced survivability and evolutionary advantage (
Zhu and Zhan, 2012). Our finding indicated that PaLCuCNV had great genetic diversity, and neutrality tests revealed that the PaLCuCNV is undergoing expansion, indicating enhanced adaptability of the virus to a broader host range, environmental conditions, and insect vectors. In the present study, PaLCuCNV was detected on
A. conyzoides in Yunnan, Guangxi, and Guangdong Provinces of China, infection of PaLCuCNV was expanding. This raises concerns regarding the potential for PaLCuCNV to infect other crops and cause more severe damage, production needs to closely monitor the transmission and spread of PaLCuCNV and strengthen the quarantine of the virus. And the findings of this study contribute to our understanding of PaLCuCNV's mutation trend and evolutionary potential and lay the foundation for the prevention and control of diseases caused by this virus.