TY - JOUR
T1 - Paving the way for deeper insights into nematode community composition with long-read metabarcoding
T2 - ecological and biogeographical coverage of the sequences
AU - Noshadi, Arash
AU - Ghaderi, Reza
AU - Nielsen, Uffe N.
AU - Hayden, Helen L.
AU - He, Ji Zheng
PY - 2026/1
Y1 - 2026/1
N2 - Long-read metabarcoding has excellent potential to advance nematode community ecology beyond the limitations of morphological and short-read approaches. The ecological and biogeographical background of existing sequence databases offers important insights into the potential application of high-throughput sequencing approaches in nematode community analyses. This study searched public databases for the three universal marker genes used in nematode molecular taxonomy studies, i.e., 18S ribosomal RNA (18S rRNA), 28S ribosomal RNA (28S rRNA), and cytochrome c oxidase subunit I (COI) genes, to retrieve full-length sequences suitable for long-read metabarcoding of soil nematode communities. The most full-length sequences were found for COI with 17534, followed by 4898 for 18S rRNA and 800 for 28S rRNA. These full-length sequences represented 185, 54, and 163 unique families; 626, 160, and 609 unique genera; and 1320, 235, and 1527 unique species for 18S rRNA, 28S rRNA, and COI markers, respectively. Nucleotide composition and diversity analyses across the three markers revealed distinct patterns affecting their utility for taxonomic studies. Geographically, the majority of the sequences were from the United States, China, Japan or Germany. Additionally, precise country-of-origin information was lacking for the majority of sequences, highlighting the limitations of the current databases and rendering robust geographic analyses difficult. Full-length sequences were assigned to an ecological framework, revealing that, for nematode trophic groups, herbivores were the most numerous group (10735 sequences), followed by animal parasites (6588 sequences), bacterivores (1785 sequences) and entomopathogenic nematodes (1513 sequences), whereas other trophic groups had fewer representative sequences. Assigning the sequences to colonizer-persister (c-p) classification revealed that all c-p groups were covered by the retrieved full-length sequences, particularly c-p 3, which had the highest number of sequences (6691). This study provides a foundational understanding of the molecular data currently available for use in long-read metabarcoding databases, facilitating ecological research on nematode communities.
AB - Long-read metabarcoding has excellent potential to advance nematode community ecology beyond the limitations of morphological and short-read approaches. The ecological and biogeographical background of existing sequence databases offers important insights into the potential application of high-throughput sequencing approaches in nematode community analyses. This study searched public databases for the three universal marker genes used in nematode molecular taxonomy studies, i.e., 18S ribosomal RNA (18S rRNA), 28S ribosomal RNA (28S rRNA), and cytochrome c oxidase subunit I (COI) genes, to retrieve full-length sequences suitable for long-read metabarcoding of soil nematode communities. The most full-length sequences were found for COI with 17534, followed by 4898 for 18S rRNA and 800 for 28S rRNA. These full-length sequences represented 185, 54, and 163 unique families; 626, 160, and 609 unique genera; and 1320, 235, and 1527 unique species for 18S rRNA, 28S rRNA, and COI markers, respectively. Nucleotide composition and diversity analyses across the three markers revealed distinct patterns affecting their utility for taxonomic studies. Geographically, the majority of the sequences were from the United States, China, Japan or Germany. Additionally, precise country-of-origin information was lacking for the majority of sequences, highlighting the limitations of the current databases and rendering robust geographic analyses difficult. Full-length sequences were assigned to an ecological framework, revealing that, for nematode trophic groups, herbivores were the most numerous group (10735 sequences), followed by animal parasites (6588 sequences), bacterivores (1785 sequences) and entomopathogenic nematodes (1513 sequences), whereas other trophic groups had fewer representative sequences. Assigning the sequences to colonizer-persister (c-p) classification revealed that all c-p groups were covered by the retrieved full-length sequences, particularly c-p 3, which had the highest number of sequences (6691). This study provides a foundational understanding of the molecular data currently available for use in long-read metabarcoding databases, facilitating ecological research on nematode communities.
KW - Barcode sequences
KW - High-throughput amplicon sequencing
KW - Nematode databases
KW - Nematode trophic groups
KW - Sequence collation
KW - Taxonomic resolution
UR - http://www.scopus.com/inward/record.url?scp=105017689773&partnerID=8YFLogxK
U2 - 10.1016/j.soilbio.2025.110001
DO - 10.1016/j.soilbio.2025.110001
M3 - Article
AN - SCOPUS:105017689773
SN - 0038-0717
VL - 212
JO - Soil Biology and Biochemistry
JF - Soil Biology and Biochemistry
M1 - 110001
ER -