Introduction
A suitable pH environment is critical for bacterial survival; while determining optimal pH aids culture media preparation, traditional experimental methods are labor-intensive, material-wasting and time-consuming, and most bacteria remain unculturable due to unknown media, necessitating a new approach. Machine learning, as a preferred alternative in recent years, builds prediction models based on expanding high-throughput sequencing data and existing experimental pH data to supplement optimal pH determination—resolving experimental limitations, promoting massive data utilization, and offering references for alleviating the imbalance between sequencing and experimental measurement rates in other bacterial environmental preference studies—and notably, for silage (a low-pH anaerobic fermentation process), this tool enables rapid screening of high-value bacterial agents adapted to silage acidity from numerous microbes (including unculturable ones), avoiding tedious culture-based screening, discovering previously hard-to-isolate strains, and supporting silage agent optimization to improve fermentation efficiency and quality.
Flowchart for the collection and pre-processing of public microorganisms' optimal growth pH big data

Acquisition of representative genomic data from the GTDB database and preprocessing flowchart




