Skip to content
Donovan H. Parks edited this page May 14, 2026 · 1 revision

Download started on the 14th of September 2024 LPSN Downloaded on the 18th of September Assembly summary for Fungi downloaded on the 21st of September

When running hmmsearch:

  • GCA_002870245.1
  • GCA_000722275.1
  • GCA_000722025.1
  • GCA_000722095.1
  • GCA_000722035.1
  • GCA_000722415.1
  • GCA_000722115.1
  • GCA_000715975.1
  • GCA_000721965.1
  • GCA_000716225.1
  • GCA_000716135.1
  • GCA_000716285.1
  • GCA_001578415.1
  • GCA_001341675.1 are skipped because the protein file _protein.faa is empty

we remove genomes from the genome_dirs file because they dont have a genomic_fna file.

  • GCF_041410205.1
  • GCF_033353635.1
  • GCA_041877085.1

Version used for this release:

  • Silva : 138.2
  • LTP : 08_2023

When update the NCBI taxonomy for all the genomes in the database, 5 of them did not have a taxonomy, I did add them manually:

  • GCF_025895605.1
  • GCF_037414725.1
  • GCF_037414705.1
  • GCF_014656475.1
  • GCF_037478275.1

With the new parsing of LPSN where a strain can be only one letter and doesnt need digit, some genomes have a parsing problem. This is the case of GCF_000306055.1.
GCF_000306055.1 has the ncbi organism name “Xanthomonas arboricola pv. juglandis str. NCPPB 1447”, the strain ID is “NCPPB1447 “ BUT , on the LPSN website, “pv. Juglandis” is considered a type strain. I had to manually edit the record to flagged it as a non type strain.

There is 2 more columns in this release: ncbi_not_trusted_as_type ( previously ncbi_untrustworthy_type ) and ncbi_excluded_from_refseq. With the ncbi_excluded_from_refseq we run an extra steps. if 'derived from metagenome' and 'not used as type' are in the ncbi_excluded_from_refseq columns, we update gtdb_type_designation_ncbi_taxa to 'not used as type'

Clone this wiki locally