NEWS & BLOGS
Taxonomic Relative Abundance Estimation via Kraken2 and Bracken
Published on: 2025-03-15 | By: EDITOR
2. Taxonomic Relative Abundance Estimation via Kraken2 and Bracken
3. Enterotyping Analysis Pipeline
1. Build a Custom Kraken2/Bracken Database using reference in FelMGDB
(1) Generate .dmp Files for Kraken2
Use gtdb_to_taxdump to create taxonomic dump (dmp) files. Move the generated files to the ./${db_path}/taxonomy directory, where ${db_path} is the path to your database directory.
#installation: https://github.com/nick-youngblut/gtdb_to_taxdump
gtdb_to_taxdump.py taxonomy_file.txt > taxID_info.tsvThe taxonomy_file.txt follows the structure below:
GPISO0023 d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__Enterococcus_B;s__Enterococcus_B hirae
MAGPD00321 d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Lactococcus;s__Lactococcus lactis(2) Replace Sequence Names in Genome Files and Add IDs
- Action: Manually extract genome names and tax IDs from .dmp file. Save this information in a file named
MAG_to_taxid.txtand upload it to the genome directory. - Format Requirements: The file must follow the structure below:
GPISO0001 98
GPISO0003 220
GPISO0004 222
GPISO0007 297
GPISO0010 185
... ...Run the Renaming Script in the Genome Directory:
python replace_genome_sequence_name.py -i MAG_to_taxid.txt(3) Build Kraken2 and Bracken Databases
Incorporate your custom genome sequences into the database:
db=${db_path}
for file in ./*taxid.fasta
do
kraken2-build --add-to-library $file --db ${db}
doneConstruct the Kraken2 database using the added genomes and taxonomy:
kraken2-build --build --db ${db}Generate the Bracken database for accurate species-level abundance estimation:
bracken-build -d ${db} -t 4 -l 1502. Abundance Estimation with Kraken2 and Bracken
Note: You are not required to build the Kraken2 and Bracken databases using the above workflow. You can download our pre-built databases and use them directly for your analysis. If you utilize our databases, your relative abundance results can be compared with the values from this study.
(1)Run Kraken2 for Taxonomic Classification.
Use Kraken2 to classify metagenomic reads against the custom database.
kraken2 --paired --db /public/home/zzs000190/Giant_Panda/Kraken2_Result_50_10 --use-names --threads 8 --report-zero-counts --report ${id}DG4D_kraken2.report ${id}_nonrRNA_fwd.fq ${id}_nonrRNA_rev.fq > ${id}.out(2)Estimate Species-Level Abundance with Bracken
Process the Kraken2 report with Bracken to refine abundance estimates at the species level.
bracken -d /public/home/zzs000190/Giant_Panda/Kraken2_Result_50_10 -i ${id}_kraken2.report -o ${id}.bracken -r 150 -l S -t 4