Predict

The predict subcommand is used to predict resistance for a sample from an index.

At its simplest

drprg predict -i reads.fq -x mtb -o outdir

drprg is a bit "new-age" in that it assumes the reads are Nanopore. If they're Illumina, use the -I/--illumina option.

See Prediction Output documentation for a detailed description of what results/output files and formats to expect.

Required

Index

The index is provided via the -x/--index option. It can either be a path to an index, or the name of a downloaded index. As with the index subcommmand, you can specify a version if you don't want to use the latest.

Input reads

A fastq (or fasta) file of the reads you want to predict resistance from - provided via the -i/--input option. If you have paired reads in two files, simply combine them and pass the combined file - interleave order doesn't matter. For example

cat r1.fq r2.fq > combined.fq
drprg predict -i combined.fq ...

gzip-compressed files are also accepted.

Optional

Sample name

Identifier to use for your output files. By default, it will be set to the file name prefix (e.g. name for a fastq named name.fq.gz). Provided via the -s/--sample option.

Minimum allele frequency

Provided via the -f/--maf option. If an alternate allele has at least this fraction of the depth, a minor resistance ("r") prediction is made. By default, this is set to 1.0 for Nanopore data (i.e. minor allele detection is off) and 0.1 when using the --illumina option. For example, if a variant is called as the reference allele for Illumina reads, but an alternate allele has more than 10% of the depth on that position, a minor resistance call is made for the alternate allele.

Ignore synonymous

Using the -S/--ignore-synonymous option will prevent synonymous mutations from appearing as unknown resistance calls. However, any synonymous mutations in the catalogue will still be considered.

Quick usage

$ drprg predict -h
Predict drug resistance

Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>

Options:
  -v, --verbose        Use verbose output
  -t, --threads <INT>  Maximum number of threads to use [default: 1]
  -h, --help           Print help (see more with '--help')

Input/Output:
  -x, --index <DIR>      Name of a downloaded index or path to an index
  -i, --input <FILE>     Reads to predict resistance from
  -o, --outdir <DIR>     Directory to place output [default: .]
  -s, --sample <SAMPLE>  Identifier to use for the sample
  -I, --illumina         Sample reads are from Illumina sequencing

Filter:
  -S, --ignore-synonymous     Ignore unknown (off-catalogue) variants that cause a synonymous substitution
  -f, --maf <FLOAT[0.0-1.0]>  Minimum allele frequency to call variants [default: 1]

Full usage

$ drprg predict --help
Predict drug resistance

Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>

Options:
  -p, --pandora <FILE>
          Path to pandora executable. Will try in src/ext or $PATH if not given

  -v, --verbose
          Use verbose output

  -m, --makeprg <FILE>
          Path to make_prg executable. Will try in src/ext or $PATH if not given

  -t, --threads <INT>
          Maximum number of threads to use

          Use 0 to select the number automatically

          [default: 1]

  -M, --mafft <FILE>
          Path to MAFFT executable. Will try in src/ext or $PATH if not given

  -h, --help
          Print help (see a summary with '-h')

Input/Output:
  -x, --index <DIR>
          Name of a downloaded index or path to an index

  -i, --input <FILE>
          Reads to predict resistance from

          Both fasta and fastq are accepted, along with compressed or uncompressed.

  -o, --outdir <DIR>
          Directory to place output

          [default: .]

  -s, --sample <SAMPLE>
          Identifier to use for the sample

          If not provided, this will be set to the input reads file path prefix

  -I, --illumina
          Sample reads are from Illumina sequencing

Filter:
  -S, --ignore-synonymous
          Ignore unknown (off-catalogue) variants that cause a synonymous substitution

  -f, --maf <FLOAT[0.0-1.0]>
          Minimum allele frequency to call variants

          If an alternate allele has at least this fraction of the depth, a minor resistance ("r") prediction is made. Set to 1 to disable. If --illumina is passed, the default is 0.1

          [default: 1]

      --debug
          Output debugging files. Mostly for development purposes

  -d, --min-covg <INT>
          Minimum depth of coverage allowed on variants

          [default: 3]

  -D, --max-covg <INT>
          Maximum depth of coverage allowed on variants

          [default: 2147483647]

  -b, --min-strand-bias <FLOAT>
          Minimum strand bias ratio allowed on variants

          For example, setting to 0.25 requires >=25% of total (allele) coverage on both strands for an allele.

          [default: 0.01]

  -g, --min-gt-conf <FLOAT>
          Minimum genotype confidence (GT_CONF) score allow on variants

          [default: 0]

  -L, --max-indel <INT>
          Maximum (absolute) length of insertions/deletions allowed

  -K, --min-frs <FLOAT>
          Minimum fraction of read support

          For example, setting to 0.9 requires >=90% of coverage for the variant to be on the called allele

          [default: 0]