Expert rules

These are blanket rules that describe resistance (or susceptibility). The file is a CSV with each row representing a rule and is passed to drprg build via the --rules option. The format of each row is

vartype,gene,start,end,drug
  1. vartype: the variant type of the rule. Supported types are:
    • frameshift - Any insertion or deletion whose length is not a multiple of three
    • missense - A DNA change that results in a different amino acid
    • nonsense - A DNA change that results in a stop codon instead of an amino acid
    • absence - Gene is absent
  2. gene: the name of the gene the rule applies to
  3. start: An optional start position for the rule to apply from. The position is in codon coordinates where the rule applies to amino acid changes and is 1-based inclusive. If not provided, the start of the gene is inferred. If you want to include the upstream (promoter) region of the gene, use negative coordinates.
  4. end: An optional end position for the rule to apply to. The position is in codon coordinates where the rule applies to amino acid changes and is 1-based inclusive. If not provided, the end of the gene is inferred.
  5. drug: A semi-colon-delimited (;) list of drugs the rule impacts. If the rule confers susceptibility, use NONE for this column.

If there are certain rules you need for your species-of-interest, raise an issue, and we can look at implementing it.

Example

This is an example of the M. tuberculosis expert rules file used in our paper.

missense,rpoB,426,452,Rifampicin
nonsense,rpoB,426,452,Rifampicin
frameshift,rpoB,1276,1356,Rifampicin
nonsense,katG,,,Isoniazid
frameshift,katG,,,Isoniazid
absence,katG,,,Isoniazid
nonsense,ethA,,,Ethionamide
frameshift,ethA,,,Ethionamide
absence,ethA,,,Ethionamide
nonsense,gid,,,Streptomycin
frameshift,gid,,,Streptomycin
absence,gid,,,Streptomycin
nonsense,pncA,,,Pyrazinamide
frameshift,pncA,,,Pyrazinamide
absence,pncA,,,Pyrazinamide
missense,katG,315,315,Isoniazid
missense,gid,125,125,Streptomycin
missense,rpoB,425,425,Rifampicin
missense,gid,136,136,Streptomycin

The row

frameshift,pncA,,,Pyrazinamide

says that a frameshift anywhere within the pncA gene will cause resistance to Pyrazinamide

nonsense,rpoB,426,452,Rifampicin
frameshift,rpoB,1276,1356,Rifampicin

these two rules illustrate the context of the start and end coordinates. In the first row, we say that any nonsense mutation between 426 and 452 in rpoB causes resistance to Rifampicin. As nonsense mutations only apply to amino acid changes, the coordinates are in codon-space. Whereas the second row describes a frameshift, which only applies to nucleotides; therefore, 1276 and 1356 are in bases-space (i.e. the 1276th nucleotide/base). (As an aside, these two rules both apply to the same region - the RRDR)

missense,katG,315,315,Isoniazid

describes any missense mutation at position 315 in katG causing isoniazid resistance.