Class TargetedPcrMetrics


  • public class TargetedPcrMetrics
    extends MultilevelMetrics
    Metrics class for the analysis of reads obtained from targeted pcr experiments e.g. the TruSeq Custom Amplicon (TSCA) kit (Illumina).
    • Field Summary

      Fields 
      Modifier and Type Field Description
      long AMPLICON_TERRITORY
      The number of unique bases covered by the intervals of all amplicons in the amplicon set
      double AT_DROPOUT
      A measure of how regions with low GC content (<= 50%), are undercovered relative to mean coverage.
      String CUSTOM_AMPLICON_SET
      The name of the amplicon set used in this metrics collection run
      double FOLD_80_BASE_PENALTY
      The fold over-coverage necessary to raise 80% of bases in "non-zero-cvg" targets to the mean coverage level in those targets.
      double FOLD_ENRICHMENT
      The fold by which the amplicon region has been amplified above genomic background.
      double GC_DROPOUT
      A measure of how regions of high GC content (>= 50% GC) are undercovered relative to the mean coverage value.
      long GENOME_SIZE
      The number of bases in the reference genome used for alignment
      double HET_SNP_Q
      The Q Score of the theoretical HET SNP sensitivity.
      double HET_SNP_SENSITIVITY
      The theoretical HET SNP sensitivity.
      long MAX_TARGET_COVERAGE
      The maximum coverage of reads that mapped to target regions of an experiment.
      double MEAN_AMPLICON_COVERAGE
      The mean read coverage of all amplicon regions in the experiment.
      double MEAN_TARGET_COVERAGE
      The mean read coverage of all target regions in an experiment.
      double MEDIAN_TARGET_COVERAGE
      The median coverage of reads that mapped to target regions of an experiment.
      long NEAR_AMPLICON_BASES
      The number of PF_BASES_ALIGNED that mapped to within a fixed interval of an amplified region, but not on a baited region.
      long OFF_AMPLICON_BASES
      The number of PF_BASES_ALIGNED that mapped neither on or near an amplicon.
      long ON_AMPLICON_BASES
      The number of PF_BASES_ALIGNED that mapped to an amplified region of the genome.
      double ON_AMPLICON_VS_SELECTED
      The fraction of bases mapping to regions on or near amplicons, which mapped directly to but not near amplicons, ON_AMPLICON_BASES/(NEAR_AMPLICON_BASES + ON_AMPLICON_BASES)
      long ON_TARGET_BASES
      The number of PF_BASES_ALIGNED that mapped to a targeted region of the genome.
      long ON_TARGET_FROM_PAIR_BASES
      The number of bases from PF_SELECTED_UNIQUE_PAIRS that mapped to a targeted region of the genome.
      double PCT_AMPLIFIED_BASES
      The fraction of PF_BASES_ALIGNED that mapped to or near an amplicon, (ON_AMPLICON_BASES + NEAR_AMPLICON_BASES)/PF_BASES_ALIGNED.
      double PCT_EXC_BASEQ
      The fraction of aligned bases that were filtered out because they were of low base quality.
      double PCT_EXC_DUPE
      The fraction of aligned bases that were filtered out because they were in reads marked as duplicates.
      double PCT_EXC_MAPQ
      The fraction of aligned bases that were filtered out because they were in reads with low mapping quality.
      double PCT_EXC_OFF_TARGET
      The fraction of bases that were filtered out because they did not map to a base within a target region.
      double PCT_EXC_OVERLAP
      The fraction of aligned bases that were filtered out because they were the second observation from an insert with overlapping reads.
      double PCT_OFF_AMPLICON
      The fraction of PF_BASES_ALIGNED that mapped neither onto or near an amplicon, OFF_AMPLICON_BASES/PF_BASES_ALIGNED
      double PCT_PF_READS
      The fraction of reads passing filter, PF_READS/TOTAL_READS.
      double PCT_PF_UQ_READS
      The fraction of TOTAL_READS that are unique, PF, and are not duplicates, PF_UNIQUE_READS/TOTAL_READS
      double PCT_PF_UQ_READS_ALIGNED
      Fraction of PF_READS that are unique and align to the reference genome, PF_UQ_READS_ALIGNED/PF_READS
      double PCT_TARGET_BASES_10X
      The fraction of all target bases achieving 10X or greater coverage depth.
      double PCT_TARGET_BASES_1X
      The fraction of all target bases achieving 1X or greater coverage.
      double PCT_TARGET_BASES_20X
      The fraction of all target bases achieving 20X or greater coverage depth.
      double PCT_TARGET_BASES_2X
      The fraction of all target bases achieving 2X or greater coverage depth.
      double PCT_TARGET_BASES_30X
      The fraction of all target bases achieving 30X or greater coverage depth.
      long PF_BASES
      The total number of bases within the PF_READS of the SAM or BAM file to be examined
      long PF_BASES_ALIGNED
      The number of bases from PF_READS that align to the reference genome with mapping score > 0
      long PF_READS
      The total number of reads passing filter (PF), where the filter(s) can be platform/vendor quality controls
      long PF_SELECTED_PAIRS
      Tracks the number of PF read pairs (used to calculate library size)
      long PF_SELECTED_UNIQUE_PAIRS
      Tracks the number of unique, PF, read pairs, observed (used to calculate library size)
      long PF_UNIQUE_READS
      The number of PF_READS that were not marked as sample or optical duplicates.
      long PF_UQ_BASES_ALIGNED
      The number of bases from PF_UNIQUE_READS that align to the reference genome and have a mapping score > 0
      long PF_UQ_READS_ALIGNED
      The total number of PF_UNIQUE_READS that align to the reference genome with mapping scores > 0
      long TARGET_TERRITORY
      The number of unique bases covered by the intervals of all targets that should be covered
      long TOTAL_READS
      The total number of reads in the SAM or BAM file examined
      double ZERO_CVG_TARGETS_PCT
      The fraction of targets that did not reach coverage=1 over any base.
    • Field Detail

      • CUSTOM_AMPLICON_SET

        public String CUSTOM_AMPLICON_SET
        The name of the amplicon set used in this metrics collection run
      • GENOME_SIZE

        public long GENOME_SIZE
        The number of bases in the reference genome used for alignment
      • AMPLICON_TERRITORY

        public long AMPLICON_TERRITORY
        The number of unique bases covered by the intervals of all amplicons in the amplicon set
      • TARGET_TERRITORY

        public long TARGET_TERRITORY
        The number of unique bases covered by the intervals of all targets that should be covered
      • TOTAL_READS

        public long TOTAL_READS
        The total number of reads in the SAM or BAM file examined
      • PF_READS

        public long PF_READS
        The total number of reads passing filter (PF), where the filter(s) can be platform/vendor quality controls
      • PF_BASES

        public long PF_BASES
        The total number of bases within the PF_READS of the SAM or BAM file to be examined
      • PF_UNIQUE_READS

        public long PF_UNIQUE_READS
        The number of PF_READS that were not marked as sample or optical duplicates.
      • PCT_PF_READS

        public double PCT_PF_READS
        The fraction of reads passing filter, PF_READS/TOTAL_READS.
      • PCT_PF_UQ_READS

        public double PCT_PF_UQ_READS
        The fraction of TOTAL_READS that are unique, PF, and are not duplicates, PF_UNIQUE_READS/TOTAL_READS
      • PF_UQ_READS_ALIGNED

        public long PF_UQ_READS_ALIGNED
        The total number of PF_UNIQUE_READS that align to the reference genome with mapping scores > 0
      • PF_SELECTED_PAIRS

        public long PF_SELECTED_PAIRS
        Tracks the number of PF read pairs (used to calculate library size)
      • PF_SELECTED_UNIQUE_PAIRS

        public long PF_SELECTED_UNIQUE_PAIRS
        Tracks the number of unique, PF, read pairs, observed (used to calculate library size)
      • PCT_PF_UQ_READS_ALIGNED

        public double PCT_PF_UQ_READS_ALIGNED
        Fraction of PF_READS that are unique and align to the reference genome, PF_UQ_READS_ALIGNED/PF_READS
      • PF_BASES_ALIGNED

        public long PF_BASES_ALIGNED
        The number of bases from PF_READS that align to the reference genome with mapping score > 0
      • PF_UQ_BASES_ALIGNED

        public long PF_UQ_BASES_ALIGNED
        The number of bases from PF_UNIQUE_READS that align to the reference genome and have a mapping score > 0
      • ON_AMPLICON_BASES

        public long ON_AMPLICON_BASES
        The number of PF_BASES_ALIGNED that mapped to an amplified region of the genome.
      • NEAR_AMPLICON_BASES

        public long NEAR_AMPLICON_BASES
        The number of PF_BASES_ALIGNED that mapped to within a fixed interval of an amplified region, but not on a baited region.
      • OFF_AMPLICON_BASES

        public long OFF_AMPLICON_BASES
        The number of PF_BASES_ALIGNED that mapped neither on or near an amplicon.
      • ON_TARGET_BASES

        public long ON_TARGET_BASES
        The number of PF_BASES_ALIGNED that mapped to a targeted region of the genome.
      • ON_TARGET_FROM_PAIR_BASES

        public long ON_TARGET_FROM_PAIR_BASES
        The number of bases from PF_SELECTED_UNIQUE_PAIRS that mapped to a targeted region of the genome.
      • PCT_AMPLIFIED_BASES

        public double PCT_AMPLIFIED_BASES
        The fraction of PF_BASES_ALIGNED that mapped to or near an amplicon, (ON_AMPLICON_BASES + NEAR_AMPLICON_BASES)/PF_BASES_ALIGNED.
      • PCT_OFF_AMPLICON

        public double PCT_OFF_AMPLICON
        The fraction of PF_BASES_ALIGNED that mapped neither onto or near an amplicon, OFF_AMPLICON_BASES/PF_BASES_ALIGNED
      • ON_AMPLICON_VS_SELECTED

        public double ON_AMPLICON_VS_SELECTED
        The fraction of bases mapping to regions on or near amplicons, which mapped directly to but not near amplicons, ON_AMPLICON_BASES/(NEAR_AMPLICON_BASES + ON_AMPLICON_BASES)
      • MEAN_AMPLICON_COVERAGE

        public double MEAN_AMPLICON_COVERAGE
        The mean read coverage of all amplicon regions in the experiment.
      • MEAN_TARGET_COVERAGE

        public double MEAN_TARGET_COVERAGE
        The mean read coverage of all target regions in an experiment.
      • MEDIAN_TARGET_COVERAGE

        public double MEDIAN_TARGET_COVERAGE
        The median coverage of reads that mapped to target regions of an experiment.
      • MAX_TARGET_COVERAGE

        public long MAX_TARGET_COVERAGE
        The maximum coverage of reads that mapped to target regions of an experiment.
      • FOLD_ENRICHMENT

        public double FOLD_ENRICHMENT
        The fold by which the amplicon region has been amplified above genomic background.
      • ZERO_CVG_TARGETS_PCT

        public double ZERO_CVG_TARGETS_PCT
        The fraction of targets that did not reach coverage=1 over any base.
      • PCT_EXC_DUPE

        public double PCT_EXC_DUPE
        The fraction of aligned bases that were filtered out because they were in reads marked as duplicates.
      • PCT_EXC_MAPQ

        public double PCT_EXC_MAPQ
        The fraction of aligned bases that were filtered out because they were in reads with low mapping quality.
      • PCT_EXC_BASEQ

        public double PCT_EXC_BASEQ
        The fraction of aligned bases that were filtered out because they were of low base quality.
      • PCT_EXC_OVERLAP

        public double PCT_EXC_OVERLAP
        The fraction of aligned bases that were filtered out because they were the second observation from an insert with overlapping reads.
      • PCT_EXC_OFF_TARGET

        public double PCT_EXC_OFF_TARGET
        The fraction of bases that were filtered out because they did not map to a base within a target region.
      • FOLD_80_BASE_PENALTY

        public double FOLD_80_BASE_PENALTY
        The fold over-coverage necessary to raise 80% of bases in "non-zero-cvg" targets to the mean coverage level in those targets.
      • PCT_TARGET_BASES_1X

        public double PCT_TARGET_BASES_1X
        The fraction of all target bases achieving 1X or greater coverage.
      • PCT_TARGET_BASES_2X

        public double PCT_TARGET_BASES_2X
        The fraction of all target bases achieving 2X or greater coverage depth.
      • PCT_TARGET_BASES_10X

        public double PCT_TARGET_BASES_10X
        The fraction of all target bases achieving 10X or greater coverage depth.
      • PCT_TARGET_BASES_20X

        public double PCT_TARGET_BASES_20X
        The fraction of all target bases achieving 20X or greater coverage depth.
      • PCT_TARGET_BASES_30X

        public double PCT_TARGET_BASES_30X
        The fraction of all target bases achieving 30X or greater coverage depth.
      • AT_DROPOUT

        public double AT_DROPOUT
        A measure of how regions with low GC content (<= 50%), are undercovered relative to mean coverage. After binning the GC content [0..50], we calculate a = fraction of target territory, and b = fraction of aligned reads aligned to these targets for each bin. AT DROPOUT is then abs(sum(a-b when a-b < 0)). For example, if the AT_DROPOUT value is 5% this implies that 5% of total reads that should have mapped to GC<=50% regions, mapped elsewhere.
      • GC_DROPOUT

        public double GC_DROPOUT
        A measure of how regions of high GC content (>= 50% GC) are undercovered relative to the mean coverage value. For each GC bin [50..100], we calculate a = % of target territory, and b = % of aligned reads aligned to these targets. GC DROPOUT is then abs(sum(a-b when a-b < 0)). For example, if the value is 5%, this implies that 5% of total reads that should have mapped to GC>=50% regions, mapped elsewhere.
      • HET_SNP_SENSITIVITY

        public double HET_SNP_SENSITIVITY
        The theoretical HET SNP sensitivity.
      • HET_SNP_Q

        public double HET_SNP_Q
        The Q Score of the theoretical HET SNP sensitivity.
    • Constructor Detail

      • TargetedPcrMetrics

        public TargetedPcrMetrics()