Package picard.sam
Class AbstractAlignmentMerger
- java.lang.Object
-
- picard.sam.AbstractAlignmentMerger
-
- Direct Known Subclasses:
SamAlignmentMerger
public abstract class AbstractAlignmentMerger extends Object
Abstract class that coordinates the general task of taking in a set of alignment information, possibly in SAM format, possibly in other formats, and merging that with the set of all reads for which alignment was attempted, stored in an unmapped SAM file. The order of processing is as follows: 1. Get records from the unmapped bam and the alignment data 2. Merge the alignment information and public tags ONLY from the aligned SAMRecords 3. Do additional modifications -- handle clipping, trimming, etc. 4. Fix up mate information on paired reads 5. Do a final calculation of the NM and UQ tags (coordinate sorted only) 6. Write the records to the output file. Concrete subclasses which extend AbstractAlignmentMerger should implement getQueryNameSortedAlignedRecords. If these records are not in queryname order, mergeAlignment will throw an IllegalStateException. Subclasses may optionally implement ignoreAlignment(), which can be used to skip over certain alignments.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
AbstractAlignmentMerger.UnmappingReadStrategy
-
Field Summary
Fields Modifier and Type Field Description static int
MAX_RECORDS_IN_RAM
protected File
referenceFasta
-
Constructor Summary
Constructors Constructor Description AbstractAlignmentMerger(File unmappedBamFile, File targetBamFile, File referenceFasta, boolean clipAdapters, boolean bisulfiteSequence, boolean alignedReadsOnly, htsjdk.samtools.SAMProgramRecord programRecord, List<String> attributesToRetain, List<String> attributesToRemove, Integer read1BasesTrimmed, Integer read2BasesTrimmed, List<htsjdk.samtools.SamPairUtil.PairOrientation> expectedOrientations, htsjdk.samtools.SAMFileHeader.SortOrder sortOrder, PrimaryAlignmentSelectionStrategy primaryAlignmentSelectionStrategy, boolean addMateCigar, boolean unmapContaminantReads)
constructor with a default setting for unmappingReadsStrategy.AbstractAlignmentMerger(File unmappedBamFile, File targetBamFile, File referenceFasta, boolean clipAdapters, boolean bisulfiteSequence, boolean alignedReadsOnly, htsjdk.samtools.SAMProgramRecord programRecord, List<String> attributesToRetain, List<String> attributesToRemove, Integer read1BasesTrimmed, Integer read2BasesTrimmed, List<htsjdk.samtools.SamPairUtil.PairOrientation> expectedOrientations, htsjdk.samtools.SAMFileHeader.SortOrder sortOrder, PrimaryAlignmentSelectionStrategy primaryAlignmentSelectionStrategy, boolean addMateCigar, boolean unmapContaminantReads, AbstractAlignmentMerger.UnmappingReadStrategy unmappingReadsStrategy)
Constructor
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected static void
clipForOverlappingReads(htsjdk.samtools.SAMRecord read1, htsjdk.samtools.SAMRecord read2)
Checks to see whether the ends of the reads overlap and soft clips reads them if necessary.void
close()
static void
createNewCigarsIfMapsOffEndOfReference(htsjdk.samtools.SAMRecord rec)
Soft-clip an alignment that hangs off the end of its reference sequence.static void
fixNmMdAndUq(htsjdk.samtools.SAMRecord record, htsjdk.samtools.reference.ReferenceSequenceFileWalker refSeqWalker, boolean isBisulfiteSequence)
Calculates and sets the NM, MD, and and UQ tags from the record and the referencestatic void
fixUq(htsjdk.samtools.SAMRecord record, htsjdk.samtools.reference.ReferenceSequenceFileWalker refSeqWalker, boolean isBisulfiteSequence)
Calculates and sets UQ tag from the record and the referenceSet<String>
getAttributesToReverse()
Gets the set of attributes to be reversed on reads marked as negative strand.Set<String>
getAttributesToReverseComplement()
Gets the set of attributes to be reverse complemented on reads marked as negative strand.protected abstract htsjdk.samtools.SAMSequenceDictionary
getDictionaryForMergedBam()
protected htsjdk.samtools.SAMFileHeader
getHeader()
protected htsjdk.samtools.SAMProgramRecord
getProgramRecord()
protected abstract htsjdk.samtools.util.CloseableIterator<htsjdk.samtools.SAMRecord>
getQuerynameSortedAlignedRecords()
protected boolean
ignoreAlignment(htsjdk.samtools.SAMRecord sam)
boolean
isClipOverlappingReads()
protected boolean
isContaminant(picard.sam.HitsForInsert hits)
boolean
isKeepAlignerProperPairFlags()
protected boolean
isReservedTag(String tag)
void
mergeAlignment(File referenceFasta)
Merges the alignment data with the non-aligned records from the source BAM file.protected void
resetRefSeqFileWalker()
void
setAddPGTagToReads(boolean addPGTagToReads)
Set addPGTagToReads.void
setAttributesToReverse(Set<String> attributesToReverse)
Sets the set of attributes to be reversed on reads marked as negative strand.void
setAttributesToReverseComplement(Set<String> attributesToReverseComplement)
Sets the set of attributes to be reverse complemented on reads marked as negative strand.void
setClipOverlappingReads(boolean clipOverlappingReads)
void
setIncludeSecondaryAlignments(boolean includeSecondaryAlignments)
void
setKeepAlignerProperPairFlags(boolean keepAlignerProperPairFlags)
If true, keep the aligner's idea of proper pairs rather than letting alignment merger decide.void
setMaxRecordsInRam(int maxRecordsInRam)
Allows the caller to override the maximum records in RAM.protected void
setProgramRecord(htsjdk.samtools.SAMProgramRecord pg)
protected void
setValuesFromAlignment(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMRecord alignment, boolean needsSafeReverseComplement)
Sets the values from the alignment record on the unaligned BAM record.protected void
updateCigarForTrimmedOrClippedBases(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMRecord alignment)
-
-
-
Field Detail
-
MAX_RECORDS_IN_RAM
public static final int MAX_RECORDS_IN_RAM
- See Also:
- Constant Field Values
-
referenceFasta
protected final File referenceFasta
-
-
Constructor Detail
-
AbstractAlignmentMerger
public AbstractAlignmentMerger(File unmappedBamFile, File targetBamFile, File referenceFasta, boolean clipAdapters, boolean bisulfiteSequence, boolean alignedReadsOnly, htsjdk.samtools.SAMProgramRecord programRecord, List<String> attributesToRetain, List<String> attributesToRemove, Integer read1BasesTrimmed, Integer read2BasesTrimmed, List<htsjdk.samtools.SamPairUtil.PairOrientation> expectedOrientations, htsjdk.samtools.SAMFileHeader.SortOrder sortOrder, PrimaryAlignmentSelectionStrategy primaryAlignmentSelectionStrategy, boolean addMateCigar, boolean unmapContaminantReads)
constructor with a default setting for unmappingReadsStrategy. see full constructor for parameters
-
AbstractAlignmentMerger
public AbstractAlignmentMerger(File unmappedBamFile, File targetBamFile, File referenceFasta, boolean clipAdapters, boolean bisulfiteSequence, boolean alignedReadsOnly, htsjdk.samtools.SAMProgramRecord programRecord, List<String> attributesToRetain, List<String> attributesToRemove, Integer read1BasesTrimmed, Integer read2BasesTrimmed, List<htsjdk.samtools.SamPairUtil.PairOrientation> expectedOrientations, htsjdk.samtools.SAMFileHeader.SortOrder sortOrder, PrimaryAlignmentSelectionStrategy primaryAlignmentSelectionStrategy, boolean addMateCigar, boolean unmapContaminantReads, AbstractAlignmentMerger.UnmappingReadStrategy unmappingReadsStrategy)
Constructor- Parameters:
unmappedBamFile
- The BAM file that was used as the input to the aligner, which will include info on all the reads that did not map. Required.targetBamFile
- The file to which to write the merged SAM records. Required.referenceFasta
- The reference sequence for the map files. Required.clipAdapters
- Whether adapters marked in unmapped BAM file should be marked as soft clipped in the merged bam. Required.bisulfiteSequence
- Whether the reads are bisulfite sequence (used when calculating the NM and UQ tags). Required.alignedReadsOnly
- Whether to output only those reads that have alignment dataprogramRecord
- Program record for target file SAMRecords created.attributesToRetain
- private attributes from the alignment record that should be included when merging. This overrides the exclusion of attributes whose tags start with the reserved characters of X, Y, and ZattributesToRemove
- attributes from the alignment record that should be removed when merging. This overrides attributesToRetain if they share common tags.read1BasesTrimmed
- The number of bases trimmed from start of read 1 prior to alignment. Optional.read2BasesTrimmed
- The number of bases trimmed from start of read 2 prior to alignment. Optional.expectedOrientations
- A List of SamPairUtil.PairOrientations that are expected for aligned pairs. Used to determine the properPair flag.sortOrder
- The order in which the merged records should be output. If null, output will be coordinate-sortedprimaryAlignmentSelectionStrategy
- What to do when there are multiple primary alignments, or multiple alignments but none primary, for a read or read pair.addMateCigar
- True if we are to add or maintain the mate CIGAR (MC) tag, false if we are to remove or not include.unmapContaminantReads
- If true, identify reads having the signature of cross-species contamination (i.e. mostly clipped bases), and mark them as unmapped.unmappingReadsStrategy
- An enum describing how to deal with reads whose mapping information are being removed (currently this happens due to cross-species contamination). Ignored unless unmapContaminantReads is true.
-
-
Method Detail
-
getDictionaryForMergedBam
protected abstract htsjdk.samtools.SAMSequenceDictionary getDictionaryForMergedBam()
-
getQuerynameSortedAlignedRecords
protected abstract htsjdk.samtools.util.CloseableIterator<htsjdk.samtools.SAMRecord> getQuerynameSortedAlignedRecords()
-
ignoreAlignment
protected boolean ignoreAlignment(htsjdk.samtools.SAMRecord sam)
-
isContaminant
protected boolean isContaminant(picard.sam.HitsForInsert hits)
-
getAttributesToReverse
public Set<String> getAttributesToReverse()
Gets the set of attributes to be reversed on reads marked as negative strand.
-
setAttributesToReverse
public void setAttributesToReverse(Set<String> attributesToReverse)
Sets the set of attributes to be reversed on reads marked as negative strand.
-
getAttributesToReverseComplement
public Set<String> getAttributesToReverseComplement()
Gets the set of attributes to be reverse complemented on reads marked as negative strand.
-
setAttributesToReverseComplement
public void setAttributesToReverseComplement(Set<String> attributesToReverseComplement)
Sets the set of attributes to be reverse complemented on reads marked as negative strand.
-
setMaxRecordsInRam
public void setMaxRecordsInRam(int maxRecordsInRam)
Allows the caller to override the maximum records in RAM.
-
setAddPGTagToReads
public void setAddPGTagToReads(boolean addPGTagToReads)
Set addPGTagToReads. If true, the PG will be added to reads when applicable. If false, the PG tag will not be added. Default is true
-
mergeAlignment
public void mergeAlignment(File referenceFasta)
Merges the alignment data with the non-aligned records from the source BAM file.
-
fixNmMdAndUq
public static void fixNmMdAndUq(htsjdk.samtools.SAMRecord record, htsjdk.samtools.reference.ReferenceSequenceFileWalker refSeqWalker, boolean isBisulfiteSequence)
Calculates and sets the NM, MD, and and UQ tags from the record and the reference- Parameters:
record
- the record to be fixedrefSeqWalker
- a ReferenceSequenceWalker that will be used to traverse the referenceisBisulfiteSequence
- a flag indicating whether the sequence came from bisulfite-sequencing which would imply a different calculation of the NM tag. No return value, modifies the provided record.
-
fixUq
public static void fixUq(htsjdk.samtools.SAMRecord record, htsjdk.samtools.reference.ReferenceSequenceFileWalker refSeqWalker, boolean isBisulfiteSequence)
Calculates and sets UQ tag from the record and the reference- Parameters:
record
- the record to be fixedrefSeqWalker
- a ReferenceSequenceWalker that will be used to traverse the referenceisBisulfiteSequence
- a flag indicating whether the sequence came from bisulfite-sequencing. No return value, modifies the provided record.
-
clipForOverlappingReads
protected static void clipForOverlappingReads(htsjdk.samtools.SAMRecord read1, htsjdk.samtools.SAMRecord read2)
Checks to see whether the ends of the reads overlap and soft clips reads them if necessary.
-
setValuesFromAlignment
protected void setValuesFromAlignment(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMRecord alignment, boolean needsSafeReverseComplement)
Sets the values from the alignment record on the unaligned BAM record. This preserves all data from the unaligned record (ReadGroup, NoiseRead status, etc) and adds all the alignment info- Parameters:
rec
- The unaligned read recordalignment
- The alignment record
-
createNewCigarsIfMapsOffEndOfReference
public static void createNewCigarsIfMapsOffEndOfReference(htsjdk.samtools.SAMRecord rec)
Soft-clip an alignment that hangs off the end of its reference sequence. Checks both the read and its mate, if available.- Parameters:
rec
-
-
updateCigarForTrimmedOrClippedBases
protected void updateCigarForTrimmedOrClippedBases(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMRecord alignment)
-
getProgramRecord
protected htsjdk.samtools.SAMProgramRecord getProgramRecord()
-
setProgramRecord
protected void setProgramRecord(htsjdk.samtools.SAMProgramRecord pg)
-
isReservedTag
protected boolean isReservedTag(String tag)
-
getHeader
protected htsjdk.samtools.SAMFileHeader getHeader()
-
resetRefSeqFileWalker
protected void resetRefSeqFileWalker()
-
isClipOverlappingReads
public boolean isClipOverlappingReads()
-
setClipOverlappingReads
public void setClipOverlappingReads(boolean clipOverlappingReads)
-
isKeepAlignerProperPairFlags
public boolean isKeepAlignerProperPairFlags()
-
setKeepAlignerProperPairFlags
public void setKeepAlignerProperPairFlags(boolean keepAlignerProperPairFlags)
If true, keep the aligner's idea of proper pairs rather than letting alignment merger decide.
-
setIncludeSecondaryAlignments
public void setIncludeSecondaryAlignments(boolean includeSecondaryAlignments)
-
close
public void close()
-
-