Package htsjdk.samtools
Class SRAIndex
java.lang.Object
htsjdk.samtools.SRAIndex
- All Implemented Interfaces:
BAMIndex,BrowseableBAMIndex,Closeable,AutoCloseable
Emulates BAM index so that we can request chunks of records from SRAFileReader
Here is how it works:
SRA allows reading of alignments by Reference position fast, so we divide our "file" range for alignments as
a length of all references. Reading unaligned reads is then fast if we use read positions for lookup and (internally)
filter out aligned fragments.
Total SRA "file" range is calculated as sum of all reference lengths plus number of reads (both aligned and unaligned)
in SRA archive.
Now, we can use Chunks to lookup for aligned and unaligned fragments.
We emulate BAM index bins by mapping SRA reference positions to bin numbers.
And then we map from bin number to list of chunks, which represent SRA "file" positions (which are simply reference
positions).
We only emulate last level of BAM index bins (and they refer to a portion of reference SRA_BIN_SIZE bases long).
For all other bins RuntimeException will be returned (but since nobody else creates bins, except SRAIndex class
that is fine).
But since the last level of bins was not meant to refer to fragments that only partially overlap bin reference
positions, we also return chunk that goes 5000 bases left before beginning of the bin to assure fragments that
start before the bin positions but still overlap with it can be retrieved by SRA reader.
Later we will add support to NGS API to get a maximum number of bases that we need to go left to retrieve such fragments.
Created by andrii.nikitiuk on 9/4/15.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intNumber of reference bases bins in last level can representstatic final intChunks of that size will be created when using SRA indexFields inherited from interface htsjdk.samtools.BAMIndex
BAI_INDEX_SUFFIX, BAMIndexSuffix, CSI_INDEX_SUFFIX -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Close the index and release any associated resources.getBinsOverlapping(int referenceIndex, int startPos, int endPos) Provides a list of bins that contain bases at requested positionsintgetFirstLocusInBin(Bin bin) Gets the first locus that this bin can index into.intgetLastLocusInBin(Bin bin) Gets the last locus that this bin can index into.intgetLevelForBin(Bin bin) SRA only operates on bins from last levelintgetLevelSize(int levelNumber) Gets the size (number of bins in) a given level of a BAM index.getMetaData(int reference) Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate recordsgetSpanOverlapping(int referenceIndex, int startPos, int endPos) Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive.getSpanOverlapping(Bin bin) Perform an overlapping query of all bins bounding the given location.longGets the start of the last linear bin in the index.
-
Field Details
-
SRA_BIN_SIZE
public static final int SRA_BIN_SIZENumber of reference bases bins in last level can represent- See Also:
-
SRA_CHUNK_SIZE
public static final int SRA_CHUNK_SIZEChunks of that size will be created when using SRA index- See Also:
-
-
Constructor Details
-
SRAIndex
- Parameters:
header- sam headerrecordRangeInfo- info about record ranges withing SRA archive
-
-
Method Details
-
getLevelSize
public int getLevelSize(int levelNumber) Gets the size (number of bins in) a given level of a BAM index.- Specified by:
getLevelSizein interfaceBrowseableBAMIndex- Parameters:
levelNumber- Level for which to inspect the size.- Returns:
- Size of the given level.
-
getLevelForBin
SRA only operates on bins from last level- Specified by:
getLevelForBinin interfaceBrowseableBAMIndex- Parameters:
bin- The bin for which to determine the level.- Returns:
- bin level
-
getFirstLocusInBin
Gets the first locus that this bin can index into.- Specified by:
getFirstLocusInBinin interfaceBrowseableBAMIndex- Parameters:
bin- The bin to test.- Returns:
- first position that associated with given bin number
-
getLastLocusInBin
Gets the last locus that this bin can index into.- Specified by:
getLastLocusInBinin interfaceBrowseableBAMIndex- Parameters:
bin- The bin to test.- Returns:
- last position that associated with given bin number
-
getBinsOverlapping
Provides a list of bins that contain bases at requested positions- Specified by:
getBinsOverlappingin interfaceBrowseableBAMIndex- Parameters:
referenceIndex- sequence of desired SAMRecordsstartPos- 1-based start of the desired interval, inclusiveendPos- 1-based end of the desired interval, inclusive- Returns:
- a list of bins that contain relevant data
-
getSpanOverlapping
Description copied from interface:BrowseableBAMIndexPerform an overlapping query of all bins bounding the given location.- Specified by:
getSpanOverlappingin interfaceBrowseableBAMIndex- Parameters:
bin- The bin over which to perform an overlapping query.- Returns:
- The file pointers
-
getSpanOverlapping
Description copied from interface:BAMIndexGets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive. See the BAM spec for more information on how a chunk is represented.- Specified by:
getSpanOverlappingin interfaceBAMIndex- Parameters:
referenceIndex- The contig.startPos- Genomic start of query.endPos- Genomic end of query.- Returns:
- A file span listing the chunks in the BAM file.
-
getStartOfLastLinearBin
public long getStartOfLastLinearBin()Description copied from interface:BAMIndexGets the start of the last linear bin in the index.- Specified by:
getStartOfLastLinearBinin interfaceBAMIndex- Returns:
- a position where aligned fragments end
-
getMetaData
Description copied from interface:BAMIndexGets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate records- Specified by:
getMetaDatain interfaceBAMIndex- Parameters:
reference- the reference of interest- Returns:
- meta data for the reference
-
close
public void close()Description copied from interface:BAMIndexClose the index and release any associated resources.
-