Package htsjdk.variant.vcf
Class AbstractVCFCodec
- All Implemented Interfaces:
FeatureCodec<VariantContext,,LineIterator> NameAwareCodec
public abstract class AbstractVCFCodec
extends AsciiFeatureCodec<VariantContext>
implements NameAwareCodec
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected booleanIf true, then we'll magically fix up VCF headers on the fly when we read them inprotected String[]protected VCFHeaderprotected intprotected final String[]static final intprotected Stringprotected static final intprotected String[]protected StringIf non-null, we will replace the sample name read from the VCF header with this sample name.static booleanprotected VCFHeaderVersionprotected boolean -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic booleancanDecodeFile(String potentialInput, String MAGIC_HEADER_LINE) createGenotypeMap(String str, List<Allele> alleles, String chr, int pos) create a genotype mapdecode the line into a feature (VariantContext)the fast decode functionfinal voidForces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records.protected voidgenerateException(String message) protected static voidgenerateException(String message, int lineNo) getAltHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersionprotected StringgetCachedString(String str) Return a cached copy of the supplied string.getMetaHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersiongetName()get the name of this codecgetPedigreeHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersiongetSampleHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersionDefine the tabix format for the feature, used for indexing.protected static Allelecreate a an allele from an index and an array of allelesparseAlleles(String ref, String alts, int lineNo) parse out the allelesparseFilters(String filterString) parse the filter string, first checking to see if we already have parsed it in a previous attemptparse genotype alleles from the genotype stringprotected VCFHeaderparseHeaderFromLines(List<String> headerStrings, VCFHeaderVersion version) create a VCF header from a set of header record linesprotected static Doubleparse out the qual valuevoidset the name of this codecvoidsetRemappedSampleName(String remappedSampleName) Replaces the sample name read from the VCF header with the remappedSampleName.setVCFHeader(VCFHeader newHeader, VCFHeaderVersion newVersion) Explicitly set the VCFHeader on this codec.Methods inherited from class htsjdk.tribble.AsciiFeatureCodec
close, decode, isDone, makeIndexableSourceFromStream, makeSourceFromStream, readActualHeader, readHeaderMethods inherited from class htsjdk.tribble.AbstractFeatureCodec
decodeLoc, getFeatureTypeMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface htsjdk.tribble.FeatureCodec
canDecode, getPathToDataFile
-
Field Details
-
MAX_ALLELE_SIZE_BEFORE_WARNING
public static final int MAX_ALLELE_SIZE_BEFORE_WARNING -
NUM_STANDARD_FIELDS
protected static final int NUM_STANDARD_FIELDS- See Also:
-
header
-
version
-
alleleMap
-
validate
public static boolean validate -
parts
-
genotypeParts
-
locParts
-
filterHash
-
name
-
lineNo
protected int lineNo -
stringCache
-
warnedAboutNoEqualsForNonFlag
protected boolean warnedAboutNoEqualsForNonFlag -
doOnTheFlyModifications
protected boolean doOnTheFlyModificationsIf true, then we'll magically fix up VCF headers on the fly when we read them in -
remappedSampleName
If non-null, we will replace the sample name read from the VCF header with this sample name. This feature works only for single-sample VCFs.
-
-
Constructor Details
-
AbstractVCFCodec
protected AbstractVCFCodec()
-
-
Method Details
-
parseFilters
parse the filter string, first checking to see if we already have parsed it in a previous attempt- Parameters:
filterString- the string to parse- Returns:
- a set of the filters applied
-
parseHeaderFromLines
create a VCF header from a set of header record lines- Parameters:
headerStrings- a list of strings that represent all the ## and # entries- Returns:
- a VCFHeader object
-
getHeader
- Returns:
- the header that was either explicitly set on this codec, or read from the file. May be null. The returned value should not be modified.
-
getVersion
- Returns:
- the version number that was either explicitly set on this codec, or read from the file. May be null.
-
setVCFHeader
Explicitly set the VCFHeader on this codec. This will overwrite the header read from the file and the version state stored in this instance; conversely, reading the header from a file will overwrite whatever is set here.- Parameters:
newHeader-newVersion-- Returns:
- the actual header for this codec. The returned header may not be identical to the header argument since the header lines may be "repaired" (i.e., rewritten) if doOnTheFlyModifications is set.
- Throws:
TribbleException- if the requested header version is not compatible with the existing version
-
getAltHeaderLine
Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersion- Parameters:
headerLineString- VCF header line being parsed without the leading "##ALT="sourceVersion- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFAltHeaderLine object
-
getPedigreeHeaderLine
public VCFPedigreeHeaderLine getPedigreeHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersion- Parameters:
headerLineString- VCF header line being parsed without the leading "##PEDIGREE="sourceVersion- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFPedigreeHeaderLine object
-
getMetaHeaderLine
Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersion- Parameters:
headerLineString- VCF header line being parsed without the leading "##META="sourceVersion- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFMetaHeaderLine object
-
getSampleHeaderLine
public VCFSampleHeaderLine getSampleHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersion- Parameters:
headerLineString- VCF header line being parsed without the leading "##SAMPLE="sourceVersion- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFSampleHeaderLine object
-
decodeLoc
the fast decode function- Parameters:
line- the line of text for the record- Returns:
- a feature, (not guaranteed complete) that has the correct start and stop
-
decode
decode the line into a feature (VariantContext)- Specified by:
decodein classAsciiFeatureCodec<VariantContext>- Parameters:
line- the line- Returns:
- a VariantContext
- See Also:
-
getName
get the name of this codec- Specified by:
getNamein interfaceNameAwareCodec- Returns:
- our set name
-
setName
set the name of this codec- Specified by:
setNamein interfaceNameAwareCodec- Parameters:
name- new name
-
getCachedString
Return a cached copy of the supplied string.- Parameters:
str- string- Returns:
- interned string
-
oneAllele
create a an allele from an index and an array of alleles- Parameters:
index- the indexalleles- the alleles- Returns:
- an Allele
-
parseGenotypeAlleles
protected static List<Allele> parseGenotypeAlleles(String GT, List<Allele> alleles, Map<String, List<Allele>> cache) parse genotype alleles from the genotype string- Parameters:
GT- GT stringalleles- list of possible allelescache- cache of alleles for GT- Returns:
- the allele list for the GT string
-
parseQual
parse out the qual value- Parameters:
qualString- the quality string- Returns:
- return a double
-
parseAlleles
parse out the alleles- Parameters:
ref- the reference basealts- a string of alternates to break into alleleslineNo- the line number for this record- Returns:
- a list of alleles, and a pair of the shortest and longest sequence
-
canDecodeFile
-
createGenotypeMap
public LazyGenotypesContext.LazyData createGenotypeMap(String str, List<Allele> alleles, String chr, int pos) create a genotype map- Parameters:
str- the stringalleles- the list of alleles- Returns:
- a mapping of sample name to genotype object
-
disableOnTheFlyModifications
public final void disableOnTheFlyModifications()Forces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records. Useful primarily for raw comparisons such as when comparing raw VCF records -
setRemappedSampleName
Replaces the sample name read from the VCF header with the remappedSampleName. Works only for single-sample VCFs -- attempting to perform sample name remapping for multi-sample VCFs will produce an Exception.- Parameters:
remappedSampleName- replacement sample name for the sample specified in the VCF header
-
generateException
-
generateException
-
getTabixFormat
Description copied from interface:FeatureCodecDefine the tabix format for the feature, used for indexing. Default implementation throws an exception. Note that onlyAsciiFeatureCodeccould read tabix files as defined inAbstractFeatureReader.getFeatureReader(String, String, FeatureCodec, boolean, java.util.function.Function, java.util.function.Function)- Specified by:
getTabixFormatin interfaceFeatureCodec<VariantContext,LineIterator> - Returns:
- the format to use with tabix
-