Package htsjdk.samtools.util
Class StringUtil
- java.lang.Object
-
- htsjdk.samtools.util.StringUtil
-
public class StringUtil extends Object
Grab-bag of stateless String-oriented utilities.
-
-
Field Summary
Fields Modifier and Type Field Description static String
EMPTY_STRING
ReturnsObject.toString()
of the provided value if it isn't null; "" otherwise.
-
Constructor Summary
Constructors Constructor Description StringUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static String
asEmptyIfNull(Object string)
static String
assertCharactersNotInString(String illegalChars, char... chars)
Checks that a String doesn't contain one or more characters of interest.static String
bytesToHexString(byte[] data)
Convert a byte array into a String hex representation.static String
bytesToString(byte[] data)
static String
bytesToString(byte[] buffer, int offset, int length)
static char
byteToChar(byte b)
Convert ASCII byte to ASCII char.static void
charsToBytes(char[] chars, int charOffset, int length, byte[] bytes, int byteOffset)
Convert chars to bytes merely by castingstatic byte
charToByte(char c)
Convert ASCII char to byte.static int
fromHexDigit(char c)
static int
hammingDistance(String s1, String s2)
Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2.static byte[]
hexStringToBytes(String s)
Convert a String containing hex characters into an array of bytes with the binary representation of the hex stringstatic String
humanReadableByteCount(long bytes)
Takes a long value representing the number of bytes and produces a human readable byte count.static String
intValuesToString(int[] intVals)
static String
intValuesToString(short[] shortVals)
static boolean
isBlank(String str)
Checks if a String is whitespace, empty ("") or null.static boolean
isWithinHammingDistance(String s1, String s2, int maxHammingDistance)
Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric.static <T> String
join(String separator, Collection<T> objs)
static <T> String
join(String separator, T... objs)
static int
levenshteinDistance(String string1, String string2, int swap, int substitution, int insertion, int deletion)
static String
readNullTerminatedString(BinaryCodec binaryCodec)
static String
repeatCharNTimes(char c, int repeatNumber)
static String
reverseString(String s)
Reverse the given string.static int
split(String aString, String[] tokens, char delim)
Split the string into tokens separated by the given delimiter.static int
splitConcatenateExcessTokens(String aString, String[] tokens, char delim)
Split the string into tokens separated by the given delimiter.static byte[]
stringToBytes(String s)
static byte[]
stringToBytes(String s, int offset, int length)
static char
toHexDigit(int value)
static byte
toLowerCase(byte b)
static byte
toUpperCase(byte b)
static void
toUpperCase(byte[] bytes)
Converts in place all lower case letters to upper case in the byte array provided.static String
wordWrap(String s, int maxLineLength)
Return input string with newlines inserted to ensure that all lines have length <= maxLineLength.static String
wordWrapSingleLine(String s, int maxLineLength)
-
-
-
Field Detail
-
EMPTY_STRING
public static final String EMPTY_STRING
ReturnsObject.toString()
of the provided value if it isn't null; "" otherwise.- See Also:
- Constant Field Values
-
-
Method Detail
-
join
public static <T> String join(String separator, Collection<T> objs)
- Parameters:
separator
- String to interject between each string in strings argobjs
- List of objs to be joined- Returns:
- String that concatenates the result of each item's to String method for all items in objs, with separator between each of them.
-
split
public static int split(String aString, String[] tokens, char delim)
Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that if tokens arg is not large enough to all the tokens in the string, excess tokens are discarded.- Parameters:
aString
- the string to splittokens
- an array to hold the parsed tokensdelim
- character that delimits tokens- Returns:
- the number of tokens parsed
-
splitConcatenateExcessTokens
public static int splitConcatenateExcessTokens(String aString, String[] tokens, char delim)
Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that the string is split into no more elements than tokens arg will hold, so the final tokenized element may contain delimiter chars.- Parameters:
aString
- the string to splittokens
- an array to hold the parsed tokensdelim
- character that delimits tokens- Returns:
- the number of tokens parsed
-
toLowerCase
public static byte toLowerCase(byte b)
- Parameters:
b
- ASCII character- Returns:
- lowercase version of arg if it was uppercase, otherwise returns arg
-
toUpperCase
public static byte toUpperCase(byte b)
- Parameters:
b
- ASCII character- Returns:
- uppercase version of arg if it was lowercase, otherwise returns arg
-
toUpperCase
public static void toUpperCase(byte[] bytes)
Converts in place all lower case letters to upper case in the byte array provided.
-
assertCharactersNotInString
public static String assertCharactersNotInString(String illegalChars, char... chars)
Checks that a String doesn't contain one or more characters of interest.- Parameters:
illegalChars
- the String to checkchars
- the characters to check for- Returns:
- String the input String for convenience
- Throws:
IllegalArgumentException
- if the String contains one or more of the characters
-
wordWrap
public static String wordWrap(String s, int maxLineLength)
Return input string with newlines inserted to ensure that all lines have length <= maxLineLength. if a word is too long, it is simply broken at maxLineLength. Does not handle tabs intelligently (due to implementer laziness).
-
intValuesToString
public static String intValuesToString(int[] intVals)
-
intValuesToString
public static String intValuesToString(short[] shortVals)
-
bytesToString
public static String bytesToString(byte[] data)
-
bytesToString
public static String bytesToString(byte[] buffer, int offset, int length)
-
stringToBytes
public static byte[] stringToBytes(String s)
-
stringToBytes
public static byte[] stringToBytes(String s, int offset, int length)
-
readNullTerminatedString
public static String readNullTerminatedString(BinaryCodec binaryCodec)
-
charsToBytes
public static void charsToBytes(char[] chars, int charOffset, int length, byte[] bytes, int byteOffset)
Convert chars to bytes merely by casting- Parameters:
chars
- input charscharOffset
- where to start converting from chars arraylength
- how many chars to convertbytes
- where to put the converted outputbyteOffset
- where to start writing the converted output.
-
charToByte
public static byte charToByte(char c)
Convert ASCII char to byte.
-
byteToChar
public static char byteToChar(byte b)
Convert ASCII byte to ASCII char.
-
bytesToHexString
public static String bytesToHexString(byte[] data)
Convert a byte array into a String hex representation.- Parameters:
data
- Input to be converted.- Returns:
- String twice as long as data.length with hex representation of data.
-
hexStringToBytes
public static byte[] hexStringToBytes(String s) throws NumberFormatException
Convert a String containing hex characters into an array of bytes with the binary representation of the hex string- Parameters:
s
- Hex string. Length must be even because each pair of hex chars is converted into a byte.- Returns:
- byte array with binary representation of hex string.
- Throws:
NumberFormatException
-
toHexDigit
public static char toHexDigit(int value)
-
fromHexDigit
public static int fromHexDigit(char c) throws NumberFormatException
- Throws:
NumberFormatException
-
reverseString
public static String reverseString(String s)
Reverse the given string. Does not check for null.- Parameters:
s
- String to be reversed.- Returns:
- New string that is the reverse of the input string.
-
isBlank
public static boolean isBlank(String str)
Checks if a String is whitespace, empty ("") or null.
StringUtils.isBlank(null) = true StringUtils.isBlank("") = true StringUtils.isBlank(" ") = true StringUtils.isBlank("sam") = false StringUtils.isBlank(" sam ") = false
- Parameters:
str
- the String to check, may be null- Returns:
true
if the String is null, empty or whitespace
-
repeatCharNTimes
public static String repeatCharNTimes(char c, int repeatNumber)
-
levenshteinDistance
public static int levenshteinDistance(String string1, String string2, int swap, int substitution, int insertion, int deletion)
-
hammingDistance
public static int hammingDistance(String s1, String s2)
Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.- Parameters:
s1
- The first string to compares2
- The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.- Returns:
- Hamming distance between s1 and s2.
- Throws:
IllegalArgumentException
- If the two strings have differing lengths.
-
isWithinHammingDistance
public static boolean isWithinHammingDistance(String s1, String s2, int maxHammingDistance)
Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.- Parameters:
s1
- The first string to compares2
- The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.maxHammingDistance
- The largest Hamming distance the strings can have for this function to return true.- Returns:
- true if the two strings are within maxHammingDistance of each other, false otherwise.
- Throws:
IllegalArgumentException
- If the two strings have differing lengths.
-
humanReadableByteCount
public static String humanReadableByteCount(long bytes)
Takes a long value representing the number of bytes and produces a human readable byte count.- Parameters:
bytes
- The number of bytes to create a human readable string for.- Returns:
- A human readable string of the number of bytes given.
-
-