vg
tools for working with variation graphs
Functions | Variables
minimizer_main.cpp File Reference
#include "subcommand.hpp"
#include <vg/io/vpkg.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
#include <getopt.h>
#include <omp.h>
#include "../index_manager.hpp"
#include <gbwtgraph/index.h>
#include "../min_distance.hpp"
#include "../handle.hpp"
#include "../utility.hpp"

Functions

int get_default_threads ()
 
size_t get_default_k ()
 
size_t get_default_w ()
 
void help_minimizer (char **argv)
 
int main_minimizer (int argc, char **argv)
 

Variables

constexpr int DEFAULT_MAX_THREADS = 16
 

Detailed Description

Defines the "vg minimizer" subcommand, which builds the experimental minimizer index.

The index contains the lexicographically smallest kmer in a window of w successive kmers and their reverse complements. If the kmer contains characters other than A, C, G, and T, it will not be indexed.

The index contains either all or haplotype-consistent minimizers. Indexing all minimizers from complex graph regions can take a long time (e.g. 65 hours vs 10 minutes for 1000GP), because many windows have the same minimizer. As the total number of minimizers is manageable (e.g. 2.1 billion vs. 1.4 billion for 1000GP), it should be possible to develop a better algorithm for finding the minimizers.

A quick idea:

Function Documentation

◆ get_default_k()

size_t get_default_k ( )

◆ get_default_threads()

int get_default_threads ( )

◆ get_default_w()

size_t get_default_w ( )

◆ help_minimizer()

void help_minimizer ( char **  argv)

◆ main_minimizer()

int main_minimizer ( int  argc,
char **  argv 
)

Variable Documentation

◆ DEFAULT_MAX_THREADS

constexpr int DEFAULT_MAX_THREADS = 16
constexpr