main
function defines the long options in its global g_longOpts
array. The g_longOpts
and g_longEnd
global variables are not declared
for general use in a header file but are declared in class specific headers
when needed. Currently that's only scenario/scenario.ih
.
int main(int argc, char **argv) try { Arg const &arg = Arg::initialize("a:B:C:D:hi:n:oP:R:S:t:vV", g_longOpts, g_longEnd, argc, argv); arg.versionHelp(usage, Icmake::version, arg.option('o') ? 0 : 1); Simulator simulator; // construct the simulator simulator.run(); // and run it. }
The simulations themselves are controlled by a Simulator
, having only one
public member: run
, performing the simulations.
Simulator
is constructed by main
. Its constructor determines the
source of the analysis specifications (setAnalysisSource
).
The class defines the following data members:
bool d_next = false; // true if 'run()' should run // an analysis uint16_t d_lineNr = 1; // updated by 'fileAnalysis' std::ifstream d_ifstream; // multiple analysis specs std::string (Simulator::*d_nextSpecs)(); // ptr to function handling // the (next) analysis spec.
If the command-line option -o
was specified then analysis specifications
may be provided as command-line arguments, handled by the member
cmdLineAnalysis
.
Otherwise a specification file must be provided as first command-line
argument. This file is read until a line begins with analysis:
, which may
then be followed by specifications, read by the member fileAnalysis
.
In both cases the Simulator
's member d_next
is set to true
.
The member d_nextSpecs
is set to either cmdLineAnalysis
or
fileAnalysis
. Eventually these members set d_next
to false
, ending
the analyses.
run
member runs all analyses. At the construction d_next
may be
set to true
indicating that an analysis must be performed:
void Simulator::run() { while (d_next) { uint16_t lineNr = d_lineNr; // read the next analysis specs string spec = (this->*d_nextSpecs)(); Analysis analysis{ istringstream{ spec }, lineNr }; analysis.run(); // run the simulation } }
If a simulation must be performed then the non-default specifications are
provided by the member to which d_nextSpec
points.
The actual simulation analysis is then performed by an Analysis
object,
which also has a run
public member performing the actual analysis.
Analysis
class objects handle one simulation analysis. Since multiple
analyses are performed independently from each other each Analysis
object
initializes its own error count (class Error
) and option handling (class
Options
.
Since the options may specify the name of the file containing the
analysis-parameters Analysis
objects also define a configuration file
object (ConfFile).
In simrisc
's scientific context simulation parameters are also known as
`scenarios'. Scenarios contain information like the number of
iterations to perform and the number of cases to simulate.
The analysis's Scenario
object is initialized by
Analysis
's constructor, and used by its run
member.
Here is an overview of Analysis
's data members:
// for this analysis: Error d_error; // error handling Options d_options; // options ConfFile d_confFile; // configuration (parameters) file Scenario d_scenario; // performs one (1) simulation
The class Analysis
uses the classes Scenario, Loop, ConfFile,
Options
and Error
, which are covered in separate sections.
run
member may immediately end if errors were encountered in the
specifications of the Scanario
and/or ConfigFile
parameters.
One of the options defines the base directory into which the output is
written. The member requireBase
checks whether the base directory exists,
and if not creates it. If that fails the program ends, showing the message
Cannot create base directory ...where
...
is replaced by the name of the base directory provided by the
Options
object.
If all's well, the actual simulation is then performed by a Loop
object.
Scenario
class objects are defined by Analysis
class objects and
contain parameter values of the simulation to perform. For each separate
analysis a new Scenario
object is constructed.
Most of its members are accessors: const
members returning values of
scenario parameters.
Some parameter values are stored in the Scenario
object itself:
d_iterations
, the number of iterations to perform (defaults to 1);
d_labels
, a vector of strings containing the values of label:
lines that may be specified in analysis:
sections to briefly
describe the essence of simulations;
d_nWomen
, the number of cases to simulate (defaults to 100). This
name was used in the original versions of the simrisc
program, and may be
changed to d_nCases
in future versions. For now it is kept;
d_seed
, the initial seed of the random number generator(s);
d_spread
, when set to true
some (otherwise fixed) configuration
parameters may show random fluctuations.
Configuration parameters start with identifying names, like costs:
or
screeningRounds:
. Those names are then followed by the parameter's
specifications. Those specifications are made available by the members
value
, returning the value of a numeric parameter;
lines
, containing iterators to the lines in a ConfFile
identified
by the identifying name;
find
, returning the iterator to a single line identified by the
identifying name.
Loop
is the `workhorse' class. It performs the simulation as
specified by the Scenario
object which is passed to its objects when they
are constructed. The class Loop
uses many other classes, most of which
encapsulate parameters specified in the configuration file. Those classes read
configuration file parameters and prepare these parameters for their use by
Loop
.
The constructor of a Loop
object defines the following objects:
d_costs
: a Costs
object providing the treatment costs
parameters (cf. section 3.1);
d_densities
: a Densities
object providing the breast-densities
parameters (cf. section 3.2);
d_modalities
: a Modalities
object providing the details about
which screening modalities are used in a specific simulation
(cf. section 3.3);
d_screening
: a Screening
object providing screening parameters
(cf. section 3.4);
d_tumor
: a Tumor
object handling the simulation related to the
occurrence of tumors (cf. section 3.5);
d_tumorInfo
: a TumorInfo
object providing tumor-related
configuration parameters (like incidence, growth and survival
parameters. Cf. section 3.6);
It also defines a Random
object (d_random), generating random numbers
(cf. section 4.1).
iterate
member performs the number of iterations (scenario:
iterations). At each iteration
TumorInfo
(cf. section 3.6) parameters may be
refreshed;
Loop
's counters are reset:
d_totalCost
, the total simulated treatment cost;
d_sumDeathAge
, the total simulated dying ages;
d_nIntervals
,;
d_nRoundFP
, the number of false positive diagnoses per age round;
d_nRoundFN
, the number of false negative diagnoses per age round;
d_nDetections
, the number of self detected cancers per age round;
d_roundCost
, the total treatment costs per age round;
d_modalities
, the Modalities
counters.
womenLoop
;
womenLoop
).
womenLoop
performs one complete iteration. The option
nCases
may be used to analyze a specific number of cases, and then only
write the data of the final case to file. By default the number of cases
specified in `scenario nWomen' are simulated.
womenLoop
initializes the random number generator with the current
scenario
seed. Then it iterates over all cases. For each case:
ALIVE
if (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present && tumor.selfDetectAge() < 1st screening age)))
No prescreening is therefore performed by complementing this expression:
not (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present && tumor.selfDetectAge() < 1st screening age)))
or (using De Morgan's rule: a && b == !a || !b):
not (Nscr > 0) or not (naturalDeathAge < 1st screening age || (tumor present && tumor.selfDetectAge() < 1st screening age))
So no screening is performed if there are no screening rounds (not (Nscr > 0)) and also if
not (naturalDeathAge < 1st screening age || (tumor present && tumor.selfDetectAge() < 1st screening age))is true.
Applying the not-operator to the condition and using De Morgan's rule !(a || b) == !a && !b results in
naturalDeathAge >= 1st screening age and not (tumor present && tumor.selfDetectAge() < 1st screening age)
Once more aplying De Morgan's rule finally results in:
naturalDeathAge >= 1st screening age and (not tumor present or tumor.selfDetectAge() >= 1st screening age)
This condition is tested and if true no pre-screening is performed.
When pre-screening is used:
If there's no tumor or if there is a tumor, but the case's natural death happens before the death cause by the tumor, then there is a natural induced death in the pre-screening phase.
Alternatively, there is a self-detected tumor. In that case death may happen before the tumor causes the case's dath (a natural death at the case's natural death age) or the death caused by the tumor may be before the case's natural dath, resulting in a tumor caused pre-screen death at the simulated death age caused by the tumor.
The screening phase ends if the case is no alive anymore. At each round, if the case is alive, then it may be that death occurs.
To determine whether death occurs the following checks are performed:
If the natural death age occurs before the tumor-caused death age the case dies of a natural death. Otherwise, the death is caused by the cancer at the tumor determined death age;
If death has not occurred so far, then the screening continues for the current screening age. At this point the modalities are considered. Each of the modalities configured for the current screening age is considered in turn. They are considered in their order of specification in the configuration file. E.g., when specifying
screeningRound: 50 Mammo MRI screeningRound: 52 MRI MammoMammo is considered before MRI at screening age 50, and MRI is considered before Mammo at screening age 52.
For each modality it may be that it is not considered. Whether it is
considered is determined by comparing the configured attendance rate (e.g.,
.80) with the random value round nr * modality nr
from the pool of
attendance random values
note: this computation also results in multiple uses of random values for different round and modality combinations. E.g., round 1 and modality 2 use the same random value as round 2 and modality 1. So far multiple modalities haven't been used, but here, too, the used attendance pool index should avoid multiply using random values
If the screening rate's value is less than the random value then screening is not used for the current case at this screening age
note: this looks a bit strange: I would have expected that an attendance rate of .80 would mean that there is a probability of .80 that a screening round is used. This implementation does the opposite: 20% of the screening rounds are used. In the original code I readif ( (rndPoolAttendance[ screeningRound*(modality+1) ] <= attendanceRate) && (( (modality == 0) && (scr[screeningRound].T0)) || ( (modality == 1) && (scr[screeningRound].T1))) )in which case screening is performed. Thus no screening is performed when applying the not-operator to this condition:not ( (rndPoolAttendance[ screeningRound*(modality+1) ] <= attendanceRate) && (( (modality == 0) && (scr[screeningRound].T0)) || ( (modality == 1) && (scr[screeningRound].T1))) )The checks for used modality have already been performed, so they can be omitted, resulting in:
rndPoolAttendance[ screeningRound*(modality+1) ] > attendanceRateor:
attendanceRate < rndPoolAttendance[ screeningRound*(modality+1) ]resulting in not using this screening round: here too a high attendance rate results in low probabilities that the screening round is used.
If a round is used then its costs are added to the costs so far, and if there is a tumor then its characteristics are determined.
Then, if there is a tumor and the current screening age is at least equal to the tumor's detectable age it may be detected; otherwise there may be a false positive.
Maybe detecting the tumor uses the betafunction. It returns true if the random sensitivity pool value (again: collisions are possible) is at most equal to the computed beta function value. As a reference, here is the beta function's implementation:
bool Loop::betaFunction(uint16_t modalityNr) { double diameter = d_tumor.diameter(); double mValue = d_screening.m().value(); double expBeta = exp( d_screening.beta(0).value() + d_screening.beta(1).value() * diameter + d_screening.beta(2).value() * mValue + d_screening.beta(3).value() * mValue / (diameter * diameter) ); return d_random.uniform() <= 0.9 * expBeta / (1 + expBeta); }
If the beta function returns true then if the case's natural death age is less than the tumor cause dying age the case dies naturally.
note: which looks weird to me. The original code specifies
naturalDeathAge < tumorDeathAge
, but if the tumor caused death age
clearly exceeds the case's natural death age why would the case then die?
Consider the situation where the screening age is 50, and the natural
death age is 83.5: why would the case die at the screening round for age
50?
When the beta function returns true and the case's natural death age exceeds the tumor cause dying age then death is observed during this screening round, and a false positive occurs if the pool's false positives random value for the current round and modality exceeds the modality's specificity (and again: multiply used random values may occur).
If there is a tumor ad the tumor self-detection ages is earlier than the case's natural death then it is concluded that there either is a natural death or the tumor's characteristics may have caused death before the natural death age.
If there is no tumor or if the tumor's self-detection age is after the case's natural death then there is a natural death at the case's natural death age. In this latter case (there is a tumor, but it hasn't caused death) the tumor's characteristics are also determined (cf. section 3.5.2).