Chapter 2: Design

2.1: the main() function

The main function defines the long options in its global g_longOpts array. The g_longOpts and g_longEnd global variables are not declared for general use in a header file but are declared in class specific headers when needed. Currently that's only scenario/scenario.ih.

    int main(int argc, char **argv)
    try
    {
        Arg const &arg = Arg::initialize("a:B:el:C:D:hoP:R:S:s:t:vV",
                                          g_longOpts, g_longEnd, argc, argv);
        arg.versionHelp(usage, Icmake::version, arg.option('o') ? 0 : 1);
    
        Options::instance();            // construct the Options object
    
        Simulator simulator;            // construct the simulator
        simulator.run();                // and run it.
    }

The simulations themselves are controlled by a Simulator, having only one public member: run, performing the simulations.

2.2: Simulator

The Simulator is constructed by main. Its constructor determines the source of the analysis specifications (setAnalysisSource).

The class defines the following data members:

        bool d_next = false;                        // true if 'run()' should run
                                                    // an analysis
    
        uint16_t d_lineNr = 1;                      // updated by 'fileAnalysis'
        std::ifstream d_ifstream;                   // multiple analysis specs
    
        std::string (Simulator::*d_nextSpecs)();    // ptr to function handling 
                                                    // the (next) analysis spec.

If the command-line option -o was specified then analysis specifications may be provided as command-line arguments, handled by the member cmdLineAnalysis.

Otherwise a specification file must be provided as first command-line argument. This file is read until a line that begins with analysis:, which may then be followed by specifications, which are read by the member fileAnalysis.

In both cases the Simulator's member d_next is set to true.

Specifications that are specified in the analysis specification file (or at the command line when -o was specified) are appended to a specification string, separated by newline characters. The member fileAnalysis then returns these specifications.

The member d_nextSpecs is set to either cmdLineAnalysis or fileAnalysis. Eventually these members set d_next to false, ending the analyses.

If d_next is not false then the analysis-specific modifications are then used by run to initialize a stream which is read by the Analysis object performing the next analysis. Refer to section 2.3 for a description of how the Analysis class modifies default parameter settings.

2.2.1: run

The run member performs all analyses. At the construction d_next may be set to true indicating that an analysis must be performed:

    void Simulator::run()
    {
        while (d_next)
        {
            uint16_t lineNr = d_lineNr;        
                                                // read the next analysis specs
            string spec = (this->*d_nextSpecs)();   
    
            emsg.setCount(0);
    
            Analysis analysis{ istringstream{ spec }, lineNr };
            analysis.run();                     // run the simulation
        }
    }

If a simulation must be performed then the non-default specifications are provided by the member to which d_nextSpec points.

The actual simulation is then performed by an Analysis object, which also has a run public member performing the actual analysis.

2.3: Analysis

Analysis class objects handle one simulation analysis. Since multiple analyses are performed independently from each other, each Analysis object initializes its own error count (class Error) and option handling (class Options).

As options may specify the name of the file containing the analysis-parameters Analysis objects also define a configuration file object (ConfFile).

In simrisc's scientific context simulation parameters are also known as `scenarios'. Scenarios contain information like the number of iterations to perform and the number of cases to simulate. The analysis's Scenario object is initialized by Analysis's constructor, and is used by its run member.

Figure 2 provides an overview of Analysis's data members.

Figure 2: The Analysis data members

The class Analysis uses the classes Scenario, Loop, ConfFile, Options and Error, which are covered in separate sections of this technical manual.

When the Analysis object is constructed it constructs its own Scenario object. That object receives the modifications that are specific for the current analysis (e.g., that were specified following an analysis: line in an analysis-specification file). When the Scenario object (cf. section SCENARIO) is returning specifications (e.g., using its lines members, then it returns the modified specifications if available. Modified specification must be complete, as they replace the corresponding specifications in the used (standard) configuration file.

2.3.1: run

The run member may immediately end if errors were encountered in the specifications of the Scanario and/or ConfigFile parameters.

One of the options defines the base directory into which the output is written. The member requireBase checks whether the base directory exists, and if not creates it. If that fails the program ends, showing the message


    Cannot create base directory ...
    
where ... is replaced by the name of the base directory provided by the Options object.

If all's well, the actual simulation is then performed by a Loop object.

2.4: Scenario

Scenario class objects are defined by Analysis class objects and contain parameter values of the simulation to perform. For each separate analysis a new Scenario object is constructed.

Most of its members are accessors: const members returning values of scenario parameters.

Some parameter values are stored in the Scenario object itself. Refer to the simrisc(1) manual page for a description of their default values:

Configuration parameters start with identifying names, like costs: or screeningRounds:. Those names are then followed by the parameter's specifications. Those specifications are made available by the members

2.5: Loop

The class Loop is the `workhorse' class. It performs the simulation as specified by the Scenario object which is passed to its objects when they are constructed. The class Loop uses many other classes, most of which encapsulate parameters specified in the configuration file. Those classes read configuration file parameters and prepare these parameters for their use by Loop.

The constructor of a Loop object defines the following objects:

It also defines a Random object (d_random), generating random numbers (cf. section 4.1).

2.5.1: iterations

The iterate member performs the number of iterations (scenario: iterations). At each iteration

2.5.2: Simulate Cases

The member genCases performs one complete iteration over all screening rounds for all simulated cases. The option nCases may be used to simulate a specific number of cases. When nCases is specified only the data of the final case are written to file. By default as many cases as specified at the `nWomen' parameter (stored in the Scenario object) are simulated, and the data of all those simulated cases are written to file. Analyses up to a specific number of cases, or a single simulation using a preset death-age (see option --death-age in the simrisc(1) man-page) can also be performed. The number of cases to simulate is determined in the nCases member.

For each simulated case:

2.5.3: Pre-screening

The original program uses the following condition to determine whether prescreening must be performed:

    if (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
        && tumor.selfDetectAge() < 1st screening age)))
    
This results in a needlessly complex implementation of the pre-screening phase. It's much simpler to use the complement of this expression, skipping the pre-screening phase if the complementary condition is true. The pre-screening phase is therefore skipped if the following condition holds true:

    not (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
        && tumor.selfDetectAge() < 1st screening age)))
    
The expression can be simplified using De Morgan's rule a && b == !a || !b:

    not (Nscr > 0) or 
    not (
         naturalDeathAge < 1st screening age or
         (tumor present and tumor.selfDetectAge() < 1st screening age)
        )
    
Consequently, pre-screening is skipped if there are no screening rounds (not (Nscr > 0)) and also if the following condition holds true:

    not (
        naturalDeathAge < 1st screening age or
        (tumor present and tumor.selfDetectAge() < 1st screening age)
    )
    

Distributing the not-operator over the terms of the above condition, and applying De Morgan's rule !(a || b) == !a && !b we get:


        naturalDeathAge >= 1st screening age and 
        not (
            tumor present and 
            tumor.selfDetectAge() < 1st screening age
        )
    

Applying De Morgan's rule once more this finally results in:


        naturalDeathAge >= 1st screening age and 
        (
            not tumor present or 
            tumor.selfDetectAge() >= 1st screening age
        )
    

Thus, pre-screening is skipped if the above condition holds true.

2.5.3.1: Performing a pre-screening

If a pre-screening phase is used then the following happens:

2.5.4: Self-detected tumors during the pre- and post-screen phases

The function Loop:characteristics may be called during the pre-screening phase and during the post-screening phase.

In both phases the tumor is self-detected, the tumor characteristics are determined (Tumor::characteristics, Tumor::setDeathAge, cf. sections 3.5.2 and 3.5.3), and the treatment costs are determined using the tumor's induced death age and the tumor's diameter (Costs::treatment, cf. section 3.1).

If the case's natural death occurs before the tumor would have caused the case's death then the case leaves the pre-screening or post-screening simulation with status (respectively) LEFT_PRE and LEFT_POST. Otherwise death was caused by the tumor and the the case leaves the pre-screening or post-screening simulation with status (respectively) TUMOR_PRE and TUMOR_POST.

2.5.5: Screening

Following the pre-screening phase the screening phase itself starts. However, when a case's simulation has already ended during the pre-screening phase the screening phase is skipped. Other than that, a case's simulation may also end during the screening phase itself, ending the screening phase.

As long as the case simulation has not ended (i.e., the case's state is PRESENT) a screening is performed for each of the screening rounds defined by the Screening object (cf. section 3.4), initialized in Loop's constructor.

At each screening round two actions are performend:

2.5.5.1: Leaving screening rounds

Whether the case leaves the simulation before an actual screening at the current screening age is determined by the member Loop::leaving. In the original program this is determined as follows:

Converting this condition, then the case leaves the simulation if

If at this point the case hasn't left the simulation, then

2.5.5.2: Performing a screening

If the case has not yet left the simulation, then the member Loop::screen simulates a screening at a given screening age.

At this point the modalities are considered. Each of the modalities configured for the current screening age is considered in turn. They are considered in their order of specification in the configuration file. E.g., when specifying


    screeningRound:     50  Mammo MRI
    screeningRound:     52  MRI Mammo 
    
then Mammo is considered before MRI at screening age 50, and MRI is considered before Mammo at screening age 52.

Modalities are made available by the Modalities member (d_modalities, cf. section 3.3). The use member of this member returns the information of all modalities that have been configured for the current screening round.

Whether a configured modality is actually going to be used at a particular screening round is determined by chance. In the configuration file the parameter attendanceRate defines the probability that a case attends a screening round. If the next random value drawn from the uniform random distribution exceeds the configured attendance rate then the screening round for that modality is skipped for the current case.

If a case attends a screening round then the screening round's costs are determined (cf. section 3.1) and are added to the costs so far (cf. Loop::addCost): the costs are added to the case's accumulated cost and to the accumulated costs of the current screening round.

In addition, if a tumor exists at a screening round then the tumor's characteristics are determined for the current screening age.

Two factors determine whether a tumor may be detected or whether its detection may be a false positive. One factor (factor-1) is the (apparent) presence of a tumor, the other factor (factor-2) is whether the screening round's age is at least equal to the age that the tumor can be detected.

If both factors are true, then the tumor may be detected. Otherwise there may be a false positive tumor detection.

Maybe detecting the tumor (maybe a false negative conclusion):

The member maybeDetect (cf. Loop::maybeDetect) is used to decide whether a tumor may be found during the screening. A false negative screening result is obtained if a random value exceeds the current modality's sensitivity. The sensitivities of the various modalities are returned by the ModBase::sensitivity member (which in turn call their derived class's members vSensitivity returning the return values of the actually used modality's overridden vSensitivity members.

If a false negative result isn't obtained then a tumor was detected: its treatment costs are added to the accumulated costs and the dying age because of the tumor is set to the age of the current screening round.

If the natural dying age is earlier than the dying age caused by the cancer, then the case leaves the simulation (using status LEFT_DURING). Otherwise the case leaves the simulation using status TUMOR_DURING.

Maybe incorrectly detecting the tumor (maybe a false positive conclusion):

Once a tumor has apparently been observed it may in fact not exist, in which case a false positive observation was made. The memebr maybeFalsePositive handles this situation.

The specificity of the used modality, given the current screening age is compared to a random value from the uniform random distribution. If the generated random value exceeds the modality's specificity then the simulation has encountered a false positive tumor detection. The numbers of false positive decisions for the modality and for the screening round are incremented and addCoist is called with argument the biopsy costs at the current screening age.

2.5.6: Post-screening

Post screening is skipped if the case has left the simulation process during the pre-screening or screening phases.

If there is a tumor and the tumor's self-detection age is before the case's natural death then the tumor characteristics and treatment costs are determined (cf. section 2.5.4).

On the other hand, if there is no tumor or if the tumor's self-detection age would have been after the case's natural death then the case leaves the simulation at the case's natural death age. In this latter case (athough there is a tumor, it hasn't caused death) the tumor's characteristics are determined as well (cf. section 3.5.2).