CIBC:SoftwareDesign:Scheduler
From NCRR Biomedical Software Development, Engineering, and Dissemination Wiki
Contents |
Scheduler Design
Rules for scheduling
Module states
We assume the module can be in one of the following eight states:
- Unresolved: An unresolved module has not adjusted its output to the input or the variables set. When loading a new network all modules will be in this state. The objective of the Scheduler will be to solve all modules.
- Resolved: This module's output matches the input and variables given. Hence it is in perfect equlibrium. The scheduler does not need to reexecute the module.
- Computing new data: This module is currently executing and will have new data shortly. All modules downstream will be waiting to finish execution until this one is done.
- Waiting for new data upstream: This module will execute as a consequence of a module upstream that is executing.
- Computing intermediate: This module is currently executing, but it is forwarding intermediate results, which are valid results and the modules downstream can execute as soon as we have intermediate results.
- Waiting for downstream to resolve: This module automatically retriggers executing when no modules downstream are executing or depend on executing.
- Disabled: This module is disabled and should not be considered to be part of the network.
- Aborting: This module is given the order to stop and will not post a result on the output ports.
Execution triggers
- A module can be execute by the user directly, in which case execution is immediate.
- A module can be forced by the network to be executed, in which case execution is delayed until the modules upstream will not have an execute pending anymore.
- A module can reexecute automatically when all modules downstream have finished execution.
Intermediate Results
In the current design we use intermediate results in two cases: to display intermediate results in a solving process and to create animations. However the needs of both are different, for intermediate results it is not necessary that every intermediate is displayed, if the solving is faster than the intermediates we should be skipping displaying some of them. Such a behavior is however not wanted for making animations or looping through files. Hence we need to split this behavior:
Execution model
All actions run through the scheduler, which computes the state of each module up on every change in the network. The scheduler is the only entity that can start the execution of a module. For this behavior we need to alter the module code, so a module waits for a signal from the scheduler to start execution or abort in case the module is deleted.
The following events trigger recomputing the state of the network:
- Module finished execution:
- Update the list of modules that depend on this module and schedule them to wait for new data (make sure that modules cannot wait for dependencies that depend on module itself).
- Check whether the output should be considered valid. If a pipe was added or deleted while the module was executing all results of the module should be considered invalid and the module should reexecute.
- If excution was aborted all modules that depend on this one should go into the unresolved state, however modules that depend on other modules that are still executing still are scheduled for execution and continue to wait for new data.
- Determine which modules that are waiting for new data can be executed as the modules upstream are not computing new data.
- Adding/deleting pipe:
- If module is executing try to abort module and reschedule module for execution.
- If module is not executing but is waiting for data recalculate the dependency scheme and keep the module in the Waiting for new data mode. This module then may trigger execution when the modules upstream are finished.
- If a module is in the resolved state, push it to the unresolved state, but do not execute.
- Adding module:
- Compute whether the execution of a module upstream will trigger this module. In that case the module should go to the Wait for new data state immediately.
- If the module has no upstream execution schedule it will stay in the unresolved state.
- Deleting a module:
- The module that is deleted is removed from the network
- The module is issued an abort
- When the module finishes it will be deleted from SCIRun
- Recompute all dependecies and all states of modules when the module is deleted.
- As deletion causes pipes to be deleted, some modules will be forced into an unresolved state.
Sketch of Scheduler Innerworkings
- An event (module execution finished, module intermediate execution finished, add pipe, remove pipe, add module, remove module, user execute module) occurs and is reported to the mailbox of the Scheduler.
- Scheduler locks and prevents anything from changing the current state. We will do an event by event recompution of the full state of the scheduler.
- First make a list of all modules that are currently executing. Use the information from the events to see if a module finished execution or was ordered by the user to execute. If the user executed a module it is added to the execution list, if a module finished it is removed from the execution list.
- Second determine which modules are resolved and which are not and which modules are disabled. Check whether the generation numbers used of the dataflow objects used for the most recent execution are the most recent from the upstream modules. If a module is disabled no state information will be calculated for it: it is in the disabled state.
- Determine all modules that depend on modules executing in the network, these modules will go in the wait for new data list.
