Execution model
NOTE: This is a proposed executation model for the new CAWI.
Most aspects have been implemented (as a proof-of-concept) in the feature--datastack branch of cawi-engine, but the implementation is still experimentatl.
Background - XCAWI2
XCAWI2 implemented a "flat" memory model - every time a section of DSC was executed (whether this was from the "next" or "previous" button) it would make changes on top of the existing data - all changes were cumulative.
This meant that once a question was asked and stored in the data, the answer would stay in the data unless it was explicitly deleted with a CLEAR() statement, even if that question was now "off-route" due to changes to answers to other questions.
When the previous button was pressed, XCAWI2 would identify the previous page on the route by re-running the ROUTE code from the start to generate a list of pages on the route. No history of the route taken so far was kept.
The CAPMI worked slightly differently in this respect - it kept a list of the pages displayed on the ROUTE (the CW_fullPath variable), and going back would "pop" the previous page from this list and display that page.
The XCAWI2 approach had the advantage that it adapted to new versions of the DSC with changes to the pages. This is something that sometimes caused problems in the CAPMI (inserting a new page to a live survey could break the routing for existing, partially-completed interviews).
Proposal
The new CAWI will use the CAPMI's "CW_fullPath" variable to keep track of the pages visited so far. Mechanisms will be inroduced to manage changes to the page structure during a live survey, and this is something the DSC scripter will need to be aware of.
The major conceptual change which is proposed is that the "back" button acts as "undo".
A "restore point" will be created immediately before each NEWPAGE() statement, containing the state of all the QUESTIONS and VARIABLES at that point.
When the respondent navigates to the previous page, the data will be restored to the values it contains at the corresponding restore point.
A "response cache" will be kept of values entered by the respondent, so that when the user goes forward (after going back), the page will be presented with the answers previously entered.
Questions and variables defined with the STATIC keyword are not included in this mechanism and will retain their values when the respondents goes back in the route.
INVALID DATA AND DATA TYPES
In XCAWI2, all data entered by the respondent was first saved before being validated. If there were errors in the data, it would be presented back to the respondent, but it would still be saved in the data.
The intention in the new CAWI is to only save valid responses in the main part of the data store. Invalid values would be saved in the cache, which would allow them to be presented back to the user for correction.
Data saved in the main part of the data store should be stored with the correct type (number, string, boolean, array). In both XCAWI2 and the CAPMI, everything was a string,
NOT ASKED vs NO RESPONSE
Questions not asked on the route will not have an entry in the data store.
If a question is asked, but no response was given (and the question is not REQUIRED), then "null" will be stored in the data for that qquestion.
LOOPS
FOR, FOREACH and WHILE loops are all supported, and can include pages inside them.
When pages are displayed multiple times within a loop, the page number will appear multiple times in CW_fullPath, and there will one entry in the data stack for each time the page is displayed. So the respondent will be able to go backwards in loops.
GOTO
The GOTO statement will still be supported. This will always have the effect of going forwards in the route (i.e. adding a new page to the end of the CW_fullPath list). This means that the "back button" will go back through GOTO statements.
REWIND
A new "REWIND" statement will be added, which takes a page ID as a parameter. This can only be used to go back to pages already visited on the route and will be equivalent to multiple presses of the "back" button.
This could be used when the answers given to questions across multiple pages are inconsistent with each other, and the respondent asked to go back and correct their answers.
CALL
The DSC language does not yet have support for defining functions. However, it will be possible to treat a "block" as a function using a new "CALL" statement. This has the same syntax as "GOTO", but when the end of the block that the label is attached to is reached, execution will return to the instruction after the CALL statement.
This is implemented using a "call stack", meaning that blocks can call other blocks (or even call themselves recursively).
The call stack does not yet store data, but the intention is to use it to store function parameters and local variables when that syntax is added to the DSC language.
FAST-FORWARD
The engine could support a feature to "fast-forward" from the current page to the first page with validation errors or no reponse - for when the respondent has gone back and then wants to "fast-forward" to the next page they have not previously seen.
This could also be used when a respondent is resuming an interview - to recalculte the route in case a new version of the interview script is being used.
Data storage
The engine stores the current interview data in a JSON object. When a new interview starts, it is initialised as follows:
{
stack: [ {} ],
current: {},
static: {},
system: { CW_track: '...', CW_fullPath: [] },
cache: {},
}
The initial object on the stack is used to store any pre-populated data. i.e. it is the state of the data at the start of the ROUTE.
The system.CW_fullPath variable is an array of page names - the
route the respondent has taken up to and including the current page.
Each page in the ROUTE is automatically assigned a sequential ID - this is a four character string starting '0001'. However, the ID is arbitrary and does not need to be sequential. The page ID can be overridden by attaching a label to the NEWPAGE() statment, in which case that is used for the ID (and the sequential ID does not get incremented for that page).
This is to allow new pages to be added without changing the IDs of existing pages.
(TO BE REVIEWED - THIS MAY NOT BE NEEDED) If a page is inside a loop,
then the name of the page in CW_fullPath is appended with the
current value(s) of the loop iterator(s) surrounding the page.
e.g. 0004:1:2 if a page is inside two nested loops.
Writing data to the store
All data written to the store is saved in .current. This contains
all data writes since the last NEWPAGE(). It will be moved to the
stack when the next page starts.
Reading data from the store
When the value is read from the store, the get function first looks
in .current and will return the value if found.
If not, it will search the objects in stack, starting from the last one and working back to 0. It will return the first value found (if any).
Navigating forwards in the ROUTE
When the respondent successfully submits a page, execution continues
with the next statement after ENDPAGE() in the ROUTE, with any data
writes still being made to .current.
When the next NEWPAGE() is found, the contents of .current are
pushed onto the stack, and .current is reset to be an empty object.
This creates a "restore point" at that location in the route.
At the same time, the name of the next page will be pushed onto
CW_fullPath - the stack contains one entry for each page (plus one
for the initial state).
Navigating backwards in the ROUTE
When the user presses the "previous page" button, execution resumes from the "NEWPAGE()" statement at the start of the previous page in CW_fullPath.
For example, if the respondent is on page 0004, and the previous page was 0003, the following will happen:
cache['0004'] = current
cache['0003'] = stack.pop()
current = {}
current contains any data changes made since the NEWPAGE() at the start of page 0004.
stack.pop() contains the changes made between the NEWPAGE() at the start of page 0003, and the NEWPAGE() at the start of page 0004.
We reset current to an empty object because we are starting page 0003 again from the start. The cache will be used when the questions are ASKed().
Using the cache
When a question is ASKed on a page, data is not pre-populated using the normal "get" function, the different parts of the data store are checked in the following order, and the first value found is used:
- "current" (meaning it has been previously asked, and is being redisplayed with errors),
- "cache" (meaning the user has previously provided a value)
- "stack" (we search the history to see if it was asked on a previous page, or set to a value in the ROUTE)
Note: This means the DSC scripter should not set the value of a
question using an assignment statement within the page (which will be
stored in current). This will conflict when going backwards -
preventing any value in the cache being used. Any pre-population
should therefore be done in the part of the ROUTE before the
NEWPAGE().
Clearing the cache
When a user successfully submits a page and moves to the next page, any cache for that page is deleted - it has been succesfully incorporated into "current" (and will shortly be pushed onto the stack when the next page is reached).
NOTE: The cache could either be implemented per page (which includes the loop iterator if the pages are in a loop), or in a single "global" cache. The current implementation is "per-page" - this means that if the same question is asked multiple times in a loop, then each of the entries by the respondent will be remembered.
Potential issues
- If the respondent completes the survey, then goes back several pages and closes their browser, the last few pages will be removed from the route. This could be resolved by cleaning the data (re-running the route and using the cache for the last few pages).
Examples
Example 1 - basic operation
Given the following DSC:
QUESTIONS
Q1 "What is your sex?" : { 1 "Male", 2 "Female" }
Q2 "Are you pregnant?" : { 1 "Yes", 2 "No" }
Q3 "Is this your first pregnancy?" : { 1 "Yes", 2 "No" }
ROUTE
NEWPAGE()
ASK(Q1)
ENDPAGE()
IF (Q1 IN {2}) THEN {
NEWPAGE()
ASK(Q2)
ENDPAGE()
}
IF (Q2 IN {1}) THEN {
NEWPAGE()
ASK(Q3)
ENDPAGE()
}
NEWPAGE()
MESSAGE("End of survey")
ENDPAGE()
Here is the content of the data after following actions:
- Q1 answered "Female"
- Q2 answered "Yes"
- Q3 answered "No"
"cache": {},
"stack": [
{},
{ "Q1": [ "2" ] },
{ "Q2": [ "1" ] },
{ "Q3": [ "2" ] }
],
"current": {},
"system": { "CW_fullPath": [ "0001", "0002", "0003", "0004" ] }
}
The user then presses the back button three times and goes back to page 0001. At this point, Q1 is not in the "on-route" data (current or stack), but the question is presented to the respondent using the value from the cache ("Female").
"cache": {
"0001": { "Q1": [ "2" ] },
"0002": { "Q2": [ "1" ] },
"0003": { "Q3": [ "2" ] },
"0004": { }
},
"stack": [
{}
],
"current": {},
"system": { "CW_fullPath": [ "0001" ] }
The respondent now changes the answer to "Male" and goes forwards to page 0004. The answer to Q2 is not in the stack, so is ignored.
{
"cache": {
"0002": { "Q2": [ "1" ] },
"0003": { "Q3": [ "2" ] },
"0004": { }
},
"stack": [
{},
{ "Q1": [ "1" ] }
],
"current": {},
"system": { "CW_fullPath": [ "0001", "0004" ] }
}
Example 2 - A navigation question with WHILE and CALL
For a large survey with independent sections, the following approach is possible, giving the respondent the opportunity to complete the blocks in an arbitrary order, with a summary question showing which are completed:
QUESTIONS
SOMETHING "Which block?" : { 1 "Block a", 2 "Block b", 3 "Finished" }
A1 "A1" : INTEGER
A2 "A2" : INTEGER
B1 "B1" : INTEGER
C1 "C1" : INTEGER
VARIABLES
i : INTEGER
COMPLETED : BOOLEAN
ROUTE
COMPLETED = FALSE
WHILE (NOT COMPLETED) DO {
NEWPAGE()
ASK(SOMETHING)
ENDPAGE()
IF (SOMETHING IN {1}) THEN { CALL blockA ; i = 3 }
ELSE IF (SOMETHING IN {2}) THEN CALL blockB
ELSE IF (SOMETHING IN {3}) THEN COMPLETED = TRUE
}
NEWPAGE()
MESSAGE("END OF SURVEY")
ENDPAGE()
STOP("")
blockA : {
NEWPAGE()
ASK(A1)
ENDPAGE()
CALL blockC
NEWPAGE()
ASK(A2)
ENDPAGE()
}
blockB : {
NEWPAGE()
ASK(B1)
ENDPAGE()
}
blockC : {
NEWPAGE()
ASK(C1)
ENDPAGE()
}