Quantum User’s Guide Volume 1 Data Editing TUM90518U1
COPYRIGHT
2000 BY SPSS LIMITED
All rights reserved as an unpublished work, and the existence of this notice shall not be construed as an admission or presumption that publication has occurred. No part of the materials may be used, reproduced, disclosed or transmitted to others in any form or by any means except under license by SPSS Ltd. or its authorized distributors. SPSS Limited
Maygrove House 67 Maygrove Road LONDON NW6 2EG
England Please address any comments or queries about this manual to the Support Department at the above address, or via e-mail to:
[email protected] All trademarks acknowledged.
Contents About this guide ............................................................................................................ vii 1 1.1 1.2
Introduction.........................................................................................................................1 What Quantum does .............................................................................................................1 Stages in a Quantum run ......................................................................................................2
2 2.1 2.2
Your Quantum program...................................................................................................3 Storing your program on the computer ................................................................................3 Components of a program ....................................................................................................3 Edit statement ........................................................................................................................4 Checking and verification statements ....................................................................................4 Dealing with errors in your data ............................................................................................5 Loops and routing ..................................................................................................................5 Tabulation statements ............................................................................................................5
3 3.1 3.2 3.3 3.4 3.5 3.6
Writing in the Quantum language .................................................................................7 The character set ...................................................................................................................7 Formatting your program .....................................................................................................8 Comments .............................................................................................................................9 Continuation .........................................................................................................................9 Dealing with possible syntax errors ...................................................................................10 Printing error messages on the screen ................................................................................11
4 4.1
Basic elements.................................................................................................................13 Data constants .....................................................................................................................13 Individual constants .............................................................................................................13 Strings of data constants ......................................................................................................15 Numbers .............................................................................................................................16 Whole numbers ....................................................................................................................16 Real numbers .......................................................................................................................16 Variables and arrays ...........................................................................................................17 Data variables ......................................................................................................................18 Integer variables ..................................................................................................................20 Real variables ......................................................................................................................21 Reading real numbers from columns ...................................................................................23 Subscription ........................................................................................................................23
4.2
4.3
4.4 5 5.1
5.2
Expressions ......................................................................................................................25 Arithmetic expressions .......................................................................................................25 Combining arithmetic expressions ......................................................................................26 Counting the number of codes in a column .........................................................................28 Generating a random number ..............................................................................................29 Logical expressions ............................................................................................................30 Comparing values ................................................................................................................30 Comparing data variables and data constants ......................................................................31 Checking the arithmetic value of a field of columns ...........................................................38 Combining logical expressions ...........................................................................................39 Contents / i
Quantum User’s Guide Volume 1
5.3
Comparing variables and arithmetic expressions to a list ................................................... 42 Speeding up large programs ............................................................................................... 45
6 6.1
How Quantum reads data ............................................................................................. 47 Types of record .................................................................................................................. 47 Ordinary records ................................................................................................................. 47 Multicard records ................................................................................................................ 47 Multicard records with trailer cards .................................................................................... 48 6.2 Reading data into the C array ............................................................................................. 48 Ordinary records ................................................................................................................. 48 Multicard records ................................................................................................................ 49 Ignoring card types .............................................................................................................. 49 6.3 Processing the data ............................................................................................................. 49 6.4 Trailer cards ....................................................................................................................... 50 thisread ................................................................................................................................ 50 allread .................................................................................................................................. 50 firstread and lastread ........................................................................................................... 51 Examples ............................................................................................................................. 51 6.5 Columns 1 to 100 ............................................................................................................... 52 6.6 Reserved variables ............................................................................................................. 52 6.7 Using spare columns .......................................................................................................... 52 6.8 Describing the data structure .............................................................................................. 53 Record type ......................................................................................................................... 53 Record length ...................................................................................................................... 54 Serial number location ........................................................................................................ 55 Card type location ............................................................................................................... 55 Required card types ............................................................................................................. 56 Repeated card types ............................................................................................................ 56 Highest card type number ................................................................................................... 57 Dealing with alphanumeric card types ................................................................................ 58 Merge sequence for trailer cards ......................................................................................... 58 6.9 Merging data files .............................................................................................................. 59 Merging complete cards ...................................................................................................... 59 Merging a field of data from an external file ...................................................................... 61 6.10 Multicard records of more than 100 columns per card ...................................................... 63 6.11 Reading non-standard data files ......................................................................................... 63 7 7.1
7.2 7.3
ii / Contents
Writing out data ............................................................................................................... 65 Print files ............................................................................................................................ 65 Printing out individual records ............................................................................................ 65 Writing out parts of records ................................................................................................ 68 Data files ............................................................................................................................ 69 Creating new cards .............................................................................................................. 70 Writing to a report file ....................................................................................................... 70 Data variables ...................................................................................................................... 71 Integer variables .................................................................................................................. 74 Real variables ...................................................................................................................... 75 Text and white space ........................................................................................................... 76 Examples ............................................................................................................................. 77
Quantum User’s Guide Volume 1
7.4 7.5 7.6
Defining the file type ......................................................................................................... 78 Default print parameters for write statements .................................................................... 81 Writing out data in a user-defined format .......................................................................... 84
8 8.1
Changing the contents of a variable.......................................................................... 89 Assignment statements ....................................................................................................... 89 Copying codes ..................................................................................................................... 90 Partial column replacement ................................................................................................. 92 Storing arithmetic values .................................................................................................... 95 Assignment with and, or and xor ........................................................................................ 99 Adding codes into a column ............................................................................................ 102 Deleting codes from a column ......................................................................................... 103 Forcing single-coded answers .......................................................................................... 104 Setting a random code in a column .................................................................................. 107 Reading numeric codes into an array ............................................................................... 108 Clearing variables ............................................................................................................ 111 Checking array boundaries in assignment statements ..................................................... 112 Assigning values to T variables in the data file ............................................................... 113
8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 9 9.1 9.2 9.3 9.4 9.5
Flow control.................................................................................................................... 115 Statements of condition – if ............................................................................................. 115 Statements of condition – else ......................................................................................... 117 Routing around statements ............................................................................................... 118 continue ............................................................................................................................ 119 Loops ................................................................................................................................ 119 do with individually specified numeric values ................................................................. 120 do with numeric ranges ..................................................................................................... 121 do with codes .................................................................................................................... 122 Nested loops ...................................................................................................................... 123 Routing with loops ............................................................................................................ 124 9.6 Rejecting records ............................................................................................................. 124 9.7 Jumping to the tabulation section .................................................................................... 126 9.8 Stopping the processing of data by the edit ..................................................................... 127 9.9 Canceling the run ............................................................................................................. 128 9.10 Going temporarily to the tab section ................................................................................ 129
10 Examining records........................................................................................................ 133 10.1 Holecounts ....................................................................................................................... 133 Creating a holecount ......................................................................................................... 135 Filtered holecounts ............................................................................................................ 136 Counting trailer cards ........................................................................................................ 136 Multiplied holecounts ....................................................................................................... 136 10.2 Frequency distributions .................................................................................................... 138 Creating a frequency distribution ...................................................................................... 139 Multiplied frequency distributions .................................................................................... 142 11 Data validation ............................................................................................................... 143 11.1 require .............................................................................................................................. 143 11.2 Column and code validation ............................................................................................ 144
Contents / iii
Quantum User’s Guide Volume 1
11.3 11.4 11.5 11.6
The action code ................................................................................................................. 145 Checking type of coding ................................................................................................... 146 Comments with require ..................................................................................................... 147 Checking codes in columns ............................................................................................... 148 Exclusive codes ................................................................................................................. 150 Automatic error correction ................................................................................................ 151 Defaults in a require statement .......................................................................................... 152 Validating logical expressions ......................................................................................... 153 Testing the equivalence of logical expressions ................................................................ 154 Actions when a require statement fails ............................................................................ 156 Combining testing sentences ............................................................................................ 157
12 12.1 12.2 12.3
Data correction .............................................................................................................. 159 Forced editing (forced cleaning) ...................................................................................... 159 On-line data correction ..................................................................................................... 160 On-line editing commands ............................................................................................... 162 Displaying columns in the record ..................................................................................... 162 Correcting records ............................................................................................................. 163 Accepting and rejecting records ........................................................................................ 165 Creating and deleting cards ............................................................................................... 166 Record editing commands ................................................................................................. 166 Canceling the online edit ................................................................................................... 167 Redefining on-line edit command names .......................................................................... 167 12.4 Creating clean and dirty data files .................................................................................... 167 12.5 Correcting data from a corrections file ............................................................................ 170 12.6 Missing values in numeric fields ..................................................................................... 172 Facilities provided by missing values processing ............................................................. 172 Switching missing values processing on and off .............................................................. 172 Missing values in arithmetic expressions and assignments .............................................. 173 Manual assignment of the missing value to a variable ..................................................... 174 Testing whether a variable has the value missing_ ........................................................... 175 13 Using subroutines in the edit..................................................................................... 177 13.1 Calling up subroutines ..................................................................................................... 177 13.2 Subroutines in the Quantum library ................................................................................. 178 Using look-up files ............................................................................................................ 178 Converting multicoded data to single-coded data ............................................................. 181 13.3 Writing your own routines ............................................................................................... 182 Writing subroutines in C ................................................................................................... 182 Writing subroutines in Quantum ....................................................................................... 185 13.4 Calling functions from C libraries ................................................................................... 192 14 14.1 14.2 14.3 14.4
iv / Contents
Creating new variables ................................................................................................ 195 Naming variables ............................................................................................................. 195 Defining variables ............................................................................................................ 196 The default variables file .................................................................................................. 198 Naming variables in your program .................................................................................. 199
Quantum User’s Guide Volume 1
15 15.1 15.2 15.3 15.4 15.5 15.6
Data-mapped variables................................................................................................ 201 Advantages of data-mapping files ................................................................................... 201 Contents of a data-mapping file ....................................................................................... 203 Defining data-mapped variables ...................................................................................... 203 Using data-mapped variables ........................................................................................... 204 Assigning values to data-mapped variables ..................................................................... 207 Testing the value of a data-mapped variable ................................................................... 211 Test the numeric value ...................................................................................................... 211 Test the categoric response ............................................................................................... 212 15.7 Using data-mapped variables in analysis specifications .................................................. 213 15.8 Using parameter substitution with data-mapped variables .............................................. 215 15.9 Additional features using data-mapped variables ............................................................ 216 15.10 Automatically generating a Quantum spec ...................................................................... 217 Quancept CAPI and Quancept Web text-formatting options ........................................... 218 Reducing the response texts .............................................................................................. 219 Files produced by qdiaxes ................................................................................................. 220 16 Running Quantum under Unix and DOS................................................................. 223 16.1 Which version to use ........................................................................................................ 223 16.2 The Quantum command ................................................................................................... 224 Compressed and non-standard data files ........................................................................... 225 16.3 Compiling your program file ........................................................................................... 226 16.4 Loading the C code .......................................................................................................... 227 16.5 Reading the data ............................................................................................................... 228 16.6 Weighting, accumulation and manipulation .................................................................... 229 16.7 Creating tables ................................................................................................................. 229 16.8 Log files and running in the background ......................................................................... 230 16.9 Running more than one job in a directory ....................................................................... 231 16.10 The Quantum temporary directory ................................................................................... 231 16.11 The Quantum permanent directory .................................................................................. 232 Index ................................................................................................................................ 233
Contents / v
About this guide The Quantum User’s Guide is written primarily for Quantum spec writers. It is also a useful reference for Quanvert database administrators and others who prepare data for use with Quanvert or Quanvert Text. This guide is not intended as a tutorial or teach-yourself document. Instead, it provides a complete and detailed description of the Quantum language and the Quantum programs. However, the guide has been designed with your needs in mind. If you are an experienced user, you will find the Quick Reference boxes at the start of each section helpful as a reminder of syntax. If you are less experienced, you will probably prefer the more detailed explanations and examples in the main body of each section. The Quantum User’s Guide is divided into four volumes, which are described in more detail below. All the volumes contain a comprehensive index that covers all four volumes.
Volume 1, Data editing Volume 1 of the Quantum User’s Guide covers data editing, validation and cleaning: •
Chapters 1 to 3 give you an overview of the language and explain the basic concepts of Quantum spec writing.
•
Chapter 4, ‘Basic Elements’, describes constants, numbers and variables.
•
Chapter 5, ‘Expressions’, describes arithmetic and logical expressions.
•
Chapter 6, ‘How Quantum reads data’, describes types of records, data structure, trailer cards, reserved variables, merging data files and reading non-standard data files.
•
Chapter 7, ‘Writing out data’, describes creating a new data file, copying records to a print file, and writing to a report file.
•
Chapter 8, ‘Changing the contents of a variable’, describes the Quantum assignment statements, adding and deleting codes in a column, forcing single-coded answers, setting a random code in a column, reading numeric codes into an array and clearing variables.
•
Chapter 9, ‘Flow control’, describes the if and else statements, routing around statements, loops, rejecting records, jumping to the tabulation section and canceling the run.
•
Chapter 10, ‘Examining records’, describes holecounts and frequency distributions.
•
Chapter 11, ‘Data validation’, describes the require statement, column and code validation, and validating logical expressions.
About this guide / vii
Quantum User’s Guide Volume 1
•
Chapter 12, ‘Data correction’, describes forced cleaning, on-line data correction, creating clean and dirty data files, correcting data from a corrections file, and missing values in numeric fields.
•
Chapter 13, ‘Using subroutines in the edit’, describes how to call up subroutines, the subroutines in the Quantum library, writing your own subroutines and calling functions from C libraries.
•
Chapter 14, ‘Creating new variables’, describes how to name and define variables in your Quantum spec.
•
Chapter 15, ‘Data-mapped variables’, describes the data-mapped variables feature.
•
Chapter 16, ‘Running Quantum under Unix and DOS’, describes how to compile and run your Quantum program.
Volume 2, Basic tables Volume 2 of the Quantum User’s Guide covers axes and creating basic tables: •
Chapter 1, ‘Introduction to the tabulation section’, provides an introduction to creating tables in Quantum.
•
Chapter 2, ‘The hierarchy of the tabulation section’, describes the components of a tabulation program, the hierarchies of Quantum, how to define run conditions, the options that are available on the a, sectbeg, flt and tab statements, the default options file and some sample tables.
•
Chapter 3, ‘Introduction to axes’, describes how to create an axis, the types of elements within an axis, how to define conditions for an element, the n count creating elements, subheadings, netting and axes within axes.
•
Chapter 4, ‘More about axes’, describes the col, val, fld and bit statements, filtering within an axis, and options on axis elements.
•
Chapter 5, ‘Statistical functions and totals’, describes totals, averages, means, the standard deviation, standard error and error variance statements and how to create percentiles.
•
Chapter 6, ‘Using axes as columns’, describes special considerations for when axes are used for the columns of a table.
•
Chapter 7, ‘Creating tables’, describes the syntax of the tab statement, multidimensional tables, multilingual surveys, combining tables, printing more than one table per page, and suppressing percentages and statistics with small bases.
viii / About this guide
Quantum User’s Guide Volume 1
•
Chapter 8, ‘Table texts’, describes table titles, underlining titles, printing text at the foot of a page, table and page numbers and controlling table justification.
•
Chapter 9, ‘Filtering groups of tables’, describes general filter statements, named filters and nested filter sections.
•
Chapter 10, ‘Include and substitution’, describes filing and retrieving statements, symbolic parameters and grid tables.
•
Chapter 11, ‘A sample Quantum job’, provides an example of a Quantum specification and the tables it produces.
•
Appendix A, ‘Limits’, describes the limits built into Quantum.
•
Appendix B, ‘Error messages’, contains a list of compilation error messages with suggestions as to why you may see them and how to solve the problems which caused them to appear.
•
Appendix C, ‘Options in the tabulation section’, provides a summary of the options available in the tabulation section.
Volume 3, Advanced tables Volume 3 of the Quantum User’s Guide covers advanced tables and statistics: •
Chapter 1, ‘Weighting’, describes the weighting methods that you can use in Quantum.
•
Chapter 2, ‘Row and table manipulation’, describes how to create new rows and tables using previously created tables or parts of previously created tables.
•
Chapter 3, ‘Dealing with hierarchical data’, describes how to use analysis levels in Quantum.
•
Chapter 4, ‘Descriptive statistics’, describes the axis-level and table-level statistical tests that are available in Quantum and provides details of the chi-squared tests, non-parametric tests on frequencies and Friedman’s two-way analysis of variance.
•
Chapter 5, ‘Z, T and F tests’, describe the Z, T and F tests that are available in Quantum.
•
Chapter 6, ‘Other tabulation facilities’, describes how to include C code and edit statements in the tabulation section and how to sort tables.
•
Chapter 7, ‘Special T Statistics’, describes the special T statistics that are available in Quantum.
•
Chapter 8, ‘Creating a table of contents’, describes how to create a formatted list of the tables that are produced by a Quantum run.
About this guide / ix
Quantum User’s Guide Volume 1
•
Chapter 9, ‘Laser printed tables with PostScript’, describes how to convert the standard tabulation output into a file suitable for printing on a PostScript laser printer.
•
Appendix A, ‘Options in the tabulation section’, provides a summary of the options available in the tabulation section.
Volume 4, Administrative functions Volume 4 of the Quantum User’s Guide covers administrative functions: •
Chapter 1, ‘Files used by Quantum’, describes files you may need to create in order to use certain Quantum facilities, including the variables file, the levels file, the default options file, the run definitions file, the merges file, the corrections file, the rim weighting parameters file, and the C subroutine code file, aliases for Quantum statements, customized texts, and userdefinable limits.
•
Chapter 2, ‘Files created by Quantum’, describes many of the files created during a run and draws your attention to those of particular interest.
•
Chapter 3, ‘Quantum Utilities’, describes how to tidy up after a Quantum run and how to check column and code usage.
•
Chapter 4, ‘Data conversion programs’, describes the q2cda and qv2cda programs that convert tables into comma-delimited ASCII format, the qtspss and nqtspss programs that convert Quantum data into SPSS format, and the qtsas and nqtsas programs that convert Quantum data into SAS format.
•
Chapter 5, ‘Preparing a study for Quanvert’, describes the tasks you need to perform before converting a Quantum spec and data file into a Quanvert database.
•
Chapter 6, ‘Files for Quanvert users’, describes files that are specific to either Quanvert Text or Windows-based Quanvert.
•
Chapter 7, ‘Creating and maintaining Quanvert databases’, describes how to create and maintain Quanvert databases.
•
Chapter 8, ‘Transferring databases between machines’, describes how to transfer databases between machines and the programs provided to help you achieve this.
•
Appendix A, ‘Limits’, lists limits built into Quantum.
•
Appendix B, ‘Error messages’, contains a list of compilation error messages with suggestions as to why you may see them and how to solve the problems that cause them to appear.
•
Appendix C, ‘Quantum data format’, describes the Quantum data format.
x / About this guide
Quantum User’s Guide Volume 1
•
Appendix D, ‘Using the extended ASCII character set’, explains how you can use Quantum with data that contains characters in the extended ASCII character set.
•
Appendix E, ‘ASCII to punch code conversion table’, provides a table showing ASCII to punch code conversions.
•
Appendix F, ‘Will this job run on my machine’, offers suggestions on how you can check whether a particularly large job will run on your computer.
Symbols and typographical conventions Words which are keywords in the Quantum language are normally printed in italics in the text. In the main description of each keyword, the keyword is shown in bold the first time it is mentioned. When showing the syntax of a statement, as in the Quick Reference sections, all keywords are printed in bold. Parameters, such as question texts or responses, whose values are user-defined are shown in italics. Optional parameters are enclosed in square brackets, that is, [ ]. All examples are shown in fixed width type. The ✎ symbol marks a note or other point of particular interest. The ☞ symbol marks a reference for further reading related to the current topic.
Comments SPSS MR welcomes any comments you may have about this guide, and any suggestions for ways in
which it could be improved.
About this guide / xi
1 Introduction Quantum is a highly sophisticated and very flexible computer language designed to simplify the process of obtaining useful information from a set of questionnaires. Quantum has been designed with market researchers in mind so its syntax and grammar are similar to English. Nevertheless, it is still a computer language and as such should be used with precision and understanding. The four volumes of the Quantum User’s Guide have three basic functions: •
To explain the Quantum language.
•
To provide you with enough information about how Quantum works to enable you to carry out a specific task.
•
To help you work out what went wrong when errors occur or when your output is not what you expected.
1.1 What Quantum does Quantum is a very flexible language which performs a variety of tasks. It can: •
Check and validate your data.
•
Edit and correct your data.
•
Produce different types of lists and reports of data.
•
Produce new data files.
•
Recode data and produce new variables.
•
Generate tables (in different languages, provided that the translated texts exist).
•
Perform statistical calculations.
Any Quantum run may perform as many or as few of these tasks as you like, but for each run the basic format is the same.
Introduction – Chapter 1 / 1
Quantum User’s Guide Volume 1
1.2 Stages in a Quantum run First, the data is read onto a disk. Data on disk can come from a number of different sources, for example: •
It may be entered directly via a terminal by a telephone interviewer using Quancept CATI.
•
It may be collected over the World Wide Web using software such as Quancept Web.
•
It may be entered directly into a computer by an interviewer conducting a personal interview using Quancept CAPI.
•
It may be entered by a data entry clerk using a data entry package.
Next, the tasks to be performed are defined using the Quantum language. Then, Quantum translates these tasks into instructions that the computer can understand. Finally, the computer itself uses this program to run your job. Quantum comprises two sections — an edit section and a tabulation section. The edit section checks and validates the data, generates lists and reports, corrects data, produces new data files, and recodes data and creates new variables. The tabulation section produces tables and performs statistical calculations. Quantum reads the records in the data file one at a time and passes them through the various parts of the Quantum program. As long as there are records remaining in the data file, the loop of ‘read a record −> edit −> tabulate’ is repeated; once the last record has been processed, the tables are ready for printing. If errors occur at any point in a Quantum run an error message is printed telling you what is wrong.
☞ For details of the error messages that can occur, see appendix B, ‘Error messages’ in the Quantum User’s Guide Volume 2.
2 / Introduction – Chapter 1
2 Your Quantum program Your Quantum program is the basic requirement for any Quantum run. It tells the computer what tasks it has to perform. All Quantum programs are written in the Quantum language which both you and the computer can understand. When writing in this language you must take care that you say exactly what you mean; otherwise your output may not be quite what you expect. The computer cannot guess at what you mean it to do; it only does what you tell it.
2.1 Storing your program on the computer All Quantum programs are stored in separate files on the computer. Each file has a unique name which may be made up of any characters on your keyboard, but you are advised to use only letters and numbers in your filenames. A typical program might look like this: *include edit a;dsp;spechar=–*;decp=0;flush *include tabs *include axes
where the file called edit contains editing instructions, the file called tabs contains statements defining the tables required, and axes contains statements which define the individual rows and columns each table is to have. The a; statement lists characteristics that all tables are to have, although some of these characteristics can be overridden for individual tables or individual table elements.
2.2 Components of a program A Quantum program is made up of a series of statements defining the actions to be taken. If you are typing Quantum programs on your screen you will notice that statements of more than 80 characters wrap around onto the next line and appear to be on two lines in the file. As long as these statements have 200 characters or fewer, Quantum can read them, but you may prefer to make the lines shorter for ease of reading on your screen. In the following sections, we will explain briefly the types of statements you can use.
Your Quantum program – Chapter 2 / 3
Quantum User’s Guide Volume 1
Edit statement Quantum edit statements contain a Quantum keyword and other texts and numbers. Statements in the edit section can generally start in any column, although comments and continuation characters must start in column 1. A line may contain one or more statements, as long as each statement is separated by a semicolon. Edit statements may be preceded by a label number of up to five digits allowing them to be referenced by other parts of the program, for example: total = c56 + c57 + c58 if (total.gt.8) go to 100 require sp c(66,70) 100 write
Here we are adding the number in column 56 to those in columns 57 and 58 and saving the result in a variable called ‘total’. If this value is greater than eight we go to statement 100, otherwise we continue with the statement immediately after the if line.
Checking and verification statements Quantum offers you the ability to check and verify your data prior to tabulation. Suppose your questionnaire contains a series of questions to be answered only by people buying a specific brand of tea. You may want to check that everyone who didn’t buy tea has a blank in all columns related to tea. On the other hand, if they did buy a specific brand of tea, you could check whether the codes in the following columns were within a specific range. The statement that you would use for this type of test is require. To perform the test given as an example, we might write: if (c24’1’) r nb c(25,30); else; r b c(25,30)
This says that if column 24 contains a ‘1’, then columns 25 to 30 must not be blank, otherwise, if column 24 does not contain a ‘1’, then columns 25 to 30 must all be blank. More generalized checking facilities exist which enable you to produce frequency distributions of numeric data (e.g., how many respondents have the number 201 in columns 13 to 15) or holecounts (marginals) which show the broad pattern of coding across all columns in the data. Words associated with these are list and count.
4 / Your Quantum program – Chapter 2
Quantum User’s Guide Volume 1
Dealing with errors in your data When errors are found in the data, you have several courses of action open to you. You may: •
Write out incorrect records for further investigation (write or require).
•
Copy the record to a different file (split or write).
•
Correct the errors (Quantum statements, online edit, file of corrections).
For example, we may write: if (c224’5’) write
to write out all records in which column 24 of card 2 contains a 5. The records are written to the default print file, out2 Incidentally, many of the statements mentioned in this section may be used for other purposes, rather than just to deal with errors.
Loops and routing Quantum offers you many aids to efficient programming. Repetitive checks may be specified once with instructions to Quantum to repeat them a given number of times or until a certain condition is satisfied. The word associated with loops of this kind is do. There are two sorts of routing: you may either go to another edit statement (go to) or you may send the record straight on to the tabulation section (return).
Tabulation statements Tabulation statements tell Quantum which tables are required and how to create them. They consist of a start letter or keyword to identify the type, and may be followed by other keywords, numbers or text. They are used to define rows and columns (elements), the variables that are to be crosstabulated (axes) and finally, the tables themselves. There are also statements for weighting your data and for creating tables by manipulating the contents of tables created previously in the current run or even in other runs.
Your Quantum program – Chapter 2 / 5
3 Writing in the Quantum language Writing in the Quantum language is very easy but as with all computer languages it needs to be done with care and precision to obtain the required results.
3.1 The character set The characters and symbols that you may use in Quantum are: • • • • •
The 26 uppercase letters A to Z The 26 lowercase letters a to z The 10 digits 0 to 9 The space (blank) character The special symbols + – / * . , ; : ’ $ = & ( ) { } ! | > # @ %
Some of these symbols have special meanings: +
Addition sign or continuation of a long statement
–
Subtraction sign or the 11-punch
/
Slash sign for division, or an abbreviation for ‘through’ (i.e., 1/9 is 1 to 9 inclusive)
*
Asterisk for multiplication
,
Comma for separating column numbers in field specifications
;
Semicolon for separating statements on the same line
:
Colon for specifying ranges
’
Single quotes for single codes (always used in pairs)
$
Dollar signs for fields of codes (always used in pairs)
=
Equal sign for assignments
&
Ampersand (12) punch or end of g statement
()
Parentheses to enclose field specifications
{}
Braces to enclose vectors (i.e., lists of numbers)
!
Exclamation mark for splitting long words
|
Vertical bar to split words or create vertical lines in tables
#
Hash (pound) sign for identifying rows/tables for manipulation
@
‘At’ sign for identifying rows/tables for manipulation
>
Greater than sign for identifying tables for manipulation from other runs
%
Percent sign for introducing options on col, val fld and bit statements
Writing in the Quantum language – Chapter 3 / 7
Quantum User’s Guide Volume 1
Where symbols have two meanings, the meaning required will become clear in the context in which the symbol is employed.
3.2 Formatting your program Quantum is a ‘free-format’ language which means that within reason you may enter your program however you like. Statements occupy columns 1 to 200 of successive lines and may be written in uppercase or lowercase or a combination. Thus: IF (C132’1’) WRITE; REJECT
is exactly the same as: if (c132’1’) write; reject
The exception to this is text in tables, where the text is printed on the tables in the same case as you write it in your Quantum program. Additionally, you must set up table text so that it fits on the paper when you print your tables. Therefore, if you want the table title to be printed on two lines, you must write it on two lines in your program. Generally, spaces are allowed anywhere in a Quantum program except within Quantum keywords. Blank lines in a program are ignored. As we mentioned earlier, Quantum has separate edit and tabulation sections which may or may not be in the same file. If your program contains an edit, it must precede the tabulation statements and must be enclosed by the words ed and end, each on a separate line, thus: ed . edit statements . end
Errors will occur if either of these words is missing. If there is no edit, these statements are not needed.
8 / Writing in the Quantum language – Chapter 3
Quantum User’s Guide Volume 1
3.3 Comments Comment statements insert comments or information into the Quantum program. They do not affect the way your program works because they are ignored when the program is run to produce tables. Comments are identified either by a capital C in column 1 or by a slash and an asterisk in columns 1 and 2 respectively (/*). If a comment needs more than one line, each line must start with the appropriate notation otherwise it will be assumed to require some sort of action. /* This is the first comment /* This is the second comment
It is a good idea to put comment statements in your program in case someone else has to take over your job or alternatively to remind yourself what you are doing and why. For example: /* Edam is 1 if Edam mentioned at Q1, Q3 or Q6 if (c110’1’.or.c114’1’.or.c121’1’) edam=1
3.4 Continuation Any Quantum statement may be continued over several lines by starting the second and subsequent lines with + or ++, depending on where the statement is split. A single plus sign is used when the statement is split between keywords. This assumes that a semicolon appears at the end of each continued line, whether or not there is actually one there. Take the statement: if (c132’12’.and.t5.gt.50) write $t5 incorrect$; else; write ofil
This could be split in three places with a single plus sign for a continuation: if (c132’12’.and.t5.gt.50) +write $t5 incorrect$ +else +write ofil
We have omitted the semicolons at the end of each line, but it would not be wrong to leave them in. The double-plus sign introduces an internal continuation of a long statement over several lines. Statements may be split between lexics; that is, between keywords, conditions, lists of numbers, and so on, but not in the middle of any of these. In our previous example, we could write: if (c132’12’.and. ++t5.gt.50) write $t5 incorrect$; else; write ofil
Writing in the Quantum language – Chapter 3 / 9
Quantum User’s Guide Volume 1
A double plus is needed here because we have split an expression in which one parameter is dependent on the other. The statement on the first line means nothing on its own, neither does the second line, hence the ++. We could equally well have split the expression before the .and. or before or after the .gt.. To split it between t and 5, or in any other similar place, is incorrect because the two characters by themselves do not mean anything.
✎ There is no limit to the number of consecutive continuations of either type.
3.5 Dealing with possible syntax errors Quick Reference To have possible syntax errors (that is, ones which Quantum can process even though they are not quite perfect) treated as fatal, type: check_ at the start of the edit. To have possible syntax errors flagged but ignored, type: nocheck_ at the start of the edit. This is the default.
When the Quantum compiler is checking your program and finds an error it flags the incorrect statement with an explanatory error message and continues with the next statement. If any of these errors are fatal — that is, Quantum cannot convert your statement into C code — the run will be terminated. Sometimes Quantum finds statements which are not quite correct, but which it can still convert into C. In these cases the compiler flags the statement with the message ‘Possible syntax error’ and continues as if nothing were wrong. You can choose to have this type of error treated as fatal and have the run terminated at the end of the compilation by entering the statement check_ (note the underscore at the end) at the start of your edit. The statement nocheck_ causes possible syntax errors to be flagged but ignored, and this is the default.
10 / Writing in the Quantum language – Chapter 3
Quantum User’s Guide Volume 1
3.6 Printing error messages on the screen Quick Reference To have more or less than the default of 20 error messages displayed on your screen, type: errprint n before the edit and tabulation sections. Where n is the number of messages you wish to see.
When the Quantum compiler finds errors in your program, it copies them to the compilation listing file. It also displays the first twenty messages on your screen. You may increase or decrease this number by placing the statement: errprint n at the top of your main program file, before the edit and tabulation sections. n is the number of messages you want to see on your screen: it must be an integer. Thus: errprint 5
prints the first five error messages on the screen and in the listing file, and then any others only in the file.
Writing in the Quantum language – Chapter 3 / 11
4 Basic elements There are three basic elements in Quantum: • • •
Data constants. Integer numbers. Real numbers.
which are stored in variables: Data variables Integer variables Real variables
store store store
data constants whole numbers real numbers
4.1 Data constants Individual constants Quick Reference To refer to one or more codes in a single column, type: ’codes’
An individual constant is one or more of the codes 1234567890–& or blank. The – is sometimes referred to as the 11 or X punch, and & is sometimes called the 12, V or Y punch. Each code represents one answer to a question. For example, let’s take the question ‘What is your favorite color?’ which has the response list: Red
1
Yellow
2
Blue
3
Green
4
Black
5
White
6
coded into one column. If my favorite color is green, this will appear in the data file as a 4 in the appropriate column, just as if your favorite color is red, there will be a 1 in that column.
Basic elements – Chapter 4 / 13
Quantum User’s Guide Volume 1
To refer to these answers inside your Quantum program (maybe we only want our table to include those respondents whose favorite color is blue), type in the code enclosed in single quotes: ’3’
You will also have to tell Quantum which column to look in.
☞ To find out how to refer to columns, see ‘Data variables’ later in this chapter. Several codes may be combined in the same column and are called multicodes. Throughout this manual when we talk of multicodes or multicoding we mean two or more codes in the same column. Suppose the next question asks me to choose three colors from the same list; I pick yellow, black and white. If these answers were all coded in the same column (a multicoded column), we would refer to them by typing: ’256’
or
’526’
or
’652’
or any other variation of those three codes. Quantum does not care what order you enter the codes in. If you have a series of consecutive codes in the order &–01234567890–&, you may either type each code separately or you may enter the first and last codes separated by a slash (/) meaning ‘through’, as shown below: ’1/7’
means
’1234567’
’&/4’
means
’&–01234’
’&/9’
means
’&–0123456789’ (all 12 codes)
’1/&’
means
’1234567890–&’ (all 12 codes)
As you can see, the last two examples mean exactly the same thing. However, the notations ’0/&’ and ’0–&’ are not the same: ’0/&’ means ’01234567890–&’ whereas ’0–&’ is ’0’, ’–’ and ’&’ only. Some combinations of codes represent ASCII characters; that is, they represent characters which you can type on your screen: ’&1’ ’&2’
is the equivalent of is the equivalent of
’A’ ’B’
The only time you would use letters rather than codes (that is, ‘A’ rather than ‘&1’) is when the questionnaire tells you that a column should contain a letter.
14 / Basic elements – Chapter 4
Quantum User’s Guide Volume 1
☞ For further information, see appendix E, ‘ASCII to punch code conversion table’ in the Quantum User’s Guide Volume 4.
Sometimes we may need to write a notation for ‘no codes’ — for instance, if my favorite color does not appear in the list of choices. To do this, we write ’ ’ (that is, a blank enclosed in single quotes).
✎ The notation ’ ’ is a special case since blank is not really a code. If you type a blank inside single quotes with any other characters Quantum will follow its usual rule of ignoring spaces. This means that references of the form ’ 12 ’ are read as ’12’.
Strings of data constants Quick Reference To refer to a string of codes in a field of columns, type: $codes$ The list of codes contains one code per column.
When data constants are single-coded or the multicodes correspond to ASCII characters (for example, ‘A’, ‘B’) they may be strung together. Strings of data constants are sometimes called literals or column fields. Strings are enclosed in dollar signs, with the component single codes losing their single quotes. For example: $12345$
$ABC$
$916 7&$
The first string is five columns long with 1 in the first column, 2 in the second, 3 in the third, and so on. The third string is six columns wide with the fourth column being blank. Times when you might use strings are: •
When you want to refer to a questionnaire serial number.
•
When the answers to a question are represented by codes of more than 1 digit. For example, in a car ownership survey the car make and model owned may be represented by a 3-digit code. To pick up respondents owning a particular type of car you would need to check whether the relevant columns contained the code for that car. For instance, to look for owners of Ford Escorts you might ask Quantum to search for the string $132$ in a particular field of columns.
Basic elements – Chapter 4 / 15
Quantum User’s Guide Volume 1
4.2 Numbers Quantum can print figures in tables with up to ten characters; figures that require more than ten characters are printed as asterisks. For example, 12345678.12 appears as 12345678.1 when displayed with one decimal place, but as asterisks (*) when displayed with two decimal places. However, you can use the scale= option to apply a scaling factor before printing.
Whole numbers Quantum can deal with whole numbers in the range −1,073,741,824 to +1,073,741,823 with an accuracy of up to six significant figures. Numbers with more than six significant figures are rounded up or down depending on the value of the remaining figures.
☞ For some examples of how Quantum rounds figures up and down, see ‘Real numbers’ later in this chapter.
Your data will contain whole numbers whenever there are questions requiring numeric responses: for example, the question ‘How many children do you have?’ can only be answered with a whole number. If the respondent has three children, the number 3 will appear in the appropriate column in his or her data record, whereas a respondent with five children will have a 5 in that column instead. Whole numbers are also used if you want to perform arithmetic calculations during the run, for instance to multiply a field by a number.
☞ For further information on arithmetic in Quantum, see chapter 5, ‘Expressions’.
Real numbers Real numbers are numbers containing decimal points. To be valid, they must have at least one digit on either side of the decimal point: 0.1
and
1.0
are correct
.1
and
1.
are not
Quantum deals with real numbers of any size with accuracy up to six significant figures. Numbers with more than six significant figures have the sixth figure rounded up or down depending on the value of the remaining figures.
16 / Basic elements – Chapter 4
Quantum User’s Guide Volume 1
Here are some examples of rounding: 96.82529 189462.1 123456.5 123456.444 123456.445
is rounded to is rounded to is rounded to is rounded to is rounded to
96.8253 189462.0 123457.0 123456.0 123456.0
By default, Quantum calculates cell values in single precision. However, when working with very large numbers, you can produce more accurate results by using the double precision option (dp) on the a statement.
☞ For further details on double precision, see chapter 2, ‘The hierarchy of the tabulation section’ in the Quantum User’s Guide Volume 2.
4.3 Variables and arrays There are three types of variable — data, integer and real — each used for storing different types of information. You may create your own variables with names representing the type of information stored (for example, the variable called meals might contain a count of the number of meals eaten during the day) or you may use the ones offered automatically by Quantum. Sometimes it is useful for a group of variables to have the same name. Each variable can then be addressed by its position in the group. This arrangement is known as an array. Arrays are discussed further in the following sections.
✎ There is an additional type of variable, called a data-mapped, or mapvar, variable. These variables can be used to store the answers to questions, both numerical and categorical and are typically used in conjunction with one or more data-mapping file. This enables Quantum specs to be written without needing column and code information. Currently, only the qdi mapping file format is supported. Quancept CATI, Quancept CAPI and Quancept Web all produce a qdi file that can be used as a data-mapping file.
☞ For details of data-mapped variables, see chapter 15, ‘Data-mapped variables’.
Basic elements – Chapter 4 / 17
Quantum User’s Guide Volume 1
Data variables Quick Reference To refer to a single data variable in the C array, type: cnumber To refer to a field of data variables in the C array, type: c(start_pos,end_pos) To define a data variable, type: data var_name sizes before the edit section. To refer to it, use the same notation as above but replace the c with the variable’s name. At the start of every job, Quantum provides you with an array of 1,000 data cells called C. This array is sometimes referred to as the C matrix. The individual cells are called C-variables. Each C-variable stores one ‘column’ of data. Quantum reads data from your data file into this array: we will discuss exactly how it does this in chapter 6, ‘How Quantum reads data’. For the time being, let’s say we have a very small questionnaire which uses 43 columns to store the data. Quantum will read the data for each respondent into cells 1 to 43 of the C array, one respondent at a time. The codes from column 1 of the data are copied into cell 1 of the C array, the codes from column 2 of the data are copied into cell 2, and so on. When Quantum has finished with that respondent’s data it clears out the cells in the C array and reads the data for the next respondent, placing it in cells 1 to 43 of the array. We can access this data by defining the columns whose contents we wish to inspect or change. Let’s take the questions about color that we mentioned earlier.The printed questionnaire tells us that the respondent’s favorite color will be coded into column 15. To look at this column we would write: c15
or
c(15)
The C may be in uppercase or lowercase, and the parentheses around the column number are optional. To refer to column 43 we would write: c43
or
c(43)
Now suppose we want to look at a field of columns such as the questionnaire serial number in columns 1 to 5. All we have to do is tell Quantum that the serial number is in a field starting in column 1 and ending in column 5, as follows: c(1,5)
18 / Basic elements – Chapter 4
Quantum User’s Guide Volume 1
Here the parentheses around the column numbers are obligatory. C variables are reset to blank before a new respondent’s data is read. Thus, you can be certain that Quantum never muddles the contents of column 10 for the first respondent with those of c10 for the second respondent. As we mentioned above, you may create your own data variables to store specific pieces of data. For instance, in a shopping survey we may want to store data about visits to Sainsburys in an array called ‘sains’ and data about visits to Safeways in an array called ‘safe’. Before we can use these arrays, we must create them. If each array is to contain 100 cells or columns of data, we would write: data sains 100s data safe 100s
before the edit section. Where the s at the end of each statement causes Quantum to recognize that, for example, safe1 is the same as safe(1), just as it knows that c15 and c(15) refer to the same column of data. If you created the arrays without the s, then Quantum would not recognize safe1 as being the same as safe(1). Data variables which you create remain blank until you copy data into them. If the data about visits to Sainsburys is stored in columns 30 to 45, then we might copy this into cells 30 to 45 of the array called sains. If we then want to use this data we can write statements which refer to sains30 to sains45. Unless you subsequently change the data in sains(30,45), each time you refer to one of those cells it is exactly the same as referring to c30, c45, and so on, in the C array, and to columns 30, 45, and so on, in the data file. In this simple example, there is not much to be gained (apart from an immediate improvement in readability) by using your own data variables. However, when you have many columns of data per respondent, or a complicated Quantum program, named data variables can be very useful for improving readability and also for providing simple yet powerful facilities for data manipulation. Here are some further examples: c80
means column 80 of the C array
c(130,145)
means columns 130 to 145 inclusive of the C array
total(17)
means cell 17 of the array called total (parentheses are optional)
visits(134,136)
means cells 134, 135 and 136 of the array called visits
c(1,80)
means columns 1 to 80 inclusive of the C array
☞ To find out more about creating and using named data variables, see chapter 14, ‘Creating new variables’.
Basic elements – Chapter 4 / 19
Quantum User’s Guide Volume 1
Integer variables Quick Reference To define an integer variable, type: int var_name sizes To refer to an integer variable, type: name[cell_number]
Integer variables store whole numbers. Strings of integer variables are called integer arrays, and each cell in the array may store any whole number from −1,073,741,824 to +1,073,741,823. At the start of each run, Quantum provides an array of 200 integer variables called T. The first cell in this array is the integer variable t1 which may store any value within the given range; the second cell in the array is the integer variable called t2 which may also store any value within the given range. To illustrate the difference between a data variable and an integer variable, let’s suppose that our data contains the value of the respondent’s car to the nearest whole pound. If the value is £6,000, this will take up 4 columns in the data (assuming that we are only concerned with the digits) — that is, four data variables, the first of which will contain the 6, and the other three of which will all contains zeroes. If we placed this same value in an integer variable, we would only need one variable to store the whole value because each variable can store values in the range ±1,073,741,824. We have already mentioned that Quantum provides an integer array of 200 integer variables. You may create your own arrays using statements similar to those shown above for data variables. Suppose you have a household survey in which you have collected the value of each car that the family owns. You want to set up an integer array in which to store each value, so you write: int carval 10s
This creates an array called carval which contains ten separate integer variables called carval1 to carval10. Notice that we have followed the array size with the letter s so that we can omit the parentheses from the individual variable names. We can then copy the value of the first car into carval1, the value of the second car into carval2, and so on. If a particular household owns three cars values at £6,000, £2,500 and £500, then carval1 would have a value of 6,000, carval2 would be 2,500 and carval3 would be 500. If you create your own integer variables, it is recommended that you name them with names that reflect their purpose in the run, as we have done in our example.
20 / Basic elements – Chapter 4
Quantum User’s Guide Volume 1
☞ To find out more about creating and using named integer variables, see chapter 14, ‘Creating new variables’.
All integer variables have a value of zero at the start of a run, and they are not reset between respondents. If you want your integer variables to store information about the current record only, you must include statements in the edit to reset those variables to zero when a new record is read. For example, we might write: carval1 = 0
at the start of the edit to reset the first integer variable of the carval array to zero.
✎ You can also reset an integer variable to zero by using a clear statement. ☞ For further information about the clear statement, see section 8.7, ‘Clearing variables’. T-variables with non-zero values are printed out at the end of the run.
Real variables Quick Reference To define a real variable, type: real var_name sizes To refer to a real variable, type: name[cell_number]
You may define real variables and arrays to store real numbers with accuracy up to six significant figures. Values with more than six significant figures have the sixth figure rounded up or down according to the value of the extra figures.
☞ For further information about real values, see ‘Real numbers’ earlier in this chapter. As with integer variables, the names of real variables should give some clue to the type of information they contain. Real arrays are created by statements of the form: real liters 5s
Basic elements – Chapter 4 / 21
Quantum User’s Guide Volume 1
This example creates a real array called liters which has five real variables named liters1 to liters5. It can store five real values, the first in liters1 and the fifth in liters5.
☞ To find out more about creating and using named real variables, see chapter 14, ‘Creating new variables’.
Quantum also provides a set of 100 real variables named X which you may use.
✎ All real variables start with a value of 0.0 and are not reset to zero between respondents. As an example, let’s say that the data contains information on how long, on average, each person in the household spent watching television during a given week. We want to manipulate these figures so we create an array of real variables in which to store the average viewing figures: real tvwatch 8s
This provides room for up to eight people’s figures. If our household contains four people with viewing averages of 20.8 hours, 15.75 hours, 9.75 hours and 10.0 hours, then tvwatch1 will have a value of 20.8, tvwatch2 will have a value of 15.75, tvwatch3 will be 9.75 and tvwatch4 will be 10.0 hours. The rest of the variables in the array have values of 0.0. Real variables with non-zero values at the end of the run are not printed out automatically. If you want to see these values, you will need to write them using a report statement.
☞ For further information about report, see section 7.3, ‘Writing to a report file’.
22 / Basic elements – Chapter 4
Quantum User’s Guide Volume 1
Reading real numbers from columns Quick Reference To read real values from the C array, type: cx(start_col,end_col)
As we have already said, data from the questionnaire is read into columns for use during the run. When the data contains real numbers you will have to tell Quantum that the dot is to be treated as a decimal point rather than as a multicode representing a number of different answers. The way to do this is to refer to the field as cx: cx(15,20)
cx(131,135)
Here we have two fields containing real numbers: the first is six columns wide including the decimal place, which means that the number itself contains five digits, whereas the second is only five columns wide with four digits. Notice that there is no need to tell Quantum where the decimal point is.
4.4 Subscription As we have shown above, you may refer to specific variables in integer and real arrays and cells or columns in data arrays by naming their position in the array. For example: c1
is the first column of the C array
t5
is the fifth variable in the T array
time3
is the third variable in the array called time
seg(2)
is the second variable in the array called seg
Variables within an array may also be referred to using any arithmetic expression. In this case, parentheses must be used. For example: c(t1)
The column number depends on the value of t1. If t1 has a value of 10, then the variable is c10; if t1 is 67, the variable is c67.
c(t4,t5)
The field delimiters depend on the values of t4 and t5. If t4 has a value of 12 and t5 has a value of 19, the column field referred to is c(12,19).
t(c4)
The variable number depends on the value in c4. If c4 contains a single code in the range 1 to 9, the integer variable will be one of t1 to t9 depending on the exact value in c4. If c4 is multicoded, then the result is nonsense. Basic elements – Chapter 4 / 23
Quantum User’s Guide Volume 1
time(c4*23)
The variable number is the result of multiplying the value in c4 by 23. As in the previous example, c4 must be single-coded in the range 1 to 9 for this example to make sense. Thus, if c4 contains just a 4, the value of the expression is 92 so the variable referred to is time92.
When variables are referenced in this way, the value of the expression must be positive. The expression c(t1−5) is acceptable as long as t1 is at least 5. If the expression has a zero or negative value Quantum will issue an array dimension error when it comes to read the data during the datapass. Also, if the variable refers to columns, the value of the subscript must not exceed 32,767. These are called subscripted variables and they greatly increase the flexibility with which you can write your edit.
✎ Subscription may be used in repetitive processes to save you writing the same thing over and over again.
☞ For an example, see section 9.5, ‘Loops’.
24 / Basic elements – Chapter 4
5 Expressions Quantum recognizes two types of expression — arithmetic and logical. Arithmetic expressions are used to produce numeric values and logical expressions, when evaluated, produce a value of true or false.
5.1 Arithmetic expressions The simplest form of arithmetic expression is a single positive or negative number such as 10 or -26.5 or an integer or real variable. Although the C array is data, columns may also be used in arithmetic when the response coded into those columns is a numeric response, such as a respondent’s age or the number of different shops he or she visited. For example, if columns 243 to 247 contain the codes 4,7,2,6 and 0 respectively the value in c(243,247) could be read as 47,260. Similarly, if columns 45 to 48 contain 7, 8, a dot and 2 respectively, the value in cx(45,48) would be 78.2. Blank columns in a field are ignored when the codes in those columns are evaluated. Thus, if columns 20 to 21 contain the codes 6 and 7 respectively, and column 22 is blank, the codes in c(20,22) will be evaluated as 67. A similar result is produced if the blank column appears anywhere else in the field. All the examples of c(20,22) below produce an arithmetic value of 67: +----2----+ 67
+----2----+ 67
+----2----+ 6 7
The same applies to multicoded columns. If you use a multicoded column as part of an arithmetic expression, the multicoded column will be ignored. The exception to this is a multicode of a digit and a minus sign which creates a negative number: a minus sign anywhere in a numeric field negates the value in the field as a whole, not just the number it is multicoded with. For example: ----+----1----+----2 5 3778 9 0 2---+----3----+----4 12-4 3 4---+----5----+----6 83-
is 5378
is -1234
is -83
Expressions – Chapter 5 / 25
Quantum User’s Guide Volume 1
Combining arithmetic expressions Quick Reference To combine arithmetic expressions, type: variable operator variable [operator variable ... ] where variable is a numeric value or the name of a variable containing a numeric value, and operator is one of the arithmetic operators +, −, * (multiply) or / (divide). More often than not, you will want to combine numeric expressions to form a larger expression, for instance to count the number of records read with a given code in a named column. Arithmetic expressions are linked with any of the arithmetic operators listed below: +
(addition)
*
(multiplication)
−
(subtraction)
/
(division)
Expressions may contain more than one of these operators, for instance: t5 + c(134,136) / otot c(150,152) * 10 + 2.5
Quantum evaluates such expressions in the following order: 1. Expressions in parentheses 2. Multiplication and division 3. Addition and subtraction If you wish to change this order you should enclose the expressions which go together in parentheses. The first expression in the example above will be evaluated by dividing the value in columns 134 to 136 by otot and adding the result to t5. If you change the expression to: (t5 + c(134,136)) / otot
this adds the values of t5 and c(134,136) first and then divides that by otot. Let’s substitute numbers and compare the results. If t5=10, otot=5 and the value in c(134,136) is 125, the two versions of the expression would read as follows: 10 + 125 / 5 = 35
26 / Expressions – Chapter 5
and
(10 + 125) / 5 = 27
Quantum User’s Guide Volume 1
Where two integer expressions are combined, the result is integer (any decimal places are ignored), but if an expression contains a real then the result will be real. Therefore, if t1=5 and t2=3, then: t1 + 4
= 9
t1 + 4.0
= 9.0
t1 * t2
= 15
t1 / t2
= 1
t1 * 1.0
= 5.0
t1 * 1.0 / t2
= 1.66667
If you use parentheses in expressions which contain both integer and real variables, you need to take extra care to ensure that your expression is producing the correct results. Let’s look at an example to illustrate how an expression can look correct but can still produce unexpected results. If we assume that t40=2 and t41=70, the expression: t40 * 100.0 / t41
yields a result of 2.85714 (that is, 200.0/70). The final value will be 2.85714 if the result is saved in a real variable, or 2 if it is saved in an integer variable. If we use parentheses: (t40 / t41) * 100.0
the result is 0.0 (or 0 if saved in an integer variable). The reason for this is as follows. Because Quantum evaluates expressions in parentheses before it deals with the rest of the expression, it treats that expression as integer arithmetic. The rules for integer arithmetic dictate that real results are truncated at the decimal point, so the true result of 0.0285714 becomes 0. Any multiplication involving zero is always zero, so the final result is zero. If you find that a run gives unexpected zero results, try looking for expressions of this type and checking whether the parenthesized part of the expression has been truncated because the integer division results in a decimal number.
Expressions – Chapter 5 / 27
Quantum User’s Guide Volume 1
Counting the number of codes in a column Quick Reference To count the number of codes in a column or list of columns, type: numb(cn1[’codes’], cn2[’codes’], ... ) If any columns are followed by a code reference, only those codes will be counted for those columns.
The function numb is an arithmetic expression which counts the number of codes in a column or list of columns. Its format is: numb(cn1,cn2, ... cnn) where cn1 to cnn are the columns whose codes are to be counted. So, if we wanted to count the number of codes in columns 132 to 135 we would type: numb(c132,c133,c134,c135)
Notice that even though the columns are consecutive, each one is entered separately, with each column number preceded by a ‘c’. It is incorrect to define only the start and end columns of a field when using numb. Therefore it is wrong to write numb(c(132,135)) or numb(c(132,135)) and, if you write statements such as these, Quantum will flag them as errors. Sometimes you will only be interested in certain codes, for instance you may want to know how many 1, 2 or 3 codes there are in a group of columns. In this case the function is entered as: numb(cn’p1’,cn’p2’, ... cnn’pn’) where p1 to pn are the codes to be counted. Only the named codes are counted — any others appearing in the columns are ignored. Let’s say our data on card 1 is as follows: 1---+----2---...---5----+----4 1 2 1 6 / / 8 6 7 9
and we want to count the number of codes in column 115 and also the number of codes in the range ‘5/8’ in columns 121 and 157. The expression would be entered as: numb(c115,c121’5/8’,c157’5/8’)
28 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
When Quantum checks these columns and codes, it will tell us that there are 9 codes in these columns which are within the given ranges. These codes are all four codes in column 115 (we did not specify which codes to count in that column), codes 5 and 6 in column 121 (codes 2 to 4 are outside the given range), and codes 5 to 7 in column 157 (codes 1 to 4 are outside the given range).
Generating a random number Quick Reference To generate a random number in the range 1 to n, type: random(n) in the edit section.
Quantum can generate random numbers automatically with the random function: random(n) where n is the maximum value the random number may take. So, to generate a random number in the range 1 to 100, the expression would read: random(100)
The number produced may be saved for later use in an integer variable or column, thus: rnum=random(32) c(110,112)=random(156)
When using random with columns, always make sure that the number of columns allocated to the number is sufficient to store the highest possible number that can be generated. In our example, we need three columns in order to store numbers up to 156.
✎ random generates a different random value each time it is run, even on reruns of the same job. If you want to retain the same set of random values between runs, copy them into the data the first time you run the job.
Expressions – Chapter 5 / 29
Quantum User’s Guide Volume 1
5.2 Logical expressions Logical expressions are used for comparing values, codes and variables.
Comparing values Quick Reference To compare the values of two arithmetic expressions, type: arith_exp log_operator arith_exp where log_operator is one of the operators .eq., .gt., .ge., .lt., .le. or .ne. Values are compared when you need to check whether an expression has a given value — for example, did the respondent buy more than 10 pints of milk? Values are compared by placing arithmetic expressions on either side of one of the following operators: .eq. .gt. .ge. .lt. .le. .ne.
Equal to Greater than Greater than or equal to Less than Less than or equal to Not equal to / unequal to
If the number of pints of milk that the respondent bought is stored in columns 114 and 115, the expression to check whether he bought more than ten pints would be: c(114,115) .gt. 10
If the number in these columns is greater than ten the expression is true, otherwise it is false. In chapter 4, ‘Basic elements’, we said that integer variables may take numeric values or the logical values true and false depending upon whether or not the value is zero. To check whether the respondent bought any packets of frozen vegetables, we can either write: fveg .gt. 0
to check the numeric value of the variable fveg, or we can simply say: fveg
30 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
to check whether the logical value of fveg is true. To check whether fveg is false (that is, zero), we would write: .not. fveg
☞ For further information about .not., see ‘Combining logical expressions’ later in this chapter.
Comparing data variables and data constants In virtually every Quantum run you will want to check which codes occur in which columns. This is easily done using logical expressions. There are several forms of expression depending on whether you are checking a column or a field of columns.
Data variables Quick Reference To test whether a data variable contains at least one of a list of codes, type: var_name’codes’ To test whether a data variable contains none of the listed codes, type: var_namen’codes’ To test whether a data variable contains exactly the given codes and nothing else, type: var_name = ’codes’ To test whether a data variable contains exactly the given letter and nothing else, type: var_name = ’letter’ To test whether two data variables contain identical codes, type: var_name1 = var_name2 To test whether a data variable contains codes other than those listed, type: var_nameu’codes’ To test whether two data variables do not contain identical codes, type: var_name1uvar_name2
Expressions – Chapter 5 / 31
Quantum User’s Guide Volume 1
To check whether a column or data variable contains certain codes, place the codes, enclosed in single quotes, immediately after the name of the column or data variable. For example: c1’1’
c156’23’
brand’5’
The expression: Cn’p’
checks whether a column (n) contains a certain code or codes (p). The expression is true as long as column n contains at least one of the given codes. It does not matter if there are other codes present since these are ignored. For example, to check whether column 6 contains any of the codes 1 through 4 we would type: c6’1/4’
The expression is true if c6 contains any of the codes 1, 2, 3 or 4 or any combination of those codes, regardless of what other codes may also be present. For instance: ----+----1 1 6 8 &
----+----1 1 2 3 4
----+----1 1 3 0
are true, but: ----+----1 5 7 9 -
is false. In our original example we chose the codes 1 through 4. You can, of course, use any codes you like and they may be entered in any order. The opposite of cn’p’ is: cnN’p’
which checks that a column does not contain the given code or codes. The expression is true as long as the column does not contain any of the listed codes.
32 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
For example: c478n’5/7&’
is true as long as column 478 does not contain a 5, 6, 7 or & or any combination of them. A multicode of ‘189’ returns the logical value true, because it does not contain any of the codes ‘5/7&’ whereas a multicode of ‘1589’ makes the expression false because it contains a ‘5’. The ‘=’ operator is used to check that the contents of a column are identical to either the given codes or the given letters. The expression: c312=’1/46’
is true as long as c312 contains all of the codes 1 through 4 and 6, and nothing else. The expression: c142=’ ’
checks that column 142 is blank. The equals sign is optional when checking for blanks, so we could simply write: c142’ ’
to check whether column 142 is blank. The expression: c124=’A’
checks that column 124 contains the letter A and nothing else. The ‘=’ operator may also be used to compare the contents of two data variables. For example: c56=c79
checks whether c56 contains exactly the same codes as c79. If so, the expression is true, otherwise it is false. If we have: +----6----+ ... +----8---1 1 5 5
the expression is true, but:
Expressions – Chapter 5 / 33
Quantum User’s Guide Volume 1
+----6----+ ... +----8---1 1 5 5 9
yields the value false because column 79 contains a ‘9’ when column 56 does not. If you have defined your own data variables, you could write a statement of the form: brand1=c79
to check whether the data variable called brand1 contains the same codes as c79. The opposite of ‘=’ is ‘U’ (unequal): cnU’p’
This checks whether column n contains something other than just the code ‘p’. Suppose we have two sets of data: ----+----4 1 4 7
----+----4 1 5 9
and we write: c34u’7’
The expression is true for both sets of data. In the first example, the ‘7’ is multicoded with a ‘1’ and a ‘4’, while in the second example, column 34 does not contain a ‘7’ at all. The only time this expression is false is when column 34 contains a ‘7’ and nothing else.
34 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
Fields of data variables Quick Reference To test whether a field contains a given list of codes, type: var_name(start,end) = $codes$ To test whether a field contains a given list of letters, type: var_name(start,end) = $letters$ To test whether two fields contain identical strings, type: var_name1(start1,end1) = var_name2(start2,end2) To test whether the codes in one field differ from a given string, type: var_name(start,end)u$codes$ To test whether the codes in one field differ from those in another, type: var_name1(start1,end1)uvar_name2(start2,end2)
The contents of data fields must be enclosed in dollar signs with each code in the string referring to a separate column in the field. For instance, to check whether columns 47 to 50 contain the codes –, 6, 4 and 9 respectively we would type: c(47,50)=$–649$
The only data for which this expression is true is: +----5----+ -649
However, if our data read: +----5----+ -529 164&
the expression would be false because all columns are multicoded.
Expressions – Chapter 5 / 35
Quantum User’s Guide Volume 1
In a similar way as you can test whether a field contains a given list of codes, you can also check whether a field contains a given list of letters. For example, to check whether columns 55 to 57 contained the string AAA, we would type: c(55,57)=$AAA$
The only data for which this expression is true is: +----5-----+ AAA
All our examples have used columns, but the same rules apply to data variables that you define yourself. For example: rating(1,4)=$1234$
checks whether the field rating1 to rating4 contains the codes 1, 2, 3 and 4 in that order. That is, it checks whether rating1 contains a 1, whether rating2 contains a 2, and so on. When checking the contents of fields in this way, make sure that you enter as many columns as there are codes in the string (that is, five codes require five columns). The exception to this rule occurs when you are checking for blanks when the expression may be shortened to: c(50,80)=$ $
This type of statement may also be used to compare two fields, to check whether the second field contains exactly the same codes as the first field. When you compare one field with another, Quantum takes each column in the first field in turn and looks to see whether the corresponding column in the second field contains exactly the same codes. For example, if the first column of the first field contains a code 1 and a code 2 and nothing else, then Quantum will check whether the first column of the second field also contains a code 1 and a code 2 and nothing else. If all columns of the second field are identical to their counterparts in the first field, then the expression is true; otherwise it is false. Here is an example: c(129,132)=c(356,359)
For this expression to be true, column 129 must contain exactly the same codes as column 356, column 130 must be exactly the same as column 357, and so on. Once again, the two expressions on either side of the equals sign must be the same length.
36 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
✎ Comparisons of one data variable against another are concerned with columns and codes: they are not concerned with the arithmetic values of the codes in the fields as a whole. If we have: ----+----3----+---02 2
the expression: c(24,25)=c(34,35)
is false because the string $02$ is not the same as the string $2$. If you want to compare fields arithmetically (for example, is 02 the same as 2) then you will need to use the .eq. operator: c(24,25).eq.c(34,35)
to test whether the value in c(34,35) was equal to the value in c(24,25).
☞ For further information about the .eq. operator, see ‘Comparing values’, earlier in this chapter. To check whether the codes in one field do not match a given string or the codes in another field, we can use the u (unequals) operator: c(m,n)U$codes$
cmUcn
c(m,n)Uc(m1,n1)
If codes in the field c(m,n) do not match the given string or the codes in c(m1,n1) then the expression is true. If the two fields are identical, then the expression is false.
✎ The comparison is of codes in columns, where the columns are compared on a one to one basis. It is not a comparison of a field with a numeric value, or of the numeric values in two fields. Numeric comparisons for inequality are written with the .ne. operator.
☞ For further information about numeric comparisons, see ‘Comparing values’, earlier in this chapter.
Let’s look at an example of the unequals operator. The statement: c(67,69)u$123$
is true at all times unless our data reads: +----7----+ 123
Expressions – Chapter 5 / 37
Quantum User’s Guide Volume 1
The expression: c(67,69)uc(77,79)
is true as long as columns 67 to 69 differ by at least one code from columns 77 to 79. If our data is: +----7----+----8 123 256
the expression is true because each of columns 77 to 79 differ from columns 67 to 69. Also, if we have: +----7----+----8 123 123 5
the expression is true because column 77 is multicoded ‘15’. The only time the expression is false is when columns 67 to 69 are identical to columns 77 to 79.
Checking the arithmetic value of a field of columns Quick Reference To test whether a value in a field is within a specified range, type: range(start,end,minimum,maximum) Blanks at the start of the field cause this statement to give a false result. To ignore leading blanks, type: rangeb(start,end,minimum,maximum)
The logical expression range checks whether the number in a field of columns is within a given range. If so, the expression is true, otherwise it is false. The format of this statement is: range(start,end,min,max) where start and end are column numbers and min and max are the range delimiters. For example, the statement: range(137,139,100,150)
will return the value true if the number in columns 37 to 39 of card 1 is in the range 100 to 150.
38 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
✎ It is important to remember that this statement is designed for use with purely numeric columns. Columns which contain blanks, multicodes or an ampersand (12 punch) automatically cause the statement to be false. The exception to this is a multicode of a digit and a minus sign (11 code) which converts the whole field to a negative number.
A variation of range is rangeb which allows columns to the left of the field to be blank if the number is right-justified in the field. In all other respects it is exactly the same as range. If our data is: ----+----2 123 6
the expression: rangeb(17,18,1,10)
will be true because the string $ 6$ will be read as 6. With range the value would be false. However, the expression: rangeb(15,18,2000,3000)
returns false because of the blank in c17.
Combining logical expressions Quick Reference To combine logical expressions, type: expression operator expression where operator is one of .or., .and., or .xor. Two or more logical expressions may be combined into a single expression using the operators: .and.
Both/all true.
.or.
One or the other or both/all true.
.not.
Negates (reverses) an expression.
Any number of subexpressions may be combined to form a larger expression, but whether the result is true or false depends upon the values of the subexpressions and also upon the operators used to combine them.
Expressions – Chapter 5 / 39
Quantum User’s Guide Volume 1
The .and. operator requires that all the expressions preceding and following the .and. be true for the whole expression to be true. Thus, the statement: int1.eq.9 .and. c116’1’
is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either subexpression is false, the whole expression is false too. By comparison, the .or. operator requires that one expression or the other, or both, be true in order for the whole expression to be true. c(249,251)=$159$ .or. numb(c132,c135) .gt. 4
For this expression to be true, columns 249 to 251 must contain nothing but a ‘1’, ‘5’ and ‘9’ respectively or the number of codes in columns 132 to 135 must be greater than 4. It is also true if both expressions are true. However, if both are false, the overall result is false. Expressions are reversed (negated) simply by preceding them with the keyword .not. Although it is not wrong to use it with a single variable, it is more generally used to reverse an expression containing the keywords .and. and .or. Thus, it is not wrong to write .not.c15’1/5’ but it is much simpler to write this as c15n’1/5’.
✎ Take care when using .not. with the .eq. operator. Statements of the form: .not. c(1,3) .eq. 100
are incorrect and will not work. They should be written as either: (.not.(c(1,3).eq.100))
with the expression to be reversed enclosed in parentheses, or, more efficiently, as: (c(1,3).ne.100)
Any of the operators .and., .or, and .not. may appear in a statement more than once, as long as you use parentheses to define the order of evaluation. For example: (c15’1/47’ .or. c16’3579’) .and. c22’&’
causes Quantum to check whether the .or. condition is true before dealing with the .and. Suppose our data is: ----+----2----+ 13 & 79
The first expression (c15’1/47’) is true because column 15 contains a 1 and a 7 and the second expression (c16’3579’) is also true since the codes it contains are amongst those listed as acceptable. Thus, the .or. condition is true. Column 22 contains an ampersand so the last expression is also true, therefore the expression as a whole is true regardless. 40 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
If both expressions in the parentheses were false, the whole expression would be false.
.not. with .and. and .or. When you use .not. with expressions in parentheses, be very careful that what you write is what you mean. Let’s take the conditions male and married and forget about columns and codes for the minute. The condition: (Male .and. Married)
refers only to married men. The opposite of this is: .not. (Male .and. Married)
which refers to unmarried men and all women. This can also be written as: .not. Male .or. .not. Married
The first .not. collects all the women, the second collects everyone who is not married (for example, single, widowed, and so on), and together they collect people who are female and unmarried. We use .or. instead of .and. here because the latter will gather unmarried women but will ignore the unmarried men and married women. Reversing .or. expressions works in exactly the same way. The expression: (Male .or. Married)
means anyone who is Male, or anyone who is Married, or anyone who is Male and Married. The opposite of this is: .not. (Male .or. Married)
which means anyone who is not Male or is not Married or is not both; that is, anyone who is a woman and is unmarried. This can be written as: .not. Male .and. .not. Married
Thus, we can summarize, as follows: Positive
Negative
Is the same as
(A .and. B)
.not. (A .and. B)
.not. A .or. .not. B
(A .or. B)
.not. (A .or. B)
.not. A .and. .not. B
Expressions – Chapter 5 / 41
Quantum User’s Guide Volume 1
Here is an example using columns and codes: .not. (c(135,137)=$519$ .or. c160’6/0’)
If our data is: 3----+----4----+----5----+----6----+ 519 1 9 &
the expression is true because c(135,137) do not contain just the codes 5, 1 and 9 (c135 is multicoded), and c160 does not contain any of the codes 6 through 0. The expression will only be false if: •
column 135 contains a 5 only, column 136 contains a 6 only and column 137 contains a 9 only, and
•
column 160 contains any of the codes 6 through 0, either singly or as a multicode. We could therefore write the expression as: .not. c(135,137)=$519$ .and. .not. c160’6/0’
Comparing variables and arithmetic expressions to a list Quick Reference To compare the value of a variable or an arithmetic expression to a list of numbers, type: item .in. (value1,value2, ... ) in the edit section. Ranges of numbers may be entered in the list as start:end. If the item is a reference to a field containing blanks, enter the values as strings of codes enclosed in dollar signs.
From time to time you may need to check whether a variable or arithmetic expression has one of a given list of values. For example, if the questionnaire codes brands of frozen vegetables as 3-digit codes into columns 145 to 147 we might want to check that only valid codes appeared in this field. This is achieved using the logical expression .in. as follows: variable-name .in. (list) arithmetic-exp .in. (list)
or
where variable-name is that of the variable to be checked and list is a list of permissible values. The arithmetic expression is an expression consisting of data or integer variables, arithmetic operators and integer values as described earlier in this chapter. If the variable or arithmetic expression has one of the listed values, the expression is true, if not, it is false.
42 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
The left-hand side of the expression may contain integer variables, columns or data variables containing whole numbers, or expressions using these types of variables. If it is a data variable, then the list may contain codes enclosed in dollar signs. Quantum will then compare the codes in the data variable with the codes inside the dollar signs. We could therefore check that the frozen vegetables have been coded correctly by keying in a statement which says: c(145,147) .in. ($205$,$206$,$207$,$210$,$215$,$220$)
Quantum will flag any records in which c(145,147) does not contain exactly 205, 206, 207, 210, 215 or 220 (that is, three single-coded columns) as incorrect. If the data variable contains a valid positive or negative whole number, then the list may also contain such values. Ranges of values may be entered in the form min:max, where min is the lowest acceptable value and max is the highest. Since the frozen vegetables have numeric codes, we could write the expression as: c(145,147) .in. (205:207,210,215,220)
Any columns in the field which contain non-numeric data (for example, multicodes) will be flagged as incorrect, as will any which contain values which do not match the specification. Sometimes, though, the codes and numbers will not be interchangeable. If you have 2-digit codes in a 3-column field, the statement: c(206,208) .in. ($ 10$,$ 11$,$ 12$,$ 13$)
is not the same as: c(206,208) .in. (10:13)
unless column 206 is always blank. If the 2-digit codes have been padded on the left with zeros instead of blanks (that is, 010, 011) or if they all start in column 206 (that is, $10 $, $11 $), then the first expression will be false, even though the second one will still be true.
☞ For a fuller explanation of the difference between codes and numbers, see the earlier sections of this chapter.
If the left-hand side of the expression is an integer variable or an arithmetic expression, the list may contain positive or negative whole numbers: total .in. (100,200,500:1000)
Lists may contain up to 247 values or codes, which may be entered in any order. In our examples, we have always entered them in ascending order, but this is not a requirement of Quantum. You may enter codes in a list in any order you like. The exception is numeric ranges which must be entered in the form lowest:highest. Expressions – Chapter 5 / 43
Quantum User’s Guide Volume 1
Naming lists Quick Reference To assign a name to a list of values, type: definelist name=(list) in the edit section. Where list is a comma-separated list of numbers, ranges or code strings enclosed in dollar signs.
If you have a list that is used more than once you may give it a name and refer to it by that name instead of typing in the complete list each time. To name a list, write: definelist name=(list) For example: definelist fveg=(205:207,210,215,220)
To use a defined list, simply replace the list with the name: c(145,147) .in. fveg
✎ You cannot use a definelist in an .in. statement with a data-mapped variable. Quantum cannot handle this syntax because it needs to read the data in the definelist differently for data-mapped variables (as strings instead of column punches) but does not know at the time the definelist is parsed whether it will be used with a data-mapped variable.
44 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
5.3 Speeding up large programs Quick Reference To speed up your Quantum program by converting expressions of the form c(1,4)=$1234$ into C in a more efficient way, type: inline n where n is the maximum field width to be converted in this manner. This statement must appear at the start of the edit.
If you have a large edit, you can speed up the time it takes to run by including the inline statement in your edit. This instructs the Quantum compiler to convert expressions of the form c(1,4)=$1234$ into statements in the C programming language in a different way to the way it normally does. You need not worry about these different methods of conversion, apart from deciding whether or not to use them. If you want to speed your program up, place a statement of the form: inline n at the beginning of the edit section, where n is the maximum field width to be converted in the special way. For example: inline 6
Here we are saying that fields of six columns or less should be converted in the special way rather than in the normal way.
Expressions – Chapter 5 / 45
6 How Quantum reads data In order for the answered questionnaire to be processed, the information contained on the questionnaire must be read into the computer into a location where Quantum can access it. This is done by reading the data into the data variable array called C which is supplied automatically with every Quantum run. You may then access this data by addressing this array. Different types of records are read into the C array in different ways.
6.1 Types of record Quantum deals with three types of record: ordinary, multicard and multicard with trailer cards.
Ordinary records These are strings of codes and numbers, one per respondent, up to a maximum of 32,767 characters per respondent.
Multicard records When data originates from punched cards and each questionnaire requires more than 80 columns, the data is spread over several cards. So that all cards belonging to a particular respondent may be easily identified, each questionnaire is assigned a serial number which is entered as part of the data for each card. Within this, each card has a unique card type or card number to distinguish it from others in the group. It is important that both the serial number and card type be in the same relative positions on all cards in the file, since this is the only way that Quantum can tell which data belongs to which respondent. If the questionnaire serial number is in columns 1 to 4 of each card and the card type is in column 5, and we are looking at questionnaire 1005, we will see that it has two cards whose first five columns are 10051 and 10052 respectively. Quantum can deal with records that contain up to 327 cards per respondent. Occasionally you may have multicard records in which each ‘card’ is greater than 80 columns. The notes that follow refer to multicard records of up to 100 columns per card.
☞ For information on how Quantum deals with ‘cards’ of more than 100 columns, see section 6.10, ‘Multicard records of more than 100 columns per card’.
How Quantum reads data – Chapter 6 / 47
Quantum User’s Guide Volume 1
Multicard records with trailer cards Sometimes a record contains very repetitive data which is tabulated over and over again in the same way. For instance, a shopping survey may ask the respondent a series of identical questions for each store he visited. In this case, there may be a separate card for each store. Processing this type of data is often easier if we treat all cards containing the same questions as if they were, in fact, one card with one card number. These cards are called trailer cards. Thus, if the respondent visited five stores, and the questions about these stores are coded on a card 2, the record for that respondent would contain five cards of type 2. If demographic details were stored on a card 1, the whole record would be 6 cards in all. In Quantum, the demographic data would be described as the higher level and the stores as the lower level. Another example of data gathered at different levels might be a travel survey in which respondents are asked about the places they visited and their method of travelling. The highest level may be demographic information about the respondents, the second level would be the various trips they made and the third level might be information about the various modes of transport they used. If we were to draw a chart of a record, it would look like this: Respondent | ------------------------------------------| | | Trip1 Trip2 Trip3 | | | Tran1 Tran2 Tran3 Tran1 Tran2 Tran1 Tran2 Tran3
Here, we have three groups of data at level 2 and eight groups of data at level 3.
6.2 Reading data into the C array Data is read into the C array automatically, one record at a time. The way data is read depends upon the record structure. If a record contains carriage return characters (CTRL+M), those characters are always ignored.
Ordinary records Ordinary records are read into cell 1 onwards of the array. Therefore, for example, the 50th column is referenced as c50 and the 200th cell as c200.
48 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
Multicard records Records are read into c101 to c200 for card 1, c201 to c300 for card 2, and so on. For example, 80-column cards are read into c101 to c180 for card 1 and c201 to c280 for card 2. Columns 181– 200, 281–300, and so on remain blank. In this case, the C array may be pictured as ten rows of 100 cells each. Column 50 of card 1 is then accessed by referring to it as c150, and column 67 of card 8 is referred to as c867.
☞ For information on longer records, see section 6.10, ‘Multicard records of more than 100 columns per card’.
If you have records with more than nine cards, you need to extend the size of the C array by using max=. This also tells Quantum which cells to clear between records.
☞ For further details on max=, see ‘Highest card type number’ later in this chapter.
Ignoring card types It is also possible to read cards into the array sequentially regardless of card type: the first card goes in c(101,200), the second in c(201,300), the third in c(301,400), and so on.
☞ For further information, see ‘Record type’ in section 6.8, ‘Describing the data structure’.
6.3 Processing the data Each time an ordinary record or set of cards comprising a multicard record is read in, that data is processed first by the edit section and then by the tabulation section of your program. The complete record is edited and tabulated in one go. The exception to this is the trailer card record where processing can take place a number of times within each record for each lower level. To ensure that only the part of the edit section applying to a particular level is used, the edit section is defined separately for each level. Similarly, the table instructions specify the level at which the table should be incremented.
☞ For more information about levels, see chapter 3, ‘Dealing with hierarchical data’ in the Quantum User’s Guide Volume 3.
How Quantum reads data – Chapter 6 / 49
Quantum User’s Guide Volume 1
6.4 Trailer cards By using the Levels facility, the user need not know how Quantum deals with trailer card data internally. However, there are occasions when it may be necessary to edit or tabulate the data without using levels. To do this, it is necessary to know more about how trailer cards are processed. Quantum deals with trailer cards in a number of ‘reads’. Cards are read into the appropriate rows of the C array until: •
A card is located with a card type matching that of the previous card (for example, two consecutive card 2’s), or
•
A card is read with a type lower than its predecessor and matching one of the card types already read in during the current ‘read’ (for example, a card 2, a card 3, and then another card 2).
In order to produce useful tables, you will need to know which cards are currently in the C array. Quantum has four reserved variables — thisread, allread, firstread and lastread — which it uses to keep track of which cards it has read for each respondent.
thisread The array called thisread is used to check which cards have been read in during the current read. thisread1 will be true (or 1) if a card type 1 has just been read in; thisread2 will be true if a card 2 has just been read, and so on. There are nine such variables (thisread1 to thisread9) available unless extra card types have been specified using the max= option In this case, these variables will be numbered 1 to max; if there are 13 cards, we will have thisread1 to thisread13.
☞ For further details on max=, see ‘Highest card type number’ later in this chapter.
allread allread notes which cards have been read in so far for this questionnaire. If cards 1, 2 and 3 have been read so far, allread1, allread2 and allread3 will all be true. Additionally, each cell of allread will contain the number of cards of the given type read in — for instance, if two cards of type 3 have been read, allread3 will be true and it will contain the number 2. As with thisread, there are nine allread variables available unless extra card types have been specified with max=.
50 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
firstread and lastread The variables firstread and lastread become true when the first and last cards in a record have been read in.
Examples You can use these variables in your program to associate specific parts of the edit or tabulation section with specific types of data. For instance: if (.not. thisread3) go to 400 * card 3 edit follows . . 400 continue /* calculate average when all cards read for respondent if (lastread) average=sum / num . /* update table when all cards read for this respondent tab brand demo;c=lastread
Let’s take an example and look at the contents of the C array and the values of thisread, allread, firstread and lastread. Suppose the record has five cards: 1, 2, 2, 2 and 3 of 80 columns each. The first ‘read’ places card 1 in c(101,180) and the first card 2 in c(201,280). The second card 2 is not read into the array yet because it has the same card type as the previous card. As this is the start of a new respondent, firstread is true (or 1), and because cards 1 and 2 have been read, thisread1, thisread2, allread1 and allread2 are also true. The second ‘read’ deals only with the second card 2 since it is followed by another card of the same type. thisread2 is true, as are allread1 and allread2. Also, allread2 contains the value 2 because we have read in 2 card 2s so far. Note that thisread1 is now false (or 0) as no card 1 was read this time. On the third and final ‘read’ the third card 2 is read into c(201,280) and card 3 is copied into c(301,380). lastread is true because we have reached the end of the record, thisread2 and thisread3 are true because we have just read cards 2 and 3, and allread1, allread2 and allread3 are true because this record contains cards 1, 2 and 3. allread2 now contains the value 3 because there were 3 card 2s altogether. The chart below summarizes the cards read and the variables which will be true after each read.
Read 1
c(101,180) c(201,280) c(301,380) thisread
allread
firstread
Card 1
1
Card 2a
12
12
Read 2
Card 2b
2
12
Read 3
Card 2c
23
123
Card 3
lastread
1
How Quantum reads data – Chapter 6 / 51
Quantum User’s Guide Volume 1
If Quantum reads a record in which the repeated cards are out of sequence, it inserts blanks cards of the appropriate types wherever necessary to force the cards into the correct sequence. For example, if the record contains the cards 1, 2, 4, 3, 4, 4 in that order, Quantum will generate a completely blank card 3 when it reads the first card 4. The record is then processed as if it contained cards 1, 2, 3, 4, 3, 4, 4.
6.5 Columns 1 to 100 It is sometimes useful to know that in the case of multicard records the first card of the next record is waiting in columns 1 to 100 of the array. Beware of overwriting these columns.
6.6 Reserved variables In section 6.4, ‘Trailer cards’ we discussed the reserved variable thisread, which keeps track of which cards have been read in during the current read, and allread, which keeps track of all cards read in for the current record. Other reserved variables associated with reading in data: lastrec
Set to true when the last record in the file has been read or, in the case of trailer cards, the last read of the last record has occurred.
rec_count
Stores the number of records read in so far.
card_count
Counts the number of cards read so far.
6.7 Using spare columns You can use spare columns in the C array for data manipulation and storing additional information. However, it may be clearer to store this information in named variables where the name gives some indication of the type of data stored. In ordinary records you can use the space beyond the end of the record. If the record length is 120 columns, you can use columns 121 to 1000.
✎ For ordinary records, only columns 1 to reclen are reset to blanks, where reclen is the maximum record length as defined by the reclen= keyword on the struct statement.
☞ For further information about defining the record length, see ‘Record length’ in the next section.
52 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
In multicard records you may not use c(1,100). However, you may use any columns between the end of the card (reclen) and the end of that row of the C array. For instance, when reclen=80 you may use c(181,200), c(281,300) and so on. You may also use full sets of columns in which there is no data: that is, if the record has only four cards (1, 2, 3 and 4), then c(501,1000) are the spare columns you may use. Additionally, cells 101 to c(100+reclen), c201 to c(200+reclen), and so on are reset to blanks before the next record is read in.
6.8 Describing the data structure Quick Reference To describe the structure of the data, type: struct; options All programs dealing with multicard records must contain a struct statement unless the data contains trailer cards which will be read and tabulated using the levels facility. In this case you may choose between using a struct statement or using a levels file. If the run has no struct statement and no levels file, Quantum assumes that the data contains ordinary records to be read into c1 onwards of the C array.
☞ For information about levels and how to describe the levels data structure, see chapter 3, ‘Dealing with hierarchical data’ in the Quantum User’s Guide Volume 3. The struct statement is used to define the type of records, the location of the serial number and card type in the record and the number of the highest card type if greater than 9. Its format is: struct; options
Record type Quick Reference To define the record type, type: struct; read=n where n is 0 for ordinary records, 2 to read multicard records in sections according to the card type, or 3 to read multicard records all in one go.
How Quantum reads data – Chapter 6 / 53
Quantum User’s Guide Volume 1
Quantum recognizes two types of record: single card and multicard. The type of record is defined by the keyword read= on the struct statement: •
Ordinary records — Ordinary records are defined using read=0. Each record is read into c1 onwards of the array. Since it is the default, you need only use it when other options are required; for example, when the records contain serial numbers and you wish to have the serial number printed out as part of the record, or when you are working with long records of more than 100 columns.
•
Multicard records — Multicard records are identified by the keyword read=2. Each card in the record is read into the row corresponding to the card type of that card — that is, card 1 in c(101,200), card 2 in c(201,300), and so on. We mentioned briefly that it is possible to read all cards in a multicard record in at once and ignore the card type. The first card goes in c(101,200), the second in c(201,300), and so on. This is achieved with read=3.
Record length Quick Reference To define the record length of records greater than 100 columns, type: struct; reclen=n
The keyword reclen=n defines the maximum number of characters to be read into the C array, the number of cells to be reset to blanks and the number of cells to be written out by the write statement. With ordinary records reclen may take any value, but with multicard records the maximum is reclen=1000. In both cases, the default is reclen=100. When data is read into the array, any record which is longer than reclen characters is truncated to that length and a warning message is printed. When ordinary records are written out with write or split, cells c1 to c(reclen) are copied, with any trailing blanks being ignored. For instance, if we have: struct;read=0;reclen=200
and the current record is only 157 characters long, the record written out will be 157 characters long. This length can be overridden by an option on a filedef statement. When multicard records are written out, columns c101 to c(100+reclen), c201 to c(200+reclen), and so on will be output. Thus, if we write: struct;read=2;reclen=70
and we have 2 cards per record, Quantum will write out c(101,170) and c(201,270).
54 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
Finally, with ordinary records cells c1 to c(reclen) are reset to blanks between records, but with multicard records cells c101 to c(100+reclen), c210 to c(200+reclen), and so on are reset.
☞ For information about the write statement, see section 7.1, ‘Print files’. For information about the split statement, see section 12.4, ‘Creating clean and dirty data files’. For information about the filedef statement, see section 7.4, ‘Defining the file type’.
Serial number location Quick Reference To define the location of the serial number in each record, type: struct; ser=c(m,n)
The keyword ser=c(m,n) defines the field of columns containing the respondent serial number. For example, if the serial number is in columns 1 to 5 of an ordinary record we would write: struct;read=0;ser=c(1,5)
Similarly, if it is in columns 1 to 5 of a multicard record the statement would be: struct;read=2;ser=c(1,5)
Notice that even with multicard records we only give the actual column numbers containing the serial number, rather than card type and column number as is usually the case when identifying columns in such records. This is because the column numbers refer to all cards in the data set rather than to a single card in the file.
Card type location Quick Reference For multicard records only, to define the location of the card type in the record, type: struct; crd=cn
Defining the card type location is much the same as defining the position of the serial number in the record. The keyword is crd=cn for a single digit card type or crd=c(m,n) for a card type of more than one digit. Once again, m and n are column numbers only, not card type and column number.
How Quantum reads data – Chapter 6 / 55
Quantum User’s Guide Volume 1
For example: struct;read=2;ser=c(1,4);crd=c5
tells us that we have a multicard record with serial numbers in columns 1 to 4 and the card type in column 5 of each card. Each card will be read into the row corresponding to its card number.
Required card types Quick Reference For multicard records only, to define cards which must be present in each record, type: struct; req=card_numbers where card_numbers is either a comma-separated list of card numbers, or a range of sequential card numbers in the form start:end or start/end.
Sometimes some cards will be optional and others mandatory. You define the cards which must appear in every record by using the keyword req= followed by the numbers of the cards that each respondent must have. For example: req=1,2
tells us that cards 1 and 2 must be present in each record for that record to be accepted. Any other cards are optional. If a record is read without one of these cards, the error message ‘Card Missing in Set’ and a note of the record’s position in the file are printed and the record is ignored. If you have ranges for required card types, you may type the numbers of the lowest and highest cards separated by a slash (/) or a colon (:) rather than listing each card type separately. For example, if cards 1 to 4 are all required, you may type: req=1,2,3,4
or
req=1/4
or
req=1:4
Repeated card types Quick Reference For multicard records only, to define cards which may appear more than once in a record, type: struct; rep=card_numbers where card_numbers is either a comma-separated list of card numbers, or a range of sequential card numbers in the form start:end or start/end.
56 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
If the data contains trailer cards and the Levels facility is not used, you must list their card types with the keyword rep=. For instance, if card 2 is a trailer card we would write rep=2. Where there is more than one trailer card, each card type is listed separated by a comma. If cards 2, 3 and 4 are all trailer cards we could write: rep=2,3,4
If you have ranges for repeated card types, you may type the numbers of the lowest and highest cards separated by a slash (/) or a colon (:) rather than listing each card type separately. For example, if cards 2 to 4 are all repeated, you may type: rep=2,3,4
or
rep=2/4
or
rep=2:4
If rep= is not used and a record is read with two or more cards of the same type, the last card of that type will be accepted and the message ‘Identical duplicate’ or ‘Non-identical duplicate’ and a note of the record’s position in the file will be printed. For example: Record structure error: serial 026, card 234 in run, card 234 in dfile card type 2 — non-identical duplicate Because rep= refers to trailer cards only, it will be ignored if read=2 and crd= are not both present on the struct statement.
Highest card type number Quick Reference For multicard records only, to define the highest card type in the record, if there are more than nine cards per record, type: struct; max=n
The only time you need to inform Quantum of the highest card type is when you have records with more than nine cards. This is so that Quantum can allocate sufficient cells in the C array to store the extra cards. The highest card type is defined with max=n, where n is the number of the highest card type. Cells 1 to max*reclen are then cleared between respondents. For example, to read a data set with 11 cards per respondent we might write: struct;read=2;ser=c(1,4);crd=c5;req=1,2,3,4;max=11
If you forget max=, and a record is read with more than nine cards, the message ‘Too many cards per record’ is printed and the record is rejected. On the other hand, if a card is read with a card type higher than that defined with max=, the record is rejected with the message ‘Card number out of range’.
How Quantum reads data – Chapter 6 / 57
Quantum User’s Guide Volume 1
✎ Since the maximum size of the C array is 32,767 cells, the maximum value you can set with max= is 327 cards.
Dealing with alphanumeric card types Quick Reference For multicard records only, to define the location in the C array of cards with alphanumeric card types, type: struct; order=card_types where card_types is a list of card type numbers and letters in the order they are to appear in the C array.
From time to time you may need to read in records with alphabetic as well as numeric card types. This generally happens in a multicard data set containing more than nine cards per record where only one column has been allocated to the card type. Quantum can deal with this data but first you have to say where in the C array the alphabetic card types should go. This is done with the keyword: order=n where n is one or more of the codes ‘1234567890–&’ or the letters A to Z (in upper or lower case) not separated by spaces. The card type bearing the first number in the list is read into c(101,200), the card bearing the second code in the list is read into c(201,300), and so on. For example, suppose each record has ten cards — 1 to 9 and A — our struct statement might say: struct;read=2;ser=c(1,4);crd=c5;max=10;order=123456789A
Data from card A would be read into cells 1001 to 1100 of the C array.
Merge sequence for trailer cards Quick Reference For multicard records only, to define the location of the merge sequence number in trailer cards, type: struct; seq=cn
58 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
When trailer card data is merged during a run with the merge facility, you may wish trailer cards to be merged in a specific order, according to a sequence number entered as part of the data. The location of this sequence number can be defined with the keyword seq=cn for a single column code or seq=c(m,n) for a multicolumn code. For more information on merging data see the next section.
6.9 Merging data files When we say that Quantum allows you to merge data files, we do not mean that Quantum takes data from a number of files and merges it to create a new file. Rather, we mean that data can be read from a series of files during a Quantum run. Of course, the merged data can then be written out to a new file for future use. Quantum provides two methods for merging data. The first is designed for studies where you have different card types in different files; for example, cards 1 and 2 in the file data1 and card 3 in the file data2. In this case, merging is by serial number and, optionally, card type and trailer card sequence number. The second method is designed for situations where you want to merge a field of data from an external file into records from the main data file. For example, you may have a file of manufacturers’ codes which refer to a number of products. If each record in the main data file contains the product the respondent preferred, you may wish to merge the appropriate manufacturer’s code from the external file into the main data in the C array. In this case, merging is based on finding matching keys in the main record and the records in the external file. Both options are described in detail below.
Merging complete cards Data for a study may be spread across a number of files. This is particularly useful with large surveys because it means that you can put each card type in a different file and simply merge in the cards required for the current batch of tables. For example, if we require tables from cards 4 and 5, we need not even read in cards 1, 2, 3 and 6. Data from up to 16 files may be merged; that is, the main data file and 15 others. It may be merged on serial number and, within that, on card type. With trailer card data, you also have the option of merging trailer cards according to a sequence number entered as part of the data. In order for the merge to be successful, all files must be sorted in ascending order with the serial number, card type and sequence number in the same position. Quantum reads the locations from the keywords ser=, crd= and seq= on the struct statement.
How Quantum reads data – Chapter 6 / 59
Quantum User’s Guide Volume 1
To merge data files you must create a file called merges telling Quantum which items to merge on, and which files to merge. The type of merge is represented by a number: 1
Merge on serial number. Cards are read in from each data file according to their serial number only — the card type and sequence number, if any, are ignored. You might use this option when you have two files, dat01 containing cards of type 1 and dat02 containing cards of type 2, and you want the files to be merged so that card type 1 is read into the C array, followed by card type 2.
3
Merge on serial number and card type (default). With this option, cards with the same serial number read from different data files are merged to form a single record by comparing the serial number and card type. Cards within a record are then sorted sequentially from 1 so that each card is read into the appropriate cells of the C array. For example, if dat01 contains cards 1 and 3, and dat02 contains cards of type 2, the merge will produce records containing cards 1, 2 and 3 in that order.
5
Merge on serial number, card type and sequence number. This is similar to merge type 3, except that trailer cards are merged according to their sequence number. For example, if dat01 contains cards 1 and 2, where card 2 is a trailer card with a sequence number of 2, and dat02 contains cards 2 and 3, where card 2 is a trailer cards with a sequence number of 1, the merged record will contain cards 1, 2/1, 2/2, and 3, in that order.
The type of merge is the first item in the merges file, and is followed by the names of the files to be merged with the main data file named in the Quantum command line. Items may be entered on separate lines or all on the same line separated by semicolons. For example, if we want to merge data in files dat02 and dat03 with data in the main file, dat01, by serial number, card type and sequence number, the merges file would look like this: 5; dat02; dat03
Notice that we have not mentioned dat01 in the merges file because it will be named on the Quantum command line instead.
✎ This facility is not designed to work with merge files that contain *include or #include statements to read additional data files into the current data file. All merge files must be named in the merges file, which accepts pathnames if the data files are not in the project directory.
60 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
Merging a field of data from an external file Quick Reference To merge extra data from an external data file into the data currently in the C array, type: int_variable=mergedata($ex_file$, key_field, key_start, copy_to, data_start) where ex_file
is the name of the file containing the extra data.
key_field
is the location of the key in the main data file, entered using the standard Quantum notation for columns and fields.
key_start
is the start column of the key in the external data file.
copy_to
is the field in the main data record in which to place the external data. The field is defined using the standard Quantum notation for columns and fields.
data_start is the start column of the data to be copied. This statement returns 1 in int_variable if a match is found, 0 if no match is found.
The mergedata statement merges a field of data from an external file with the main data at the datapass stage of the Quantum run. Merging is by means of a data key present in both the main records and the records in the external file. If a record in the external file has a key which matches that of a record in the main data file, the external data will be merged into a user-defined field of the main record when it is read into the C array. In order for data to be merged correctly, both the main data file and the external file must be sorted in ascending order by key value. If the key is the record serial number then the data file will already be sorted in the correct order (assuming, of course, that the data is sorted by serial number). If you are using a key that is not the record serial number you must sort the data file so that it is ordered by key rather than by serial number. The syntax for mergedata is: int_variable=mergedata($ex_file$, key_field, key_start, copy_to, data_start) where: int_variable
is the name of an integer variable in which the function can place its return value.
ex_file
is the name of the file containing the extra data. It must be enclosed in dollar signs.
How Quantum reads data – Chapter 6 / 61
Quantum User’s Guide Volume 1
key_field
is the location of the key in the main data file, entered using the standard Quantum notation for columns and fields.
key_start
is the start column of the key in the external data file, for example, 1 if the key starts in column 1. The length of the key is taken from the length of key_field.
copy_to
is the field in the main data record in which to place the external data. The field is defined using the standard Quantum notation for columns and fields.
data_start
is the start column of the data to be copied. Quantum copies as many columns as are defined by copy_to.
For example: t1 = mergedata($manuf_codes$,c(178,180),15,c(168,175),1)
tells Quantum to compare the key in columns 178 to 180 of the main record with the key which starts in column 15 of the external records in the file manuf_codes. Because the key field in the main record is 3 columns long, Quantum reads columns 15 to 17 of each external record to obtain its key. If the keys match, Quantum copies the data from the external record into columns 168 to 175 of the main record in the C array. The external data to be copied starts in column 1 and, since the destination field is 8 columns long, Quantum copies 8 columns starting at that column. This statement returns a value of 1 if a match was found (i.e., merging took place), or 0 if not. There is no limit on the number of mergedata statements in a specification, but you may only merge data from up to nine different files per record.
Errors Errors can occur if your run contains a mergedata statement and either the main data file or the file of supplementary data for merging has records with duplicate keys or records that are out of sequence. In some cases the run is also canceled after all data has been read, when a complete error report is available. The following table lists the situations when duplicate or out of sequence data may occur and shows what happens to your job.
62 / How Quantum reads data – Chapter 6
Quantum User’s Guide Volume 1
Circumstance
Message
Run canceled?
read=0 and the main data file contains records with duplicate keys
WARNING: FILE name CONTAINS
No
read=2 and the main data file contains records with duplicate keys
WARNING: FILE name CONTAINS
read=0 or read=2 and the supplementary data file contains records with duplicate keys
WARNING: FILE name CONTAINS
read=0 or read=2 and records in the main data file are out of sequence
WARNING: FILE name OUT OF
read=0 or read=2 and records in the supplementary data file are out of sequence
WARNING: FILE name OUT OF
DUPLICATES IN
DUPLICATES IN
DUPLICATES IN
SEQUENCE IN
SEQUENCE IN
key_field Yes
key_field Yes
key_field Yes
key_field Yes
key_field
6.10 Multicard records of more than 100 columns per card Occasionally you may have multicard records in which each card contains more than 100 columns. To process this data, Quantum extends the width of the C array to 10 rows of 1,000 cells each — that is, 10,000 cells in all — when a struct statement with reclen>100 is present. Data is read into c(1001,2000) for card 1, c(2001 to 3000) for card 2, and so on. The last three digits are used for the column number and the other digits are used for the card number. All other points mentioned previously for multicard records apply, but column numbers refer to the extended rather than the default C array. For example, in the default C array c(1,100) stores the first card of the next record, whereas in the extended C array this data is stored in c(1,1000).
6.11 Reading non-standard data files Occasionally you may have to process data which does not come in the standard formats described in this chapter. For instance, records may be strung out one after the other without being separated by a new-line character. Quantum provides limited facilities for reading non-standard data.
☞ For further details, see ‘Reading non-standard data files’ in chapter 10, ‘Include and substitution’ of the Quantum User’s Guide Volume 2.
How Quantum reads data – Chapter 6 / 63
7 Writing out data There are three ways of writing out your data once it has been read into the C array. You may: • • •
Create a new data file. Copy records to a print file. Write information to a report file.
Data and print files are both accessed by the write statement, but the exact format of the statement varies according to the type of file and the information being written. You write to report files using the report statement.
7.1 Print files Print files are printouts of records or parts of records with headings, descriptive texts and page numbers. They cannot be used as data for subsequent Quantum runs.
Printing out individual records Quick Reference To write a record or part of a record to a print file, type: write [file_name] [field] [$text$] If no file name is specified, the out2 print file is used.
The word write by itself prints out a whole record in the form it is when the write statement is executed, together with a ruler showing which codes fall in which columns, the line number of the record in the data file and the message ‘write’ indicating that the record was generated by a write statement. Any multicodes in the record are shown as asterisks, but you may change this with an option on the filedef statement.
☞ For information on the filedef statement, see section 7.4, ‘Defining the file type’.
Writing out data – Chapter 7 / 65
Quantum User’s Guide Volume 1
If the record contains more than one card, each card is listed separately beneath the ruler. For example, the statement: write by itself might give us: Quantum edit report 1 in file ----+----1----+----2-- ... --9----+----0 columns 1 - 100 are |12345 write 2 in file ----+----1----+----2-- ... --9----+----0 columns 1 - 100 are |23456 write
Each write statement will produce a line in the default print file, out2, telling you how many records were written out, as follows: 2 (1%) write
Which cards are printed from multi-card records depends upon which cards have been read in so far. Quantum looks at the ‘allread’ variables and writes out cards for those which are true; so for example, if allread1, allread2 and allread3 are true, cards 1, 2 and 3 will be printed. If you have changed the contents of these variables prior to printing out the record, you will see the cards for which allread is true rather than those which were originally read. The example above was very simple; more often than not your program will contain several write statements and you will want some way of identifying which records were printed by which statement and why. If the write is dependent upon some other statement — for instance, it is part of an if statement — the whole statement is printed underneath each record, thus: 67 in file ----+----1----+----2-- ... --9----+----0 columns 1 - 100 are |0015263-16*735 *837361 ... 79& if (c14n’1/4’) write
Here, as you can see, we are checking whether column 14 contains a 1/4. This record has been printed out because it contains a ‘5’ instead.
66 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
Sometimes it is more helpful to have an explanatory text printed instead of the statement itself. In this case all that is necessary is to follow the word write with the text to be printed enclosed in dollar signs: if (c308n’1/5’) write $c308 incorrect$ if (numb(c117,c118,c119).gt.3) write $too many choices$
might give us: Quantum edit report Record 17 51 in file ----+----1----+----2-- ... --9----+----0 columns 101 - 200 are |00170116548986131*46*1 ... columns 201 - 300 are |0017026464515 875 ** ... columns 301 - 400 are |0017031929-5897231 ... c308 incorrect too many choices Record 32 94 in file ----+----1----+----2-- ... --9----+----0 columns 101 - 200 are |003201837021 **53798 ... columns 201 - 300 are |0032021353452 763736 ... columns 301 - 400 are |003203212 & ... too many choices
Our first statement writes out all records in which column 308 does not contain any of the codes 1/5, and the second picks up all records having more than 3 codes in columns 117 to 119. Normally all output from write goes to the default print file, and whenever the current record is written to this file, the variable printed_ becomes true. You may change the output file by following the word write with the name of the file to write to. For example: write pfile $First Print$
writes to the file ‘pfile’, whereas; write errors $Second Print$
writes to a file called ‘errors’. All files named on write statements must be defined on a filedef statement before they are used.
☞ For information on the filedef statement, see section 7.4, ‘Defining the file type’.
Writing out data – Chapter 7 / 67
Quantum User’s Guide Volume 1
If two or more write statements apply to a single record, the record is printed out once in the state it was when the first applicable write was read, with all relevant write statements or texts listed below it. If a record satisfies two or more write statements which write to different files, Quantum writes the record out once for each statement, in the state it is when each write is executed.
✎ If you want to write out more than one field at a time, or to print more than one text, you can define those fields and/or texts on an ident statement. All write statements from that point on will then print those fields and texts.
☞ To find out more about ident, read section 7.5, ‘Default print parameters for write statements’.
Writing out parts of records Often you will not want to write out the whole record, especially if it contains several cards. Therefore Quantum allows you to include a field specification in a write statement to print only selected portions of an incorrect record. For example: if (c110’2’.and.c119’2’) write c(110,120) $Married woman$
checks that columns 110 and 119 both contain a 2, and if so prints out columns 110 to 120 in the print file, followed by the text Married woman. If you are writing out fewer than ten columns, Quantum does not print a ruler above the codes. If you are dealing with multi-card records, you may prefer to use this form of write to print only the card containing the error, rather than all cards in the record. If we take our previous example where we were checking the contents of column 308: if (c308n’1/5’) write $c308 incorrect$
prints all three cards in the record, whereas: if (c308n’1/5’) write c(301,380) $C308 incorrect$
prints only card 3. To write selected parts of a record to a particular file the notation is: write filename c(m,n) [$text$]
✎ The write statement can only write out information from the C array.
68 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
7.2 Data files Quick Reference To write records or fields to a data file, type: write file_name [c(start_col, end_col)]
write may also be used to copy records to a data file. This is useful if you want to separate a particular card type from the rest of the data, or if you want to correct errors and save the corrected data in a new file for later tabulation. To write records to a data file the command is: write filename to write the whole record to the named file, or write filename c(m,n) to write columns m to n only. If you use write in a levels job to write data to a new data file, the statement write datafile at any level will write out data for that level only. Additionally, if the write statement is inside an if clause, or a return statement is encountered, then only relevant data is written for that level. To write out data for all levels, you will need one write statement per level. In all cases, records are written in the state they are when the write is executed, and all cards read in with the current read are copied; that is, all cards for which thisread is true. For instance, if thisread1, thisread2 and thisread3 are true, Quantum will write out cards 1, 2 and 3. To prevent any of these cards being written, you may set the appropriate variable to false (zero); therefore to print only card 1 of our three cards, we would write: thisread2=0; thisread3=0 write newdat
Any number of writes to data files are allowed in the edit, and each one may write to a different file. Records written by write are normally as long as the record length defined with reclen on the struct statement. You may change this with len= on the filedef statement. The exception is where records end with blank columns. In this case Quantum ignores the blank columns. If you want to create a data file of fixed length records, and your data is single coded, you can use the reportn statement. If your data is multicoded you can convert it to single coded first by using the explode statement.
Writing out data – Chapter 7 / 69
Quantum User’s Guide Volume 1
☞ For further information about explode, see ‘Converting multicoded data to single-coded data’ in chapter 13, ‘Using subroutines in the edit’.
If your data is multicoded and you need to preserve the multicodes, the only way of writing out fixed length records if the data currently has trailing blank columns is to insert a dummy code in the last column of those records.
Creating new cards New cards can be created by copying information into spare columns of the C array. To save these as part of a new data file you will have to give each new card the same respondent serial number as the rest of the data in the array and a card type which may or may not be unique. In the example below, we are moving some information from card 1 of a 2-card data set into a new card 3. The comments explain what each statement is doing. /* Copy the data into the new card c(310,341)=c(148,179) /* Delete it from its original place c(148,179)=$ $ /* Give it a serial number and card type c(301,304)=c(101,104); c380’3’ /* Set thisread true for card 3 thisread3=1 /* Define pfil as a data file filedef pfil data /* Copy cards 1, 2 and 3 to pfil write pfil
7.3 Writing to a report file Quick Reference To write information to a report file, type: report[n] file_name variable_names where variable_names is a comma-separated list of the variables and texts to print. Use reportn rather than just report to start a new line each time the statement is executed.
70 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
A report file is a special type of print file in which you can print out records, fields or variables in the format of your choice. To write information in a report file, use the report statement, as follows: report filename parameters where filename is the name of the file to be written to, and parameters define exactly what is to be written. Lines in a report may be up to 1024 characters long. Report does not start a new line automatically at the end of each write, but you may tell it to do so by following the keyword report with the letter n: reportn filename parameters In both cases, the named file must be identified as a report file using a filedef statement, as described in section 7.4, ‘Defining the file type’. The parameter list defines what is to be printed in the report file. It may contain variables, texts, and special characters representing tabs and spaces.
Data variables Quick Reference To print the contents of a data variable, type: var_name
or
var_name(start,end)
To print the contents of a field, evaluated as an integer right-justified in a field of a given width, type: var_name:field_width To print a the contents of every column in a field, even if they are multicoded or blank, type: start:field_width where start is the first position in the field. You may also use this notation to print fields whose contents evaluate to a value greater than the maximum integer value Quantum can deal with.
To print the contents of a data variable, type the variable’s name.
Writing out data – Chapter 7 / 71
Quantum User’s Guide Volume 1
All data variables that are single coded are printed using as many positions as there are columns in the variable. For example, if the data is: ----+----4 511 538253 2 &
the statement: report rfile c31,c35,c40
prints the contents of columns 31, 35 and 40 one after the other, as follows: 553
The statement: report rfile c(35,40)
prints the contents of columns 35 to 40: 538253
In both the examples the last column of the field has contained a code. If the last column or columns of a field are blank, Quantum omits those columns when printing the contents of the field. (You can get round this by entering the field specification as start:field_width as described later in this section.) A single data variable that is blank is printed as such, while a single data variable that is multicoded is printed as an asterisk. The statement: report rfile c35, c34, c33, c32
creates the line: 5 *1
If a variable refers to a string that contains multicoded or blank columns, Quantum ignores the multicodes and blanks and evaluates the contents of the remaining columns as an integer. For example, using the data shown above, the statement: report rfile c(31,40)
prints a line containing the value 51538253. The value starts in the first print position available.
72 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
✎ If the field you wish to print is very long, its contents may produce an incorrect value when evaluated as an integer (the maximum integer value which Quantum can deal with is 1,073,741,824). You can get round this by specifying the first column and the field width as described below.
If you want to see all columns in a field which contains blanks or multicodes, or you need to have the correct evaluation of a long field, you will need to deal with each column in the field separately. You could type each column number separately, but it is quicker just to specify the start column and the total number of columns you want to print starting at that column. The format for this type of reference is: start:field_width For instance, to print columns 31 to 40 you would type: report rfil c31:10
The output from this command would be 51* 538253, the same as if you had typed each column number separately. As before, the data is printed starting in the first print position available. You can use this alternative notation with field specifications too. In this instance Quantum will evaluate the contents of the field as an integer and will print the result right-justified in a field of the given width. If you type, for example: report rfil c(31,40):10
Quantum will print the value 51538253 in positions 3 to 10 of a ten-position field. The first two positions will be blank. This notation is also useful if you need to create data files with fixed length records, and some records end with blank columns. Writing records to a data file preserves multicodes but ignores trailing blank columns. Writing to a report file allows you to create a single-coded data file with fixed length records. If your data is multicoded you will need to convert it to single-coded form before writing it out. You can do this by ‘exploding’ any multicodes into a field of single codes. You use the explode statement for this.
☞ For information on how to use explode, see ‘Converting multicoded data to single-coded data’ in chapter 13, ‘Using subroutines in the edit’.
Writing out data – Chapter 7 / 73
Quantum User’s Guide Volume 1
Once your data is in single-coded form you can then write the whole record out to a report file using a reportn statement as follows: reportn repdata c101:80 reportn repdata c201:80
Integer variables Quick Reference To print the contents of an integer variable, type: var_name[:field_width] If the report statement names a variable by itself, Quantum prints the variable’s value starting in the first print position available. If the specification includes a field width, Quantum prints the variable’s value right-justified in a field of the given width. Any extra columns on the left of the field width are shown as blanks.
To print the value of an integer variable, type: var_name[:field_width] If you type the variable name by itself, without a field width, Quantum prints it left-justified starting in the first available position on the line. If you would prefer values to be printed right-justified, follow the variable name with a colon and a field width. Quantum will then print all values for that variable right-justified in a field of the width you have given. For example: report rfile codenums:5
prints the values of the variable called codenums right-justified in a field five positions wide. Values that are shorter than five characters are padded on the left with blanks.
74 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
Real variables Quick Reference To print the value of a real variable, type: var_name[:field_width.dec_places] where field_width is the width of the field in which the values are to be printed (values are rightjustified and padded on the left with blanks if necessary) and dec_places is the number of decimal places to be shown for each number. If you omit these parameters, Quantum prints the values starting in the first available print position and with six decimal places.
To print the value of a real variable, type: var_name[:field_width.dec_places] If you type the variable name by itself, without a field width and a number of decimal places, Quantum will print the variable’s value with six decimal places and starting in the first print position available. You can control the layout by defining a field width and the number of decimal places required. For example, by typing: report rfil cost:6.2
you can create a neat column of figures all with two decimal places and all right-justified in a field six characters wide.
Writing out data – Chapter 7 / 75
Quantum User’s Guide Volume 1
Text and white space Quick Reference To print text, type: $text$ To create blank space on a line, type: [number]x to leave the given number of spaces (the default is one), or: print_post to tab to the given position on the line.
Most reports require some sort of text or spacing on the line, either on the same line as the values or on lines by themselves to create titles, column headings, and the like. To print text on a report, type: $text$ The text may contain spaces. To print spaces between the values on a line, you can either use spaces or tabs. To print a given number of spaces between one value and the next, type: [number]x where number is the number of spaces required. The default is one space. If you are producing tabular or columnar output you’ll probably find tabs are more useful for creating blank space since they allow you to skip to a particular print position on the line. For example, typing: 25t
takes you directly to position 25 on the line, regardless of the current print position. Compare this with 25x which moves you 25 positions on from your current position.
76 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
Examples Here are some examples of report statements: reportn summary 20t,$Bought Brand A$,1x,brda:3,1x,$times$
prints a report of the form: Bought Brand A 5 times Bought Brand A 9 times Bought Brand A 13 times in a file called summary. Printing starts in position (column) 20 because we started the parameter list with the keyword 20t. The variable brda is an integer variable whose value is to be rightjustified in a field three columns wide. Notice also how we have inserted spaces between the texts and the value of brda. The statements: /* only print title if this is the first record in the data file if (.not. rchk) +reportn yogurt 30t,$Serial Numbers for Yogurt Buyers$ if (c119’1’) reportn yogurt c(1,4) . rchk = 1
produce a report showing the serial numbers of all respondents who buy yogurt. As you can see, we have given our report a title. As a final example, let’s look at the difference between printing a field of columns all in one go and printing them one at a time. If our data is: +----4----+ 18 036 & / 7
the statement: reportn test $c(37,43) is $,c(37,43)
will produce the line: c(37,43) is 106
Writing out data – Chapter 7 / 77
Quantum User’s Guide Volume 1
However, if we deal with each column separately: reportn test $c(37,43) is $,c37,c38,c39,c40,c41,c42,c43
will report that: c(37,43) is 1*
0*6
✎ You cannot write information to the standard print file (usually called out2) using report. To do this use the function qfprnt.
☞ For information about qfprnt, see section 7.6, ‘Writing out data in a user-defined format’.
7.4 Defining the file type Quick Reference To define a report file, type: filedef filename[=pathname] report [len=rec_len] To define a data file, type: filedef filename[=pathname] data [len=rec_len] To define a print file, type: filedef filename[=pathname] print [mpa][mpd][mpe][len=n][norule][noser][$text$] where mpa, mpd and mpe indicate that multipunches should be printed across the page, down the page, or as an asterisk and then listed below the record.
All files named on write and report statements must be defined by a filedef statement before they are used. This tells Quantum whether the file is a report, print or data file, and defines more specifically how the output should be written. So that you can be sure that all filenames will be recognized, you are advised to place all filedef statement at the beginning of the edit. For report files, the definition is: filedef filename[=pathname] report [len=rec_len] where filename is the name of the report file and report is a mandatory keyword indicating that the file is a report file.
78 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
✎ If you are writing out more than 200 characters to a report file, you need to set len= on the filedef statement to more than 200 to ensure that no lines are truncated.
Quantum normally creates report files in the main project directory. If you want the report file to be created in a different directory, follow the filename with =pathname. When specifying a pathname, the filename acts as a short-hand reference (tag). This means that you still have to tell Quantum the filename by appending it to the pathname. For example, to declare a report file called repfile1 that is to be created in the directory /home/ben, you would write: filedef repfile1=/home/ben/repfile1 report
✎ The maximum length for filename is set at 31 characters. For data files, the file definition statement is: filedef filename[=pathname] data [len=rec_len] where filename is the name of the output file and data is a mandatory keyword indicating that the named file is a data file. As with report files you may use the optional =pathname parameter to name the directory in which the data file should be created. All records written to data files are as long as the record length defined with reclen on the struct statement. If you wish to change this, add the option len=reclen to the filedef statement, thus: filedef newdat1 data len=80
This example says that records written to the data file newdat1 must be 80 columns long. The file definition statement for print files is: filedef filename[=pathname] print options where filename is the name of the print file with an optional pathname, print is a mandatory keyword indicating that the file is a printout file, and options is a list of optional keywords defining more specifically how the records should be written. Filename lengths are as described above for data files.
Writing out data – Chapter 7 / 79
Quantum User’s Guide Volume 1
The options are: len=n
Length of output record if different from reclen= on the struct statement.
$text$
Heading text to be printed at the top of each page.
mpa
Prints the codes in a multicode across the page enclosed in curly brackets. For example: 000401 635495{134}45111
Here, we have a multicode of ‘134’. The ruler is of little use when multicodes are printed in this manner, so you may prefer to suppress it with the option norule. mpd
Prints the codes in a multicode down the page, thus: ----+----1----+----2 000401 635495145111 3 4
mpe
Prints multicodes as an asterisk, but lists the individual codes within each multicode beneath the record. For example: ----+----1----+----2 000401 635495*45111 Column 14 contains codes 134
norule
Turns off the ruler.
noser
Prevents the messages ‘Record nnn’ and ‘n in File’ from being printed.
The default output file is a print file called out2, and the default output style is as described above. To change the output style for this (for example, to suppress the ruler or print multicodes in a different format), simply use a filedef statement naming this file and giving the appropriate options from the list above: filedef out2 print norule mpe
80 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
7.5 Default print parameters for write statements Quick Reference To define default print parameters for write statements, type: ident[±] [$text$] [,variable_name] [,variable_name, …] Any number of texts, variable names and fields are allowed. Items are printed in the order they are listed. To turn off ident defaults and return to the standard write behavior, type: noident
The ident statement gives you increased control over the content of the print file by allowing you to print more than one field of columns and one text per write statement. The format of this statement is: ident[±] [$text$] [,variable_name] [,variable_name, …] Each ident statement may contain any number of texts, variable names and columns as long as each one is separated from the others by a comma. The order in which you define items with this statement controls the order in which they will be printed. For example, if you type: ident $bad film code$, c(1,10) if (films0 .gt. 0) write $check c(1,6)$
and Quantum finds a record which fails this test, it will print the following: bad film code Column c(1,10) is |----+----| 040506 check c(1,6)
Notice that the text defined with ident does not replace the text given with write. If you do not define a message on the write statement, Quantum will print the complete statement as it usually does. In this example there is not much difference between using ident and writing the test as: if (films0 .gt. 0) write c(1,10) $bad film code - check c(1,6)$
Writing out data – Chapter 7 / 81
Quantum User’s Guide Volume 1
The real power comes when you want to write out more than one field and/or text per write statement, or if you want to write out the values of data, integer or real variables. For example, if you type: ident t1, t2, t3 write
Quantum will write messages of the form: t(1) t(2) t(3)
is is is
10 15 20
in the print file (the values reported will, of course, be the values of the variables as they are in your run).
✎ In ident statements you can refer to a field of adjacent entries in a data variable array by specifying the first and last entries. For example, you can specify c(1,12) to refer to columns 1 through 12 of the C array. However, like most other Quantum statements, you cannot use this syntax for other types of variable, such as integer arrays. So the example above must be specified as: ident t1, t2, t3
and not as: ident t(1,3)
You can combine texts, columns and variable names. The statements: ident $Bad film code$, c(1,10), films0, films1, films2, films3 if (films0 .gt. 0) write
might print: Bad film code Column c(1,10) is |----+----| 010209 if (films0 .gt. 0) write films(0) films(1) films(2) films(3)
is is is is
1 1 1 0
82 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
You could use this type of output for checking records which may be incorrectly coded for use with field and bit statements.
☞ For information about field, see section 8.6, ‘Reading numeric codes into an array’. For information about bit, see section 4.4, ‘Responses with numeric codes: bit’ in the Quantum User’s Guide Volume 2.
When ident writes out data variables, it prints the data according to the specification on the filedef statement for the file to which you are writing the data. If the filedef statement includes the keyword norule to suppress the ruler, the data is written out without a ruler, otherwise the ruler is always printed above the data, as in the previous example. You can alter this behavior without having to respecify the filedef command by typing a + or − sign at the end of the ident keyword. If filedef normally requests a ruler, type: ident− data variables to print the listed variables without a ruler. If filedef normally suppresses the ruler, type: ident+ data variables to print the variables with a ruler. To switch off ident and revert to the standard write behavior, type: noident
Writing out data – Chapter 7 / 83
Quantum User’s Guide Volume 1
7.6 Writing out data in a user-defined format Quick Reference To write data to the standard print file (usually called out2) in a format of your choice, type: call qfprnt(0, $format$, variables) where format defines the format in which the data is to be written and the data types of the variables used. variables is a comma-separated list of the variables to be written out. Variables must be listed in the order they are used in the format statement. The format string consists of optional text interspersed with references to variables in the list: %num_posi
Print an integer variable in the next num_pos positions on the line. If the variable has a negative value the value is printed starting with a minus sign.
%num_pos.dec_plr
Print a real variable in the next num_pos positions on the line and with dec_pl decimal places. The number of print positions must allow for the required number of decimal places and a decimal point.
%num_colc
Print num_col columns starting with the column whose name or number appears in the variable list. Columns are printed as texts not punch codes; that is, multicodes are converted to letters where possible.
%numberb
Print number blank spaces.
write and report are both powerful statements for writing out data, but they do have limitations which you may find restrictive in some circumstances. The write statement lets you write data out to a print file, including the standard print file (usually called out2), but it always writes the data in a fixed format that you cannot change. The report statement lets you write out data and text in any format you like, but only to a report file. You cannot write to a print file with report. The qfprnt function brings together the functionality of write and report by writing text and data to the standard print file in a format of your choice. To use it, type: call qfprnt(0, $format$, variables) where format defines the format in which the data is to be written and the data types of the variables used. variables is a comma-separated list of the variables to be written out. Variables must be listed in the order they are used in the format statement. Here is a simple example to start with: call qfprnt(0,$Number of products tested is: %2i$,t1)
84 / Writing out data – Chapter 7
Quantum User’s Guide Volume 1
If the respondent tested five products this statement will appear in the standard print file as: Number of products tests is: _5
The underscore character in front of the 5 represents a space and appears as such in the print file. We’ll explain why we have printed it here shortly. First, let’s look at the qfprnt statement itself. The format section of the statement consists of text to be printed exactly as it is written and references to variables whose values are to be substituted in the text at the given points. In this example we are writing out the value of the numeric (integer) variable t1. The variable is named in the variable list section of the statement and is represented by the characters %2i in the format section. There are three parts to the variable’s reference. The % sign signals to Quantum that it has reached a variable reference: all references start with a % sign. The i says that the variable is an integer variable and the 2 says how many print positions to reserve for printing this variable. In the example two positions are reserved for printing the value of t1, but since the value of t1 is only 5, Quantum prints the value on the right of the reserved space and fills the remaining positions with spaces. In the sample output we have used an underscore to represent this space. Here is another example using two integer variables: call qfprnt(0,$Record %4i tested %2i products$,recnum,t1)
This produces lines of the form: Record 1004 tested _5 products
As before, the underscore represents a space used to pad a value to the full field width. This qfprnt statement produces the correct results because the variables are in the same order as their references in the format section. This is your responsibility. As long as a variable has the same type as the reference in the corresponding position in the format section, Quantum will print its value at that point in the statement. So, if we had written: call qfprnt(0,$Record %4i tested %2i products$,t1,recnum)
Quantum would write out: Record ___5 tested ** products
As you can see, Quantum does not increase the number of print positions to accommodate the value it needs to print. Instead, it prints asterisks. In this example, the asterisks would alert you to the fact that there is something wrong with the qfprnt specification, but this would not always be so. More often than not you’ll be printing positive values. If Quantum needs to print a negative number, it prints the minus sign directly in front of the first digit, just as you would write it manually. Writing out data – Chapter 7 / 85
Quantum User’s Guide Volume 1
Besides integer variables, you can also print real variables, columns or fields of columns and blank strings. You use a reference similar to the one you’ve seen for integer variables. To print a real variable, type: %num_pos.dec_plr where num_pos is the number of print positions required and dec_pl is the number of decimal places. As an example, the statement: call qfprnt(0,$%5.2r liters bought$,liters)
prints the value of the real variable called liters in a field 5 positions wide. The value is printed with two decimal places so, allowing for the decimal point, the maximum value that can be printed in 99.99: 15.27 liters bought 9.01 liters bought
Quantum can also print the text values of a column, a field of columns or a data variable. By this we mean that Quantum converts multicodes to letters or other keyboard characters before printing them. Multicodes that do not correspond to letters or characters are printed as asterisks. For example, the multicode ‘&1’ translates into the letter A and would be printed as such; the multicode ‘&123’ is simply as collection of codes and would therefore be printed as an asterisk. To print single columns, type: %numberc in the format section, where number is the number of print positions required, and the name of a single column in the corresponding position in the variable list. Quantum will then print number columns starting at the named column. For example: call qfprnt(0,$Record %4c tested %2i products$,c1,t1)
might produce: Record 1234 tested
86 / Writing out data – Chapter 7
5 products
Quantum User’s Guide Volume 1
Now suppose that the data is: ----+---2 9462&5736 5 1 8 9
The statement: call qfprnt(0,$Columns 11 to 20 are %10c: $,c11)
reports the following: Columns 11 to 20 are: 9*62A5*36
To print strings of blanks, type: %numberb where number is the number of blanks you want. You’ll find this useful if you want to indent lines or print values in columns.
Writing out data – Chapter 7 / 87
8 Changing the contents of a variable This chapter describes how to assign values to variables and the statements emit, delete and priority, all of which may be used to alter the contents of a variable. Emit, delete and priority are used only with columns whereas assignment statements can deal with character, integer and real variables. When we say that these statements change the contents of a column we mean that they change the contents of that column as it exists during the run: at no time do they change the corresponding column in the data file.
8.1 Assignment statements An assignment statement normally means ‘put the specified information into the given variable overwriting anything already in that variable’. It can be used with any type of variable to perform any of the following tasks: •
To copy codes from one column into another.
•
To replace certain codes in one column with those from a second column.
•
To assign the value of an arithmetic expression to a variable.
•
To copy codes from groups of columns into another column using the logical operators and, or and xor.
In spite of the diversity of these functions the basic format of any assignment statement is: variable=item where item defines what is to be copied into the variable. Remember that comments can be identified by an uppercase C in column 1. If the first variable in your statement starts with a C, make sure that you type it in lower case otherwise the whole line will be read as a comment and will be ignored. For example: c(15,16)=$12$
is correct, but
C(15,16)=$12$
is read as a comment even though the syntax is correct
Alternatively, you may precede assignment statements with the word set, thus: set c(15,16)=$12$
In this manual we will omit set from all such statements. Changing the contents of a variable – Chapter 8 / 89
Quantum User’s Guide Volume 1
Copying codes Quick Reference To copy codes into a single data variable, overwriting the variable’s original contents, type: variable=’codes’ To copy a string of codes into a field, type: var_name(start,end)=$codes$ To copy the contents of one variable or field into another, type: variable1 = variable2
Assignment statements are most commonly used to copy codes into a column or to copy the contents of one variable into another. For instance: c121=’159’ c121=c134
In the first example we are copying the codes 1, 5 and 9 into column 121 overwriting whatever is already there. The second example copies everything in column 134 into column 121, again overwriting what was originally there. Column 134 remains unchanged. You can also copy strings of characters into fields of columns. Let’s say we want to copy the code 59642 into columns 76 to 80 of card 3; we would write: c(376,380)=$59642$
Notice that the characters to be copied into the array are enclosed in dollar signs as is the rule when dealing with strings. If you need to use a semicolon in a string, you must type it as: \; Quantum uses a semicolon to mark the end of a statement, and will issue an error message if it finds a semicolon by itself in the middle of a string. The backslash in front of the semicolon tells Quantum to read the next character as an ordinary character with no special meaning. For example: c(376,380)=$59\;42$
inserts 59 in c(376,377), a semicolon in column 378, and 42 in c(379,380).
90 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
When characters are being copied into columns, the equals sign may be omitted: c10’4’
is the same as
c10=’4’
c(11,14)$6353$
is the same as
c(11,14)=$6353$
Just as the contents of a single column can be copied into another, so the contents of one field can be copied into another field. For example: c(10,19)=c(70,79)
or
c(20,22)=c(45,47)
copies the contents of c(70,79) into c(10,19) and the contents of c(45,47) into c(20,22), in both cases overwriting the original contents of those columns. Data variables in assignment statements may be subscripted. The following are valid: c(t1)=c145 c(178,180)=c(t4,t5) c(t3,t5)=c(t10,t10+2)
The equals sign in these kinds of operation may not be omitted. When subscripting columns, remember that the current values of the integer variables will be substituted in the expression before the statement itself is executed. If t3=120 and t10=240, the statement: c(t3,t3+2)=c(t10,t10+2)
means: c(120,122)=c(240,242)
Generally you will know how many characters are required to hold the information they will receive, but this is not always the case. What if the field on the left of the equals sign is longer than the string to be copied into it? Quantum always copies a string starting with the right-most column and transferring it into the right-most column of the field. It continues in this way until all characters have been copied, then if there are still columns left in the field they are reset to blanks. When strings are copied in this way they are called ‘right-justified and blank-padded’.
Changing the contents of a variable – Chapter 8 / 91
Quantum User’s Guide Volume 1
Let’s clarify this with a couple of examples. Suppose we have: ----+----9--100
...
---4----+----5 84635
and we enter: c(241,245)=c(185,187)
We will then have ----+----9--100
...
---4----+----5 100
If there are fewer characters than there are columns in the field, the characters are right-justified in the field with the remaining columns set to blanks. If the reverse is true, and there are more characters than there are columns in the field, the error message ‘Attempt to set too many columns into too few columns’ is issued. Columns in assignment statements may overlap; for instance: c(145,150)=c(143,148)
copies the contents of columns 143 to 148 into columns 145 to 150, so: ----+----5 83645902
becomes
----+----5 83836459
When a field is set to blanks it is never wrong to type in as many blanks (enclosed in dollar signs) as there are columns in the field, but it is much quicker and more efficient to type, say: c(301,380)=$ $
Partial column replacement Quick Reference To replace a code or set of codes in one data variable with a code or set of codes in a second data variable, type: variable1’codes1’=variable2’codes2’ codes1 and codes2 must contain the same number of codes, and the codes must be in superimposable order (e.g., ‘123’ and ‘456’, but not ‘123’ and ‘135’).
92 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
Assignment statements are also used to replace parts of one column with those of another, leaving the remaining contents of that column intact. Note that this is the only time that assignment does not overwrite everything in the recipient variable. Let’s start with a simple example. Suppose we have: ----+----3 3 / 7
...
----+----6 6 / 8
and we want column 124 to contain a ‘1’ only if column 159 contains a ‘7’. We would write: c124’1’=c159’7’
Once this statement has been executed, we will have: ----+----3 1 3 / 7
...
----+----6 6 / 8
However, if we wrote: c124’3’=c159’3’
meaning that c124 should only contain a ‘3’ if c159 contains a ‘3’, Quantum would give us: ----+----3 4 / 7
...
----+----6 6 / 8
As you can see, the ‘3’ in c124 has been deleted because there is no ‘3’ in c159. Both examples could equally well be written using if, else, emit and delete, but an assignment statement is much more efficient when you have a set of codes to check for.
☞ For further information about if, see section 9.1, ‘Statements of condition – if’. For further information about else, see section 9.2, ‘Statements of condition – else’. For further information about emit, see section 8.2, ‘Adding codes into a column’. For further information about delete, see section 8.3, ‘Deleting codes from a column’.
Changing the contents of a variable – Chapter 8 / 93
Quantum User’s Guide Volume 1
Let’s say we type: c10’123’=c11’456’
and our data is: +----1----+ 14 35 4
Quantum will give us: +----1----+ 14 25 4
Column 10 contains a ‘1’ and a ‘2’ because c11 contains a ‘4’ and a ‘5’. The ‘3’ that was originally there has been removed because there was no ‘6’ in c11. The ‘4’ in column 10 remains untouched because it has no corresponding code in c11. Partial assignment need not have different column numbers either side of the equals sign. Quantum accepts statements of the form: c127’0/3’ = c127’1/4’
which can be used for recoding incorrectly coded data. The example we have used will recode a ‘0’ in column 127 as a ‘1’, a ‘1’ in column 127 as a ‘2’, and so on. When entering codes with this type of statement, make sure that there are the same number of codes on either side of the equals sign and that they are in the same relative positions in the order &-0123456789. In the previous example we used ‘123’ and ‘456’. We could also have used ‘&-1’, ‘789’ or ‘234’ instead of ‘456’, to name but a few alternatives. The important thing is that the two groups follow the same pattern: if the first set names alternate codes (for example, ‘1357’) then so must the second (for example, ‘&024’). The following statements are valid: c21’&–0’=c92’456’ c21’05’=c86’49’
these are not: c56’ 0’=c91’15’ c78’123’=c81’367’
94 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
The statement for columns 56 and 91 is incorrect because blank is not a valid code here; the statement for columns 78 and 81 is wrong because the codes ‘367’ cannot be superimposed on ‘123’ (either 345 or 567 would be correct).
Storing arithmetic values Quick Reference To store the value of an arithmetic expression in a variable, type: variable = expression To copy a real value into a data variable, type: var_name(start,end) :dp = expression where dp is the number of decimal places required.
In many of your Quantum programs you will need to save the result of some arithmetic expression in a variable. The variable may be a column or an integer or real variable and the arithmetic information may be the contents of a column, integer or real variable, an integer or real number, or the results of the functions numb or random. It can also include arithmetic expressions which have been manipulated using the arithmetic operators +, −, / and *. Here are some examples to start with: var1=100 /* Next statement expects that variable ntim is < 10 c135=ntim /* In next example, if c31’5678’, variable np=4 np=numb(c31) /* Increment rect (record total) by 1 for each record processed rect=rect+1
Copying a number into an integer or real variable is easy because the variable has no predetermined size — that is, Quantum does not say that such variables may only store numbers of up to, say, three digits. Integer variables can store any whole number in the range +2,147,483,648 to -2,147,483,647 and real variables may take values of any magnitude with six digits accuracy. Suppose our questionnaire tells us how many pints of milk a respondent bought and we want to save this is in an integer variable called npt. Here’s what we might write: npt=c(125,126)
Similarly, if we know how many miles the respondent travels to work each day, and we want to convert this to kilometers, we could save the conversion in a real variable called km0: km=c(213,214) * 1.609
Changing the contents of a variable – Chapter 8 / 95
Quantum User’s Guide Volume 1
If the respondent travels 5 miles, km will have the value 8.045, but if he or she travels 9 miles, km would be 14.481. The main difference between the two examples is the type of variable in which the results are saved. The number of pints bought will always be a whole number so we save it in an integer variable, whereas the conversion from miles to kilometers is likely to produce a real number so we save it in a real variable.
Real and integer variables When copying a real value into an integer variable or vice versa, remember that the accuracy of the result depends upon the type of variable in which the value is saved. Real values saved in integer variables are truncated before the decimal point, thus: gives
t1=2.5 + 3.4
t1=5
but integer values placed in a real variable are saved as reals with decimal places and accuracy to 6 significant figures: gives
x1=1 + 7
x1=8.0
Integer variables are often used to count the number of respondents having a specific characteristic. For instance, to count the number of respondents holidaying at home and the number taking holidays abroad we can say, /* Home is c113’1’; abroad is c113’2’; both is c113’12’ if (c113’1’) home=home+1 if (c113’2’) abroad=abroad+1
☞ This example uses the if statement that is described in chapter 9, ‘Flow control’. Whenever a record is read with c113’1’, the variable home will be incremented by one and whenever a record is read with c113’2’ the variable abroad will be increased by 1. Let’s say we have five respondents who took the following holidays: Respondent 1
Home
c113’1’
Respondent 2
Home and abroad
c113’12’
Respondent 3
Home
c113’1’
Respondent 4
No holiday
c113’ ’
Respondent 5
Abroad
c113’2’
96 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
At the start of the run, the variables home and abroad are both zero. After these records have been processed, home will equal 3 and abroad will be 2. The person unlucky enough to have no holiday at all will be ignored. In the example above we were accumulating information about holiday habits for all respondents together, but on many occasions you will want to store information on a per respondent basis instead. Normally, integer and real variables are not reset between respondents, but all you need do to overcome this is to enter a statement at the start of your edit to reset the variable in question to zero each time a new record is read. For instance: home=0
☞ We will discuss in more detail the times when you might want to do this when we describe the do statement in section 9.5, ‘Loops’.
Columns which contain single codes may be treated as a whole number. For instance, if our data is: +----2----+ 4922
the statement: value=c(219,222)
will assign the value 4922 to value. If any of the columns are blank or multicoded in any way, they are ignored. +----2----+ 49 2
and
+----2----+ 4912 2
both give value=492.
Columns Columns may also store arithmetic information, but unlike other variables they have a predefined size which means they can only store numbers of a certain size. For instance, c(1,10) can store numbers of up to ten digits whereas c(1,3) only stores numbers of up to three digits.
Changing the contents of a variable – Chapter 8 / 97
Quantum User’s Guide Volume 1
If the number is negative Quantum places the minus sign in the column immediately to the left of the first digit, but if there are no spare columns the first digit will be dropped and the minus sign placed in the left-hand column. If t5=−278, the statement: c(46,49)=t5
gives
4----+----5 -278
yields
4----+----5 -78
but: c(47,49)=t5
Note that this does not hold true for negative numbers whose length exceeds the field width by more than one character. Then, the number is copied into the field from the right and the minus sign and any excess digits are ignored. Thus, if t5=−1278, c(42,44) will contain the number 278. If the value to be saved has fewer digits than there are columns in the field, it will be right-justified in the field and the remaining columns padded with zeros. Here are some more examples: /* Room to store values of t60 between –99 and +999 c(110,112)=t60 /* visits*4 should be between –999 and +9999, otherwise truncated c(34,37)=visits*4 /* Result never truncated since maximum value is 81 c(10,11)=c7*c8 /* Total holidays taken c(224,230)=home + abroad /* Count the number of codes pch=numb(c21,c22,c23’1/5’,c24’1/9’)
When copying real numbers into columns, Quantum needs to know how many decimal places are required. This is done by following the variable with a colon and a digit defining the number of places. For example, if x5=10.22, the statement: cx(15,19):2=x5
results in: ----+----2---10.22
If the real number has more decimal places than we have allowed for, say 3 instead of 2, the extra decimal places will be ignored.
98 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
Assignment with and, or and xor Quick Reference To copy codes which are present in at least one of a list of columns, type: data_var_name=or(cnum1[’codes1’], cnum2[’codes2’], ...) To copy codes which are present in all of a list of columns, type: data_var_name=and(cnum1[’codes1’], cnum2[’codes2’], ...) To copy codes which are present in only one of a list of columns, type: data_var_name=xor(cnum1[’codes1’], cnum2[’codes2’], ...) If any of these statements includes codes (p), only those codes are checked for. Any unlisted codes are then ignored.
The final type of assignment is copying codes from a set of columns. The codes copied depend upon the type of operator used: and
Copy codes present in all columns.
or
Copy codes present in one or more columns.
xor
Copy codes present in one column only.
The format of the statement is: column = operator(ca,cb,cc, ...) where ca, cb, and cc are the columns whose codes are to be compared. Note that even if you are comparing codes in consecutive columns, each column must be identified separately, preceded by a c.
Changing the contents of a variable – Chapter 8 / 99
Quantum User’s Guide Volume 1
Suppose we have: ----+----4 111 /22 453 77
and we type: c181=and(c137,c138,c139)
we will then have: ----+----4 ... ---8----+ 111 1 /22 2 453 77
Notice that even though the codes ‘3’ and ‘7’ appear in more than one column they are not copied to c181 because they are not common to all columns. Let’s take the same three columns with the or operator. We type: c182=or(c137,c138,c139)
which gives us: ----+----4 ... ---8----+ 111 1 /22 / 453 5 77 7
c182 contains a list of all codes present in at least one of the named columns.
100 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
Now, look at the same columns with xor: c183=xor(c137,c138,c139)
yields: ----+----4 ... ---8----+ 111 4 /22 5 453 77
Here only two codes have been copied because all other codes appear in more than one column. If one column was blank, this would be ignored if there were other codes unique to one column. Only if there were no other unique codes would column 183 be blank. For instance, if we have c11=’ ’, c12=’12’, c13=’13’ and we type: c14=xor(c11,c12,c13)
we would have c14=’23’, but if c13 were to contain a ‘12’ instead, c14 would be blank. All our examples so far have referred to whole columns, but sometimes you will only be interested in specific codes in those columns. To write this in Quantum, follow each column number with the positions to be checked enclosed in single quotes. Any unnamed codes in those columns are then automatically ignored. Here is an example. Our data is: ----+----4----+----5 1 1 2 / 3 / 5 5 6
the statement c85=and(c31’1/3’,c41’1/3’,c45’1/3’) gives us: ----+----4----+----5 1 1 2 / 3 / 5 5 6
...
8----+----9 3
Even though column 31, 41 and 45 all contain a ‘3’ and a ‘5’, Quantum only copies the ‘3’ because the ‘5’ is not part of our specification. We have used the same code specification for all three columns, but you can use whatever combination you like.
Changing the contents of a variable – Chapter 8 / 101
Quantum User’s Guide Volume 1
✎ These types of statement are extremely useful for setting up shorthand references to the codes present in a group of columns. Say, for instance, that you wanted various statements throughout the edit to be executed only if there was a ‘1’ in one or more of c110, c112, c120 and c125. You can always write out each column and code separately each time: if(c110’1’.or.c112’1’.or.c120’1’.or.c125’1’) .....
but it is simpler and much more efficient to say: c181=or(c110,c112,c120,c125) if (c181’1’) ...
especially if you will need to refer to the contents of these columns again later on in the edit. This facility may also be used to simplify what would otherwise be complicated filter conditions in the tabulation section.
8.2 Adding codes into a column Quick Reference To add codes into a column in addition to those that are already there, type: emit cn1’codes1’ [, cn2’codes2’ ... ]
The emit statement inserts codes into a column leaving the original contents intact. Its format is: emit cn’p’ Suppose we have: ----+----7 4 5 &
and we write emit c567’3’ we will have: ----+----7 3 4 5 &
102 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
More than one column may be entered on each line, provided that each one is separated by a comma. emit c567’7’, c110’2’, c(t5+6)’7’
✎ emit can only be used with single columns; string variables are not valid: emit c(109,110)$99$ does not work.
8.3 Deleting codes from a column Quick Reference To delete selected codes from a column, type: delete cn1’codes1’ [, cn2’codes2’ ... ]
The delete statement is the opposite of emit in that it deletes codes from a column leaving the remainder intact. Its format is: delete cn’p’ Suppose we have: +----1----+ 5 6 8 9
and we write delete c110’5’ we are left with: +----1----+ 6 8 9
More than one deletion may be effected with the same delete statement as long as each column is separated by a comma. delete c110’5’, c(t1+3)’6’, c179’56’
Changing the contents of a variable – Chapter 8 / 103
Quantum User’s Guide Volume 1
8.4 Forcing single-coded answers Quick Reference To force single-coding of multicoded columns, type: priority cn’code1’, ’code2’ ,’code3’,[cn2’code1a’, ’code2a’ ,’code3a’, ... ] where a code at the start of the list should be accepted in preference to any later in the list.
Sometimes when you are cleaning your data you will come across a column which is multicoded when it ought to contain only one code. You can either print out the record and change the incorrect codes later or you can have Quantum do it for you automatically. When data is to be corrected automatically, you will need to write a statement saying which codes should be discarded and which are to be kept. Obviously, there can be no hard and fast rule since the codes may vary between questionnaires, so what you may do is assign each code a priority so that when a certain code is found Quantum knows that all others in that column are to be deleted. The statement used for this is: priority cn’code1’, ’code2’, …’coden’,[cn2’code1a’, ’code2a’ ,’code3a’, ... ] where cn is the column whose codes are to be checked and ‘code1’ to ‘coden’ are the positions to check, entered in order of priority, the most important first.
✎ priority checks only the listed positions; if any other codes are present they are ignored. Let’s work through an example to clarify this. Suppose one of the questions in a survey asks respondents to give their overall opinion of a product, rated on a scale of 1 (Poor) to 5 (Excellent). You have been told that if the question has accidentally been multicoded you are to assume that the higher rating is correct and delete the lower rating from the column. You will not know beforehand exactly what multicodes there are, if any, but you will know the column and the possible codes it may contain, and also that low codes should be discarded in favor of high ones. If this question is coded into column 249, you could write: priority c249’5’, ’4’, ’3’, ’2’, ’1’
This causes Quantum to scan column 249 to see first whether it contains a ‘5’ and, if so, to delete all subsequent codes in the list. If c249 contains a ‘5’ and nothing else, obviously there will be no extra codes to delete; this does not matter. If there is no ‘5’ in c249, Quantum then checks whether it contains a ‘4’; if so, any other codes in the range ‘1/3’ are deleted, otherwise the program skips to the next code in the list and checks for that. If none of the listed codes are found, the column remains unchanged.
104 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
If our first record has c249’53’ Quantum will give us c249=’5’, but if the second has c249’942’ we will end up with c249’94’; the ‘9’ has not been removed because it was not one of the named positions. You can also use priority to force a field to be single-coded simply by listing the columns and codes to be checked in order of importance. If a listed code is found in the first column, any other listed codes will be removed from that column, as will any that appear in subsequent columns. For example, if our record is: -----+----6 22 3 5
and we write: priority c55’2’, ’3’, ’4’, c56’1’, ’2’, ’3’, ’4’, ’5’
this gives us: -----+----6 2
However: -----+----6 22 3 &
would become
-----+----6 2&
Changing the contents of a variable – Chapter 8 / 105
Quantum User’s Guide Volume 1
In the previous example, we have named two different columns on the same priority statement because together they form a field which must be single coded overall. If you want to force two completely separate columns to be single-coded, you must write two priority statements, one for each column. If our data is: +----3----+ 21 33 6
the statement priority c129’1’,’2’,’3’,c130’1’,’2’,’3’
will give us: +----3----+ 26
but: priority c129’1’, ’2’, ’3’ priority c130’1’, ’2’, ’3’
results in: +----3----+ 21 6
106 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
8.5 Setting a random code in a column Quick Reference To choose a random code from a list of codes, type: data_var_name=rpunch(’codes’) To choose a random code from the codes present in a column, type: data_var_name=rpunch(col_number)
Occasionally you may wish to set a random code into a column, perhaps because the code in that column is incorrect. To do this, write: cvar = rpunch(’p’)
where cvar is the column into which one of the codes ‘p’ is to go. For example: c115 = rpunch(’1/5’)
will place one of the codes 1 through 5 in column 115. Alternatively, you may use rpunch with another C-variable, thus: c115 = rpunch(c120)
Once this statement has been executed, column 115 will contain one of the codes present in column 120.
Changing the contents of a variable – Chapter 8 / 107
Quantum User’s Guide Volume 1
8.6 Reading numeric codes into an array Quick Reference To set up an array based on numeric codes in the data, type: field array_name=column_spec [,$code$=cell_number, ...] column_specs are references to the fields containing the numeric codes. code is a non-numeric code present in those fields and cell_number is the cell of the array which should be incremented whenever that code is encountered. Cells in the array are reset to zero at the start of each new record. To prevent this happening, enter the statement name as fieldadd rather than field. The rest of the statement is as shown.
On some studies you will find responses which are represented by numbers rather than codes. There are various methods of checking and tabulating these responses. Which one you use depends on whether you want to know the number of respondents whose record contains a given code in a field or group of fields, or the number of times a code appears in a group of fields. To illustrate this, let’s suppose the question and response list in the questionnaire are as follows: Q6A: Which films did you see on your last three visits to the cinema? (12-13)
(14-15)
(16-17)
01 02 03 04 05
01 02 03 04 05
01 02 03 04 05
Columbus ................... Aliens 3 ................... Pretty Woman ............... Green Card ................. Batman 2 ...................
If you want a table which shows how many people saw each film, one way of tabulating this data is to use a fld statement in the axis which tells Quantum which columns to read and which codes represent each film.
☞ For information about the fld statement, see section 4.3, ‘Responses with numeric codes: fld’ in the Quantum User’s Guide Volume 2.
Another way is to use a combination of field in the edit and bit in the axis. This is particularly efficient if, rather than wanting to count the number of people who saw each film, you want to count the number of times each film was seen.
108 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
The field statement counts the number of times a particular code appears in a list of fields for each respondent. It stores these counts in an integer array that consists of as many cells as there are fields to count. In the films example, the array will have five cells. Cell 1 will hold the number of times code 01 appears in the fields c(12,13), c(14,15) and c(16,17). If the respondent saw Green Card then Batman 2 and then Green Card again, his/her data will be: 1----+----2 040504
Cell 4 (Green Card) of the array will be set to 2, and cell 5 (Batman 2) of the array will be set to 1. You can then tabulate the contents of this array using a bit statement in the axis. The format of the field statement is: field output_array = column_specs [,special_specs] output_array is the name of the array in which you wish to store the counts of responses. You can use spare columns in the C array, but you may find your program is easier to read if you define an integer array of your own with a name which reflects the type of information it contains. For example, if you want an integer array called films, you might write: int films 5s ed field films = .....
When you define the integer array, make sure that you request as many cells as there are codes in the data. In this example there are five films so you define the array as having five cells. Quantum automatically creates an extra cell (cell 0) which it uses to count responses for which there is no cell allocated. If there were six films, for example, Quantum would increment cell 0 each time it found code 06 in the films columns. You might like to check the value of this cell as a means of reporting on invalid codes: if (films0 .gt. 0) write c(1,20) $Bad film code$
Negative and zero values also cause cell zero to be incremented. Codes which are shorter than the field width are accepted as long as they are left-padded with blanks or zeros. Codes which are shorter than the field width and which are right-padded with blanks only increment cell zero. The input_specs part of the statement defines the columns to read. You have a number of choices here. First, you may list each column or field reference one after the other, separated by commas. The list must be enclosed in parentheses. In our example this would be: field films = (c(12,13), c(14,15), c(16,17))
Changing the contents of a variable – Chapter 8 / 109
Quantum User’s Guide Volume 1
Second, if you have sequential fields as you do here, you can type the start columns of each field followed by the field length. The list of start columns is separated by commas and enclosed in parentheses, and the field length comes after the closing parenthesis and starts with a colon. If you use this notation for the film example you would write: field films = (c12, c14, c16) :2
If you wish, you can abbreviate this further by typing just the start columns of the first and last fields, followed by the field length. field films = c12, c16 :2
Third, if the fields are not sequential, you list the start columns and field width of each group of columns (as shown above) and separate each group with a slash. For example, to read data from columns 12 to 17 and 52 to 57, with each field being two columns wide, you would type: field films = c12, c16 / c52, c56 :2
This reads c(12,13), c(14,15), c(16,17), c(52,53), c(54,55) and c(56,57). You can also use this notation for single non-sequential fields. For example: field films = c23 / c36 / c71 :2
means c(23,24), c(36,37) and c(71,72). The special_specs part of the statement is optional. You use it when a field contains non-numeric codes such as $&&$ for None of these films. If you want to count codings of this type, you must remember to allocate cells in the array for each code or group of codes you wish to count. You then include the notation: $code$ = cell_number to count those codes. For example: int films 6s ed field films = (c12, c14, c16) :2, $&&$=6
If you want to count more than one non-numeric code, list each one individually, separated by commas.
✎ To tabulate data counted by a field statement, you use a bit statement which names the integer array you have created and defines the element texts associated with each cell of the array.
☞ For further information about the bit statement, see section 4.4, ‘Responses with numeric codes: bit’ in the Quantum User’s Guide Volume 2.
110 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
Quantum normally resets the cells of the integer array to zero at the start of each record. If you want counts to continue from one record to another, use a fieldadd statement instead of field. For example: fieldadd films = (c12, c14, c16) :2
✎ The advantage of using field or fieldadd is that they automatically count the number of times a code appears in a list of fields. If you want a table which uses this information, you just tell Quantum to increment the counts in the table by the values stored in the appropriate cells of the array. You can also manipulate the values stored in the cells before you tabulate the data. For example, if you had codes for Aliens 1, 2 and 3, you might wish to merge them into a single cell for all Aliens films so that the tabulation spec is easier to write.
8.7 Clearing variables Quick Reference To remove values from variables, type: clear var_name1, var_name2, var_name3
Variables of any type may be cleared using a clear statement: clear var1, var2, .... varn where var1 to varn are any valid Quantum variable or range of variables. For example: clear c(109,180), t(1,200), myarray(29,33), myint, myreal
Data variables are reset to blank, integer variables are reset to 0 and real variables are reset to 0.0. Variables can also be cleared using assignment statements (e.g., t1=0), but there are advantages to using clear instead. Firstly, clear is much easier to write. Secondly, with clear the compiler checks that the subscripts are in the correct range (e.g., 1 to 33 if ‘myarray’ has only 33 cells); this is not possible with the loop method because the subscript is a variable. However, if you use variables as subscripts with clear (e.g., clear c(t1,t1+5) subscript checking once again cannot be done.
Changing the contents of a variable – Chapter 8 / 111
Quantum User’s Guide Volume 1
8.8 Checking array boundaries in assignment statements Quick Reference To prevent Quantum from checking array boundaries during a run, type: nobounds at or near the start of the edit.
Quantum normally terminates if it detects that you are writing beyond the end of an array. For example: int number 10s ed do 5 t1=1,12,1 number(t1)=c(132,135)*t1 5 continue
Here, we have defined an integer array called ‘number’ as having 10 cells. When Quantum reads the assignment statement and detects that it refers to ‘number(11)’ it will terminate because there are only 10 cells in the array, not 11. The same would be true for statements which referred to, say, t201 when the size of the T array had not been extended past the default of 200 cells. The exceptions to this are emit, delete, partial column moves and reads from fetch files.
☞ emit, delete and partial column moves are discussed earlier in this chapter. For further information about fetch files, see ‘The fetch statement’ in chapter 13, ‘Using subroutines in the edit’.
While they may save you time in the long run, these checks do mean that your job will run slightly slower than it otherwise would. If you wish to run without these checks, insert a nobounds statement near the start of the edit.
112 / Changing the contents of a variable – Chapter 8
Quantum User’s Guide Volume 1
8.9 Assigning values to T variables in the data file Quick Reference To assign a value to a T variable in the data file, type: *set tn = value
You may use a *set statement in the data file to assign a value to a T variable. Its format is: *set tn = value where n is a number between 1 and 200 (unless you have increased the number of T-variables). The statement must start in column 1. You may type ‘set’ in upper or lower case, and may follow it with any number of spaces. If Quantum reads anything that it cannot interpret as a T variable, it terminates the run immediately. This facility is available in all jobs with or without levels (trailer cards). You may use it as many times as you need throughout the data file to assign different values to the same T-variable, or to assign different values to a number of T-variables.
Changing the contents of a variable – Chapter 8 / 113
9 Flow control Statements in the edit section are usually dealt with in the order in which they occur in the program. Quantum provides statements which may be used to alter this normal order of execution, for example, by missing out a statement or repeating a group of statements a number of times.
9.1 Statements of condition – if Quick Reference To define statements to be executed if a certain condition is true, type: if (logical_exp) statement1[; statement2; ... ]
The if statement has exactly the same meaning as in English; it defines a statement whose execution depends upon the value of a logical expression. Let’s first take an English sentence to explain this: we might say ‘If it is raining, I will take my umbrella’. Here, the statement is ‘I will take my umbrella’ and it depends upon the logical expression ‘It is raining’. If the expression is true (i.e., it is raining), the statement is executed (I take my umbrella), if it is false (no rain) it is ignored (I don’t even think about my umbrella). Now let’s take a Quantum sentence. We have a shopping survey in which respondents have been asked to name the supermarkets in which they shop at least once a week. These responses are coded into column 21 of card 1, and we want to keep a count of the number of respondents shopping in Safeway (code 4). Our sentence would say ‘If column 21 contains a 4, increment our counter by 1’. A Quantum if statement consists of three items: 1.
The word if.
2. The logical expression whose value controls the action to be taken, enclosed in parentheses. 3. The statement(s) to be executed if the expression is true.
☞ For further information about logical expressions, see section 5.2, ‘Logical expressions’. Thus, to translate our sentence into the Quantum language, we would write: if (c121’4’) safe=safe+1
Flow control – Chapter 9 / 115
Quantum User’s Guide Volume 1
Here is another statement: if (numb(c10,c11,c12).gt.3) emit c20’9’
The logical expression to be tested states that the number of codes in columns 10, 11 and 12 is greater than three. If it is true, and there are, say, 5 codes altogether in those columns, we will add a 9 into column 20 in addition to what is already there. On the other hand, if it there are 3 or fewer codes in that field we leave column 20 as it is and continue with the statement on the line immediately after the if. For instance:
+----1----+----2----+ 621 0 / 4
yields
+----1----+----2----+ 621 0 / 4 9
but:
+----1----+----2----+ 21 0 / 4
yields
+----1----+----2----+ 21 0 / 4
Once the emit statement has been executed, Quantum continues with the statement on the next line. The statement to be executed if the expression is true may be any Quantum statement, even another if. For example: if (c130’1’); if (c131’9’) c181’19’
says ‘if c130 contains a ‘1’, and then if c131 contains a ‘9’, then put the multicode ‘19’ in c181’. This statement is not incorrect, but it can be more efficiently written as: if (c130’1’.and.c131’9’) c181’19
The if keyword may be followed by a whole series of statements as long as each one is separated by a semicolon. These statements will then be executed in the order in which they appear. For example: if (t4.le.5) c235’45’; emit c567’2’; delete c789’0’
This says, if the value of t4 is less than or equal to 5, put the multicode ‘45’ in column 235 overwriting whatever is there already, then add a ‘2’ into column 567 and, finally, remove the ‘0’ from column 789.
116 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
✎ You cannot switch missing values processing on or off with an if statement. A missingincs statement is always executed wherever it appears in the edit. This means that although the compiler will accept statements of the form: if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a missingincs 0 statement is read. It does not switch on missingincs selectively for only those records that satisfy the expression defined by the if clause.
☞ For further information about missingincs, see section 12.6, ‘Missing values in numeric fields’.
9.2 Statements of condition – else Quick Reference To define statements to be executed if a given condition does not exist, type: if (logical_expr) statement(s); else; statement1[; statement2; ...]
In Quantum the keyword else means ‘otherwise’. In English we would say ‘If it’s raining I’ll take the car, otherwise I’ll walk’; in Quantum we write: if (expression) statement(s); else; statement(s) This says, if the expression is true, execute the statements immediately after the if, but if it is false, execute those following the else. For example: if (c76’4’) t3=1; delete c76’3’; else; t3=2; emit c77’2’
Here, if c76 contains a ‘4’, t3 is set to 1 and a ‘3’ is deleted from c76. However, if c76 does not contain a ‘4’, t3 is set to 2 and a ‘2’ is added into c77. The else keyword may only be used as part of an if statement and must be separated from the if by at least a semicolon. Statements of the form: if (c115’1’); else; emit c140’&’
are correct, but since action is only required if the expression is not true, it is more usual to write: if (c115n’1’) emit c140’&’
Flow control – Chapter 9 / 117
Quantum User’s Guide Volume 1
9.3 Routing around statements Sometimes your Quantum program will include statements which refer to certain respondents only; for instance you will only want to check the data associated with a particular brand of soap powder if the respondent bought that powder. These statements may be routed over when the respondent does not buy the powder by using the go to (or goto) statement, followed by a statement number. The statement: if (c121n’1’) go to 50
causes Quantum to go immediately to the statement labeled 50 if column 121 does not contain a ‘1’ (for example, the respondent did not buy Brand A soap powder). Any statements between this if statement and statement 50 are ignored whenever a record is read where c121n’1’ is true. The statement labeled 50 may be any Quantum statement, but many people just write: 50 continue
to gather all respondents together before continuing through the rest of the program. This statement is described in the next section. All labels must be attached to statements: a label by itself is an error and Quantum will tell you so. You may route forwards or backwards in your program, but when routing backwards, take care that you are not creating a situation from which it is impossible to escape: the following will go on and on forever if you let it: 10 t1=t1+1 - - other statements - go to 10
The only way to avoid situations like this is to make sure that somewhere between statement 10 and go to is another statement that routes you past the go to at some time, for example: 10 if go 15
t1=t1+1 - other statements - (t1.gt.10) go to 15 to 10 continue
Do not write statements of the form: if (c73’7’) go to 10; print $errors$
because any statements after the go to will never be executed.
118 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
9.4 continue Quick Reference Attach the keyword: continue to a label to mark a place in the edit.
This statement is a dummy statement whose sole purpose is to join various bits of a program together. It is often used with a statement label as a destination for routing with go to, or to identify the end of a loop.
☞ To find out more about using continue with loops, see ‘do with individually specified numeric values’ in the following section.
9.5 Loops Quick Reference To define a set of repetitive statements, type: do label_number int_variable=value_list statements label_number statement
Loops are extremely important structures because they enable the same set of basic statements to be executed over and over again on a changing series of numbers, columns or codes. Their use can reduce the work involved in checking data. The statement which introduces a loop is do which is formatted as follows: •
The word do.
•
A label number identifying the last statement in the loop.
•
An integer variable (for numbers or columns) or a letter (for codes) whose value is to be used by the statements in the loop.
•
An equals sign.
Flow control – Chapter 9 / 119
Quantum User’s Guide Volume 1
•
A list of whole numbers, integer variables or codes which are the values the integer variable or letter is to take. These may be entered in two ways (see below).
Loops should be terminated by any statement other than go to, stop, return, another do or an if containing any of these words. The main purpose of the terminating statement is to identify the end of the loop and send the program back to the start of the loop. Go to and return send the record elsewhere, stop terminates the run and another do indicates the start of another loop. The statement most often used to terminate a loop is the dummy statement continue. Any statement that terminates a loop must be preceded by a label number.
☞ For information about the return statement, see section 9.7, ‘Jumping to the tabulation section’.
Thus, the usual format of a loop is: do label.number int.var = value list - - statements to be executed - label.number statement We will now go on to discuss the various ways of defining the values in the value list.
do with individually specified numeric values Quick Reference To define a loop to be repeated for a set of given values, type: do label_number int_variable=(val1,val2, ... )
The simplest way to define the values for the loop is to list them individually. In this case, values must be whole numbers, separated by commas with the whole list enclosed in parentheses. For example: do 20 t5 = (125,130,140,145) if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $ 20 continue
Before we discuss what this loop is doing, let’s look at the way it has been written. The do statement tells us three things, namely that the loop is terminated by the statement labeled 20, the integer variable to be used is t5, and the statements within the loop are to be repeated four times (there are four values in the list). The statement labeled 20 is continue which just sends Quantum back to do.
120 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
The purpose of this loop is to check whether the contents of four fields are greater than 3000, and if so to reset those columns to blank. The first time through the loop, t5=125. When substituted into the if statement it yields: if (c(125,129).gt.3000) c(125,129)=$ $
The next statement is continue which sends us back to the top of the loop. t5 is now pointing to the second value in the list, 130. The if statement reads: if (c(130,134).gt.3000) c(130,134)=$ $
This process is repeated until t5 has taken all values in the list. There is no need to include statements which check the value of t5 and jump out of the loop when the last value is reached: Quantum keeps a count of how many values there are and it knows that once the last value has been reached it should continue with the statements following the loop.
do with numeric ranges Quick Reference To define a loop which will be executed for a range of values, type: do label int_var=start_val1, end_val1, [inc_val1][start_val2, end_val2, [inc_val2]] If the incremental value is 1 and the loop has one range only, the incremental value may be omitted.
Sometimes there will be a pattern to the numbers in the list: for example, they may increase in steps of 5. You may list them all individually if you prefer, but it is quicker to enter them as a range with a start, end and incremental value (in our example, 5) separated by commas. The start value must be smaller than the end value, and the increment must be positive. Quantum checks the start and end values and if the start is larger than the end value, the statements inside the loop will not be executed at all. If the increment is negative, the loop will be executed for the start value only. do 20 t5 = 125,145,5 if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $ 20 continue
This loop is very similar to that used in the previous section. It will be executed for all values of t5 between t5=125 and t5=145 where the value is incremented by 5 each time. The loop says: if if if if if
(c(125,129).gt.3000) (c(130,134).gt.3000) (c(135,139).gt.3000) (c(140,144).gt.3000) (c(145,149).gt.3000)
c(125,129)=$ c(130,134)=$ c(135,139)=$ c(140,144)=$ c(145,149)=$
$ $ $ $ $
Flow control – Chapter 9 / 121
Quantum User’s Guide Volume 1
You may enter as many range specifications as you like on one line, as long as each one is separated by a slash (/): do 15 t1 = 25,35,2 / 50,62,3 if (numb(c(t1).gt.1) c(t1)’ ’ 15 continue
This loop replaces eleven if statements: t1 will take the values 25, 27, 29, 31, 33, 35, 50, 53, 56, 59 and 62. If the loop has only one range, and the incremental value is 1, the 1 may be omitted. If t3=11 and t4=15: do 15 t2 = t3,t4 if (numb(c(t2).gt.1) c(t2)’ ’ 15 continue
checks that columns 11, 12, 13, 14 and 15 each contain no more than 1 code. If not, the column is reset to blank.
do with codes Quick Reference To repeat a set of statements for all codes in a given range, type: do label ’variable’ = ’code1’,’code2’ To repeat a set of statements for each of a given list of codes, type: do label ’variable’ = (’code1’,’code2’, ... )
Sometimes you will want to repeat a statement or set of statements for a given set of codes, rather than for columns or other types of variable. The way to do this is to write a do statement which, instead of naming an integer variable and whole numbers, defines a list of codes and a temporary variable which points to each code in turn. When you want to refer to the current code, you simply enter the name of the temporary variable and Quantum will substitute the value of the current code in the statement before it is executed. The format of a do statement for codes is therefore: do label.num ’var.name’ = ’p1’, ’p2’ to execute statements for all codes in the range ‘p1’ to ‘p2’, where the sequence of codes is &-01234567890–&;
122 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
or: do label.num ’var.name’ = (’p1’, ’p2’, ’p3’, ... ) to execute statements for the listed codes only. In both formats, note that the variable name and the codes must all be enclosed in single quotes. Additionally, you may not use the notation ’ ’ to indicate a blank code, nor may you use the temporary variable in partial column moves (that is, in statements of the form c(1,4)=c(3,6)). Here is an example which illustrates how to check for certain codes in a series of columns: do 10 ’code’ = (’1’,’3’,’5’) if (c110’code’ .or. c111’code’) emit c180’code’ 10 continue
This loop is executed three times, once for each of the three listed codes. The first time the loop is executed, the statement will read: if (c110’1’ .or. c111’1’) emit c180’1’
The second time it will be: if (c110’3’ .or. c111’3’) emit c180’3’
and the third time it will be: if (c110’5’ .or. c111’5’) emit c180’5’
Nested loops Loops may contain other loops: this is called nesting. Loops may be nested up to six levels deep, but they must not overlap. Also, each loop must have a separate terminating statement. In other words, they must always take the form: do 60 t2 = do 70 t3 = do 80 t4 = . . 80 continue 70 continue 60 continue
or
do 60 t2 = do 70 t3 = . 70 continue do 80 t4 = . 80 continue 60 continue
Flow control – Chapter 9 / 123
Quantum User’s Guide Volume 1
Routing with loops It is possible to route from inside a loop to outside, but not from outside to inside. The following is permissible: do 150 t1 = 125,145,5 if (numb(c(t1)).eq.1) c189’1’; go to 76; else; c(t1)’ ’ 150 continue 76 continue
What we are saying in this loop is that if a given column specified by t1 is single-coded (i.e., contains one code only) we set a spare column equal to 1 and send the record out of the loop. If not, we set the column being checked to blank and return to the top of the loop to get the next value of t1. This process continues until a single-coded column is found, or until all values of t1 have been tried. What is not permissible is: if (c176’3’) go to 76 . . do 150 t1 = 125,145,5 76 if (numb(c(t1)).eq.1) c(t1+1)’&’ 150 continue
If c176’3’, the program would jump into the middle of the loop and have an unidentified value for t1. An error message will be printed under the offending statement.
9.6 Rejecting records Quick Reference To reject a record from the tables, but to include it in the rest of the edit, type: reject [level_name] In a levels job, include a level name to reject all data at the given level.
Normally all records are passed straight from the edit to the tabulation section regardless of whether or not they contain errors. The reject statement tells Quantum to continue editing the record but not to include it in the tables. The record is also rejected from the weighting and where split is used, it is rejected from the clean file and may be found in the dirty file.
124 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
For instance, we might write: if (c73’8’) reject if (c80’1’) t5=t5+1 end
to reject records in which column 73 contains an ‘8’ from the tabulations but not from the rest of the edit. Therefore, even if c73’8’, the record is still checked for a ‘1’ in column 80 and if one is found, t5 is incremented. Whenever a record is rejected the variable rejected_ becomes true. You may use this variable in your program to deal with rejected records in a different way to accepted records. For instance, we may wish to write all rejected records out in the file rejfil for later inspection and correction: if (rejected_) write rejfil
The variables rec_rej and rec_acc count the total number of records rejected and accepted so far. You may wish to check these variables and terminate the run if too many records are rejected.
☞ There is an example of how to do this in section 9.9, ‘Canceling the run’ below. If you are working with hierarchical (levels or trailer card) data, reject at a given level will reject all data at that level. Additionally, data at a level higher than that currently being edited may be rejected from tables — for instance, in the edit of data at the item level, you may reject all data at person level. The syntax for this is: reject levelname where levelname is the name of the parent level to be rejected.
✎ When used with split, reject at any level rejects the whole record from the clean file. ☞ For more information about levels data, see chapter 3, ‘Dealing with hierarchical data’ in the Quantum User’s Guide Volume 3.
Flow control – Chapter 9 / 125
Quantum User’s Guide Volume 1
9.7 Jumping to the tabulation section Quick Reference To send the record to the tabulation section, type: return
The word return in Quantum bears no relation to the same word in English. It does not mean go back to the start of the edit or anything like that, rather it means ‘terminate the edit immediately and jump to the tabulation section’. Once the record is tabulated Quantum reads in another record as usual. If there is no tabulation section, the next record is read in straight away. The return keyword is often used with reject to reject a record without finishing the edit. For example: if (c73’8’) reject; return if (c80’1’) t5=t5+1 end
Here any records in which c73’8’ are rejected from the tables, but, because reject is followed by return which sends records to the tabulation section, editing is terminated immediately. Thus, only records in which c73n’8’ will be tested for a ‘1’ in column 80. Compare this example with the one in section 9.6, ‘Rejecting records’ above.
✎ Do not put reject after return because it will never be reached. Once the return is read, the edit is terminated immediately and the record is passed to the tabulation section without the rest of the statement ever being read: if (c73’8’) return;reject
All records are tabulated, none are rejected.
126 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
9.8 Stopping the processing of data by the edit Quick Reference To stop editing records and start tabulating records read so far, type: stop [num_times_execute]
On some surveys you may want to run test tables on a few records only. This can be done using the word stop. stop tells Quantum to stop the run and print tables once editing has been completed on the current record. For example, we may want test tables for 50 people who own goldfish, so we set up a counter and terminate the run when it reaches 50: /* gfish counts those owning goldfish if (c113’5’) gfish=gfish+1 if (gfish.gt.49) stop
If we did not wish to restrict ourselves to goldfish owners, and were satisfied with just the first 100 respondents, we could use the reserved variable rec_count in our test and stop when it reached 100: if (rec_count.eq.100) stop
Alternatively, to be sure that we stop when 100 records have been accepted for tabulation, we could write: if (rec_acc.eq.100) stop
When the stop statement is executed, the reserved variable stopped_ becomes true.
Flow control – Chapter 9 / 127
Quantum User’s Guide Volume 1
A variation of stop is stop n where n is the number of times the statement is to be executed. If stop is part of a routing pattern in the edit, it may be necessary to read in more than the n records to execute the statement n times. As an example, here is another way of counting goldfish owners: /* only deal with goldfish owners if (c113n’5’) goto 20 - - other statements - stop 50 /* everyone comes here 20 continue
Here, the stop statement is only executed whenever we find someone who owns a goldfish. We may need to read data for 72 respondents before we reach our target of 50 goldfish owners. When either form of stop is used, editing and tabulation is completed for the respondent at which the condition is fulfilled, and no more records are read. Therefore, if we have to process 72 respondents in order to find 50 goldfish owners, a holecount requested by the edit would include 72 records and errors in those 72 records would be included in the error listings.
☞ To find out how to create holecounts, see section 10.1, ‘Holecounts’.
9.9 Canceling the run Quick Reference To cancel a run, type: cancel [num_times_execute]
The word cancel, which is similar in format to stop, terminates the run immediately, producing tables only for those respondents already passed to the tabulation section. It is often used to halt a run when too many errors have been detected in the data. For instance, to cancel the run when more than 100 errors have been found, we might have: /* if if if
ect is the error counter (c110n’1’) write $error in c110$; ect=ect+1 (c145n’ ’) write $c145 not blank$; ect=ect+1 (ect.gt.100) cancel
128 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
To cancel the run when more than 50 records have been rejected, we could write: if (rec_rej.gt.50) cancel
Alternatively, cancel may be followed by a number indicating that the run should be cancelled when the statement has been executed a specific number of times: cancel 100
cancels the run when this statement has been executed 100 times. As with stop, holecounts and error listings will only contain information about records read prior to the cancellation condition being fulfilled. If 400 records are read before 101 errors are found, we will see the errors for those 400 records.
9.10 Going temporarily to the tab section Quick Reference To send a record temporarily to the tab section, type: process The record is returned to the statement immediately after process. The process statement is similar to return but must not be confused with it. When return is executed, the record is sent on to the tabulation section; after the tables are completed for that record, the program returns to the start of the edit section and the next record is read in. When process is executed, the record is also sent immediately to the tabulation section where it is used in table creation. However, after the record has been tabulated, control is passed back to the edit section to the statement immediately following the word process. The record continues through the edit and any statements after process applicable to the record are executed. At the end of the edit the record is passed through the tabulation section again.
Flow control – Chapter 9 / 129
Quantum User’s Guide Volume 1
The process statement is used when you need to tabulate portions of a record more than once. For example, if our survey asks shoppers about the brands of bread they purchased the last four times they visited the shops, our data may be set out as follows: c134 : Brand purchased first time (’1’=Brand A; ’2’=Brand B; ’3’=Brand C; ’4’=Brand D) c135 : Number of loaves purchased at that time c136 : Brand purchased second time c137 : Number purchased second time c138 : Brand purchased third time c139 : Number purchased third time c140 : Brand purchased fourth time c141 : Number purchased that time Suppose we wish to create a table showing the total number of loaves of each brand bought by all (or selected groups of) respondents during their four trips to the store. The simplest way to do this is to set up an axis of the form: l brd;inc=c135 n23Number of Loaves Bought col 134;Brand A;Brand B;Brand C;Brand D
in the tabulation section, and to write the statement: process
in the edit at the point you want to tabulate the record for the first brand. The next set of edit statements will be: c(134,135)=c(136,137) process
This overwrites the information about the first purchase with information about the second purchase, and the record is processed a second time. The total number of loaves bought on the second trip will be added to the total number of loaves bought on the first trip. The statements continue: c(134,135)=c(138,139) process c(134,135)=c(140,141) process
When we finish, the total number of loaves of each brand bought by all respondents during those four visits will be contained in the relevant cells of the axis.
130 / Flow control – Chapter 9
Quantum User’s Guide Volume 1
In a situation like this we would probably put the process statements in a loop at the end of the edit, although this is not strictly necessary. For example: do 10 t1 = 134,140,2 c(134,135)=c(t1,t1+1) process 10 continue
This performs exactly the same task as the list of statements shown earlier; it is just a more efficient way of writing them.
✎ Be careful if process is the last statement in your edit: the record will be passed to the tabulation section by process and then again by the end statement. If this is not what you want, omit the last process.
☞ For another example of process, see ‘Incrementing tables more than once per respondent’ in chapter 4, ‘More about axes’ in the Quantum User’s Manual Volume 2.
Flow control – Chapter 9 / 131
10 Examining records There are a number of ways of examining your data once it has been read into the C array. You may: •
Produce a holecount showing the total number of codes in each column.
•
Create a frequency distribution reporting the different values found in a column or field of columns.
•
Write out specific records and examine them individually, as discussed in chapter 7, ‘Writing out data’.
10.1 Holecounts Holecounts are used to obtain an overall picture of the data before you write your edit program. For each column they show: •
A distribution of the codes — for example, how many respondents have a 2 in column 56.
•
The density of coding — how many respondents have one, two, or three or more codes in each column.
•
The total number of codes for the whole data file.
There is an example of a holecount on the next page. The first column tells us the columns for which codes are being counted; in this case it is columns 1 to 16 of card 1. The numbers across the top are the individual codes, and the total in the top left-hand corner is the total number of respondents (records): our data has 605 respondents. As you can see, there are two numbers in each cell; an absolute figure and a percentage. The former tells us how many records were found with a specific code in a column and the latter tells us what percentage of the total data that is. For example, there are 169 records with a code 1 in column 14 and this is 27.9% of the total. Similarly, 32 records have a code 4 in column 15 which is 5.3% of the total records. Notice that when the cell total is zero, no percentage figure is printed: this all makes it easier to see the pattern of coding in each column. The four right-hand columns of the holecount show the density of coding in each column. the columns headed Den1 shows the total number of records with only one code of any sort in the column. Den2 is the number of records with two codes in the column, and Den3+ tells us how many records were multicoded with three or more codes in that column. The TOTAL is the total number of codes in that column — that is, the sum of Den1, Den2 and Den3+.
Examining records – Chapter 10 / 133
Visitor Survey - British Museum (Natural History) ALL VISITORS Total = 605 Col & 0 1 2 3 4 5 6 7 8 9 Blank Den1 Den2 Den3+ Total -----------------------------------------------------------------------------------------------------------------------------------------| | 101 | 0 0 605 0 0 0 0 0 0 0 0 0 0 | 605 0 0 605 | 100.0% | 100.0% 1.00 | | 102 | 0 0 605 0 0 0 0 0 0 0 0 0 0 | 605 0 0 605 | 100.0% | 100.0% 1.00 | | 103 | 0 0 0 0 0 0 0 0 605 0 0 0 0 | 605 0 0 605 | 100.0% | 100.0% 1.00 | | 104 | 0 0 0 304 301 0 0 0 0 0 0 0 0 | 605 0 0 605 | 50.2% 49.8% | 100.0% 1.00 | | 105 | 0 0 0 348 257 0 0 0 0 0 0 0 0 | 605 0 0 605 | 57.5% 42.5% | 100.0% 1.00 | | 106 | 0 0 99 99 99 100 100 99 9 0 0 0 0 | 605 0 0 605 | 16.4% 16.4% 16.4% 16.5% 16.5% 16.4% 1.5% | 100.0% 1.00 | | 107 | 0 0 68 59 60 59 60 60 60 60 60 59 0 | 605 0 0 605 | 11.2% 9.8% 9.9% 9.8% 9.9% 9.9% 9.9% 9.9% 9.9% 9.8% | 100.0% 1.00 | | 108 | 0 0 60 61 61 60 60 61 61 61 61 59 0 | 605 0 0 605 | 9.9% 10.1% 10.1% 9.9% 9.9% 10.1% 10.1% 10.1% 10.1% 9.8% | 100.0% 1.00 | | 109 | 0 0 0 605 0 0 0 0 0 0 0 0 0 | 605 0 0 605 | 100.0% | 100.0% 1.00 | | 110 | 0 0 0 341 264 0 0 0 0 0 0 0 0 | 605 0 0 605 | 56.4% 43.6% | 100.0% 1.00 | | 111 | 0 0 0 38 82 96 194 91 55 33 16 0 0 | 605 0 0 605 | 6.3% 13.6% 15.9% 32.1% 15.0% 9.1% 5.5% 2.6% | 100.0% 1.00 | | 112 | 0 0 0 480 125 0 0 0 0 0 0 0 0 | 605 0 0 605 | 79.3% 20.7% | 100.0% 1.00 | | 113 | 0 0 0 2 0 0 0 0 478 0 0 0 125 | 480 0 0 480 | 0.3% 79.0% 20.7% | 79.3% 0.79 | | 114 | 0 0 0 169 436 0 0 0 0 0 0 0 0 | 605 0 0 605 | 27.9% 72.1% | 100.0% 1.00 | | 115 | 0 0 0 17 84 22 32 22 0 0 0 0 436 | 162 6 1 177 | 2.8% 13.9% 3.6% 5.3% 3.6% 72.1% | 26.8% 1.0% 0.2% 0.29 | | 116 | 0 0 0 306 299 0 0 0 0 0 0 0 0 | 605 0 0 605 | 50.6% 49.4% | 100.0% 1.00
Figure 10.1 A Holecount
Quantum User’s Guide Volume 1
Let’s look at column 115. 162 records have one code only in that column; six have two codes and one has three or more codes. The total number of codes in this column is 177, and each card has an average of 0.29 codes in this column. The holecount is the starting place in your search for errors. There are many holecounts in which it is immediately apparent that the presence of certain codes indicates an error. It is also clear whether or not the column should be multicoded.
Creating a holecount Quick Reference To create a holecount, type: count c(start_col, end_col) [$text$] where text is the holecount title.
To create a holecount you will use the count statement: count c(start_col,end_col) [$text$] where text is the heading to be printed at the top of each page. This is optional; if it is omitted the holecount will simply be headed ‘Holecount’. Our example was created by the statement: count c(101,116) $Visitor Survey - British Museum (Natural History) ALL VISITORS$
Quantum itself accepts double quotes in the holecount heading, but the C compiler which processes the code that Quantum creates from your specification does not. Generally, it will issue an error message that refers to a missing ) symbol at the point the double quote occurs. To prevent this happening, precede the double quote with a backslash. For example: count c(101,116) $Demo for \"Quantum User’s Guide\"$
You may count as many or as few columns as you like, as long as the columns to be counted are consecutive: to count, say, columns 135 to 140 and columns 160 to 180 you will need two statements, one for each field. Records are counted at the stage they are when the count is read. If you have previously altered any columns, say, with assignment or emit statements, the count will refer to the columns as they are after the alterations rather than as they were in the original data file. Similarly, any changes which are effected after the count are not reflected in the output.
Examining records – Chapter 10 / 135
Quantum User’s Guide Volume 1
✎ If you place a count statement in a loop, Quantum sums the counts for all the columns in the statement and reports the total number of codes as the count for the first column only.
Filtered holecounts A filtered holecount is one in which only records fulfilling a specific condition are counted. They can be created using the if statement to define the occasions when a record should be counted. For example, suppose we only wish to include male respondents in our holecount. Our statement might be: if (c106’1’) count c(101,108) $Demonstration Survey – Males$
Counting trailer cards Normally, trailer cards of a given type are treated as one card and are counted together. Thus, the number of codes in a column for a particular trailer card contains the sum of all codes found in that column on all trailer cards of the given type (e.g., all cards 2s). You may, however, prefer to produce holecounts on such cards based on their relative position within the group of trailer cards. For example, suppose card 2 is a trailer card and we wish to make a holecount on the third card 2 of each group. In chapter 6 we said that the variable allread2 is true when a card 2 has been read in for the current record, and that it keeps count of the number of card 2s read. So, to produce a holecount for the third card 2, we would write: if (allread2.eq.3) count c(201,280) $Card 2 – Third Card$
We can also create filtered holecounts of trailer cards based on characteristics of the individual cards. Suppose we have a trailer card for each store visited, in which the store is identified in c79. The trailer card is the 5-card. We would write: if (c579’1’) count c(501,580) $Harrods$
Multiplied holecounts Quick Reference To create a multiplied or weighted holecount, type: count c(start_col, end_col) [$text$] c(m_start, m_end) where text is the holecount title and c(m_start,m_end) is the field in the C array containing the multiplier or weight for each record.
136 / Examining records – Chapter 10
Quantum User’s Guide Volume 1
In ordinary holecounts, the cells are simply counts of records: each time a record is read with a specific code in a given column, the relevant cell in the holecount is incremented by one. If 231 records have a 7 in column 79, the figure in that cell will be 231. Holecounts may also be created by incrementing each cell by the value found in a column field in the record. This value is the record’s ‘multiplier’. If the multiplier is 15, and the record has a 6 in column 152, the count for c152’6’ will be incremented by 15 rather than by 1 for this record. You may hear this type of holecount referred to as a weighted holecount because multiplying a record by a given value is the equivalent of weighting it.
✎ If the multiplier is being calculated during the run, it must be placed in the C array using wttran before the holecount is requested.
☞ For further details on weighting and wttran, see section 1.9, ‘Copying weights into the data’ in the Quantum User’s Guide Volume 3.
A multiplied holecount is created using the count statement as shown below: count c(m,n) [$text$] c(x,y) where c(m,n) is the field to be counted, text is the optional heading to be printed at the top of each page, and c(x,y) is the field containing the multiplier for the record. If this field contains a real number, it must be referenced as cx(x,y) otherwise the decimal point will be ignored (for example, 1.5 will be read as 15). The number labeled TOTAL at the top of each page of output is no longer the total number of records in the data file, rather it is the number of records after each record has been multiplied by its multiplier. This is best illustrated by an example. If we are producing a holecount for c(20,30), and of our 50 respondents, 20 have a multiplier of 2.5, 15 have a multiplier of 2.6 and 15 have a multiplier of 3.0, the total at the top of the page will be 134 respondents, calculated as follows: (20 × 2.5) + (15 × 2.6) + (15 × 3.0) = 134 Multipliers may be part of the original data file or they may be calculated during the edit. Both real and integer values are valid, even though the cell counts in the output will always be shown as whole numbers. This does not mean that you lose accuracy with real multipliers. Quantum stores the cell counts with as many decimal places as are necessary until the count is complete, whereupon it rounds all values ending in .49 or less down and all values ending in .5 or more up.
Examining records – Chapter 10 / 137
Quantum User’s Guide Volume 1
For example, we might write: /* House owners have multiplier of 22.4 if (c104’2’) cx(177,180):1=22.4; go to 10 /* Tenants have multiplier of 12.7; /* Others have multiplier of 11.9 if (c104’3’) cx(177,180):1=12.7; else; cx(177,180):1=11.9 10 continue – - other statements – count c(101,180) $Multiplied Holecount – Card 1$ cx(177,180)
The figures used to create the multiplied holecount would then be 22.4, 12.7, or 11.9, depending upon the contents of c104 in each record. Suppose we have 27 home owners (that is, 27 people have c104’2’), the count for a ‘2’ in column 4 of card 1 would be 612.9 (27 × 22.4), which would appear in the output file as 613. Other points to notice are: •
Since we are copying a real number into a field of columns we use the notation cx to refer to the columns and follow them with the number of decimal places required.
•
Because the word count is written in lower case it may start in column 1. If it had been written in upper case it would need to start in a column other than 1 to prevent it being read as a comment.
10.2 Frequency distributions A frequency distribution enables you to inspect the contents of a field of columns containing alphabetic or numeric data. For example, in a shopping survey the price the respondent paid for a bottle of mineral water may be stored in columns 112 to 114. A frequency distribution will tell you how many respondents bought mineral water at a particular price. This is very useful for determining how the values in these fields should be grouped for tabulation, as well as for rough estimates of medians. By default, each distribution has two parts. In the first part, the values in the column field are sorted in alphabetic or numeric order; in the second, they are sorted in rank order, according to the number of times each one occurs in the data. Any multicodes in the field are decoded and the constituent codes are listed. Each distribution shows both absolute and cumulative figures as well as percentages for both. At the end of the alphabetic sort, Quantum prints: •
The number of categories (that is, different values) found.
•
The number of numeric items found.
138 / Examining records – Chapter 10
Quantum User’s Guide Volume 1
•
The sum of factors — that is, the sum of all wholly numeric items (values which occur more than once are counted as many times as they occur).
•
The mean for the numeric items listed (that is, the sum of factors divided by the number of numeric items).
•
Standard deviation for the numeric values listed.
If the field is numeric and the run has missing values processing switched on, fields that are nonnumeric will contain the value missing_. This value is counted as zero by the sum of factors, mean and standard deviation lines of the report. Statements are provided for requesting a frequency distribution sorted in alphabetic or numeric order only.
Creating a frequency distribution Quick Reference To create a frequency distribution sorted in alphabetic and rank orders, type: list c(start_col, end_col) [$text$] where text is the heading to be printed. To produce a frequency distribution sorted in alphabetic order only, type lista instead of list. For a distribution sorted in rank order only, type listr instead of list.
A frequency distribution, as shown in the example on the next page, is created with the list statement, as follows: list c(m,n) [$text$] where c(m,n) is the column field whose contents are to be listed and text is the heading to be printed at the top of each page. If no heading text is given, the heading ‘Frequency Distribution’ is used instead. The list statement, as shown above, produces both the alphabetic and numerically-sorted distributions. To request an alphabetic distribution only, type: lista c(m,n) [$text$] and for a ranked distribution only, type: listr c(m,n) [$text$]
Examining records – Chapter 10 / 139
Quantum User’s Guide Volume 1
Here are some examples: listr c(107,108) $Contents of cols 7 and 8$ lista c(t1,t1+4) $First Set of Car Brands$
The first example produces a frequency distribution of the contents of c(107,108) sorted in numeric order; the second example generates a list of car brands which will be sorted in alphabetic order. Additionally, we are using subscripts to represent the column numbers. If t1 has a value of 36, Quantum will list the values found in columns 36 to 40. The rules for double quotes in the text are the same as for holecounts, that is, you must precede them with a backslash. The list in the diagram below shows a frequency distribution for the column field c(123,125). It was created by the statement: list c(123,125) $PRICE PAID$
Since it was run on a data file containing 200 respondents, the total is 200. Let’s start with the first table — the alphabetical sort. The figures in the column headed ‘string’ are the values found in columns 123 to 125, in this case, the price paid for a bottle of mineral water. The next column (item) tells us how many times each code occurred in those columns — that is, how many people paid each price. We can see the actual number of people and also what percentage of the total sample that is. For instance, 31 respondents paid 111p which is 15.5% of the total (200). The columns labeled cumulative show accumulated totals and percentages for each value found. There are 86 respondents who paid between 111p and 114p, and these are 43.0% of the total respondents. The second table shows exactly the same information presented in rank order, with the most frequently occurring value first. The example shows that this is 212, and that 41 respondents or 20.5% of all the respondents paid 212p for a bottle of mineral water. Unlike count, if list is part of a loop, it will be executed once for each pass through the loop. All values found will be entered in the same list: Quantum does not create a separate listing for each pass through the loop.
140 / Examining records – Chapter 10
Quantum User’s Guide Volume 1
PRICE PAID Total = 200
Alphabetical Sort
string 111 112 113 114 121 122 123 124 211 212 213 214 311 312
item 31 29 17 9 17 21 4 1 3 41 1 3 9 14
cumulative 15.5% 14.5% 8.5% 4.5% 8.5% 10.5% 2.0% .5% 1.5% 20.5% .5% 1.5% 4.5% 7.0%
31 60 77 86 103 124 128 129 132 173 174 177 186 200
15.5% 30.0% 38.5% 43.0% 51.5% 62.0% 64.0% 64.5% 66.0% 86.5% 87.0% 88.5% 93.0% 100.0%
Number of categories = 14 Number of numeric items = 200 Sum of factors = 32218.00 Mean Value = 161.09 Std deviation = 67.97
PRICE PAID Total = 200
Rank Sort
string 212 111 112 122 113 121 312 311 123 211 214 124 213
item 41 31 29 21 17 17 14 9 4 3 3 1 1
cumulative 20.5% 15.5% 14.5% 10.5% 8.5% 8.5% 7.0% 4.5% 2.0% 1.5% 1.5% .5% .5%
41 72 101 122 139 156 170 188 192 195 198 199 200
20.5% 36.0% 50.5% 61.0% 69.5% 78.0% 89.5% 94.0% 96.0% 97.5% 99.0% 99.5% 100.0%
Figure 10.2 A frequency distribution Examining records – Chapter 10 / 141
Quantum User’s Guide Volume 1
Multiplied frequency distributions Quick Reference To create a multiplied or weighted frequency distribution, type: list c(start_col, end_col) $text$ c(m_start, m_end) where text is the frequency distribution title and c(m_start,m_end) is the field in the C array containing the multiplier or weight for each record. If the multiplier contains a decimal point, reference it as cx(m_start,m_end). For a distribution sorted in alphabetic or rank order only, type lista or listr as appropriate instead of list. Creating multiplied frequency distributions is exactly the same as creating multiplied holecounts: list c(m,n) [$text$] c(x,y) As with count, c(m,n) is the column field whose values are to be listed, text is the optional heading to be printed at the top of the page, and c(x,y) is the field containing the multiplier. If the multiplier contains a decimal point, reference it as cx(x,y), otherwise the decimal point will be ignored and, for example, 1.5 will be read as 15. Multipliers may either be part of the original data, or they may be created during the edit, in which case they must be placed in the C array with a wttran statement before the frequency distribution is requested. Multiplied frequency distributions are generally required when you are producing weighted tables and you want to check that you have the correct number of people in each row of a table.
☞ For further information about weighting and wttran, see section 1.9, ‘Copying weights into the data’ in the Quantum User’s Guide Volume 3.
142 / Examining records – Chapter 10
11 Data validation In earlier chapters, we discussed ways of examining the data for a set of records (with count) or for an individual record (with write). In general, however, we want to check the validity of the data for individual records by putting in the edit a set of testing sentences which will tell us not only whether a record contains an error but also what that error is. There are two types of checking sentence. The first involves checking whether a column contains the correct type of coding (single-coding/ multicoding) and whether the codes in that column are valid. Take the question on a respondent’s sex which may be Male, coded c106’1’, or Female, coded c106’2’. c106 must be single-coded because a person cannot have two sexes, and the only codes which may appear in that column are 1 and 2. Any record in which c106 is not single-coded with a 1 or a 2 will be flagged as incorrect. The second type of checking involves making sure that columns whose contents depend on the contents of other columns contain the correct codes. For instance, suppose the questionnaire asks whether the respondent has ever used a particular brand of washing up liquid. The answer is coded into c125 as ‘1’ for Yes and a ‘2’ for No. If the answer is Yes, the next questions concerning price and quality are asked. If c125’2’ indicating that the respondent has not used that brand of washing up liquid, the following columns must be blank. Conversely, if c125’1’, the following columns must be coded according to the codes on the questionnaire.
11.1 require Both tasks listed above can be carried out using if but sometimes they can become very complicated and repetitive. Therefore, Quantum has an additional testing statement, require, specifically designed to increase the efficiency of this checking process.
☞ For more information on the if statement, see section 9.1, ‘Statements of condition – if’. The require statement is used in three different ways: •
Column validation. Tests columns against a given set of characteristics and deals with records not meeting the requirements according to a specified action code.
•
Testing the validity of a logical expression. Tests a logical expression and, if it is true, continues with the next statement. If the expression is false, the record is dealt with according to the given action code.
•
Testing the equivalence of logical expressions. Compares the logical value of a group of logical expressions. If all are true or all are false, the run continues with the next statement, but if the expressions yield a mixture of values the specified error action is carried out.
Data validation – Chapter 11 / 143
Quantum User’s Guide Volume 1
The actions which are carried out when the stated conditions are violated are determined by an error action code defined either in the require statement itself or in a global statement placed at the start of the edit.
☞ For information about the error action code, see ‘The action code’ in the following section. The require statement has three forms, depending upon the function it performs, and these are described in the subsequent sections. Each one must start with the word require which may be abbreviated to r.
11.2 Column and code validation Quick Reference To validate columns and codes, type: require [/code/] condition col1 [,col2 ...] where code is the error action code, condition is the type of coding required, and col1 and col2 are the columns or fields to be tested.
This form of the require statement has four basic parts: 1. The word require or the letter r followed by a space. 2. An optional error action code enclosed in forward slashes. 3. A code defining the type of coding required. 4. The column or columns to be checked, separated by commas. It looks like this: require [/code/] condition ca [,cb, c(m,n)] For example: r /5/ nb c110, c125
Our example checks that columns 110 and 125 are not blank (nb). Any records in which this is not the case are written out to a new file and rejected from any tables that may be produced (/5/). Let’s deal with each of these items separately. 144 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
The action code Quick Reference To define a default error action code, type: rqd number where number is a number between 0 and 7 inclusive.
The action code is a number between 0 and 7 which tells Quantum what to do with records that do not match the required conditions (for example, records which are blank but which should contain codes). The action code may either be entered as a parameter on each require statement or, if it is the same for all statements, on an rqd statement. Action codes are: 0
Print a summary of errors only — records are not listed individually, but a count is kept of the number of records failing each require statement. This is printed out at the end of the run.
1
Reject the record from the tables.
2
Print the whole record in the print file, out2.
3
Print the record and reject it from the tables. This is the default.
4
Write the record to the output data file, punchout.q.
5
Write the record into the output data file, punchout.q and reject it from the tables.
6
Print the record in the print file, out2, and write it into the output data file, punchout.q.
7
Print, write and reject the record.
To write a statement which would print out incorrect records but include them in the tables, we would write: r /2/ ....
Similarly, to have all incorrect records printed in the print file, written into the output data file and rejected from the tables, we would write: r /7/ ....
Data validation – Chapter 11 / 145
Quantum User’s Guide Volume 1
In both cases the action code is part of the individual require statement, but where the same action applies to all requires, it is quicker and more efficient to define the action code on an rqd statement at the beginning of the edit. For instance, if all erroneous records are to be written out and rejected we would write: rqd 5
The default action is to print the record out and reject it from the tables: r /3/ ....
or
rqd 3
and if no action code is defined, this will be assumed.
Checking type of coding Checking with require can be as simple or complex as you like. In this section, we will start with the simplest checks and deal with each extra feature in turn. We will assume, unless otherwise stated, that the error action code is the default Print and Reject (code 3) and will omit it from most of the examples accordingly. The most basic form of the require statement simply checks whether the column or field of columns contains the correct type of code; it does not check the individual codes themselves. Code types may be: b
Blank
nb
Not blank (single-coded or multicoded)
sp
Single-coded (literally, single-punched)
spb
Single-coded or blank
One of these types must follow the word require since it tells Quantum what to check for. All that remains is to say which columns are to be inspected; just list each column or field of columns at the end of the statement. If more than one column or field is defined, each one must be separated by a comma. Here are some examples in which the record to be checked is: ----+----1----+----2----+----3----+----4----+ 002411123481231&*1927235537*&& 1 1 1
The statement: r nb c10, c(25,35)
146 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
checks that columns 10, and 25 to 35 inclusive are not blank — they may contain any number of codes. This record satisfies both conditions so it passes on to the next statement in the edit. The statement: r sp c11, c15, c23, c41
looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are, but if this were not the case (say c11’123’) the record would be printed out and rejected from any tables that may be produced. Additionally, Quantum would tell us ‘Column 11 is 123’.
✎ Be careful when using field specifications with require: the condition applies to each column individually, not to the field as a whole. For instance: r sp c(1,4)
means that each of columns 1, 2, 3 and 4 must contain one code. It does not mean that the field must contain one code overall. To check that a field contains one code only, use numb.
☞ For further information about numb, see chapter 5, ‘Expressions’. Very often some columns on the questionnaire are not used, so you might like to check that all such columns are blank in the data file. In our example, let’s say that columns 51 to 70 are not used. To check that there are no stray codes in these columns we would write: r b c(51,70)
Comments with require Quick Reference To define a message to be printed when a record fails a test, type: r [/err_code/ ] condition columns $message$
When incorrect records are printed out, require automatically prints a short text describing the error. Normally, it tells you what codes were found in the column which is wrong, but if this is not what you want, you may define your own error text by entering it enclosed in dollar signs at the end of the statement. This text will then be printed in place of the default text when errors are found. For example, if c329 is multicoded when it should be single-coded, the statement: r sp c329
will print the whole record and tell us which codes were found in that multicode: Column 329 is 13 Data validation – Chapter 11 / 147
Quantum User’s Guide Volume 1
Instead of being told which codes the column contains, you may prefer to see a message linking the error to a question on the questionnaire. In this case you will need to add your own error text as follows: r sp c329 $q21a not sp$
These texts may be as long or short as you like.
Checking codes in columns Quick Reference To check for specific codes in a column, type: r [/err_code/] condition col1’codes1’ [, col2’codes2’ ... ] where codes1 are the codes to be tested for in column or field col1, and codes2 are the codes to be tested for in column or field col2. Any codes which are present in col1 but are not listed in codes1 are ignored. The same applies to any other column and code pairs listed.
Sometimes it is not sufficient to check just the type of coding, and you will want to know whether the codes found are valid for that column. To do this, we use the information given in the previous section as a base, and add on our first ‘optional extra’. To check whether a column or field of columns contains specific codes, follow the column specification with the codes to be checked, enclosed in single quotes. For example: r /5/ sp c223’1/5’
tells us that column 223 should be single-coded within the range of codes 1 through 5. Any other codes in this column are ignored. Thus, a record in which c223’14’ is incorrect because it contains two of the listed codes, whereas a record in which c223’27’ is correct because it contains only a 2 from the range ‘1/5’. Of course, any record which does not contain a 1, 2, 3, 4 or 5 at all is also incorrect, regardless of whether or not it is single-coded: c223’9’ is just as wrong as c223’789&’.
148 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
Codes may also be defined with all other code types, thus: r /3/ nb c156’2/6’
If c156 does not contain at least one of the codes 2 through 6 (regardless of anything else it may contain) the record is printed out. Column 156 may be multicoded as long as at least one of the codes is within the required range. ----+----6 1 2
and
----+----6 2 7 8
and
-----+----6 2 5 8
are valid, but: ----+----6 9
is not because ‘9’ is not one of the listed codes. Even though it checks for blanks, require b may be followed by columns and codes. You would do this when you are checking that a column is either blank or, if not blank, that it does not contain certain codes. Here’s an example to clarify this: r b c134’1/8’
This statement tells Quantum that column 134 must never contain any of the codes 1 through 8: only ‘09-&’ or blank are acceptable. This is the opposite of r sp and r nb, both of which list valid codes. Any record failing this condition will be printed and rejected via the default action code 3.
Data validation – Chapter 11 / 149
Quantum User’s Guide Volume 1
Exclusive codes Quick Reference To check that a column or field contains no codes other than those listed, type: r [/err_code/] condition col1’codes1’o If col1 contains any codes other than those given in codes1, the test is false.
Now that you know how to check codes, the next thing to discuss is how to check that all other code positions are blank. We have said that statements of the form: r sp ca’p’ accept all records containing only one of the codes ‘p’ in column a, regardless of what other codes are also present. To check that a column contains only the listed codes and nothing else, follow the code specification with the letter o (for only) in upper or lower case. For example, to indicate that c356 must be single-coded in the range ‘1/5’ and that all other positions (‘6/&’) must be blank, you should type: r sp c356’1/5’o
which is the same as: if (c356’6/&’.or.numb(c356).ne.1) write; reject
Any of the following would cause the record to be printed and rejected: c356’34’
c356’59’
c356’8’
c356’ ’
The require statement may define conditions for more than one column. Just follow each column with the code positions to be checked and separate each set with a comma: r sp c164’12-’, c165’1/70’, c166’1/3’, c167’1/9-’, c168’1/5’
Here the columns to be checked are consecutive but have been listed separately because they each have different sets of valid codes. If all columns could be single-coded in the range 1 to 7 we might abbreviate this to: r sp c(164,168)’1/7’ $q10a/e$
since this notation means that each column in the field must be single-coded within the given range rather than that the field as a whole may contain only one of those codes.
150 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
Automatic error correction Quick Reference To define a correction code to be used as a replacement for codes which fail the required condition, type: r [/err_code/] condition col1’codes1’ :’new_code’ new_code is the code or codes to be inserted in col1 if it fails the test condition. Any codes already in that column are overwritten.
As you know, records found to have errors are printed, coded and/or rejected according to the error action code. When the run is finished you will look at these records and, if possible, correct the errors by using the on-line edit or correction file facilities.
☞ For information about on-line editing and the corrections file, see chapter 12, ‘Data correction’.
Occasionally you will know in advance what to do with certain types of error; say, for instance, the respondent’s sex has been miscoded. You may decide or be told to recode this person as a ‘3’ in the appropriate column indicating that the sex was not known. The way to do all this in one go is to write the normal require statement that checks columns and codes, and to follow the code specification with a colon (:) and the replacement code (in this case ‘3’) enclosed in single quotes, thus: r /2/ sp c106’12’ :’3’
Any record in which c106 is not single-coded with either a ‘1’ or a ‘2’ will have the contents of c106 overwritten with a ‘3’. The equivalent using if and an assignment statement would be written: if (numb(c106’12’).ne.1) c106’3’; +write $c106 incorrect$
Once again, the require is shorter and quicker. When working with fields, it is not possible to define replacement strings for the field as a whole. You should, however, note that if a single replacement code is given for a field of columns, any incorrect columns in that field will be overwritten with the replacement code. The correct columns remain untouched.
Data validation – Chapter 11 / 151
Quantum User’s Guide Volume 1
If we have: +----4----+ 1927
and we write c(237,240)’1/5’ :’&’" we will have: +----4----+ 1&2&
✎ If you use this facility, remember that the replacement code is an alteration to the data, and as such is operative only as long as each record is in the C array. If you want to save these modifications you must include a statement in your edit which will write records to another file. Statements which write out new data files are split and write. Alternatively, you can use one of the action codes which writes records to the output data file.
☞ For information about split, see section 12.4, ‘Creating clean and dirty data files’. For information about write, see section 7.1, ‘Print files’.
Defaults in a require statement Quick Reference To define defaults for all columns or fields tested, type: r type [’codes’][o] [:’new_code’] columns The defaults may be overridden for an individual column by following the column with the required coding, only flag and replacement code as usual.
By now you will have guessed that require statements can become lengthy things, especially when specific codes have to be checked, replacement characters defined and error texts entered. In many cases some, if not all, of these items will be common to the majority of the columns listed in the statement; for instance, several non-consecutive columns may have the same set of valid codes. When this happens you may enter these common items at the beginning of the require statement as defaults for that statement. There are several ways of doing this, so let’s take the statement: r spb c127’0/9’o, c129’0/9’o, c131’0/9’o, c133’0/9’o
as an example. This can be more efficiently written as: r spb ’0/9’o c127, c129, c131, c133
152 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
Both statements check whether columns 127, 129, 131 and 133 are single-coded n the range 0 to 9 or are blank. If the − or & codes appear in any of these columns, or if the columns are multicoded, the offending records will be printed and rejected. Defaults defined at the start of a require may be overridden for an individual column or field by following that item with the new specification. For example: r sp ’1/5’ c10, c12, c15, c20’1/3’
tells us that columns 10, 12 and 15 must be single-coded in the range 1 to 5 while column 20 must be single-coded in the range 1 to 3. Here is another example which uses the Only operator: r sp ’1/5’o c10, c12, c15, c20’1/7’, c24
This checks that columns 10, 12, 15 and 24 are single-coded in the range 1 to 5 and that none of the codes ‘6/&’ are present in those columns. Column 20 has its own code specification which overrides not only the default codes but also the Only operator. Quantum will check that c20 contains only one of the codes 1 to 7, but it will ignore anything it finds in the range ‘8/&’. Finally, let’s look at one more statement: r sp ’1/5’o :’&’ c10, c12, c20’1/7’, c24
This is exactly the same as the previous example except that we have added a replacement code to be used when errors are found. This code refers to all columns named with this require, even though column 20 has a different set of valid codes.
11.3 Validating logical expressions Quick Reference To evaluate a logical expression, type: r [/err_code/] (logical_expr) [$message$] The require statement can be used to evaluate a logical expression. If the expression is false, the record will be dealt with according to the specified (or default) action code. If the expression is true, the program continues with the next statement. This type of require also has four parts, two of which are optional: 1. The word require or the letter r followed by a space. 2. An optional action code enclosed in slashes. Data validation – Chapter 11 / 153
Quantum User’s Guide Volume 1
3. A logical expression enclosed in parentheses. 4. An optional error text enclosed in dollar signs.
☞ Items 1, 2 and 4 are exactly as described in section 11.3, ‘Validating logical expressions’ above. For further information about logical expressions, see chapter 5, ‘Expressions’.
For example: r /3/ (c133’4’ .and. c140n’5’) $Cols 33/40 incorrect$
says that c133 must contain a ‘4’ and c140 must not contain a ‘5’. If one or other or both expressions are false, Quantum prints the record out with the message ‘Cols 33/40 incorrect’ and rejects it from the tables. This type of require statement is often used to check the number of codes present in a column or group of columns. For example, if the questionnaire specifies that the respondent should name no more than three products in his answer, you might write: r (numb(c139).le.3)
causing any record in which column 39 is multicoded with more than 3 codes to be printed and rejected. This statement has no error text, so any records printed will be followed by the require statement itself.
11.4 Testing the equivalence of logical expressions Quick Reference To test whether a group of logical expressions all have the same logical value, type: r = [/err_code/] (expression1) (expression2) ...(expressionn)[$message$] There must be a space between r and the = sign.
Require can evaluate groups of expressions and perform given tasks depending on whether all expressions are true or all are false. When all the expressions have the same value (i.e., all true or all false) Quantum continues with the next statement in the program, whereas if some are true and some are false, the record being tested will be dealt with according to the given (or default) error action code.
154 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
This statement has five parts: 1. The word require or the letter r. 2. An equals sign which must be preceded by a space. 3. An optional action code. 4. The expressions to be evaluated, each one enclosed in parentheses. 5. Optional error text enclosed in dollar signs. This type of statement is generally used to check routing patterns. For example: if a ‘2’ in c125 means that the respondent did not try Brand A washing powder, we would expect columns 126 to 145 which record his opinion of it to be blank. On the other hand, if he tried the washing powder, we would expect to find his opinions about it coded in columns 126 to 145. This can be written: r = (c125’2’) (c(126,145)=$ $)
which says that to be accepted, a record must either have a ‘2’ in column 125 and blanks in columns 126 to 145, or something other than a ‘2’ in c125 with at least one code somewhere in c(126,145). The following data is designed to clarify this. ----+----3----+----4----+----5 2 15
----+----3----+----4----+----5 15 42674 262&03 37 73 9 4 0
----+----3----+----4----+----5 2 6 8 15
is accepted, so is
but
is rejected, so is
----+----3----+----4----+----5 3 635
The first example is accepted because both expressions are true, the second is accepted because both expressions are false. The third and fourth expression are both rejected because one expression is true and the other is false. Note that in this example, if column 125 does not contain a ‘2’ we are only checking that columns 126 to 145 contain at least one code; we are not checking whether those codes are correct.
Data validation – Chapter 11 / 155
Quantum User’s Guide Volume 1
11.5 Actions when a require statement fails When Quantum executes a require statement, it sets the variable failed_ to True if the data fails the require statement or to False if the record passed the requirement. You can then test whether failed_ is True and take whatever actions you wish. For example, if you are checking that the respondent’s sex is coded as a ‘1’ or a ‘2’ only, you may wish to blank out the column if it contains any other code or codes. You could write this as: r sp c123’12’ if (failed_) set c123’ ’
The test for failure is made on the last require statement executed for the current record. This may not always be the most recent require statement in the program, and it may not be the require statement you intend Quantum to execute. If you write: r sp c112’1/5’ if (c115’1’) r b c116 if (failed_) set c116’ ’
the test for failure could apply to either of the previous statements. If column 115 does not contain a ‘1’, the second require statement will not be executed and failed_ will be True if column 112 is not single-coded in the range ‘1/5’. If column 115 contains a ‘1’, then failed_ will be True if column 116 is not blank. You can get around this potential problem by setting failed_ to zero (the equivalent of False) just before the require statement you wish to test. For example: r sp c112’1/5’ failed_ = 0 if (c115’1’) r b c116 if (failed_) set c116’ ’
156 / Data validation – Chapter 11
Quantum User’s Guide Volume 1
11.6 Combining testing sentences Require is often part of an if statement saying “If this is true, then that also must be true”. In our previous example with r= we were saying two things: •
if the respondent didn’t try Brand A, the columns associated with it must be blank, or
•
if he tried Brand A, there must be a code in at least one of the associated columns.
This is more efficient than writing: if (c125’2’) r b c(126,145); else; r (c(126,145)u$ $)
which will perform exactly the same test. Sometimes this type of test is too stringent and will reject records in which the data is perfectly correct. For example, the extra questions for people who tried the product may not contain a specific code for Refused or No Answer, so anyone who tried the product but refused to answer the extra questions would have blanks in the relevant columns. This data is perfectly correct but would be rejected by the r= statement which expects at least one column to contain a code. Therefore, we need to write a statement that will only check whether columns 126 to 145 are all blank if the respondent didn’t try the product; if the respondent tried the product we do not care whether he answered the extra questions or not. The statement for this is: if (c125’2’) r b c(126,145) $Incorrect routing$
but we can make this statement more powerful by writing: if (c125’2’) r b c(126,145); else; r spb c(126,145)’1/7’
This says that if the respondent did not try Brand A, all columns associated with it must be blank, but if he tried the product we expect those columns to be single-coded in the range ‘1/7’ or blank. One can also make require statements apply to smaller sets of data by having records for which they would be irrelevant go around the statements. Let’s say c112 records whether there are children in the household. If c112’1’ there are children and c113 and c114 must contain answers. We could write: if (c112n’1’) go to 30 r nb c(113,114) 30 continue
This means that all irrelevant records (respondents without children) would not be tested.
Data validation – Chapter 11 / 157
Quantum User’s Guide Volume 1
✎ This system makes sense when there are several requires and you want to avoid a whole set of identical if statements. It’s more efficient and it’s easier to follow. Remember, as well, that you can put in comments to remind yourself what you are doing and why.
158 / Data validation – Chapter 11
12 Data correction It is always possible to deal with data which has been incorrectly coded and/or entered. If the errors themselves cannot be corrected because correct codes cannot be determined, the incorrect data can be collected under some miscellaneous heading in the tabulations. However, a cleaner data set can be obtained by correcting or removing invalid data whenever possible. There are four ways to correct data: •
Correct the data in the original data file.
•
Correct the data in the C array interactively.
•
Replace the incorrect codes with specific codes using edit forcing statements.
•
Write a file of corrections to be merged with the original data when it is read in by a Quantum program.
Changing the contents of the original data file is not a function of Quantum: you will need to use the data editing program, ded, for this. If you do need to edit the original data file, you should always take a copy of it first in case your editing does not have the desired effect.
☞ For further information about ded, see the SPSS MR Utilities Manual.
12.1 Forced editing (forced cleaning) This section does not introduce any new keywords; instead it tells you how to combine the statements that you already know in order to clean your data. A record which generates too many error messages, or which is clearly incorrect can be removed, as noted. Suppose its serial number is 2004. Then we have: if (c(101,104)=$2004$) reject; return
This rejects the record from the rest of the edit and the tabulation section as well. This statement should be at the beginning of the edit to avoid unnecessary editing of a useless record. Columns within a record can be removed by blanking them out or setting them to a common reject code, often a minus or ampersand.
Data correction – Chapter 12 / 159
Quantum User’s Guide Volume 1
For example: if(c125n’12’) c125’&’; c(126,145)=$ $
All records in which c125 contains neither a 1 or a 2 will have the contents of that column replaced with an ampersand, and whatever is in c(126,145) blanked out. As a real-life example, suppose a 1 in c125 means that the respondent visited the market, and a 2 in that column means he did not. Information about purchases made at the market are stored in c(126,145). If column 125 contains neither a 1 or a 2, we cannot clearly establish whether or not the respondent visited the market so we set c125 to a special code and blank out any information about purchases. Inserting correct data is generally more difficult than removing invalid data, because you very often don’t know what the correct data is. However, if you do know, you can correct the data record by record, or make the same correction for any record which is incorrect. For instance: if(c(101,104)=$2222$) c112’2’; c(113,114)=$ $
corrects the record whose serial number is 2222 by setting a 2 into c112 and blanking out c(113,114). If you do not know what the correct data is, you may decide to replace the incorrect code or codes with a valid code chosen at random. For example: if (c(101,104)=$3625$) c145=rpunch(’1/5’)
replaces whatever was in column 145 with one of the codes 1 through 5 for the record whose serial number is 3625.
✎ When correcting data on a record-by-record basis, it is more convenient to use the methods outlined below.
12.2 On-line data correction Quick Reference To allow interactive correction of errors, type: online [label_number] at the point at which you want to make corrections. label_number is the label of the statement to execute when the record is returned to the main edit with an rt command. The default is to return to the start of the edit.
160 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
On-line correction is a method whereby Quantum interrupts processing when incorrect records are found, so that corrections, if any, may be made interactively. The record may then be re-edited to check for further errors straight away. When an incorrect record is found, the current contents of the C array are written to the print file, out2, as usual, and a message is displayed on your screen indicating the record’s position in the data file. Any messages associated with the write or require statement finding the error are also displayed, and you then have the opportunity to accept the record as it is, reject it, correct it or reedit it. The record itself is not displayed unless you request it. To use this facility, enter the word: online in the edit at the point you want to be able to correct records. You may put in as many online statements as you like, but as long as there is one online statement in the edit, on-line editing will be possible both at the point where the statement occurs and also at the end of the edit. If there are no errors to be corrected, Quantum ignores the online statements. Once an incorrect record has passed through the on-line edit, you may leave it to continue through the rest of the standard edit until it reaches the end statement or you may return it to the start of the edit to be retested. If you prefer, you may name a statement to which records should return simply by giving that statement a label number and following online with that number. For example: online 45
returns records to statement 45 rather than the start of the edit.
✎ Runs containing on-line edits must be run from a terminal rather than in the background until the edit section is finished; otherwise you will not know when there is a record awaiting correction. Any corrections made during on-line editing are effective only during the current run unless your edit contains one of the commands split or write to create a new data file. If your program calls the on-line editor but does not contain split or write, a warning message will be displayed when your program is checked.
Data correction – Chapter 12 / 161
Quantum User’s Guide Volume 1
12.3 On-line editing commands Like any other editor, the Quantum on-line editor has its own set of commands, many of which are similar in appearance and function to statements you would write in a normal Quantum edit. There are three types of editing command: •
Those which determine what happens to the record.
•
Those which correct errors in the record.
•
Those which terminate on-line editing either for the individual record or for the file as a whole.
Displaying columns in the record Quick Reference To display the record being edited, type: di [column(s)]
As we said in the introduction to on-line editing, Quantum displays any messages associated with the write or require statement finding the error, but does not automatically display the record itself. It also displays an arrow prompting you for a command. To display the full record in its current state, type display or di. The whole record is displayed underneath a ruler, as with the write statement. Sometimes it is easier to see the error if you print out the incorrect column or columns separately rather than looking at the whole record. To see a column or field only, just follow the di command with the numbers of the columns you wish to see. For example: di c10
displays column 10
di c(115,130)
displays columns 15 to 30 of card 1
Column fields may be entered as just two column numbers separated by a comma, the parentheses and the C being optional. Thus, the second example could equally well be written: di 115,130
162 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
When a single column is displayed, the individual codes comprising a multicode are shown, but when fields are displayed, a ruler is printed and multicodes appear as asterisks (*). Here is an example: -> di 25,35 +--- 3 ---+ 613*9 2 144 -> di 28 159 ->
In the first example, the asterisk represents a multicode, whereas in the second example where only one column is displayed, the codes 1, 5 and 9 are a multicode in column 28.
Correcting records Quick Reference To overwrite the current contents of a column or field with a new code or string, type: [s] column(s) codes To insert additional codes into a column or field, type: e column(s) codes To delete codes from a column or field, type: de column(s) codes In all cases, columns are defined as numbers only, without c or parentheses.
The words used for correcting records are set, emit and delete which are usually abbreviated to s, e and de. They work in exactly the same way as their counterparts in the ordinary edit section: s overwrites the original contents of a column or field with new information; e appends a single code to the codes that are already in a column and de removes one or more codes from a column leaving the remainder intact.
Data correction – Chapter 12 / 163
Quantum User’s Guide Volume 1
There are many variations of these commands, all of which are equally correct. Just choose the one that you find most convenient. Here are some examples. The first group are set statements for overwriting the contents of a column or field with the given code or string of codes. set c5’7’
s c5=7
s 5=7
s 5 7
set c9’45&’
s c9=45&
s 9=’45&’
s 9 ’45&’
s 123,126=4567
s 123,126 $4567$
set c(123,126)=$4567$
If you want to overwrite a single column with a single code, use one of the four formats on the first line. In all cases you may type in the full command word (set) or the abbreviation (s). All four variations replace whatever is currently in c5 with a code 7. The examples on the second line are for overwriting a single column with a multicode. Notice that if you use the = notation, the single quotes enclosing the multicode are optional. The last line illustrates how to overwrite a field of columns with a string — in this case to replace the current contents of columns 123 to 126 with the codes 4, 5, 6 and 7 respectively. In all on-line set statements you may omit the set or s at the beginning of the command, thus: c5=7
9=’45’
123,126 $4567$
When it comes to adding codes to columns, the on-line editor has an option that the ordinary editor does not. Whereas the ordinary emit statement only allows you to specify single columns, the online editor also allows you to emit strings of single-codes into a field of columns. Thus, the syntax of the on-line emit statement is: emit c321’7’
e 321=7
e 321 7
emit c(16,17)=$77$
e c(16,17) $77$
e 16,17 77
The same notes apply to deleting codes: the online edit allows you delete codes from a single column or a field: delete c123’7’
de c123 7
de c123 7
delete c54’34
de c54 ’34’
de 54 34
delete c(16,17)=$77$
de c(16,17) $77$
de 16,17 77
✎ In all the examples we have just shown, the c, equals sign, single quotes and dollar signs are optional as long as the components of each statement are separated by spaces. Additionally, in assignments, set (or s) is optional.
164 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
Whenever you alter columns with set, emit or delete, the on-line edit checks that the columns you are editing are within the range of the C array for the current job. If you are using the default array of 1,000 cells, c1001 and above are out of range for editing.
Accepting and rejecting records Quick Reference To accept a record whether or not it has been corrected, type: ac To terminate the edit and send the record to the tabulation section, type: rt To reject the record from the tables but continue the edit, type: rj
The following commands may be used to determine a record’s path through the remainder of the edit section and the tabulation section: ac (accept)
Accepts the record up to the point at which the online statement occurs, whether or not it has been corrected. The record continues on through the rest of the edit and will only be re-presented for correction by other online statements or at the end of the edit if other errors are found. Records accepted in this way are written to the clean data file if split or write are used.
rt (return)
Terminates the edit for that record: that is, the record is assumed to have reached the end statement. If split or write has not yet been reached, the record will not be written to the clean data file even though it will be included in any tables produced by the run.
rj (reject)
Rejects the record. The record continues through the edit unless it is terminated with rt. The record is copied to the dirty data file.
Data correction – Chapter 12 / 165
Quantum User’s Guide Volume 1
Creating and deleting cards Quick Reference To add new cards to the output data file, type: ad card_num1[card_num2 ... ] To remove cards from the output data file, type: rm card_num1[card_num2 ... ]
The add command adds new cards to the output data file and rm removes cards from it. To add a card type, type add or ad followed by the number of the card type to be added. If you are adding several different cards at once, separate the card type numbers by spaces. Quantum will then set the appropriate thisread variable to be true so that the new card type will be written out with the rest of the data. Thus: -> ad 3 4
will set thisread3 and thisread4 to be true so that the new cards 3 and 4 will be written out. Each card will contain as many columns as the record length defined for the current run. If the C array already contains data for a card 3 or 4, Quantum issues an error message to this effect. Removing cards is exactly the same, except that the appropriate thisread variables are reset to false to prevent the unwanted cards from being written out. It does not alter the data in your original data file. If you try to delete a card that is not currently in the C array (i.e., the thisread variable is already false) an error message is displayed.
Record editing commands Quick Reference To return the record to the start of the main edit section, type: ed or press the RETURN key.
The edit command (abbreviation, ed) re-edits the record by sending it back to the start of the edit or to the statement number given with online. If no more errors occur, the record is copied to the clean data file. If you prefer, you may hit the return key instead of typing ed.
166 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
Canceling the online edit Quick Reference To cancel on-line editing for the rest of the data file, type: ca
cancel (abbreviation, ca) cancels on-line editing but continues passing records through the standard edit program. Any errors found subsequently are not displayed on the screen for correction, but records are still placed in the clean or dirty files as appropriate.
Redefining on-line edit command names The on-line edit commands we have just described are the defaults which are programmed into Quantum. If you wish, you may redefine these command names or translate them into a language other than English, or define your own abbreviations. You do this in the translatable texts file.
☞ To find out about this file, see section 1.9, ‘Customized texts’ in the Quantum User’s Guide Volume 4.
12.4 Creating clean and dirty data files Quick Reference To write correct records out to a clean data file and incorrect records out to a dirty data file, type: split [only] at the point at which records are to be written out. Type split only if the edit does not alter the contents of the record and you want to copy records directly from the original data file rather than from an intermediate file.
Clean and dirty data files are the terms used to refer to files of correct and incorrect or rejected records created automatically by the edit statement split. Each time a record is read and reaches split, it is written out to the appropriate file in its current state. If any changes have been made with assignment statements, emit, delete, priority, require or the on-line edit, they will be saved in the clean data file if the record is now correct or in the dirty data file if the record still contains errors or has been rejected.
Data correction – Chapter 12 / 167
Quantum User’s Guide Volume 1
Split may occur several times in the edit, but each record will be written out once only. In the example below, the second split is redundant since all records will have been written out by the first one. The data to be checked is: Card 1 Card 2 Card 3 +----5---+ .... 3----+----4 .... +----1----+ 5 2 3
and the program is: r sp c234’1/5’, c309’1/5-&’ :’&’ split if (c146’12’) emit c180’1’; else; reject split
Let’s suppose that the record has reached the require statement without error. Since c234’2’ and c309’3’, the record is correct so it is copied to the clean file. However, when the next statement is read and the contents of c146 are checked, we find that it contains a ‘5’ which means that it must be rejected and should be copied to the dirty file by the second split. This does not happen because it has already been written out by the previous split. For this example to place the record in the dirty file instead, it should read: r sp c234’1/5’,c309’1/5-&’ :’&’ if (c146’12’) emit c180’1’;else; reject split
Split is often used at the end of an edit after online. This causes all records found in error by write and require statements to be offered in the on-line edit for correction and then saved in the clean or dirty file according to the type of on-line commands you use. For example, if a record is flagged as incorrect and you correct those errors, the record will be placed in the clean data file. The same is true if you use ac to accept the record even if you do not make corrections. If you reject the record with rj, the record will be placed in the dirty data file. By putting both statements at the end of the edit you can be sure of seeing all erroneous records and of saving all records in their final state. If some records are rejected from the run using reject;return, these records will not be included in the clean or dirty files unless the data is split before the records are rejected: split if (c132n’1/9’) reject; return
In this example, because split appears in the edit before reject;return, all records will appear in one or other of the clean or dirty files (depending on whether or not they contain errors) even though records in which c132 does not contain any of the codes 1 through 9 have their edit terminated and are rejected from the tables.
168 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
if (c132n’1/9’) reject; return split
Here, because split appears after reject; return, only records in which c132 contains any of the codes 1 through 9 will appear the clean or dirty files. Again, which file the records are written to depends on whether or not they contain errors.
☞ For further information about using reject, see section 9.6, ‘Rejecting records’. For further information about using return, see section 9.7, ‘Jumping to the tabulation section’.
By default, an intermediate data file is created for splitting. The name of this file is clean.q. If the run does not contain statements which alter the data (for example, recoding with assignment statements or creating new columns) then this file will be identical to the original data file. In such cases, you may save disk space during the run by splitting the original data file instead with the statement: split only When we talk about the original data file, we do not mean that Quantum alters your original data file in any way; merely that it reads records directly from this file and allocates them to the clean and dirty files rather than taking a backup copy of this file and reading records from there.
✎ You may not use split only when the datapass reads input from another program (for example, when you use a corrections file to correct records rather than writing a forced edit or using the on-line edit). Instead, you should run Quantum using the corrections file only and write all records to a new data file. Then run the datapass on this new data file. If you do an on-line edit but forget split or write, your changes will not be saved. Also if you have created new cards and have not made thisread true for the new cards (for example, thisread3=1 for a new card 3), they will not be written out. If you use split on a levels (trailer card) job, splitting is switched on for all levels and must therefore be part of the top level edit. Additionally, it must appear once only and must not be part of an if statement. A reject statement at any level rejects the whole record and writes it to the dirty file.
Data correction – Chapter 12 / 169
Quantum User’s Guide Volume 1
12.5 Correcting data from a corrections file Quick Reference To correct data using a corrections file, create a file called corrfile containing statements of the form: serial_number [/trailer_read_number]; correction1 [;correction2; ...] where correction is a statement of the form: command column = codes command is s (set), e (emit) and d or de (delete).
The last method of correcting errors is to create a file of corrections which will be merged with the original data when it is read by a Quantum program. The correction file must exist in the directory or partition in which you will be running your job. Corrections are made by comparing the serial number of the record currently in the C array with the serial number given with each correction in corrfile. Consequently, all serial numbers in corrfile must be in the same order as those in the data file. The format for a correction record is: serial ; corrections for non-trailer card records, and serial /n ; corrections for records containing trailer cards. In both cases, serial is the record serial number and corrections are the corrections to be made. The /n in the trailer card format is the read number defining the trailer card to be corrected; it can be found from the error listing. For example, if our data contains a card 1, three card 2s and a card 3, and we want to correct an error on the third card 2, the read number would be /3 because the third card 2 is read into the C array during the third read. If /n is omitted, the read number is assumed to be 1. Corrections are entered as follows: s cn = ’p’
To overwrite a column.
e cn = ’p’
To add a code into a column.
d cn = ’p’
To delete a code from a column.
de cn = ’p’
To delete a code from a column.
As in the on-line edit, the s and the equals signs may be omitted. If the correction refers to a field of columns, you may define a string of codes in place of a single code. 170 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
Any number of corrections may be specified for a record as long as each correction is separated by a semicolon. The data to be corrected may be a single column or a field, and the corrections may be single-codes or multicodes enclosed in single quotes or strings enclosed in dollar signs. If the data variable is larger than the string it is to contain, the string will be right-justified and padded with blanks. If the string is longer than the data variable, a warning message is issued. Here is part of a sample corrections file: 0010; s c112’1’; e c212’3’ ; c314=’34’ ; de c115’3’ 0123 /4; c224’3’ ; c212’4’ 0246 c(316,318)=$123$ 0555; c(140,180)=’ ’
The first record to be corrected is that with serial number 10. Column 112 is to be overwritten with a ‘1’, a ‘3’ is to be added into column 212, column 314 is to be overwritten with the multicode ‘34’ and the ‘3’ in column 115 is to be deleted. The second correction is to the cards in the C array after the fourth read for serial number 123. Both corrections involve overwriting the original data with new codes.
✎ Correcting data with a corrections file is considerably faster than using a forced edit of the form: if (c(101,103)=$123$) c109’2’
Corrections in corrfile are made before the statements in the edit section of your program are executed. If you are rerunning your previous job to correct errors and you have not altered the edit in any way, you may save more time by telling Quantum to read the data but not to recompile and load your program. This is done with the option –r on the Quantum command line.
☞ For further information about options for Quantum runs, see chapter 16, ‘Running Quantum under Unix and DOS’.
Data correction – Chapter 12 / 171
Quantum User’s Guide Volume 1
12.6 Missing values in numeric fields The term missing values refers to data in numeric fields that is either non-numeric or totally blank. You may find them in data gathered from questions of the type shown below: 1.
2.
3.
Have you ever rented a video? Yes, have rented a video ........ 1 No, have not rented a video ..... 2 Not answered .................... &
(8) (GOTO Q.3) (GOTO Q.3)
How many videos have you rented in the last month? ____________________ (9 - 10) ......... continue with questionnaire
If the respondent replies ‘no’ to question 1 or does not answer it at all, question 2 is not asked and columns 9 and 10 are left blank. If the respondent replies ‘yes’ to question 1 then question 2 should be coded either with a numeric value or, perhaps, with && for a don’t know answer. The blank data and && are missing values. You may also find missing values when a numeric field is incorrectly coded with a combination of numbers and letters. This is usually the result of mistyping when the data is entered and can often be corrected by looking at the questionnaire itself and then cleaning the data within the edit section of the run.
Facilities provided by missing values processing Missing values processing is an optional feature. If you use it, Quantum automatically detects missing values and provides a variety of facilities for dealing with them in both the edit and tabulation sections of your run. In the edit section you have: •
Automatic replacement of missing values with the special value missing_.
•
An ismissing function to check whether a variable has the special value.
•
Manual assignment of the special value missing_ to variables of your choice within the edit.
Switching missing values processing on and off You can use missing values processing in the edit section, in the tabulation section, or both. To switch it on in the edit section, type: missingincs 1 and to switch it off, type: missingincs 0
172 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
You may use these statements any number of times in the edit to toggle between using and not using the missing values features.
✎ The missingincs statement is always executed wherever it appears in the edit. This means that although the compiler will accept statements of the form: if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a missingincs 0 statement is read. It does not switch on missingincs selectively for only those records that satisfy the expression defined by the if clause.
If a job contains an edit and a tab section and missing values processing is used in the edit, the setting of missingincs carries forward from the edit to the tab section. If the edit uses missing values processing but the tab section does not require it, remember to end the edit with a missingincs 0 statement.
Missing values in arithmetic expressions and assignments The general rules for non-numeric data variables in arithmetic assignments are as follows: •
Blanks in an otherwise numeric field are ignored, but totally blank fields are read as zero.
•
&’s in an otherwise numeric field are ignored, but fields full of &’s are read as zero.
•
A – in an otherwise numeric field makes the number negative.
•
Multicodes in an otherwise numeric field are ignored, but a field in which all columns are multicoded is read as zero.
If you switch on missing values processing these rules are modified so that any field that is not totally numeric or a combination of numbers and blanks is counted as missing. Missing values are represented by the special value missing_.
Data correction – Chapter 12 / 173
Quantum User’s Guide Volume 1
Here is a table showing samples of data in a numeric field and the difference missing values processing makes to the way that data is interpreted: Data in numeric field
missingincs 0
missingincs 1
123 13 –10
123 13 –10 zero 1 zero zero 11 zero
123 13 –10 missing_ missing_ zero missing_ missing_ missing_
ABC
1AB 000 &&& 1&1 three blanks
If you print variables whose values are missing_ in a report file or write them out to a data file, Quantum will show their values as −1,048,576 rather than as the word missing_. If an arithmetic expression uses a variable whose value is missing, the value of the expression differs depending on whether or not missing values processing is switched on. If missing values processing is switched on the value of the expression is always missing_. If it is switched off, the value of the expression is always zero. For example, if c(1,3) contains the string ABC: missingincs 1 t1 = c(1,3) * 100
sets t1 to missing_, but: missingincs 0 t1 = c(1,3) * 100
sets t1 to zero.
Manual assignment of the missing value to a variable If you have other values that you want to replace with the missing value in the edit, you may do so by typing a standard assignment of the form: variable_name = missing_ The variable may be an integer, real or data variable.
174 / Data correction – Chapter 12
Quantum User’s Guide Volume 1
Testing whether a variable has the value missing_ Since missing_ is a special value you cannot use statements of the form: if (t4 .eq. missing_) ...
to test whether a variable has the special missing value. Instead, use the function: ismissing(variable_name) For example: if (ismissing(t4)) ....
Data correction – Chapter 12 / 175
13 Using subroutines in the edit A subroutine is a collection of statements which perform a specific task. Subroutines may be written in the C programming language or in the Quantum language. Each subroutine must have a unique name by which it can be called up when required. Subroutines can be used to make your program more readable by eliminating the need to use go tos in certain circumstances. If you use a subroutine with a name describing its purpose it will be immediately apparent what is to be done, and it will mean you don’t have to go skipping backwards and forwards in the program in order to understand what it is doing.
13.1 Calling up subroutines Quick Reference To call a subroutine, type: call name [(arguments)]
To use any subroutine, enter the call statement at the point at which the routine is required. The call statement simply says: call routine[(arguments)] where routine is the name of the subroutine to be used and arguments are any other items of information required by the routine. These will differ from routine to routine and are clearly explained in the appropriate section below.
Using subroutines in the edit – Chapter 13 / 177
Quantum User’s Guide Volume 1
13.2 Subroutines in the Quantum library Quantum has its own library of subroutines which you may call from within your Quantum program.
Using look-up files Quick Reference To load data from a look-up file, type: call fetch($file_name$, key_start_col,put_start_col) To load data from a look-up file and generate a report of used and unused keys, type: call fetchx($file_name$, key_start_col,put_start_col,keys) where keys is a number whose value determines whether used or unused keys are listed in the report.
Sometimes you will have additional information available that is not part of each respondent’s data record but that nevertheless needs to be read into the C array for use in the analysis. For instance, suppose we did some additional work on a chocolate purchasing survey and collected information about the cost of various types of chocolate bars. We can transfer this information to the array in two ways. We can either write an edit to check which brand has been bought and then copy the appropriate price into the record using if and an assignment statement, or, in a much simpler operation, we can put the costs into a look-up file and call them up as required with the fetch statement.
Creating a look-up file A look-up file is one which contains information to be transferred into the data record at a given point. Each item in the file has a unique key associated with it; this is very often the code representing that information in the data. If brands A, B, C and D are represented by the codes 1 through 4 in the data, the costs for those brands must have the keys 1 through 4 as well. Similarly, if a Ford Escort car is coded 274 in the data, the additional information for a Ford Escort would be identified by the key 274 in the look-up file. Data in the look-up file must be sorted in alphabetical order and must be formatted as follows: •
The first line must contain exactly two whole numbers anywhere on the line. The first is the key length, the second is the total record length including the key.
•
All other lines must start with the key which may be followed by any other information as necessary.
178 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
The look-up for our chocolate survey is named costs and is as follows: 1 1 2 3 4
4 14 15 21 17
The first line tells us that the key is 1 character long and that the record length is four characters long (the space in column 2 is part of that information). The other lines refer to the individual chocolate bars. Brand A (coded 1) costs 14 pence, Brand B (coded 2) costs 15 pence, Brand C costs 21 pence, Brand D costs 17 pence.
The fetch statement To transfer data from the look-up file to the record in the C array, enter the fetch statement in your edit at the point at which data is to be copied. Fetch is a C routine and is invoked by typing: call fetch($file_name$,key_col,put_col) where file_name is the name of the look-up file, key_col is the start column of the key in the record, and put_col is the start column of the field into which the data is to be copied. Data copied from a look-up file does not retain its key at the beginning. If you look at the example in the previous section, the data transferred for records with key 1 will be $ 14$. Suppose, in our chocolate survey, that the first brand bought is stored in c135, the second in c150 and the third in c165. Brands are coded 1 through 4 as noted above, and costs are to be copied into fields starting in columns 136, 151 and 166 respectively. To deal with all three purchases we will call fetch three times, once per purchase. For the first purchase we would write: call fetch($costs$,c135,c136)
When the first record is read, Quantum inspects c135 and compares its contents with the first field of the look-up file. If c135’1’ (brand A was bought) and a matching key is found in costs, the information associated with that key is copied into the C array starting at c136. In our example, brand A chocolate bars cost 14 pence so c(136,138) will contain $ 14$. If a matching key cannot be found in costs, the destination area c(136,138) will be blanked out. Calls for the second and third purchases would be entered as: call fetch($costs$,c150,c151) call fetch($costs$,c165,c166)
Using subroutines in the edit – Chapter 13 / 179
Quantum User’s Guide Volume 1
When you read additional data in from fetch files, Quantum writes a summary of what it has done to the file out2. The format of the report is as shown here: Records 7 3
Used 5 3
Unused 2 0
Calls 893 196
Hits 869 196
Misses 24 0
File cost1 cost2
This tells you that the run used two fetch files. The first file, cost1, contained seven keys; five were present in the data and two were not. The file was called 893 times altogether and 869 times the key in the data was found in the fetch file. The 24 misses refer to keys that were present in the data but not in cost1. The second file was called cost2 and contained three keys all of which were present in the data. The file was called 196 times and every time the key in the data was found in the cost2. Nine digits are allowed for each column making the maximum count in a column 999,999,999.
Listing used and unused keys If you want a list of which keys were used and unused, use fetchx instead of fetch. fetchx has the same syntax as fetch except that it has an extra parameter at the end which tells Quantum which additional information is required. Possible values for this parameter are: 0
No extra information (same as fetch)
1
List unused keys
2
List used keys
3
List unused and used keys
So, to load data from a fetch file called costs and to see a list of used and unused keys, you would type: call fetchx($costs$,c150,c151,3)
Keys are listed one per line, as follows: Records 7 "1" "5" "2" "3" "4" "6" "7"
Used 5 - key - key - key - key - key - key - key
Unused 2 unused unused used used used used used
Calls 893
180 / Using subroutines in the edit – Chapter 13
Hits 869
Misses 24
File cost1
Quantum User’s Guide Volume 1
If you use fetchx more than once, the key listings are printed after the summary line to which they refer. If the listing goes over onto a new page the column headings are repeated at the top of the page.
Converting multicoded data to single-coded data Quick Reference To write a multicoded column out as a single-coded field, type: call explode(mc_start_col, num_cols, ’codes’, sc_start_col)
When Quantum converts multicoded data into single-coded data, it takes the codes in the multicode and transfers each one to a separate column in the data, thus creating a single-coded field of columns in addition to the original multicode. You may choose which codes should be exploded in this manner, and also the start column of the single coded field. This conversion is done by the subroutine explode which is formatted as follows: call explode(mc_start_col,num_cols,’codes’,sc_start_col) where mc_start_col is the first multicoded column to be converted, num_cols is the number of sequential columns to be converted, codes are the codes to be written out as single codes, and sc_start_col is the first column in the single-coded field. Codes are exploded in the order 1234567890–&. If the first code specified in codes is present in the multicode, that code will be copied into the first column of the single-coded field. If the code is not present, the column is blank. For instance, if our data is: ----+----5 1 / 4
and we write: call explode (c144,1,’1/4’,c151)
we will have: ----+----5----+ 1 1234 / 4
Using subroutines in the edit – Chapter 13 / 181
Quantum User’s Guide Volume 1
If we write: call explode (c132,2,’1/5’,c140)
then: ----+----4 14 25 46 7
becomes
----+----4----+----5 14 12 4 45 25 46 7
The explode statement says ‘explode codes 1 to 5 in the two columns starting at column 132 into a field starting at column 140’. Quantum copies a ‘1’ into c140 because there is a ‘1’ in c132, and a ‘2’ into c141 because there is also a ‘2’ in c132. Column 142 is blank because there is not a ‘3’ in c132, and so on. Notice that the ‘7’ in c132 and the ‘6’ in c133 have been ignored because they are not part of the code specification with explode. If explode is called for any record in the data file, Quantum prints a map in the out2 print file listing the contents of the multicoded columns and the columns into which the codes were transferred. If explode is not called for any record, no map is produced.
13.3 Writing your own routines You may write your own subroutines either in C or in Quantum.
Writing subroutines in C Quick Reference To write C subroutines, either type them into a file called private.c in the project directory, or insert them in the Quantum run immediately before or after the edit section as follows: #c C statements #endc You can also include executable C statements in the edit section itself, as long as you enclose the code within #c and #endc statements.
182 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
Subroutines written in the C language must be filed in the file private.c in the current directory so that they will be compiled automatically with the rest of your Quantum program. If you have already compiled your subroutines before doing your Quantum run, the compiled version must be stored in the file private.o in the current directory. Alternatively, you can insert complete C functions immediately before or after the edit section as long as you enclose the code between #c and #endc statements as shown here: #c /* C code #endc Here are some examples of how to include a function, square, that calculates the square root of a number. The Quantum edit that calls this function may look something like this: real square 1f ed cx(181,190):4 = square(cx(1,3)) filedef srdata data write srdata end
When calling C functions, be sure to add the f option (where f stands for function) to the end of the declaration as shown above. If you omit this, Quantum will not recognize the function name and will issue a syntax error.
✎ If the function you are calling does not return a value, or if you do not need to save the return value, you can use call to call the function and you do not need to declare it.
☞ For further information about call, see section 13.1, ‘Calling up subroutines’. The first example is a C function in the private.c file: #include
double square(double dval) { return (sqrt(dval)); }
In the second example, the C code for the function has been included directly after the end statement in the Quantum run, and is enclosed by #c and #endc statements.
Using subroutines in the edit – Chapter 13 / 183
Quantum User’s Guide Volume 1
real square 1f ed . end #c #include double square(double dval) { return (sqrt(dval)); } #endc
It is also possible to include executable C statements directly into the Quantum edit section. Again, the code must be surrounded by #c and #endc statements. Here is an example that calls the standard C sqrt function directly and assigns the result to the Quantum variable x1. ed #c #include x1 = sqrt(2.0); #endc cx(181,190):4 = x1 filedef srdata data write srdata end
In addition, any standard C library function, such as sqrt, can be declared and used directly in Quantum. So the above example can also be written as: real sqrt 1f ed x1=2 cx(181,190):4 = sqrt(x1) filedef srdata data write srdata end
☞ For more details, see ‘Calling functions from C libraries’, later in this chapter.
184 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
Writing subroutines in Quantum Quick Reference To define a subroutine written in the Quantum language, type: return subroutine name [(var1 [,var2, ...]) ] Quantum statements return at the end of the edit, just before the end statement.
Subroutines written in Quantum must be placed at the end of the edit section, before the end statement, and preceded by a return, thus: /* main edit program here return /* subroutines here end Each subroutine starts with a subroutine statement and ends with a return. The format of the subroutine statement is: subroutine name[(var1, var2, ... ) ] where name is the name of the subroutine. If you define more than one subroutine their names must be unique within the first six characters of the name so, for example, sqroot and sqrt are acceptable whereas sqroot and sqroot1 are not. var1, var2, and so on, are variables which the subroutine will use. These variables are generally referred to as the arguments of the subroutine.
Passing information between the edit and a subroutine When you use subroutines, Quantum differentiates between variables defined in the variables file or before the ed statement and those defined after the ed or subroutine statements.
☞ For more information on the variables file, see chapter 14, ‘Creating new variables’ in this volume, and chapter 1, ‘Files used by Quantum’ in the Quantum User’s Guide Volume 4.
Using subroutines in the edit – Chapter 13 / 185
Quantum User’s Guide Volume 1
Variables defined in the variables file or before ed are called external variables and may be accessed and changed by statements within a subroutine. Variables defined after ed or inside a subroutine are local variables and cannot be changed by a subroutine. For example: real cost 1 int items 1 ed int nshop 1 /* edit statements return /* subroutines end
The variables cost and items are defined before the ed statement. This means they are external variables and can have their values changed by a subroutine. The variable nshop is defined after ed so it is a local variable. This means it cannot have its value changed by the subroutine, even though its value can be passed to the subroutine for use by it. Information stored in external variables is always available within a subroutine, and may be accessed and changed regardless of whether you pass it as an argument to the subroutine. For example, if we define an integer variable called items in the variables file, we can read its contents and change them in the subroutine even if we do not include items as part of the call statement. We might write: call sub1 return subroutine sub1 if (items.gt.5) emit c134’1’ return end
This checks, inside the subroutine, whether the value of items is greater than 5 and, if so, inserts a ‘1’ in column 134. We do not pass the value of items to the subroutine because it is an external variable which is available to the subroutine as a matter of course. Because items is an external variable we could change its value in the subroutine if we wished. For instance, we could reset it to zero.
186 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
Local variables which are required in the subroutine must be passed to the routine as arguments. If the items variable was defined after the ed statement we would have to name it on the call statement and on the subroutine statement thus: ed int items 1 call sub1(items) return subroutine sub1(items) if (items.gt.5) emit c134’1’ return end
This example performs the same task as the previous one. The difference is that this time items is a local variable, so we must pass it to the subroutine. Once inside the subroutine, we cannot change the value of items in any way. In neither example is it necessary to pass c134 as an argument as all cells in the C array are external variables. When you use a subroutine which requires arguments, be sure that you call it with as many arguments as are listed on the subroutine statement for that subroutine. If you give too many or too few arguments, errors will occur. For example: call conv(gallons,liters) . subroutine conv(gallons,liters)
is correct because we call the subroutine with the same number of arguments as there are in its definition, but: call conv(aa,bb,cc) . subroutine conv(aa,bb,cc,dd)
is incorrect because we are calling conv with one argument fewer than its definition specifies.
Using subroutines in the edit – Chapter 13 / 187
Quantum User’s Guide Volume 1
When you return to the edit from a subroutine, any changes made to external variables will still exist, but values assigned to local variables defined in the subroutine will not be accessible from the main edit program. For example: call sub1 return subroutine sub1 int doneit 1 if (items.gt.5) emit c134’1’ items = 0 doneit = 1 return end
Once the subroutine has been executed and control has returned to the edit, the value of items will be zero but doneit will have no value at all.
Arguments Generally, subroutines only need arguments when you are passing the values of local edit variables to the subroutine. All arguments on the call statement must have a corresponding argument of the same type on the subroutine statement. This is because Quantum does not compare the names of the arguments on the call and subroutine lines. It simply passes the value of the first argument given with call to the first argument named with subroutine and so on. For instance, if gallons and liters are local edit variables and we want to use their values in the subroutine calc, we might write: int gallons 1s real liters 1s ed call calc(gallons,liters) . subroutine calc(input,output) int input real output
Here, the value of gallons is passed to input while the value of liters is passed to output. Input and output are variables used solely within the subroutine so they are defined in the subroutine.
188 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
Calling a subroutine more than once As we have said, external variables can always be changed by a subroutine whether or not they are passed as arguments. If the subroutine is called once only, you would call it without any arguments and then refer to the variables to be changed by name inside the routine. For example: if (numb(c119).gt.2) call pchk . subroutine pchk r sp c120’1/9&’ if (c131n’1’) emit c141’1’ . return
However, if you have a subroutine that is called more than once with different external variables, you would represent them with local variables in the subroutine. For instance: if (numb(c119).le.2) call pchk(c120,total) if (numb(c119).gt.2) call pchk(c220,tot2) . subroutine pchk(n1,n2) data n1 data n2 . return
Here, n1 represents c120 or c220 and n2 represents total or tot2. n1 and n2 are local to the subroutine so they are defined after the subroutine statement.
Defining variables in a subroutine All local variables named on the subroutine statement must be defined in that subroutine. Real or integer variables passed to the subroutine must be defined as such in the routine. For example: subroutine conv(gallons,liters,price) /* number of gallons bought int gallons /* equivalent in liters real liters /* price per gallon real price
Using subroutines in the edit – Chapter 13 / 189
Quantum User’s Guide Volume 1
Single data variables (columns in the C array or user-defined data variables with one cell only) are passed to a subroutine by naming the variable on a data statement as shown here: subroutine chk(flav,prefb) /* flavors bought data flav /* brand preferred data prefb
Fields of data variables are passed as integers with the definition: subroutine ctyp(car) /* make of car owned int car
Any multicodes present in this field are ignored. If you have a multicoded field and you want to be able to access the codes in each multicode, you must treat the field as a series of single data variables and pass each one separately, using a data statement, rather than passing the field as a whole. When variables are passed with call they are written in exactly the same way as you would write them anywhere else in your edit. For example: call sub1(c15,gallons,cost,c(20,28))
passes the address of the data variable c15, and the integer values of the variables gallons and cost and the field c(20,28). Here is a chart summarizing how to define variables for subroutines: Main definition
Call argument
Subroutine argument
Subroutine definition
int item 1
item
purch
int purch
int shop 5s
shop3
shop
int shop
real cost
cost
cost
real cost
data c 1000s
c(10,11)
week
int week
data c 100s
c15
pref
data pref
data tried
tried
tried
data tried
Notice that in the main definitions the size of the variable is defined, whereas in the subroutine definition no size is required since all values are passed as integer values or, in the case of a single data variable, as an address.
190 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
An example of a Quantum subroutine We have conducted a survey to test the market for a new TV station which would be available via the satellite network. When it comes to asking how likely respondents would be to take this new channel, people who already subscribe to the satellite network are asked slightly different questions from those who do not. However, the possible responses to each set of questions are identical. One way of checking these answers is to write a subroutine and call it up using variables to define the columns to be checked. For example: ed /* c(21,23) is for those already subscribing /* c(24,26) is for those who don’t subscribe if (c17’1’) call subchk(21,22,23); else; call subchk(24,25,26) /* rest of edit return subroutine subchk(high,low,dep) /* high – willingness to take at $20 /* low – willingness to take at $10 /* dep – willingness to pay advance deposit int high int low int dep r sp ’1/59’ c(high), c(low), c(dep) return end
As our comments show, the fields to be checked are c(21,23) for those already subscribing to the satellite network and c(24,26) for non-subscribers. Both calls to the subroutine subchk name the columns in the field individually. This is because we want to look at the codes present in each column. We have not defined the data variables at the start of the edit because they are read automatically from Quantum’s variables file. This means that they are external variables and can have their values accessed by the subroutine. The subroutine statement uses local variables with names describing the contents of the variables they represent. The variable high represents c21 and c24 which tell us how likely the respondent would be to take the new station if it cost $20 a month. Similarly the variable low represents c22 and c25 and dep represents c23 and c26. All local variables are defined in the subroutine as the name of the variable they represent. The require statement simply checks whether each column is single-coded in the range ‘1/59’.
Using subroutines in the edit – Chapter 13 / 191
Quantum User’s Guide Volume 1
If you glance back at the example, you’ll notice that although we’re talking about columns in the data, we’ve actually treated them as integers. The call to the subroutine simply gives the column numbers without a preceding ‘c’. The subroutine itself defines its arguments as integers and then uses them as pointers into the C array. There are two reasons for this: •
First, it allows Quantum to report the column numbers correctly if it finds records which fail the require statement. Passing columns to a subroutine as data variables causes Quantum always to refer to column 0 in the output from require regardless of the true column number which is in error.
•
Second, it enables you, if you wish, to set new codes into the columns used in the subroutine. Normally, any changes made to the C array inside a subroutine are forgotten when control passes back to the main program. Referring to the columns as pointers into the C array, as in this example, causes any changes to the C array to be remembered when the subroutine finishes.
13.4 Calling functions from C libraries ✎ The notes in this section are for guidance only.
SPSS does not own the source code for functions in the C libraries and therefore cannot support them. If you have any problems, consult your C compiler reference guide.
The C runtime and maths libraries contain a number of general-purpose functions, some of which may be useful in Quantum programs. For example, if you want to square a number or calculate a square root, you will almost certainly find functions that do this in one of the C libraries. Before you use a C function in Quantum, read the documentation on that function to find out what parameters it needs, and of what type. Having done this, you then need to provide this information in a format Quantum understands. In order to explain how you do this, we’ll use the pow function which raises a value to a given power. The Unix documentation for pow( ) states that the function expects two arguments, both of which are double precision real variables. This means that your Quantum program will need to hold the value and the power (exponential) in x variables: x1 = 5 x2 = 2 x3 = pow(x1, x2)
Even if one of the arguments is a constant, as both are in this example, you must assign the values to variables as Quantum will not accept real constants within the function’s parentheses.
192 / Using subroutines in the edit – Chapter 13
Quantum User’s Guide Volume 1
pow( ) returns a value which you want to use in your Quantum program. In order to do this, you must define the function in the variables section of your run (that is, in the variables file or at the top of your program, before the ed statement). The function’s type must be set to the type of data the function returns. pow( ) returns a double precision value so we define it as: real pow 1f
The f at the end of the declaration means that pow is a function. Here is the complete example: real pow ed x1 = x2 = x3 = end
1f cx(11,14) 2.0 pow(x1, x2)
If cx(11,14) contains the value 12.5, the value of x3 will be 156.25. The table below lists the various C return types and shows how to define them in Quantum: C return type
Quantum return type
char
int
short
int
int
int
long
int
unsigned char
int
unsigned short
int
unsigned int
int
unsigned long
int
float
real
double
real
When looking things up in this table, bear in mind the following points: •
Quantum uses long integers, so all integer variable types except ‘unsigned long’ can be accommodated.
•
Quantum does not support unsigned values, but this is only a problem with ‘unsigned long’ variables.
•
Quantum real variables are double precision.
Using subroutines in the edit – Chapter 13 / 193
Quantum User’s Guide Volume 1
If you are not interested in the value the function returns, or the function does not return a value at all, you can treat it as a subroutine and run it using call, as you would for the standard Quantum functions. For example: call printf($Print this text$)
displays the words ‘Print this text’ on your screen. Whether you call C library functions as subroutines or functions, you need to specify the arguments correctly in Quantum so that they are converted to the appropriate C variable types. In general, the safest option is to store any real or integer arguments in Quantum real or integer variables, as in the pow( ) example, and then call the function with those variables as the arguments. This is particularly important when dealing with Quantum data variables. You can pass text strings as they are, as you saw for printf, but you cannot pass text held in data variables.
✎ Quantum stores all names in lower case. So if you want to reference an external function whose name includes upper case characters, you need to define a function in private.c using a name in lower case, to call the external function.
☞ For more information about private.c, see section 1.12, ‘C subroutine code file’ in The Quantum User’s Guide Volume 4.
194 / Using subroutines in the edit – Chapter 13
14 Creating new variables In chapter 4, ‘Basic elements’, we said that Quantum automatically provides you with an array of 1,000 data variables in which to store data, 200 integer variables for storing whole numbers and 100 real variables for storing real numbers. We also said that you may create your own data, integer and real variables with names representing the type of information they contain. In this chapter we will discuss how to increase the number of variables that Quantum provides and how to create your own named variables.
14.1 Naming variables All variables in a program must have a unique name, which can be up to 253 characters long. The name must not contain spaces and must start with a letter. You can use only the following characters in a name: A through Z _ 1234567890. You may choose any name you like, but you are advised to use names which have some relevance to the type of data they contain — for instance, total_income for a variable which contains a respondent’s total income. Also, remember that Quantum is case insensitive and therefore does not distinguish between uppercase and lowercase letters. For example, COUNTRIES_Visited is the same as Countries_Visited. Although variable names can include digits, if you do include a digit and you are using the ‘s’ option, you still have to refer to the individual columns using parentheses. For example, if you create a data variable by writing: data safe 15s
you can refer to column 12 of this variable by writing: safe12
However, if you create a data variable whose name ends with a number by writing: data safe1 15s
Quantum does not recognize safe112 as column 12 of the data. So you have to write: safe1(12)
which defeats the purpose of using the ‘s’ flag.
Creating new variables – Chapter 14 / 195
Quantum User’s Guide Volume 1
So, to avoid unexpected conflict statements during a Quantum run, it is probably simpler to name your variables using A through Z, and the underscore characters only.
14.2 Defining variables Quick Reference To define a data variable, type: data name size[s] To define an integer variable, type: int name size[s] To define a real variable, type: real name size[s] Type s after the variable’s size if you want to be able to omit the parentheses from references to single cells in the variable.
Before Quantum will recognize named variables in your program, you must say what type of information the variable is to contain and how many cells it should have. If you wish to increase the size of the C array, you must indicate how many cells you require. There are three places that you can declare named variables: •
In the variables file. Variables declared here are available in the edit and tab sections of your program and also in subroutines, and may be changed by the edit or by a subroutine.
•
At the start of your program before the ed statement. Variables declared here are available in the edit and tab sections of your program and also in subroutines and may be changed by the edit or by a subroutine.
•
In the edit after the ed statement. Variables declared here are available in the edit section only and may only be changed there. They are unknown to the tab section and to subroutines.
All variable definitions are made up of three items, separated by spaces: •
The variable type:
data int real
196 / Creating new variables – Chapter 14
for data variables for integer variables for real variables
Quantum User’s Guide Volume 1
•
The variable name: C, T or X to increase the number of data, integer or real variables available; any name for a new variable.
•
The variable size. This is generally the number of cells the variable is to have.
Here are some examples: data c 1500
increases the size of the C array to 1500 cells. This provides space for records with up to 14 cards per respondent. int number_of_trips 5
creates an integer variable called number_of_trips which can store up to five whole numbers. real price 10
creates a real variable with room to store ten real numbers.
✎ Increasing the C array with a data, int or real statement does not cause Quantum to clear the extra cells between records. However, when you increase the C array by using the max= option on the struct statement, Quantum automatically clears the entire array between records.
☞ For further information on max=, see ‘Highest card type number’ in chapter 6, ‘How Quantum reads data’.
When we first talked about variables we said that the individual cells of an array may be referenced by following the name of the array by the cell number enclosed in parentheses. Therefore: meals(3) c(100)
means the third cell of the variable meals is the 100th cell of the C array
We also mentioned that you may omit the parentheses when you are referring to a single cell in the C array so that c100 means the same as c(100). To make this possible you must follow the variable size with the letter ‘s’. This is particularly important when you are increasing the size of the C array as, without it, any references to, say, c15 will cause errors. For instance, if we write: data c 1200s
we are increasing the size of the C array to 1200 cells — enough for 11 cards per record. Because the array size is followed by ‘s’ we can write c1056 when we mean c(1056): Quantum will substitute the parentheses automatically.
Creating new variables – Chapter 14 / 197
Quantum User’s Guide Volume 1
The dimension of the C array will be taken automatically from the value of max= on the struct statement if this is greater than the dimension requested in the variables file or at the start of your program file. For example, if you have: int c 1300s
in your variables file or at the start of your program file, and: struct;max=15;ser=c(1,4);crd=c(79,80); ....
in your program, the C array will be increased to 1600 cells to accommodate card type 15. Do not confuse a declaration of the form: int brand 1s
with a similar one which omits the ‘s’: int brand 1
The former creates the variable ‘brand’ as an array, and you can refer to it in your program as brand1. The latter creates a single named variable that must be referred to as brand.
14.3 The default variables file If you are not increasing the number of data, integer or real variables or creating new variables, there is no need to set up a variables file. Quantum will read the default values from its own variables file, as follows: data c 1000s colreal cx c real x 100s int t 200s
This gives you the 1000 data variables, 100 real variables and 200 integer variables mentioned in chapter 4, ‘Basic elements’. The second statement (colreal cx c) informs Quantum that variables referred to as cx are, in fact, data variables whose contents are to be treated as real numbers.
198 / Creating new variables – Chapter 14
Quantum User’s Guide Volume 1
14.4 Naming variables in your program An alternative method of naming variables is to define them as part of your Quantum program. Variables which you want to use within your program and which you want to be able to change in a subroutine must be defined before the ed statement. These are called external variables. Variables which are to be used during the edit, and whose values may be passed to a subroutine but not changed by it may be defined after the ed statement. These are termed local variables. Here is an example: /* pints and liters are external variables because they are /* defined outside the edit. Their values may be changed by /* the subroutine conv. int pints 1 real liters 1 ed /* cost is a local variable which may only be changed in the edit, /* not by a subroutine. Its value may be passed to the subroutine /* for use in calculations int cost 1 . /* count pints bought per week . /* convert pints to liters call conv /* calculate cost @ 21 pence per liter cost = liters * 21 return subroutine conv /* statements comprising subroutine conv go here return end
☞ For further information about external and local variables, see ‘Passing information between the edit and a subroutine’ in chapter 13, ‘Using subroutines in the edit’.
Creating new variables – Chapter 14 / 199
15 Data-mapped variables Data-mapped variables can be used to store the answers to questions, both numerical and categorical. When storing numerical information, a data-mapped variable can be treated in the same way as other numerical variables. Categorical values are generally stored and retrieved as text strings, that is, the response texts of a question. As the name suggests, data-mapped variables are typically used in conjunction with one or more data-mapping files and allow Quantum specs to be written without needing column and code information. Instead, the Quantum can be written so that it automatically retrieves the information it needs from the data-mapping files used. Using this technique, you can specify conditions in your Quantum run by referring to the response texts that appear in your questionnaire, rather than having to specify the columns and codes that are involved. An example of this could be: n01Blue; n01Green; n01Red;
c=colors $Blue$ c=colors $Green$ c=colors $Red$
where colors is one of your mapvar variables. While it is possible to use data-mapped variables on their own, they don’t really offer too much over what is already available. The real power of these variables comes when you start using datamapping files.
15.1 Advantages of data-mapping files A data-mapping file contains information about what is contained in a data file and where specific information is located within it. The Quancept data acquisition package produces such a file, the project qdi (questionnaire data information) file. This file contains details of all the variables defined, their possible values (that is, the possible responses), and where the information for particular variables is located in the data records. There are many advantages of using a data-mapping files, for example: •
Refer to data fields by name. Instead of having to specify the columns which refer to the data, you can simply use the names they were given in the data-mapping file. You do not need to write specifications to transfer this information — it happens automatically in the same way as Quantum automatically sets entries in the C array according to the data read in. This means that Quantum will set the values of data-mapped variables for variables whose names appear in the mapping file that is being used.
Data-mapped variables – Chapter 15 / 201
Quantum User’s Guide Volume 1
•
Refer to responses by name. The data-mapping file contains the column and code values for each of the responses in a categorical question. Because of this, there is no need to write this in your Quantum specifications; you can just refer to the response text itself. At first, this may seem a bit cumbersome, for example, it may seem easier to write: c=c233’1’
as opposed to: c=opinions$I liked the first brand much more than the second$
However, you do not need to specify the whole response text, just enough to uniquely identify it. (In addition, it is very likely that the specifications will have been automatically generated rather than hand written.) The above example could therefore be written as: c=opinions$I liked\$
You type in the characters which uniquely identify the text and then append the \ character to ignore the remaining characters in the string. This is described in more detail later. •
Recoding is not required if the data layout changes. When referring to data fields by name, and response codes by the response text, the data locations are derived entirely from the data-mapping files. The advantage of this is that if the data layout changes, all you need to do is use the data-mapping file for the new data set. Furthermore, Quantum lets you analyze many data sets with many different mapping files — all in the same run. You do not need to write complex recoding specifications as this is handled automatically, which in turn, means less chance of error.
•
Automatically generate specifications. Data-mapping files can contain a complete description of one or more data files. This can include field names, response texts and their locations. In fact, everything that would be required to generate the main body of a Quantum run can be held in data-mapping files and the Quantum specifications can be generated automatically. Obviously, there will always be some reason why specifications generated in this way would need to be manually adjusted, but this can be kept to a minimum freeing you from the routine tasks (where mistakes are likely). This allows you to concentrate on the more complex requirements.
☞ For more details about generating a Quantum specification automatically, see section 15.10, ‘Automatically generating a Quantum spec’.
202 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
All this does not mean that you must have data-mapping files to use data-mapped variables. However, without a data-mapping file, you would have to manually load values into the datamapped variables, which removes many of these advantages.
15.2 Contents of a data-mapping file The data-mapping file describes the content and layout of one or more data files. It contains the following information: •
The names of the fields in the data file.
•
For numerical fields, the location of the fields in the data file (that is, the card and column specifications).
•
For categorical fields, the response text and location (that is, the card, column, and codes) and, possibly, the unique ID for each category.
•
Additional information that is not used by Quantum such as the limits of a numerical range.
✎ Currently, only the qdi mapping file format is supported.
15.3 Defining data-mapped variables You can define data-mapped variables in two ways: •
As a normal variable definition, using the mapvar variable type. The syntax for which is: mapvar variable_name [size] For example: mapvar my_variable 1
Where data-mapped variables are defined using mapvar, the following should be noted: — If you define an array of mapvar variables (that is, by specifying a size greater than 1), the actual size of the array is determined by its use and not by the size specified. For example, if you have the question ‘Which colors did you paint the walls of each room’, you could specify an array of rooms as: mapvar rooms 2
Data-mapped variables – Chapter 15 / 203
Quantum User’s Guide Volume 1
and then refer to room(1)$Red$, room(2)$Blue$ ... room(3)$Green$ and so on. Note, however, that defining a size of 1 will always define a single variable. — The s (special) and f (function) options have no meaning and so are not valid for mapvar variables. •
Using the *usemap statement to introduce a map file. The syntax for this statement is: *usemap mapfile_name For example: *usemap project.qdi
If you use a *usemap statement to introduce a map file, a variable is automatically defined for each item in the file.
✎ Although you may use the same name for a mapvar name or a *usemap file, you cannot use the same name for a data-mapped variable and any other type of variable. For example, you could assign the name preferences to both a data-mapped variable and a map file, but you could not use then use this name for a data, integer or real variable.
15.4 Using data-mapped variables As mentioned earlier, data-mapped variables can hold both numerical and categorical data. If a data-mapped variable is storing numerical data, it can be used in exactly the same way as any other numerical variable. For example, you can: •
Assign a value to the variable, for example: my_mapvar=t1+23
•
Assign the value of the variable to another variable, for example: x1=my_mapvar
•
Use the value as part of an arithmetic expression, for example: t1=t1+my_mapvar-4
204 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
•
Test the value using logical operators (that is, .eq., .ne., .lt., .le., .ge. and .gt.), for example: r(my_mapvar .ne. 0)$Zero value given$
•
Use the value as an increment, for example: n01Total bought;c=my_mapvar.gt.0;inc=my_mapvar
•
Analyze the value using the val statement, for example: val my_mapvar;=;base=Total;1;2;3;i;4-5;6+
•
Analyze the value using the var statement (described later), for example: var my_var;=;base=Total;1;2;3;i;4-5;6+
For data-mapped variables storing categorical data, its use is similar to using data variables that are storing categorical data. In this case, you can: •
Assign the value to another data-mapped variable, for example: my_mapvar1=my_mapvar2
•
Test for the presence of a response, for example: n01Drank Pepsi;c=my_mapvar$Pepsi$
•
Test for an exact response, for example: if(my_mapvar=$Coke$) print $Drank only Coke$
•
Analyze the value using the var statement (described later), for example: var my_mapvar;base=Total;Coke;Pepsi;Other=$_other$;DK=rej
In addition to normal response code names, packages such as Quancept allow certain special responses in the data. In order to check or set these names, the following special response texts are recognized by Quantum: Exclusive responses
Non-exclusive responses
$_null$
Checks for the null response
$_dk$
Checks for the don’t know response
$_ref$
Checks for the refused response
$_other$
Checks for the specified other response
Data-mapped variables – Chapter 15 / 205
Quantum User’s Guide Volume 1
Response groups
Miscellaneous
$_base$
Checks for any response or no response (that is, any situation)
$_possible$
Checks if a response was possible
$_answered$
Checks for any response present
$_normal$
Checks for any normal response (that is, not $_other$)
$_precode$
Checks for any precoded response (that is, not $_other$, $_null$, $_dk$ or $_ref$)
$_special$
Checks for any special response (that is, $_other$, $_null$, $_dk$ or $_ref$)
$_na$
Checks for no present responses
$_uniqid$
Represents the response whose unique ID text matches that specified.
When using data-mapped variable arrays, you can refer to the array element just as you would any other variable array, that is, by specifying the element using the numerical index. However, if the data-mapping file contains names for the array elements (note that using a qdi file which was generated by Quancept will create arrays for variables that are iterated and the elements are named after the iterations), then you can use those names to reference specific array elements. For example, if you had the variable array wrate in the Quantum run that stores the rating given to the widget suppliers Wilsons Wonderful Widgets and Just Widgets, you could refer to the rating for each supplier as: wrate(1) wrate(2)
or: wrate($wilsons wonderful widgets$) wrate($just widgets$)
Also, if the array elements have unique IDs associated with them, then these too may be used to refer to the elements. As with other uses of unique IDs, the ID text is converted to a response text format by placing it within parentheses and prepending the underscore character. Therefore, using the same example as above, you could write this as: wrate($_(wilsons)$) wrate($_(justwids)$)
If you are not using a mapping file, or the data-mapped variable is not represented in the mapping file, then each array element will be created when it is first used.
206 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
15.5 Assigning values to data-mapped variables One of the most powerful features of data-mapped variables is that you don’t have to assign values to them. This is done automatically; as records are read, these variables are automatically initialized according to the data. All you have to do is to introduce the data-mapping file prior to reading data. The data-mapping file will contain item definitions for fields contained in each data record. Where those item names match the name of a data-mapped variable, the variable is initialized as a data record is read. (Variables whose names do not correspond to any of the items in the map file will be cleared.) So, if you just want to analyze your data, simply introduce the map file at the beginning of your Quantum spec to define the variables, write your analysis specs using those variables, then introduce the map file again immediately before reading your data. Here is how it works: In your run file:
In your data file:
*usemap map_filename
*usemap map_filename
analysis specs
*include data_filename
Of course, if you have a second data file with a different mapping scheme, you would: In your run file:
In your data file:
*usemap map_filename1
*usemap map_filename1
analysis specs
*include data_filename1 *usemap map_filename2 *include data_filename2
You can see above how your Quantum run specifications do not change at all. Also note that you only need to introduce one of the maps in your Quantum specifications. This is because you are just using the map file to define your variables. If the same items exist in both files, you do not need to define them twice. There are various reasons why you may wish to explicitly assign values to data-mapped variables, so naturally you can set values into data-mapped variables. Below is a summary of how you can achieve this: •
Assign a numerical value. You can assign the value of any numerical expression directly to a data-mapped variable as follows: variable_name = arithmetic expression
Data-mapped variables – Chapter 15 / 207
Quantum User’s Guide Volume 1
For example: q23 = t1 + 7
If the variable is either clear or already holds a numerical value, then the result of the arithmetic expression is stored as a numerical value. If, however, the variable is already set to one or more categorical responses, then Quantum attempts to set the categorical response that corresponds to the result of the arithmetic expression. For example, if the result of the expression is 5, then Quantum sets the 5th categorical response and all other categorical responses are cleared. Looking at the categoric question: Q.23 Which of these newspapers do you read regularly? 1. Times 2. Telegraph 3. Independent 4. Mail 5. Other Y. Don’t know
You may then have a data-mapped variable called Q23. Typically, you would expect the variable to be tested for the exclusive response Mail as: if (q23=$Mail$) ...
However, you could refer to it by its numeric value (that is, 4): if (q23.eq.4) ...
So, if you wanted to explicitly set Q23 to be $Mail$, you can do it in one of two ways: q23 = $Mail$ q23 = 4
However, since the variable is associated with a list of responses, then the following would give a data error: q23 = 7
This is because there is not a seventh response in the list. If, however, the variable were a true numeric type, this would be fine.
208 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
•
Assign a response. You can assign a specific response either by using the method for assigning a numerical value, or by using the following syntax: variable_name = $response_text$ For example: q1 = $Once a week$
If unique ID texts are defined for responses in the data-mapping file, you can assign a response using its unique ID text. The syntax for this is: variable_name = $_(unique_ID_text)$ For example: q1 = $_(Once a week)$
All previous responses assigned to the variable are cleared. •
Copy the values from another data-mapped variable. You can assign the value(s) of one data-mapped variable to another data-mapped variable simply by specifying: target_variable = source_variable For example: aware_copy = q23
If either the source variable or the target variable holds a numerical value, then the numerical value of the source variable is copied to the target variable. If this is not the case, then the categorical responses are copied. Categorical responses are transferred by matching the text. This means that the target variable may not contain the same positional value for a response text as the source. For example, if the variable drank_most_recently contained the response texts: $Coke$ $Pepsi$
then here the response text $Coke$ would be referenced as response number 1.
Data-mapped variables – Chapter 15 / 209
Quantum User’s Guide Volume 1
If, however, the variable drank_at_all contained the response texts: $Pepsi$ $Coke$
then here, the response text $Coke$ could be referenced as response number 2. If a respondent had $Coke$ as the answer to drank_most_recently, then the statement: drank_at_all = drank_most_recently
would result in both variables having the value $Coke$. This would, however, be response number 1 in drank_most_recently, but response number 2 in drank_at_all. •
Collecting the logical OR of the responses from several data-mapped variables. If two or more variables contain categorical responses, then you use the OR function to assign the combination of all of the responses as follows: target = OR(source1, source2 [, source3, ...]) For example: all_tried = OR(tried_first, tried_second)
Responses are transferred over using the response text as described above. Bear in mind that the following special responses are exclusive and can only appear in the absence of all other responses: $_ref$ $_dk$ $_null$ $_na$
If an assignment results in a combination of any of these exclusive codes and one or more other responses, Quantum removes the exclusive special responses from the target variable. If an assignment results in more than one exclusive special response and no other responses, Quantum removes all but one of the exclusive special responses using a defined order of precedence. The order of precedence is $_ref$, $_dk$, $_null$, $_na$. So, if an assignment results in $_null$ and $_dk$, Quantum removes the $_null$ response and leaves the $_dk$. •
Collecting the logical AND of the responses from several data-mapped variables. In a similar way to the OR function, you can use the AND function to collect only responses that appear on every one of the specified list of variables. You can assign the result to a variable as follows: target = AND(source1, source2 [, source3, ...])
210 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
For example: tried_both_times = AND(tried_first, tried_second)
As with the OR function, exclusive special responses are unset if they are not valid. •
Collecting the logical XOR (exclusive OR) of the responses from several variables. Again, similar to the OR function, you can use XOR to collect responses that appear on only one variable of a specified list of variables. This means that if a response is not mentioned on any of the variables, it is not collected. In addition, if the same response is mentioned on two or more of the variables, that response will also not be collected. You can assign the result to a variable as follows: target = XOR(source1, source2 [, source3, ...]) For example: tried_once_only = XOR(tried_first, tried_second)
As with the OR function, exclusive special responses are unset if they are not valid.
15.6 Testing the value of a data-mapped variable Although data-mapped variables can hold either numerical or categorical data, they can, in fact, have four possible states: • • • •
Storing a numerical value Storing a single categorical response Storing several categorical responses Unset (that is, they have no value)
Test the numeric value Regardless of the kind of data being stored, you can always test the numerical value of a datamapped variable by using the standard logical operators: .eq., .ne., .lt., .le., .ge., and .gt. Quantum determines the numerical value of a data-mapped variable as follows: •
If the data-mapped variable contains a numerical value, then the value of the variable is tested.
•
If the data-mapped variable contains a single categorical response, then the value of the variable is the response number (that is, the first response is counted as 1). Data-mapped variables – Chapter 15 / 211
Quantum User’s Guide Volume 1
•
If the data-mapped variable contains several categorical responses, the value of the variable is zero.
•
Finally, if the data-mapped variable is unset, then the value of the variable is zero.
Test the categoric response You can test any data-mapped variable to see if it holds a specific categorical response text in the following ways: •
To test if a data-mapped variable has only one categorical response, and it is the one specified, you can use the = operator.
•
To test if a data-mapped variable has one or more categorical response stored, including the one specified, you can use the & operator.
Testing for categorical response texts is achieved by specifying the variable name, the test operator and the response text. The syntax is very similar to the way the presence of response punch codes are tested when using standard data variables. However, it is not possible to test for the presence of several response texts in a single test. The following examples show how you might check for responses using standard data variables with punch codes, and using data-mapped variables with response texts: Standard data variables
Mapped data variables
c123’1’
q23$Yes$
c146n’7’
q46n$I liked the taste$
c109=’7’
q9=$Portable CD Player$
c127’34’
q27 $Yesterday$.or.q27$Today$
In the same way as using standard variables, you can omit the test operator, in which case & is assumed. You can also combine or negate tests using the logical operators: .or, .and., and .not. and adjust the order of evaluation using parentheses. In addition to using just the response texts associated with a given variable, you can also: •
Use one of the special response texts described earlier (that is, one of $_base$, $_normal$, $_dk$, $_ref$, $_other$, $_na$, $_null$, $_precode$, $_special$, $_answered$ and $_possible$).
•
Use the unique ID associated with a response using the syntax $_(unique_ID)$.
•
If you have specified the unique ID on an element (using uniqid=keyword), then you may use the special response text $_uniqid$ as a shorthand for $_(unique_ID)$.
212 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
•
You may specify only as much of the response text as is needed to uniquely identify it. When doing so, you must append a \ character to the text. For example, $Very impressed$ may become $Very\$ and $Quite impressed$ may become $Quite\$ (or, in fact, just $V\$ and $Q\$ if these strings are unique).
•
You may also specify response texts (or the unique ID) using either uppercase, lowercase characters, or any such combination.
15.7 Using data-mapped variables in analysis specifications The main use for data-mapped variables in analysis specifications is to define conditions for tables and table elements. You can do this in the following ways: •
You can define conditions using the c= keyword. As already discussed, you can test the values of data-mapped variables as part of logical expressions. This means that you can use the standard c= keyword on tab and l statements to create analysis conditions. For example, you might create the following axis: l q23 ttlQ.23 Which of the following have you bought in the last week? n10Total n03 n01Fridge; c=q23 $Fridge$ n01Freezer; c=q23 $Freezer$ n01Microwave oven; c=q23 $Microwave\$ n01Coffee maker; c=q23 $Filter coffee maker$ n01Toaster; c=q23 $Toaster$ n01None of these; c=q23 $None\$ n01Don’t know/Not answered; c=-
•
Write specifications using the var and val statements. Just as col and val statements make it a lot easier to write simple specifications using standard data variables, the var statement provides the same shortcuts for data-mapped variables. For example, you could write the above axis as follows: l q23 ttlQ.23 Which of the following have you bought in the last week? var q23;base=Total;hd;Fridge;Freezer;Microwave oven; +Coffee maker=$filter coffee\$;Toaster; +None of these;Don’t know/Not answered=rej
Data-mapped variables – Chapter 15 / 213
Quantum User’s Guide Volume 1
By default, the var statement uses the element text as the response text to create the condition required. If this is not correct (as with the Coffee maker element), you can specify the required response text using the = operator. When analyzing data-mapped variables that contain numerical values, you can use either the var or val statements. For example, the following two statements are equivalent: val q17;=;base=Total;hd=Number Bought;hd=-------------; +1;2;3;4;5;6;7;8;9;10 or more;Don’t know/Not answered=rej var q17;=;base=Total;hd=Number Bought;hd=-------------; +1;2;3;4;5;6;7;8;9;10 or more;Don’t know/Not answered=rej
Whether analyzing numerical or categorical data, var has an added advantage over col and val equivalents in that you can combine several variables on one statement. This ability extends the power of the var statement so that it becomes an equivalent to the fld statement too. To combine two or more variables, simply place a comma-separated list of variables where you would normally specify the single variable. For example, if the variable q25 held information about the first appliance purchased and the variable q26 on the second purchased, you could use the var statement to combine them as follows: l q25_26 var q25,q26;base=Total;hd=Appliances purchased;hd=--------------+Fridge;Freezer;Microwave oven; +Coffee maker=$filter coffee\$;Toaster; +None of these;Don’t know/Not answered=rej
Often, lists of items such as these are specified in the data using a code number. Therefore, if, instead of the actual names, a numerical code is given to each type of appliance and the codes are assigned to q25 and q26, the above example can be written as: l q25_26 var q25,q26;base=Total;hd=Appliances purchased;hd=-------------+Fridge=134;Freezer=135;Microwave oven=102; +Coffee maker=117;Toaster=203; +None of these=0;Don’t know/Not answered=rej
214 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
15.8 Using parameter substitution with data-mapped variables You can substitute variable names, response texts or array element names by using the text substitution keywords available on *include and *def statements in exactly the same way as you use them for any other purpose in Quantum. For grid axes, however, the following two new keywords are available to facilitate the substitution of variable names and response texts: •
You can use the var(#) keyword to substitute variable names. If you wish to use a different data-mapped variable for each column in a grid axis, you can use the var(#)= keyword. In the grid column specification, this keyword is used to specify the name of the variable to be substituted. In the grid side specifications, use var# in place of the variable name (where # can be any number in the range 1 to 9). For example: l gridax n01First purchase; var(1)=q25 n01Second purchase; var(1)=q26 side var var1;base=Total;hd=Appliances purchased;hd=----------------+Fridge;Freezer;Microwave oven; +Coffee maker=$filter coffee\$;Toaster; +None of these;Don’t know/Not answered=rej n01Fridge or Freezer;c= var1$Fridge$ .or. var1$Freezer$
If you are using a data-mapped variable array, then you must follow the var(#)= keyword by a specific array element. For example: l rates n01Rating for Wilsons Wonderful Widgets;var(1)=wrate($wilsons\$) n01Rating for Widgets R Us;var(1)=wrate($widgets\$) side var var1;=;1;2;3;4;5;Don’t know/No Answer=rej
•
You can use the resp(#) keyword to substitute response texts. Where a grid axis requires different response texts to be used for each column in the grid, then the resp(#)= keyword must be used to specify the response texts on the column specifications. The value you assign must be the full specification, including the quoting dollar characters. In the side specification, use the special response text $resp(#)$.
Data-mapped variables – Chapter 15 / 215
Quantum User’s Guide Volume 1
Below is an example of response text substitution: l rates n01Fridge;resp(1)=$fridge$ n01Freezer;resp(1)=$freezer$ n01Microwave oven;resp(1)=$microwave\$ n01Coffee maker;resp(1)=$filter coffee\$ n01Toaster;resp(1)=$toaster$ side n01First purchase;c=q25$resp(1)$ n01Second purchase;c=q26$resp(1)$ n01Any purchase;c=or(q25,q26)$resp(1)$
✎ The # parameter for resp(#) may only have a value of 1.
15.9 Additional features using data-mapped variables The following features provide you with further flexibility: •
Using the AND, OR and XOR functions directly. Earlier, it was mentioned that you could use the AND, OR and XOR functions to combine the values of several categorical data-mapped variables. As well as assigning the result to another variable, Quantum also allows the value to be tested directly. Using the & and = operators, you can test for response texts in the result of these functions. For example: if ( or(q25,q26) $Coffee\$ ) .... n01Purchased fridge on both occasions;c=and(q25,q26)$fridge$ r ( .not. xor(q17,q18) $_other$ ) $Other mentioned only once$
This is a useful feature in that if two or more tests are to be performed on the same operation, then it may be better to assign the result to a new variable and test that. This saves you repeating the AND, OR or XOR operation many times. •
Counting responses with the numb function. Sometimes, you need to know the number of responses to a categorical question (for example, to calculate the average number of mentions). You can use the numb function to count the number of precoded categories that are set in one or more data-mapped variables. Similarly to AND, OR and XOR, the numb function is given a comma-separated list of data-mapped variables to act upon. The function returns a count of the number of precoded responses set on all of the given variables. This can be used in the usual way.
216 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
For example: t1 = numb(q25, q26) n01Average number of purchases;inc=numb(q23) flt ;c=numb(q23).gt.1
numb counts only precoded responses. This means that it will include all user-defined responses and also the $_other$ response; it does not include the $_dk$, $_ref$, $_na$ or $_null$ responses.
15.10 Automatically generating a Quantum spec Quick Reference To create a data-mapped Quantum spec from an existing qdi file, at the command line, type: qdiaxes [–a][–t n] input_qdi_file output_filename
Using an existing Quancept qdi file, you can automatically generate a data-mapped Quantum specification. The qdi file contains details of all the variables defined, their possible values (that is, the possible responses), and where in the data records the information for particular variables is located. The generated Quantum run file includes the necessary statement to the qdi file which is then referred to for information during the Quantum run. You may need to manually adjust the generated specifications, but in general, this automatic creation can save you a great deal of time (especially for new spec writers) and reduces the likelihood of errors. The main body of the Quantum specification is generated using the qdiaxes program. This program reads the qdi file and creates the following files: •
A run file containing a struct statement, a *usemap statement for the specified qdi file, *include statements for the ‘tab’ and ‘axes’, and a dummy breakdown axis.
•
A table specification file containing a tab statement line for each data item in the qdi file.
•
Axis specifications — either standard axes statements or grid axis statements.
✎ The Quancept utility, qditum, can also generate a basic Quantum specification from a qdi file. However the specification that it creates does not use the data-mapping feature.
☞ For information about qditum, see the Quancept Utilities Manual.
Data-mapped variables – Chapter 15 / 217
Quantum User’s Guide Volume 1
To create a Quantum spec file from an existing Quancept qdi file, type: qdiaxes [–a][–t n] input_qdi_file output_filename where: –a
This option causes qdiaxes to remove all text strings of the form < ... > from question texts. This is useful for Quancept Web projects where the text may contain embedded HTML directives. Note that text-formatting codes resulting from a Quancept CAPI script are always removed since they are meaningless to Quantum.
–t n
This parameter is optional. It can be used to specify the minimum length of a response text in the generated axes file. n can be set to any length providing it is enough to uniquely identify the response text (in the qdi file) in question. If this option is omitted, the text is truncated, by default, to 12 characters.
input_qdi_file
The name of the qdi file — with or without the qdi suffix.
output_filename
The base name for the output files. qdiaxes appends the relevant suffix to each Quantum output file.
For example, the statement: qdiaxes holidays.qdi holidays
reads the qdi input file holidays.qdi and generates the corresponding Quantum files (that is, holidays.run, holidays.tab and holidays.axs). Since the –t parameter is not specified, the response texts (written to the holidays.axs file) are truncated to 12 characters.
Quancept CAPI and Quancept Web text-formatting options Quancept CAPI: Because Quancept CAPI runs in a graphical environment, more text formatting control is provided on the interviewing screen than is possible in character-based Quancept. These text formatting options are defined in the script by a formatting code enclosed in angle brackets. For example, to switch bold on and off, the following codes are used: text
These formatting codes serve no purpose in the generated Quantum texts and so are always removed by the qdiaxes program. Quancept Web: In addition to the formatting control provided with Quancept CAPI, the Quancept Web product allows the scriptwriter to embed HTML directives into texts. Such directives are usually enclosed in angle brackets and can optionally be removed from the qdiaxes output by using the –a option. 218 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
☞ For details on all the text-formatting options available with Quancept
CAPI
and Quancept
Web, see the documentation relating to these products.
Reducing the response texts To reduce the size of specifications, Quantum allows you to truncate the response text specification to the least number of characters that will uniquely identify the response in question. qdiaxes uses this feature in generated specifications but allows you to define the minimum number of characters. To show how this works, consider the following question: Q1 Which of these statements best describes your opinion of the product? 1. 2. 3. 4. 5.
I liked the product very much I thought the product was satisfactory I have no real feelings about the product The product was not really to my taste Nothing at all about the product pleases me
If the –t option was set to a number greater than the longest response text, say 50, then the conditions generated would read: c=Q1 c=Q1 c=Q1 c=Q1 c=Q1
& & & & &
$I liked the product very much$ $I thought the product was satisfactory$ $I have no real feelings about the product$ $The product was not really to my taste$ $Nothing at all about the product pleases me$
If, however, the text was truncated to the minimum number of unique characters, in this case 3, then the conditions would read: c=Q1 c=Q1 c=Q1 c=Q1 c=Q1
& & & & &
$I l\$ $I t\$ $I h\$ $The\$ $Not\$
✎ The \ character informs Quantum to ignore the remaining characters in the string.
Data-mapped variables – Chapter 15 / 219
Quantum User’s Guide Volume 1
But, by applying the default truncation length of 12, they would then read: c=Q1 c=Q1 c=Q1 c=Q1 c=Q1
& & & & &
$I liked the \$ $I thought th\$ $I have no re\$ $The product \$ $Nothing at a\$
You may prefer the texts to be shorter or longer. Either way, the –t option on the command line can accommodate your preference.
✎ Note that response texts are never reduced below a minimum threshold; that is, either the limit set by the –t option, or a default of 12.
Files produced by qdiaxes qdiaxes generates three types of files: the run file, the tab file and an axs file. Each file is given the same base name and qdiaxes adds the appropriate suffix to each. The following subsections looks at each of these files in turn.
The run file The generated Quantum filename.run file contains the following seven lines: struct;read=2;ser=c(1,5);crd=c(6,7) *usemap input_qdi_file *include output_filename.tab *include output_filename.axs l XXXXXX n10All Respondents
The *usemap statement instructs Quantum to refer to the qdi file for all the relevant variable information. The two *include statements tell Quantum to read in the contents of the generated tab and axis files. The last two statements form the dummy breakdown by which all axes are tabbed. This enables top lines to be produced immediately. These statements act as a template for your specification; you can add to, delete and amend the statements accordingly.
220 / Data-mapped variables – Chapter 15
Quantum User’s Guide Volume 1
The table specifications file The filename.tab file contains a tab statement for each data item in the qdi file being processed. There are two types of tab statements generated. Apart from grid axes produced for data items with more than one iteration, all axes are tabbed by the dummy breakdown ‘XXXXXX’, that is: tab item_name XXXXXX
These can easily be stream-edited to use any standard analysis breakdown which you may wish to define. Data items with multiple iterations produce grid axes which are tabbed as usual, that is: tab item_name GRID
Where item_name is the name of a data item in the qdi file and XXXXXX or GRID is a dummy name against which to tabulate the first axis. You will, no doubt, need to make changes to these statements in the axes file.
The axis specifications file The axes file, filename.axs, holds the Quantum axis names and specifications. Axes are generated in the following format: •
All axes have the following in common: — Each axis is named using the same name as the data item in the qdi file. — A left-justified table title (ttl) is generated containing the name of the axis. — The required number of ttl statements for each line of the item text — up to 80 characters per line. Longer texts are wrapped onto multiple ttls. — For each axis, the following three element statements are generated: n01All Respondents n00 ;c=.not.(item_name=$_na$) n01All Answering
•
Grid axes also contain some additional statements: — For each iteration, a grid column specification of the form: n01iteration_text;var(1)=item_name($iteration_name$)
— Following this specification, a side statement is generated. (side is used to separate the column definitions from the row definitions.) Data-mapped variables – Chapter 15 / 221
Quantum User’s Guide Volume 1
•
The main elements then follow: — Numeric response items generate the following four lines: Comment - ** Insert suitable bands ** Comment var item_name;hd= n25 ;inc=item_name n12Mean
— Categoric items will produce an n01 element for each category as follows: n01response_text;c=item_name & $truncated_text$
Any category that has a unique ID associated with it will also have the appropriate Quantum uniqid= keyword generated. For example: n01response_text;c=item_name & $truncated_text$;uniqid=unique_id
222 / Data-mapped variables – Chapter 15
16 Running Quantum under Unix and DOS A complete Quantum run consists of seven stages: 1. Compile your program and convert it into the C programming language. 2. Convert the C code into a C program. 3. Read and process the data using the program created at step 2. 4. Weight the data (optional). 5. Accumulate cell values for tables. 6. Manipulate the data (optional). 7. Write out the tables. You can either run all stages automatically one after the other, or you can run a specific stage in isolation.
16.1 Which version to use Your computer may have more than one version of Quantum available, for example, a standard version for client tables and a newer version for in-house testing. To indicate which version you wish to use, you must assign the pathname of that version to an environment variable called QTHOME and then add the Quantum bin directory to your path. On Unix systems, you define QTHOME in your login file with a setenv statement. For example: setenv QTHOME /usr/qtime/qt/v5.7
You then define your path as: set path = ($path $QTHOME/bin)
Under DOS, the version to use is set at the time the software is installed. For information on how to switch between versions, see your installation instructions. Each release of Quantum comes with two command files: quantum
This version silently deletes all temporary files created during a run unless you include the option –k on the command line.
quantumx
This version does not delete temporary files. Running Quantum under Unix and DOS – Chapter 16 / 223
Quantum User’s Guide Volume 1
In the examples of commands in the rest of this section, we will use the word quantum to mean quantum or quantumx. At installations where automatic deletion of temporary files is not desirable, you may find that the administrator has renamed the files so that quantumx is called quantum, and vice versa. You should check this before you run your first job.
☞ For further details on file deletion, see section 3.1, ‘Tidying up after a Quantum run’, in the Quantum User’s Guide Volume 4.
16.2 The Quantum command Quick Reference To run a complete Quantum job, type: quantum [options] [program_file] [data_file] [tables_file]
To run a Quantum job, type: quantum [options] [program_file] [data_file] [tables_file] If you omit the program and/or data file names, Quantum will prompt you for them as it needs them. If you omit the name of the tables output file, Quantum will save any tables in a file called tab_. Options on the command line allow you to: •
Run only one section of the job such as the compilation stage or the table creation stage.
•
Define a run ID when you want to do more than one run in the directory.
•
Define the names of directories in which Quantum should look for program and data files or create intermediate files.
•
Convert the Quantum program and data files into a Quanvert database.
They are: –c
Compile the program file into C code.
–d
Keep intermediate files required for an edit-only datapass.
–id
Define the run ID.
224 / Running Quantum under Unix and DOS – Chapter 16
Quantum User’s Guide Volume 1
–k
Keep all intermediate files.
–l
Load the C code (DOS). Save a log of the run (Unix).
–lo
or –ld. Load the C code (Unix).
–o
Compile and output.
–p
Keep intermediate files needed by pstab, tabcon and quantum –o.
–pd
Name the directory in which Quantum should look for program and data files.
–r
Read the data.
–t
Return the duration time for various sections of the run (Unix).
–td
Names the directory in which Quantum should create its intermediate files.
–v
Create a Quanvert database.
We will look at many of these options in more detail below.
✎ The option to create a Quanvert database is only available if the Quanvert Database Administration software is installed.
☞ For further information about creating a Quanvert database, see chapter 7, ‘Creating and maintaining Quanvert databases’ in the Quantum User’s Guide Volume 4.
Compressed and non-standard data files Quantum can deal with compressed data files whose file names end with a .Z suffix. If you have a data file of this type, there is no need to specify the suffix on the command line. Quantum always checks first for a file with the exact name you typed on the command line. If it cannot find this file, it makes a second search for that file with a .Z suffix. Quantum can also cope with files that start with records you wish to ignore, or in which records are not terminated by a new-line character. We refer to these globally as non-standard data files. If you have a file of this type, create a dummy data file and enter the name of that file on the quantum command line.
☞ For further information about dummy data files see ‘Reading non-standard data files’ in chapter 10, ‘Include and substitution’, in the Quantum User’s Guide Volume 2.
Running Quantum under Unix and DOS – Chapter 16 / 225
Quantum User’s Guide Volume 1
16.3 Compiling your program file Quick Reference To compile a Quantum program, type: quantum –c [program_file]
The first step in any Quantum run is to check the syntax of your Quantum specification and to convert it into C code. We call this compilation. You can run the compilation stage by itself by typing the quantum command with the –c option: quantum –c [program_file] The compilation creates many files, the most important of which are: out1
The program file listing made as the program is checked. If errors are found, Quantum marks them in this file.
colmap
A listing of all the columns and codes referred to by all non-ignored axes.
☞ For further information on the contents of these files, see chapter 2, ‘Files created by Quantum’, in the Quantum User’s Guide Volume 4.
226 / Running Quantum under Unix and DOS – Chapter 16
Quantum User’s Guide Volume 1
16.4 Loading the C code Quick Reference To load the C code created by a compilation under Unix, type one of: quantum –lo data_file quantum –ld data_file If you are working with DOS, type: quantum –l data_file
After a successful compilation, Quantum converts the C code created by the Quantum compile into a program and, if there are no problems, reads the data. We call this program the datapass program. You can run this stage as a separate task on Unix systems by typing: quantum –lo data_file or: quantum –ld data_file If you are working on DOS, type: quantum –l data_file This stage also creates a number of files, most of which are normally deleted at the end of the run. The file you need to know about is: qtm_ex_
(qtm_ex_.exe under DOS), the datapass program.
Running Quantum under Unix and DOS – Chapter 16 / 227
Quantum User’s Guide Volume 1
16.5 Reading the data Quick Reference To read the data file after a previous compile and load, type: quantum –r data_file The datapass program reads and processes data according to the definitions in your Quantum program file. Normally, this happens as an automatic extension of the load phase, but if you have corrected errors in the data or added more data to the data file, you may rerun the datapass without recompiling and reloading your program file. To do this, type: quantum –r data_file The datapass reads and processes each record separately. If you requested that data should be separated into clean and dirty data files, or that it should be written out to another file, Quantum will do so during this stage. Any holecounts or frequency distributions are also created now. Finally, Quantum sets flags indicating the cells and tables in which each record is to be included. Files created during this phase are: clean.q
Clean data file
dirty.q
Dirty data file
hct_
Holecount output
lst_
Frequency distribution (list) output
out2
Listing of records failing write and require statements
punchout.q
Records written out by require
sum_
Sorted summary of datapass errors
228 / Running Quantum under Unix and DOS – Chapter 16
Quantum User’s Guide Volume 1
16.6 Weighting, accumulation and manipulation Quick Reference The weight, accumulation and manipulation programs cannot be run separately.
The weighting program, weight, weights records according to the figures given in your Quantum program file. If the run has no weighting, the weighting program is ignored. The accumulation program, accum, builds a file containing the cell values for each table. If your job uses row or table manipulation, Quantum runs a program called manip. This carries out your manipulation requests and creates a second file of cell values. Note that this file contains values for all tables whether or not they are the result of manipulation. You cannot run the weighting, accumulation or manipulation stages in any way except as part of a complete Quantum run. Quantum creates the following files, amongst others, during these stages: weightrp
Summary report of weighting
nums
Unmanipulated cell values
nums.man
Final cell values for tables
16.7 Creating tables Quick Reference To create tables, type: quantum –o [program_file]
The final step in most runs is to take the cell values and use them to create tables. Quantum reads the page and table headings and positions them as requested. If tables are to be sorted, added or placed side by side, the relevant figures are rearranged or combined. To change the table layout without changing the cell counts (for example, to print more decimal places for percentages, or to use special characters for absolute zero or rounding) you may rerun just the compilation and output stages using the command: quantum –o [program_file]
Running Quantum under Unix and DOS – Chapter 16 / 229
Quantum User’s Guide Volume 1
Files created during this phase which you should know about are: out3
Cumulative output summary
tab_
Tables
If you want to rerun a single table only, you may run the Quantum output program by name rather than via the Quantum shell script. Type: qout –o tab_file –t table_num where tab_file is the name of the file to which the table will be written and table_num is the number of the table you wish to reprint. For example, to rerun table 10 and save it in the file tab_10 you would type: qout -o tab_10 -t 10
16.8 Log files and running in the background Quick Reference To create a log file under Unix, type: quantum –l [options] [file_names] To run the job in the background, append & to the end of the command line.
The notes in this section do not apply to DOS Quantum since these facilities are not available on that platform. Quantum normally runs interactively. With large jobs, this can lock up your terminal for a considerable time, so you may wish to use facilities provided with your operating system to run your jobs in the background. This then frees up your terminal for other uses. When you run jobs in the background, they still write messages to your screen unless you redirect them to a log file. Quantum provides for this on the systems which need it with the –l option. This writes any messages which would normally appear on your screen into a file called log instead. You use it on the quantum command in addition to any other options required for the job. For example, to run a complete job in the background under Unix, you might type: quantum -l run1 data &
✎ On some systems your system manager may prefer you to run large jobs via the batch system.
230 / Running Quantum under Unix and DOS – Chapter 16
Quantum User’s Guide Volume 1
16.9 Running more than one job in a directory Quick Reference To run more than one job in a directory, assign a unique suffix to each run by typing: quantum –id suffix [options] [file_names]
You may run more than one job in a directory without overwriting existing files by assigning a unique suffix to each run. All files created during this run will have names which end with a dot and the given string. For example: quantum -id abc run1 data
will create the files out1.abc, out2.abc, tab_.abc, and so on. File names that already contain a dot will not have a suffix appended. If your run creates clean and dirty data files, these will retain their original names, clean.q and dirty.q. We advise you to avoid a suffix of Z since this is the suffix assigned to compressed files and it may lead to confusion if compressed files also exist.
16.10 The Quantum temporary directory Quick Reference To create intermediate files in a directory other than the project directory, type: quantum –td directory_name [options] [file_names]
Quantum can create its temporary work files in a directory other than that in which the job is running. The directory is named using the option –td on the command line: quantum –td temp run1 data
This example tells Quantum to create temporary files in a subdirectory called temp in the project directory. Creating temporary files in a different directory is one way of improving the performance of large jobs running under DOS. When the number of files associated with a job rises above 500, you’ll find that the job runs more quickly if the temporary files are created in a different directory. You’ll also find it more convenient to scan directories’ contents when the number of files in each one is reduced. Running Quantum under Unix and DOS – Chapter 16 / 231
Quantum User’s Guide Volume 1
You may also find that using –td when creating a Quanvert database helps to keep the project directory clean of unwanted files. It is also useful if you need to do multiple Quantum runs to create the database. As long as you use a different temporary directory for each run, you can then combine the directories with qvmerge to create the Quanvert database.
✎ The option to create a Quanvert database is only available if the Quanvert Database Administration software is installed.
☞ For further information about creating a Quanvert database, see chapter 7, ‘Creating and maintaining Quanvert databases’ in the Quantum User’s Guide Volume 4.
16.11 The Quantum permanent directory Quick Reference To read program, data or include files from a directory other than the one in which the program is being run, and to create permanent files such as report files in that same directory, type: quantum –pd directory_name [options] [file_names]
Quantum normally reads its program, data and include files from the directory in which you are running the program, and creates permanent output files such as print or report files in that directory. If you want to use a different directory, define it on the command line with the option -pd. An example using Unix pathname notation is: quantum -pd /usr/barbara/qjobs run1 data
The exceptions are filedef and include with absolute pathnames. In these cases Quantum uses the directory named in the pathname.
232 / Running Quantum under Unix and DOS – Chapter 16
Index This index covers all four volumes of the Quantum User’s Guide. The page references consist of the volume number followed by the page number; for example 2-6 is page 6 of Volume 2, 3-166 is page 166 of Volume 3, and so on.
A a, global tabulation parameters 2-8 in tabcon format file 3-190 options on 2-9 Absolutes decimal places with 2-113 position of percentages relative to 2-18, 2-117 print character before/after 2-41, 2-114 print characters next to 2-81 requesting in tables 2-16 side by side with percentages 2-17 suppress small 2-21, 2-117 ac, accept codes in online edit 1-165 Access rights on files in Quanvert Text 4-81 Accum error messages 297, 4-163 Accum program 1-229 acr100, 100% on base row 2-10, 2-32 Action codes with require 1-145, 1-146 ad, create cards in online edit 1-166 add, add tables 2-182 dummy elements with 2-186 example of 2-184 options with 2-186 Quanvert 4-71 sample program for 2-183 with offsets 2-183 Adding table 2-182 Addition 1-26 Aided and unaided awareness, example of 2-230 Alias file for qvpack/qvtrans 4-128 Aliases for Quantum statements 4-6 allread, cards read for current respondent 1-50 with write 1-66 alp files 4-94 Alpha variables, for Quanvert 4-73, 4-74 Alphanumeric card types 1-58 alter, texts in Quanvert Text 4-83 Analysis levels see Levels Analysis of variance Friedman’s two-way 3-85 one-way 3-110 example 3-110 formula 3-119 .and., logical comparison (both) 1-39 and, axes for additional tables 2-178 with flt 2-218 and, logical operator for assignment 1-100
anlev=, analysis level level at which to update 2-10, 2-40, 3-53 weighting with 3-12 with grids 2-245 anova, one-way analysis of variance 3-110 Arguments for subroutines 1-188 Arithmetic equality, element conditions 2-89 Arithmetic expressions 1-26 blanks in 1-25, 1-39 combining 1-26 comparing 1-30 data-mapped variables 1-204, 1-207 increment cell counts using 2-27 missing values in 1-173 mixing integers and reals 1-27 multicodes in 1-25 numb 1-28 order of evaluation 1-26 random 1-29 saving value of 1-97 Arithmetic values, storing 1-95 Arrays checking boundaries for assignments 1-112 referring to cells in 1-18, 1-197 triangular of statistics 3-71 types of 1-17 ASCII character set, octal punch code file 4-171 ASCII characters, punch code equivalents 4-175 Assignment 1-89 and 1-100 checking array boundaries for 1-112 copying codes 1-90 data-mapped variables 1-204, 1-207 missing values 1-173 or 1-100 replacing codes 1-92 storing arithmetic values 1-95 xor 1-101 Association, test for 3-76 Asterisks in tables 1-16 Audio files 4-74 availang, available languages file 4-84 Averages 2-137 creating with manipulation 3-33 exclude elements from 2-116 ax files 4-94 axcount, count records by axis name 2-25, 2-32
Index / 233
Quantum User’s Guide Volume 1
Axes analysis level 2-40, 3-53 bases in 2-56, 2-113 blank lines in 2-58 column and code map for 1-226 column width 2-41, 2-113 column, nested subheadings in 2-63 creation of, in Quanvert (Windows) 4-96 creation of, in Quanvert Text 4-83, 4-96 declare weighting in 3-14 defining for Quanvert 4-68, 4-69 double spacing in 2-41 elements per create in Quanvert Text 4-83 flag as single coded 2-44 generated from qdi file 1-221 grids 2-238 introduction to 2-39 long element texts in 2-66 maximum characters per axis 4-9 maximum per run 4-9 mutually exclusive elements 3-72 naming 2-39, 4-15 naming of files 4-15, 4-95 no double spacing in 2-45 no sorting 2-45 on tab statements 2-171 reflip incorrect 4-98 require single coding 2-40 reset flags between trailer cards 2-41 restrict access to, in Quanvert Text 4-84 sorting 2-44 special characters for laser printing 3-199 subaxes within 2-76 subheadings 2-41 axes.inf files 4-94 axes=, maximum number of axes per run 4-9 Axis names, table titles from 2-10 Axis subgroups 2-76 Axis-level statistics, list of 3-68 axreq=, axis coding requirements 2-25, 2-40 axtt, table titles using axis names 2-10, 2-32
B b, breakdown element for manipulation 3-41 baft, print base titles last 2-10 Banners see Breakdowns Base creating 2-56, 2-113 effective 2-119, 2-153, 3-147 enclose in parentheses 2-81 flag cells with small for stats 2-20 force export to SAS or SPSS 2-114 minimum effective for T statistics 2-29, 3-150 percentage against redefined 2-16 print base title last 2-10
234 / Index
Base (continued) redefining 2-103 required for statistics 3-71 small for special T statistics 2-20, 3-150 sort on element other than 3-126 suppress elements with small 2-21 suppress percentages with small 2-196 suppress statistics with small 2-196 suppress tables with small 2-21 use to define segments in an axis 3-69 base, base element 2-113 binasc.dat, octal punch codes for ASCII character set 4-171 bineas.dat, octal punch codes for extended ASCII character set 4-171 bintab, convert extended ASCII character set 4-174 bintab.qt, characters in extended ASCII character set 4-171 bit arguments per fld statement 2-267, 4-133 bit files 4-94 bit, elements with numeric codes 2-97 inc= with 2-99 when better than fld 2-99 Blank lines after column headings 2-14, 2-162 before column headings 2-14, 2-162 in tables 2-58 Blanks allowing in arithmetic tests 1-39 with col 2-84 bot, titles at bottom of page 2-210 Quanvert 4-71 with flt 2-218 with hitch/squeeze 2-191 boxe, end of box 3-207 Boxes in tables 3-206 boxg, box above G texts 3-207 boxl, draw line inside box 3-207 boxs, start of box 3-207 Brackets, print multicodes in 1-80 Break points, define in element texts 2-163, 3-199 Breakdowns, example of 2-167 btx files 4-94 byrows, export grids row-by-row in Quanvert 2-40, 2-249
C #c, start C code 1-183, 3-123 C array columns 1-18 defining size of 1-198 increasing 1-196 C code in Quantum spec 3-123 C compiler error messages 296, 4-162 C library functions, calling 1-192
Quantum User’s Guide Volume 1
C subroutine code file 4-11 compiled 4-22 c=+, net cases counted so far 2-48 example of 2-134 with the effective base 2-153, 3-147 c=, conditions 2-26, 2-40, 2-46, 2-119 data-mapped variables 1-213 with weights 3-13 c=-, count cases not counted so far 2-48 with the effective base 2-153, 3-147 ca, cancel online edit 1-167 Calculation of effective base 3-147 call, run a subroutine 1-177 passing variables with 1-190 cancel, cancel the run 1-128 cann, symbolic parameters for columns 2-228 Card type alphanumeric 1-58 highest 1-57, 1-198, 3-47 ignoring when reading data 1-49 location of 1-55, 3-47, 4-2 repeated 1-56 required 1-56 card_count, number of cards read so far 1-52 Cards first in record read 1-51 last in file read 1-52 last in record read 1-51 maximum per record with levels 4-2 more than 100 columns in multicard records 1-63 number read so far 1-52 read in during current read 1-50 read in for current record 1-50 cards=, defining levels 3-46, 4-2 Cell counts cancel incremental values for 2-45 file 4-22 incremental values for 2-42, 2-120 celllev=, update table at higher level than axes 2-174, 3-54 comparison with uplev 3-58 example of 3-58 statistics with 3-61 with grids 2-246 Center tables on page 2-22 Change record length using len= 1-78 Changes, before and after, test for 3-83 Character set 1-7, 4-169, 4-171 Characters allowed in variable names 1-195 Characters in extended ASCII character set 4-171 Characters per axis, set maximum 4-9 check_, possible syntax errors are fatal 1-10 chi1, one dimensional chi-squared test 3-74 chi2, two dimensional chi-squared test 3-76 chis, single class chi-squared test 3-78
Chi-squared test one dimensional 3-73 example of 3-74 formula 3-89 single classification 3-78 example of 3-80 formula 3-90 two dimensional 3-76 example of 3-77 formula 3-89 Clean data file 1-228, 4-16 clean.q, clean data file 1-228, 4-16 clear, reset variables to initial state 1-111 advantages over assignment 1-111 clear=, reset axis cells 2-41, 3-64 clevel confidence level for special T stats 2-26, 3-156 test for significance with chi-squared test 3-78 Codes / with 1-15 adding into columns 1-102 checking exclusive 1-150 checking number in column 1-154 checking type of 1-146 checking with require 1-144, 1-148 comparing 1-31 copying 1-90 counting, in columns 1-28 deleting 1-103 entering 1-14 list of 1-13 replacing 1-92 set random into columns 1-107 symbolic parameters for 2-232 Coding, defining axis requirements 2-25 Coding, summarizing for axes 2-25 col, basic count elements 2-83 blanks with 2-84 conditions 2-86 semicolons in text 2-85 text-only elements 2-88 col, column element 2-115, 2-140 colmap, column/code map for axes 1-226, 4-16 colrep, check column and code usage 4-27 coltxt, print text in main body of table 2-61 Column and code map for axes 1-226, 4-16 Column and code usage, check 4-27 Column headings 2-159 blank lines after 2-14, 2-162 blank lines before 2-14, 2-162 defining for Quanvert 4-71 in laser printed tables 3-199 line titles up with start of 2-204 splitting long texts 2-163 suppress with squeeze=2 2-193 text differs from row text 2-118 underlining 3-203 using colwid= 2-164
Index / 235
Quantum User’s Guide Volume 1
Column headings (continued) using pagwid and side= only 2-160 with g and p statements 2-164 with sid 2-181 Column offsets with added tables 2-183 Column percentages 2-16 example of 2-57 force to round to 100% 2-19 suppress small 2-21 Columns 1 to 100 1-52 checking contents of 1-34 checking with require 1-144 delete codes from 1-103 fields of 1-18 insert codes in 1-102 listing contents of 1-139 real numbers in 1-23 referring to 1-18 resetting to blank 1-52 set random code into 1-107 spare, using 1-52 symbolic parameters for 2-228 Columns in tables Newman-Keuls test 3-165 position of subheadings above 2-65 ranks 2-16 sorting 2-10, 3-127 suppress small 2-20 t-test on means 3-164 t-test on proportions 3-160 vertical lines between 2-167 width 2-10, 2-41, 2-113, 2-164 colwid=, column width 2-10, 2-41, 2-113, 2-164 Combine several variables on one statement 1-214 Combining tables 2-179, 2-188 Combining testing sentences 1-157 Comma-delimited ASCII, Quantum/Quanvert Text tables into 4-32, 4-35 Command availability for Quanvert Text 4-83 comment, comment statement 1-9 Comments with require 1-147 Comparing data variables 1-31 Compilation listing file 1-226, 4-13 Compilation, files created by 1-226 Compiled C subroutine code file 4-22 Compiler error messages 271, 4-137 Compiling your program file 1-226 Components of a program 1-3 Compressed data files, reading 1-225 Conditions c=+ and c=- 2-48, 2-153, 3-147 count cases not counted so far 2-48 net cases counted so far 2-48 on elements 2-46, 2-52, 2-119 Quanvert axes 4-69 ranges 2-92 simplifying complex 2-52
236 / Index
Conditions (continued) types of 2-48 with c= 2-26, 2-46 with col statements 2-86 Confidence level for special T stats 3-156 Constants comparing 1-31 individual 1-13 strings 1-15 Continuation elements in sorted tables 3-137 long element texts 2-66 long statements 1-9 continue, read next statement 1-119 Continuity correction for t-test 3-161 Copying weights into the data 3-24 Correcting data forced edits 1-159 methods of 1-159 online 1-160 split 1-161 write 1-161 Corrections file 1-170, 4-4 corrfile, corrections file 4-4 count, create a holecount 1-135 crd=, card type location 1-55, 3-47, 4-2 Create new data files split 1-167 write 1-69 Creating a table of contents 3-189 Creating new cards 1-70 Cross-referencing in panel studies 4-73 csort, sort columns 2-10, 3-127 Cumulative output summary file 4-22 Cumulative percentages 2-16 example of 2-34 Currency symbols, print next to absolutes 2-81 Customized text file, define 4-8 C-variables 1-18
D d, delete codes in online edit 1-163 Data automatic filtering of in Quanvert Text 4-84 C array 1-18 checking and verifying 1-4 compressed, reading 1-225 convert to Quanvert database 4-93 convert to SAS format 4-56, 4-65 convert to SPSS format 4-38, 4-44 converting multicoded to single coded 1-181 correcting 1-159 counting responses with numeric codes 1-108 define structure in levels file 3-47 merging cards from different files 1-59
Quantum User’s Guide Volume 1
Data (continued) merging fields from an external file 1-61 merging files 4-4 non-standard format 1-63, 1-225, 2-250 output file for require 4-18 overlapping, with special T stats 2-30, 3-159 Quantum format 4-167 reading into C array 1-48 types of 1-47 write out fixed length records 1-69, 1-73 write out in user-defined format 1-84 Data files #include with 2-227 define T variables in 1-113 non-standard 1-63, 1-225, 2-250 Databases access Unix with PC-NFS 4-130 add variables to 4-99 convert unpacked files 4-130 copy packed 4-125 create 4-93 do not compress 4-124 files 4-94 icon 4-90 join split for unpacking 4-127 levels 4-72, 4-73 link similar 4-101 make secure 4-116 maximum size of packed file 4-124 new format 4-67 old format 4-67 pack and split 4-124, 4-129 Quanvert (Windows) 4-86 security level 4-117 split large packed 4-127 store variables in subdirectories 4-80 transfer format 4-125 transfer programs for 4-125 unknown file formats 4-128 unpack 4-126 weighted 4-71 see also Quanvert, Quanvert Text, Quanvert (Windows), Multiproject databases Data-mapped variables 1-201 assigning values to 1-207 defining 1-203 testing values of 1-211 using in analysis specifications 1-213 Data-mapping files 1-201, 1-203 Datapass error messages 297, 4-163 Datapass error summary file 4-18 Datapass program 1-227 date, print date on table 2-10, 2-32 db.ico file for Quanvert (Windows) 4-90 db.nts file for Quanvert (Windows) 4-90 db.ptf, translation file 2-176, 4-23, 4-77 dbhelp.msg file for Quanvert (Windows) 4-90 debug, intermediate figures for special T stats 3-157
dec=, decimal places for absolutes 2-10, 2-113 with means 2-139 with stat= 3-68 Decimal places 1-16 absolutes 2-10, 2-113 in significance levels 3-68, 3-71 in statistics 3-68, 3-71 means 2-139 percentages 2-11, 2-113 decp=, decimal places for percentages 2-11, 2-113 with stat= 3-68 #def, global values for symbolic parameters 2-237 with grids 2-243 *def see #def Default options file 2-32, 4-3 definelist, name a list 1-44 limits 4-9 delete, delete codes from columns 1-103 descrips.inf 4-24, 4-94 Descriptive statistics, exclude elements from 2-116 di, display columns in online edit 1-162 Difference between .eq. and = 1-37 Differences between celllev and uplev 3-58 Digits in variable names 1-195 Dirty data file 1-228, 4-16 dirty.q, dirty data file 1-228, 4-16 Disk space check machine has enough for job 4-177 reduce amount needed for Quanvert 4-75 temporary required during run 4-178 Display wide files in Quanvert Text 4-85 Distribution, comparing 3-76, 3-81 div, divide one table by another 2-186 Division 1-26 DNA, missing values in Quanvert 4-74 do, start a loop 1-119 nested loops 1-123 with individual values 1-120 with ranges of values 1-121 Dollar signs with strings 1-15 Don’t know, data-mapped variables 1-205 Double precision calculations 2-27, 2-32 Double quotes, in holecount/list headings 1-135, 1-140 dp, double precision calculations 2-27, 2-32 dsp, double spacing 2-11, 2-32, 2-41, 2-113 Dummy axis, name in Quanvert Text 4-84 Dummy elements 2-113 with add 2-186 dummy, create a dummy element 2-113, 2-186
E e, insert codes in online edit 1-163, 3-124 #ed, start edit in tab section 3-124 ed, re-edit current record online 1-166
Index / 237
Quantum User’s Guide Volume 1
ed, start of edit section 1-8 with levels 3-50 edheap=, limit for edit statement 4-9 Edit, processing missing values 1-172 Editing axis coding requirements 2-25 in tabulation section 3-124 interactive correction of errors 1-160 with levels 3-50 effbase, effective base 2-119, 2-153, 3-147, 3-149 Effective base 2-119, 2-153, 3-147, 3-149 Element texts define breakpoints in 2-163 printing | and ! in 3-202 Elements all zero, ignoring 2-116 assign to subgroups 2-79, 2-114 base 2-56, 2-57 non-printing 2-57 basic counts 2-50 non-printing 2-56 required for statistics 3-71 blank lines 2-58 cases already counted 2-48 cases not yet counted 2-48 conditions on 2-46, 2-52 count creating 2-49 distribution of records between 2-129 excluding from totals 2-116 extra text 2-58 ignore in column axes 2-115 ignore in higher dimensions 2-115 ignore in row axes 2-116 indent text when split 2-115 intermediate figures for special T stats 3-157 maximum values of inc= 2-124 minimum values of inc= 2-124 number per create in Quanvert Text 4-83 percentage differences 2-124 print all-zero 2-45 rejecting one from another 2-125 reprint at top of continued tables 2-109 responses with numeric codes 2-94, 2-97 selecting for special T stats 3-145 set maximum per run 4-9 simplifying complex conditions 2-52 splitting long texts 2-51 subheadings 2-62 sum of suppressed 2-118 suppress all-zero 2-15, 2-44 suppressed, accumulating in tables of nets 2-72 text continuation 2-66 types of 2-45 underlining text on 2-119 unsorted, in sorted table 2-116 weight factors for 3-15 weighted target for 3-14 elms=, elements for special t-tests 3-155
238 / Index
elms=, maximum number of elements per axis 4-9 else, conditional actions 1-117 emit, insert codes in columns 1-102 #end, finish edit in tab section 3-124 #endc, end C code 1-183, 3-123 End of data file, checking for 1-52 end, end of edit section 1-8 endlevel, edit at end of level 3-51 endnet, end a net 2-67, 2-113 #endpostscript, end PostScript code 3-213 endsort, end secondary level sorting 2-113, 3-134 terminating more than one level 3-135 Environment variables QTAXES 4-10 QTEDHEAP 4-10 QTELMS 4-10 QTFORM 3-201 QTHEAP 4-10 QTHOME 1-223 QTINCHEAP 4-10 QTINCS 4-10 QTINLISTHEAP 4-10 QTLEXCHARS 4-10 QTMANIPHEAP 4-10 QTNAMEVARS 4-10 QTNOPAGE 4-23 QTNOWARN 4-11 QTSPSSRC 4-55 QTTEXTDEFS 4-10 .eq., logical equality 1-30 Error messages accum stage 297, 4-163 C compilation stage 296, 4-162 compilation stage 271, 4-137 datapass stage 297, 4-163 include files 2-226 percentiles 2-151 printing on the screen 1-11 Error variance of the mean 2-136 formula 2-157 in weighted jobs 2-143 suppress if has small base 2-20 suppress if small base 2-196 Errors, correcting 1-5, 1-10, 1-170 errprint, print error messages on the screen 1-11 ex, table manipulation 3-34 ex=, manipulation expression 2-119, 3-26, 3-32 secure databases 2-45, 2-118, 4-116, 4-118 Examining records count 1-133 list 1-138 online edit 1-160 qfprnt 1-84 report 1-70 require 1-145 write 1-65
Quantum User’s Guide Volume 1
Examples aided and unaided awareness 2-230 anlev= 3-53 brand awareness questions 2-52 breakdown 2-167 c=+ 2-134 chi-squared test 3-74 column percentages 2-57 cumulative percentages 2-34 data-mapped variables 1-201, 1-213, 1-215 div 2-187 editing with levels 3-51 Friedman’s test 3-87 grids 2-239, 2-240, 2-241 hitch/squeeze 2-191 indices 2-35 Kolmogorov-Smirnov test 3-82 manipulation 3-40, 3-42 maxim and minim 2-37 McNemar’s test for differences 3-84 multidimensional tables 2-172 Newman-Keuls test 3-113 one sample T-test 3-102 one sample Z-test 3-94 one-way analysis of variance 3-110 paired T-test 3-102 percentaging against redefined base 2-103 percentaging with nets 2-73 process 1-130, 2-100 product tests 2-247 smbase= 2-199 subtotals 2-136 suppress percents with small bases 2-199 symbolic parameters 2-229, 2-232 table of means 2-36 table with inc= 2-136 total percentages 2-33 total rows in tables 2-121 totals 2-136 Exclude respondents from weighting 3-6 exp, exponentiation manipulation operator 3-27 explode, convert multicoded data to single coded 1-181 export, export element to SAS or SPSS 2-114 Exporting data, suppressing elements 2-115 exportmp, force an axis to be multicoded when exporting to SPSS 2-41, 4-50 Expressions arithmetic 1-25 combining arithmetic 1-26 combining logical 1-39 comparing data variables 1-31 comparing values 1-30 logical 1-30 manipulation 3-26 mixed mode arithmetic in 1-27 mixing logical operators 1-41 numb 1-28
Expressions (continued) random 1-29 range 1-38 with table manipulation 3-34 Extended ASCII character set defining 4-169 laser printed tables 3-212 octal punch code file 4-171 External data file, merge a field from 1-61 External variables 1-199 with subroutines 1-186
F F and T values with nft 3-108 formula 3-117 fac=, factors for statistics 2-119, 2-138 in same axis as inc= 2-139 on col and val 2-120 on row elements, for T-test 3-101 percentiles 2-144, 2-145 with stat= 3-93 Factor weighting 3-2, 3-7 factor, factor weighting 3-7 Factors decrementing by a constant 2-120 defining 2-119 incrementing by a constant 2-120 on col and val 2-120 percentiles from 2-144, 2-145, 2-146, 2-148 reverse sequential order for percentiles 2-146 scaling 2-117 switching off 2-120 failed_, action when require fails 1-156 fen, font encoding files 3-212 fetch, load data from a look-up file 1-178 fetchx, load data from a look-up file 1-180 field, count numeric codes across fields 1-108 fieldadd, count numeric codes 1-111 Fields checking codes in 1-37 comparing 1-35 copying codes into 1-91 merging from an external file 1-61 referring to 1-18 figbracket, print characters around absolutes 2-41, 2-81, 2-114 figchar=, character to print next to absolutes 2-41, 2-81, 2-114 figpost, print character after absolutes 2-41, 2-81, 2-114 figpre, print character before absolutes 2-41, 2-81, 2-114 File formats, unknown for databases 4-128 filedef, define output file type 1-78 override ruler printing with ident 1-83
Index / 239
Quantum User’s Guide Volume 1
Files aliases 4-6 alp 4-94 ax, axis information files 4-94 axes.inf 4-94 binasc.dat 4-171 bineas.dat 4-171 bintab.qt 4-171 bit 4-94 btx 4-94 C subroutine code 4-11 cell counts 3-38, 4-22 clean data 1-167, 4-16 column and code map 1-226, 4-16 comm.qsp 4-44 commands 4-65 commands.qsp 4-51 compilation listing 1-226, 4-13 compiled C subroutine code 4-22 compiled subroutines 1-183 compressed data 1-225 corrections 1-170, 4-4 created at compilation stage 1-226 created by flip 4-94 cumulative output summary 1-230, 4-22 customized table texts 4-7 data merge file 4-4 data.qsp 4-44, 4-51 data-mapping 1-201, 1-203 datapass error summary 4-18 default options 2-32, 4-3 deletion of temporary 1-223 descrips.inf 4-24, 4-94 dirty data 1-167, 4-16 fen 3-212 fli, inverted data files 4-94 flip.cnf 4-78 format file for table of contents 3-194 frequency distribution 4-17 generated by qdiaxes 1-220 graphics output 4-22 holecount 4-17 inc 4-94 intermediate figures for special T stats 3-157 levels 3-45, 4-2, 4-94 log 1-230 machine.def 4-128 manipulated cell counts 3-38 merges 4-4 merging data from different files 1-59 mul 4-75, 4-94 nums 1-229 nums.man 1-229 output data from require 4-18 PostScript 3-198 private.c 4-11 private.o 4-22 ptf, translation file 2-176, 4-23, 4-77
240 / Index
Files (continued) qdi 1-201, 1-217 Quanvert levels cross-reference 4-95 numdir.qv 4-80 required for 4-96 tstatdebug 4-76 Quanvert (Windows) 4-86 db.ico 4-90 db.nts 4-90 dbhlp.msg 4-90 qextras file 4-91 qnaire.txt 4-91 sound files 4-74 stats.ini 4-86 Quanvert Text 4-81 access rights 4-81 availang 4-84 foreign language prompts 4-81 mfwaves 4-111 profopts 4-82, 4-85 qotext.dat 4-82 qvtext.dat 4-82 users 4-83 records written by write/require 1-145, 4-17 rim weighting parameters 4-5 run definitions 3-38, 4-3 statdata 4-65 subroutine source 1-183 table of contents format 3-190 tables 1-230, 4-22 texts.qt 4-8 user-defined limits 4-9 variables 1-196, 4-1 weighting report 1-229, 4-19 Filtered holecounts 1-136 Filters canceling 2-219 groups of tables 2-217 in grid tables 2-247 n00 in axis 2-104 named 2-11, 2-220 nested sections 2-221 on per-user basis in Quanvert Text 4-84 Quanvert 4-71 sample program 2-219 firstread, first card in record read 1-51, 3-64 Fixed length records, writing out 1-69, 1-73 fld, elements with numeric codes 2-94 bit argument limit 2-267, 4-133 options on 2-112 when to use bit instead 2-99 fli files 4-94 Flip, create Quanvert database 4-68, 4-93 configuration file 4-78 files created by 4-94 reasons axes excluded 4-69 remove files used by 4-97
Quantum User’s Guide Volume 1
Flip, create Quanvert database (continued) reserved words 4-70 flip.cnf, flip configuration file 4-78 flipclean, remove files used by flip 4-97 flt, filter groups of tables 2-217 and with 2-218 bot with 2-218 foot with 2-218 inc= with in levels jobs 2-218 options on 2-9, 2-217 tt with 2-218 flt=, named filters 2-11, 2-220 flush, percentages flush with absolutes 2-11, 2-32 Font encoding in PostScript tables 3-212 font=, fonts for laser printing 2-11, 3-209 Fonts for titles 3-205 in laser printed tables 3-197, 3-209 #fonttable, define fonts for table 3-209 foot, footnotes on tables 2-208 switching off 2-209 with flt 2-218 with hitch/squeeze 2-191 Footnotes on tables 2-208 overlapping data 2-15, 3-160 switching off 2-15, 2-209 with flt 2-218 with hitch/squeeze 2-191 Forced editing with if 1-159 with require 1-151 Forcing single-coded answers 1-104 Format file for table of contents 3-190 naming 3-194 Formulae analysis of variance 3-119 chi-squared test one dimensional 3-89 single classification 3-90 two dimensional 3-89 error (sample) variance 2-157 F and T values from nft 3-117 Friedman’s test 3-91 Kolmogorov-Smirnov test 3-90 least significant difference test 3-182 McNemar’s test for differences 3-91 mean 2-156 Newman-Keuls test 3-121, 3-184 one-way analysis of variance 3-119 paired preference test 3-181 rim weighting efficiency 4-21 root mean square 4-20 significant net difference test 3-181 standard deviation 2-156 standard error 2-157 sum of factors 2-156 T-test on column means 3-177
Formulae (continued) on column proportions 3-179 one sample 3-117 paired 3-117 two sample 3-117 Z-test one sample 3-115 overlapping samples 3-116 subsample proportions 3-116 two sample on proportions 3-115 Frequency distribution file 4-17 Frequency distributions 1-138 alphabetic 1-139 double quotes in headings 1-140 missing values in 1-139 multiplied 1-142 ranked 1-139 weighted 1-142 friedman, two-way analysis of variance 3-85 Friedman’s test 3-85 example of 3-87 formula 3-91 F-test see Analysis of variance, one-way, ANOVA Functions, C library 1-192
G g, layout column headings 2-165 combining groups of 2-166 in laser printed tables 3-199 sid statements with 2-181 spacing with 2-166 .ge., greater than or equal to 1-30 Generate Quantum spec from qdi file 1-217 go to, routing in edit section 1-118 graph=, create graphics input files 2-13, 2-32 files created by 4-22 Grid axes see Grids Grid tables see Grids grid, identify a grid table 2-244 Grids #def with 2-243 components of 2-238 creating tables 2-244 data-mapped variables 1-215 example of 2-241 code symbolic parameters 2-240 column and code symbolic parameters 2-240 column symbolic parameters 2-239 export to SAS/SPSS from Quanvert 2-40, 2-249 filtered columns in 2-247 in levels jobs 2-245 increments in 2-243 inctext= invalid with 2-123
Index / 241
Quantum User’s Guide Volume 1
Grids (continued) recognizing 2-238 rotated, op= with 2-245 weighted 2-246 group=, axis group for element 2-79, 2-114 groupbeg, start of subaxis 2-77 groupend, end of subaxis 2-77 Groups in Quanvert (Windows) 4-68 .gt., greater than 1-30
H Harvard Graphics 4-32 hct_, holecount file 1-228, 4-17 hd=, axis subheading 2-41 hdlev=, nested subheadings for column axes 2-63 hdpos=, position of subheadings above columns 2-65 header=, header length in non-std data file 2-250 heap=, maximum number of characters per axis 4-9 Hierarchical data process with 3-63 processing with clear= 3-64 processing with levels 3-45 see also Levels Highest card type 1-57, 1-198, 3-47, 4-2 hitch=, print table on same page as previous table 2-13, 2-188 how Quantum compares table texts 2-194 numbering printed pages 2-19 paper saving mode 2-191 paste one table under another 2-195 print page numbers logically/physically 2-196 short tables with 2-190 table texts with 2-191 hold=, rows to reprint at top of continued tables 2-109, 2-114 Holecount file 4-17 Holecounts 1-133 basic 1-135 double quotes in headings 1-135 filtered 1-136 multiplied 1-136 weighted 1-136 hug=, space required at bottom of page 2-108
I Icons for Quanvert (Windows) 4-90 ID text, and data-mapped variables 1-209 –id, multiple runs in a directory 1-231 id=, manipulation id 2-115, 2-174, 3-36, 3-41 on n/col/val/fld/bit 3-28
242 / Index
ident, default print parameters for write 1-81 print/suppress ruler with 1-83 turn off defaults 1-83 Identical statements, filing and retrieving 2-225 IDs for manipulation 2-115 if, conditional actions 1-115 forced editing with 1-159 with missingincs 1-173 with require 1-157 ignorezeros, with n07 2-137 .in., comparing values to a list 1-42 inc files 4-94 inc(), increments in grids 2-243 inc=, increment for cell counts 2-27, 2-32, 2-42, 2-120 data-mapped variables 1-205 element for maximum values of 2-124 element for median values of 2-124 element for minimum values of 2-124 example of 2-121 exclude missing values from calculations 2-142 in grids 2-243 in same axis as fac= 2-139 missing values with 2-122 on flt in levels jobs 2-218 on n25, for T-test 3-101 percentiles 2-144, 2-149 Quanvert databases 4-78 sample table with 2-136 switching off 2-122 table of maximum values of 2-28 table of mean values of 2-28 table of minimum values of 2-29 with levels 3-59 with statistics 2-138 incheap=, number of characters for inc= names 4-9 #include, read contents of another file 2-226 symbolic parameters with 2-228, 2-234 *include see #include Include files compressed 1-225 nesting 2-227 #includes, read non-standard data file 2-250 Incorrect axes, reflipping 4-98 Increasing limit for element manipulation 4-9 limit for text strings 4-10 maximum complexity of edit statement 4-9 maximum size of definelist 4-9 number of axes per run 4-9 number of characters for inc= names 4-9 number of characters per axis 4-9 number of elements per axis 4-9 number of inc= per run 4-9 number of named variables per run 4-9 number of text symbolic parameters 4-9 size of C array 1-196 incs=, maximum number of inc= per run 4-9
Quantum User’s Guide Volume 1 inctext=, text for numeric variable 2-27, 2-42, 2-123 indent=, indent folded element text 2-13, 2-115 Indices 2-16 example of 2-35 Individual constants 1-13 inline, convert to inline code 1-45 inlistheap=, limit on complexity of a definelist 4-9 input, weighting with proportions 3-7, 3-16 Integer variables 1-20 reset to zero 1-111 Integers 1-16 and reals in the same expression 1-27 defining in subroutines 1-189 saving in real variables 1-96 Intermediate files, summary of 4-23 Internal variable names 4-15, 4-24, 4-94 Interpolation method for percentiles 2-28, 2-146, 2-151, 2-152 Inverted databases 4-93 ismissing, check for missing_ 1-175
J Jobs check whether sufficient disk space to run 4-177 compile only 1-226 complete run 1-224 create log file 1-230 create Quanvert database 4-93 creating tables 1-229 deletion of temporary files 1-223 load C code 1-227 modifying for Quanvert 4-68 multiple runs in a directory 1-231 read and process data 1-228 rerun compilation & output stages only 1-229 run in background 1-230 speeding up 1-45 stages in 1-223 temporary space for 4-178 Join split databases 4-125 Jumping to tab section 1-126 Justification column headings in laser printed tables 3-199 row text in laser printed tables 3-203
K keep, percentage differences 2-123, 2-126, 2-175 Kolmogorov-Smirnov test 3-81 formulae 3-90 ks, Kolmogorov-Smirnov test 3-81
L l, name an axis 2-40 Labels 1-4 with do 1-119 with go to 1-118 lang=, specify the language 2-13, 2-176, 4-77 Languages Quanvert (Windows) 4-77 Quanvert Text 4-81, 4-84 SAS 4-56, 4-64 specify 2-13, 2-176 SPSS 4-38, 4-44, 4-54 tables 2-176 Large numbers, printing 2-27 Laser printed tables fonts for 2-11 justification of column headings 3-199 justification of row text 3-203 personalized PostScript code 3-213 printing extended ASCII characters in 3-212 special characters with 3-201 suppressing border 3-206 lastread, last card in record read 1-51, 3-65 lastrec, last record in file read 1-52 .le., less than or equal to 1-30 Least significant difference test 3-175 formula 3-182 len=, change the record length 1-78 levbase, increment base at anlev=level 2-124, 3-57 level, edit for a specific level 3-50 Levels analysis level for tables 2-10 cross-reference files for Quanvert 4-94, 4-95 cross-tabulating axes at different levels 3-53 define data structure in level file 3-47 defining in levels file 3-45, 4-2 defining with struct 3-48 example of edit 3-51 grids with 2-245 how tables are produced 3-51 inc= on flt statements 2-218 introduction to 3-45 levels file 3-45 maximum allowed 3-45 maximum cards per record 4-2 maximum sub-records per record 3-48 naming in edit section 3-50 numeric variables 3-59 preparing for Quanvert 4-72, 4-73 process with 3-63 record length 3-48 special T statistics 3-62, 3-149, 4-88 statistics with 3-61 updating bases in uplev= tables 2-124, 3-57 updating cells with anlev= 3-53 updating tables at higher level than axes 3-54 weighting 3-12
Index / 243
Quantum User’s Guide Volume 1
Levels file 4-2 lexchars=, increase limit for text strings 4-10 License expiry warning 4-11 Limits increasing 4-9 list of 2-265, 4-131 numbers 1-16 linesaft, blank lines after column headings 2-14, 2-162 linesbef, blank lines before column headings 2-14, 2-162 list, create frequency distribution 1-139 lista, alphabetic frequency distribution 1-139 listr, ranked frequency distribution 1-139 Lists alphabetic 1-139 creating 1-139 named 1-44 preventing use of in Quanvert Text 4-85 ranked 1-139 Local variables 1-199 with subroutines 1-186 Location, test for in matched samples 3-85 Log files 1-230 Logical expressions arithmetic value of field 1-38 checking equivalence of 1-154 combining 1-39 comparing data variables 1-31 comparing values 1-30 comparing variables to a list 1-42 data-mapped variables 1-205, 1-210 negating 1-40 range 1-38 validating 1-153 with c= 2-119 with if 1-115 Logos, printing on tables 3-209 Long texts, splitting 2-51 Look-up files 1-178 list used/unused keys 1-180 Loops function of 1-119 nesting 1-123 with routing 1-124 Lotus-123, convert Quantum data for use with 4-32 lsd, least significant difference test 3-175 lst_, frequency distribution file 1-228, 4-17 .lt., less than 1-30
M m, create a manipulated row 3-25 define manipulation expression 3-26 options on 3-25 machine.def, qvpack/qvtrans alias file 4-128
244 / Index
manipclean, delete all except manipulation files 4-25 manipheap=, limit for element manipulation 4-9 Manipulated cell counts file 3-38, 4-22 Manipulated elements, in sorted tables 3-141 Manipulation apply spechar and nz options to manipulated elements 2-14, 3-31 averages 3-33 example of 3-40, 3-42 expressions with 3-41 manipulated cell counts file 3-38 more than one table 3-39 on n statements 3-32 parts of tables 3-41 program 1-229 replacing numbers in tables 3-35 row, example of 3-30 run definitions file 3-38, 4-3 run ids for 3-38 tables from dummy data 3-43 tables from other runs 3-38 using automatic table ids 3-36 using element ids 3-28 using overall position 3-37 using previously manipulated figures 3-38 using relative position 3-29, 3-37 using row texts 3-27 using your own ids 3-36 whole tables 3-34 manipz, apply spechar and nz options to manipulated elements 2-14, 3-31 mapvar, define data-mapped variable 1-203 Matched samples, testing difference in location 3-85 max, maximum manipulation operator 3-26 max=, highest card type 1-57, 1-198, 3-47, 4-2 maxim, maximum values of inc= 2-28, 2-124 example of use 2-37 maxima.qt, limits file 4-10 maxsub=, maximum sub-records per record in levels data 3-48, 4-2 maxwt=, maximum weight 3-8 mcnemar, McNemar’s test for differences 3-83 McNemar’s test for differences 3-83 formula 3-91 mean, t-test on column means 3-164 Means analysis levels with 3-61 decimal places with 2-139 error variance 2-136 formula 2-156 least significant difference test 3-175 print maximum values of 2-37 print minimum values of 2-37 produced by list 1-139 sorted table of 3-141 standard deviation 2-136 standard error 2-136 suppress if have small base 2-20, 2-196
Quantum User’s Guide Volume 1
Means (continued) table of means 2-28, 2-36 test difference between 3-110, 3-112, 3-165 test for specific values 3-101 test paired differences between 3-101 t-test on column 3-164 two sample T-test for comparing 3-105 with fac= 2-140 with inc= 2-142 median, median values of inc= 2-124 Medians, see percentiles medint=, interpolation method for percentiles 2-28, 2-151, 2-152 mergedata, merge data from an external file 1-61 merges file 1-59, 4-4 mflip program 4-107 mfwaves file 4-111 min, minimum manipulation operator 3-26 minbase=, very small base for T stats 2-29, 3-150, 3-151 minim, minimum values of inc= 2-29, 2-124 example of use 2-37 Minimum weight, defining 3-8, 3-18 minwt=, minimum weight 3-8 Missing values assignments 1-173 checking for 1-175 counting with val 2-94 exporting as missing_ 2-124 in arithmetic expressions 1-173 in frequency distribution 1-139 processing in the edit 1-172 Quanvert 4-74 switch on/off in edit 1-172 switch processing on/off in tab section 2-29, 2-32 treat other values as 2-30, 2-42, 2-124 when found 1-172 with inc= 2-122 with n25;inc= 2-142 with pre/postweights 3-8 missing=, treat other values as missing 2-30, 2-42, 2-124 missing_, missing values 1-173, 1-174 missingincs, switch missing values processing on/off 1-172, 2-29, 2-32 with if 1-173 missingval, export missing data as missing_ 2-124 mul files 4-75, 4-94 Multicard records definition of 1-47 more than 100 columns per card 1-63 reading 1-49 writing 1-66 Multicodes convert to single codes 1-181 entering 1-14 printing 1-80
Multidimensional tables, example of 2-172 Multilingual surveys 2-13, 2-176 Quanvert (Windows) 4-77 Multiplication 1-26 Multiproject databases 4-101 add new variables to 4-107 axes with duplicate element texts 4-103 command file for 4-110 create in Quanvert (Windows) 4-101 create in Quanvert Text 4-101, 4-107 how common axes are combined 4-102 merging components 4-107 select projects from 4-111 things to check 4-106 Mutually exclusive elements in axes 3-72
N n statements, options on 2-112 n00, filtering within an axis 2-104 example of use 2-241 with n04 and n05 2-134 with redefined base 2-103 n01, basic counts 2-50 percentiles with inc= 2-144, 2-149 n03, text only 2-58 n04, total 2-133, 2-134 example of 2-121 n05, subtotal 2-133, 2-134 n07, average 2-137 n09, start new page 2-108 n10, base 2-57 n11, base, non-printing 2-57 in Quanvert Text 4-85 n12, mean 2-140 analysis levels with 3-61 suppress if has small base 2-20, 2-196 with ANOVA 3-110 with T-tests 3-101 with two sample T-tests 3-105 n13, sum of factors 2-144 n15, basic counts, non-printing 2-56 n17, standard deviation 2-136 suppress if has small base 2-20, 2-196 with ANOVA 3-110 with T-tests 3-101 with two sample T-tests 3-105 n19, standard error of the mean 2-136 alternative formula for 2-143 calculate using weighted figures 2-31 suppress if has small base 2-20, 2-196 with ANOVA 3-110 with T-tests 3-101 with two sample T-tests 3-105 n20, error variance of the mean 2-136 suppress if has small base 2-20, 2-196
Index / 245
Quantum User’s Guide Volume 1
n23, subheading 2-62 n25, component values for means etc. 2-140 manipulating components of 3-30 print in column axes 2-115, 2-140 print in row axes 2-116, 2-140 Quanvert databases 4-76 n30, percentiles 2-144, 2-145 n31, effective base 2-136, 2-153, 3-147, 3-148 n33, text continuation 2-66 NA, missing values in Quanvert 4-74 Name, refer to data fields & responses by 1-201 Named filters 2-11, 2-220 prevent creation of, in Quanvert Text 4-83 Named lists 1-44 Named variables, increasing limits for 4-9 namedalpha, alpha variables 4-73 namedinc, numeric variables 4-42, 4-49, 4-63 Quanvert 4-70 namevars=, number of named variables per run 4-9 Naming of variable files 4-15, 4-95 Naming variables 1-195, 4-15 axes 2-39 for use in Quanvert 4-68 Naming weighting matrices 4-71 nand, force same table number for and tables 2-178, 2-211 ndi, distribute element cases across axis 2-129 .ne., not equal to 1-30 Nested filter sections 2-221 Nested subheadings in column axes 2-63 net, start a net 2-67 Nets accumulation of suppressed elements in 2-72 cases in previous elements 2-48 cases not yet counted 2-48 collecting suppressed elements 2-14, 2-43 description of 2-67 for previous lines 2-69 for subsequent lines 2-67 percentaging with 2-73 sorting by net level 2-14, 2-43, 2-70, 3-128, 3-129 example 3-129, 3-134 switching off 2-70 with totals 2-134 netsm, small suppression with nets 2-14, 2-32, 2-43 netsort, sort nets by net level 2-14, 2-32, 2-43, 2-70, 3-129 New cards, creating, example of 2-53 New page, starting 2-108 Newman-Keuls test 3-112, 3-113, 3-165 formula 3-121, 3-184 News file for Quanvert (Windows) 4-90 nft, F and T statistics 3-108 formula 3-117 nk, Newman-Keuls test 3-112 nkl, Newman-Keuls test 3-165 No response, data-mapped variables 1-206
246 / Index
noacr100, suppress 100% on base row 2-32 noaxcount, switch off axcount 2-32 noaxttl, suppress table headings 2-32 nobounds, switch off array bounds checking 1-112 nocheck_, possible syntax errors not fatal 1-10 nocol, not a column element 2-115, 2-116 Quanvert databases 4-75 nodate, suppress date 2-32 nodp, suppress double precision calculations 2-32 nodsp, no double spacing 2-32, 2-45 noexport, don’t export element to SAS or SPSS 2-115 noexportsp, force an axis to be multicoded when exporting to SPSS 4-50 nofac, no factors 2-120 noflush, percentages not flush with absolutes 2-32 nograph, suppress graphics 2-32 nohigh, not a higher dimension element 2-115 Quanvert databases 4-75 noident, switch off default write parameters 1-83 noignorezeros, switch off ignore zeros 2-137 noinc, suppress incremental values 2-32, 2-45, 2-122 nomanipz, turnoff manipz 3-31 nomissingincs, switch missing values processing off 2-32 nonetsm, no small suppression with nets 2-32 nonetsort, turn off sorting by net level 2-32, 2-70, 3-129 Non-identical statements, filing and retrieving 2-227 Non-standard data 1-63, 1-225, 2-250 nontot, exclude element from totals 2-116 nonz, print all-zero elements 2-45 nonzcol, print all-zero columns 2-32 nonzrow, print all-zero rows 2-32 nooverlapfoot, suppress overlap footnotes for T stats 2-15, 3-160 nopage, suppress page numbers 2-18, 2-32 #nopagebox, suppress border on laser printed tables 3-206 nopc, suppress percent signs 2-18, 2-32 noprint, suppress printing of tables 2-15 noround, element not force rounded 2-19, 2-32, 2-124 norow, not a row element 2-116 Quanvert databases 4-75 noscale, ignore scaling factor 2-32 nosmcol, print small columns 2-32 nosmrow, print small rows 2-32 nosmtot, print small totals 2-32 nosort, unsorted axis 2-45 nosort, unsorted element in sorted table 2-116, 3-129 nosort, unsorted table in sorted run 2-32 nosummary, keyword for secure databases 2-45, 4-119 .not., negate logical expressions 1-40
Quantum User’s Guide Volume 1
notauto, suppress automatic titles for T statistics 2-15 notbl, suppress table numbers 2-15 Notes file for Quanvert (Windows) 4-90 notitle, suppress table titles 2-32 notopc, suppress percent sign at top of column 2-32 notstat, exclude element from T stats 2-32, 2-44, 2-45, 2-116, 3-145 notstatdebug, no intermediate figures for T stats 2-32 notype, suppress output type message 2-24, 2-32 nouseeffbase, don’t use weighted counts for standard error 2-32 nowmerrors, suppress weighting errors 2-32, 3-10 nqtsas, convert Quantum data & spec to SAS 4-65 nqtspss, convert Quantum data & spec to SPSS 4-44 how differs from qtspss 4-44 options with 4-52 nsw, squared weight element 2-30, 2-49, 2-143, 3-147 ntd, significant net difference test 3-166 ntot, exclude element from totals 2-116, 2-130, 4-87 ntt, text-only net element 2-71 Null response, check for 1-205 numb, number of codes in a column 1-28 data-mapped variables 1-216 Numbering tables 2-210 with hitch and squeeze 2-196 Numbers 1-16 large, in tables 2-27 numcode, flag axis as single coded 2-44 numdir.qv, number of variables per directory 4-80 Numeric codes elements for 2-94, 2-97 exporting to SAS 4-63 exporting to SPSS 4-42, 4-49 Numeric conditions, defining with val 2-89 Numeric fields, missing values in edit section 1-172 Numeric variables compress in Quanvert Text 4-115 create for Quanvert 4-70 define which to flip 4-78 levels with 3-59 prevent creation of, in Quanvert Text 4-83 nums, unmanipulated cell counts file 1-229 nums.man, manipulated cell counts file 1-229, 4-22 nz, suppress all-zero elements 2-44, 2-116 apply to manipulated elements 3-31 nzcol, suppress all-zero columns 2-15, 2-32 apply to manipulated elements 3-31 nzrow, suppress all-zero rows 2-15, 2-32 apply to manipulated elements 3-31
O One dimensional chi-squared test 3-73 formula 3-89 One sample T-test 3-101 example 3-102, 3-103 formula 3-117 One sample Z-test 3-93 example 3-94 formula 3-115 One-way analysis of variance 3-110 example 3-110 formula 3-119 Online edit accepting records 1-165 canceling 1-167 correcting data 1-163 creating new cards 1-166 delete codes from column 1-163 deleting cards 1-166 displaying columns 1-162 e 1-163 ed 1-166 insert codes in column 1-163 overwrite column 1-163 redefine command names 1-167, 4-7 re-edit current record 1-166 reject record in 1-165 rt 1-165 s 1-163 split 1-161 terminate for current record 1-165 write 1-161 online, interactive data correction 1-160 op=, output types 2-15, 2-117 A/B percentage differences 2-124, 2-126 order of printing with 2-17 separate tables for different output types 2-17 with rotated grid axes 2-245 Open ended responses 4-74 Options defining run defaults 2-32 on a 2-8, 2-9 on add 2-186 on col 2-112 on div 2-187 on fld 2-112 on flt 2-9, 2-217 on l 2-40 on m 3-25 on n statements 2-112, 2-117 on sectbeg 2-9 on sid 2-180 on tab 2-9, 2-174 on und 2-180 on val 2-112 on wm 3-7 switching off 2-32
Index / 247
Quantum User’s Guide Volume 1
.or., logical or 1-40 or, logical operator for assignment 1-100 ord, line layout for table of contents 3-192 order=, alphanumeric card types 1-58 ori, justification of table titles 2-215 out1, compilation listing 1-226, 4-13, 4-23 out2, records failing write/require 1-228, 4-17, 4-23 out3, cumulative output summary 1-230, 4-22, 4-23 Output data file for require 4-18 Output options display width in Quanvert Text 4-85 order of with percent diffs 2-127 printing multicoded data 1-80 Output program 1-230 Output type descriptions, with hitch/squeeze 2-191 Output types defining 2-15 order of printing 2-17 print on tables 2-24 separate tables for different types 2-17 suppress printing of 2-24, 2-32 overlap, overlapping data with T stats 2-30, 3-159 overlapfoot, print overlap message for T stats 3-160 Overlapping data footnote about 3-160 special T stats 2-30, 3-159
P p, position cell counts 2-167 Packed databases 4-124 join split database 4-127 maximum size of 4-124 split file 4-127 unpack packed file 4-126 Packing databases 4-129 extra files for Quanvert (Windows) 4-91 <>, page numbers on tt statements 2-213 pag, page numbers 2-213 Page break suppress between all tables 2-191 suppress between split wide tables 2-190 suppress between tables 2-190 Page length 2-18 Page numbers switching off 2-213 user-defined, positioning with tt statements 2-213 with and 2-178 with hitch/squeeze 2-191 with multidimensional tables 2-174 Page width 2-18 set for Quanvert Text 4-85 suggestions for Quanvert 4-71 page, automatic page numbering 2-18, 2-32
248 / Index
Pages center tables on 2-22 number of lines on 2-18 numbering 2-18, 2-213 print more than one table on 2-188 start new 2-108 suppress numbering 2-18, 2-32 width of 2-18, 4-71, 4-85 Pagination automatic 2-105 order in split tables 2-107 precedence of rows & columns 2-19 paglen, page length 2-18 pagwid, page width 2-18 Quanvert Text 4-85 Paired preference test 3-170 formula 3-181 P-values for 3-173 Paired T-test 3-101 example 3-102, 3-103, 3-104 formula 3-117 Panel studies cross-referencing levels in 4-73 flip individual waves 4-112 link waves in 4-113 weighting in 4-113 Paper saving output 2-191 Parentheses, with data variables 1-197 Partial column replacement 1-92 pc, print percent signs 2-18, 2-32 PC-NFS, access Unix databases with 4-130 pcpos=, position of percentages 2-18, 2-117 pcsort, sort on percentages 2-18, 3-128 pczerona, print NA for percents with zero bases 2-19 –pd, directory for permanent files 1-232 Penetration tables creating with celllev= 3-58 creating with clear= 3-65 Percentage differences 2-125 flag table for 2-175 order of op= options with 2-127 Percentages 100% on base row 2-10, 2-16 against redefined base 2-16 column 2-16 example of 2-57 suppress small 2-21, 2-117 cumulative 2-16, 2-34 decimal places 2-11, 2-113 forced rounding to 100% 2-19 nets 2-73 position relative to absolutes 2-117 print NA for percents with zero bases 2-19 print percent signs 2-18, 2-22 printing flush with absolutes 2-11 redefined bases, example of 2-103 row 2-16 suppress small 2-21
Quantum User’s Guide Volume 1
Percentages (continued) side by side with absolutes 2-17 sorting 2-18, 3-128 suppress if have small base 2-20, 2-196 suppress percent signs 2-18, 2-32 suppressing for a single row 2-118 total 2-15 example of 2-33 suppress small 2-21, 2-117 with sid and und 2-182 Percentiles factors in reverse sequential order 2-146 from absolute values 2-30, 2-144, 2-149 from factors 2-144, 2-145, 2-146, 2-148 interpolation method 2-28, 2-146, 2-151 Permanent files, directory for 1-232 physpag, page numbering with hitch and squeeze 2-19, 2-32 Position of cell counts in tables 2-167 post=, postweighting 3-8, 3-16 inctext= invalid with 2-123 Postprocessors for Quanvert Text 4-82, 4-85 PostScript personalized code for laser printed tables 3-213 printing tables with 3-198 special characters in axes 3-199 suppress table of contents 3-211 user-definable characters 3-201 #postscript, start PostScript code 3-213 Postweights 3-6, 3-16 Pounds signs in tables 3-198 ppt, paired preference test 3-170 pre=, preweighting 3-8, 3-16 inctext= invalid with 2-123 Precoded response, check for 1-206 Prevent access to unweighted data in Quanvert 4-116 Preweights 3-5, 3-16 Print files define default output for 1-81 PostScript 3-198 turn off default parameters for 1-83 printed_, current record has been written out 1-67 Printing | and ! in element texts 3-202 Printing DNA and NA for missing values 4-74 Printing multicodes, output options 1-80 Printing records ident 1-81 qfprnt 1-84 require 1-145 write 1-65 printz, print all-zero tables 2-19 priority, force single-coding 1-104 private.c, C subroutine code file 4-11 private.o, compiled C subroutine code file 4-22 process, tabulate record 1-129 effect on Quanvert databases 4-75 example of 1-130, 2-100 position in edit 1-131
process, tabulate record (continued) with levels 3-63 Product tests, example of 2-247 Profiles postprocessors for Quanvert Text 4-82, 4-85 prevent use of in Quanvert Text 4-85 profopts, Quanvert Text postprocessor file 4-82, 4-85 Programs accum 1-229 bintab 4-174 colrep 4-27 components of 1-3 datapass 1-227 flipclean 4-97 format of 1-8 manip 1-229 manipclean 4-25 mflip 4-101, 4-107 nqtsas 4-65 nqtspss 4-44 pstab 3-198 q2cda 4-32 qout 1-230 qsj 4-125, 4-127 qteclean 4-25 qtext 4-167 qtlclean 4-25 qtoclean 4-25 qtsas 4-56 qtspss 4-38 quclean 4-25 qvclean 4-98 qvpack 4-129 qvpk 4-124 qvq2cda 4-32 qvsecure 4-116 qvshrinc 4-115 qvtr 4-126 qvtrans 4-130 qvupdate 4-119 storing 1-3 tabcon 3-189 textq 4-167 weight 1-229 Project selection file 4-111 Project text file 2-176, 4-23, 4-77 Projects, select from multiproject database 4-111 Prompts, translating for Quanvert Text 4-81 prop, t-test on column proportions 3-160 propcorr, continuity correction for t-test 3-161 propmean, t-test on column props & means 3-161 Proportions compare with significant net difference test 3-166 test for given values 3-93
Index / 249
Quantum User’s Guide Volume 1
Proportions (continued) test of differences between overlapping samples 3-99 between subsamples 3-97 t-test on column 3-160 two sample test of difference 3-95 pstab, create PostScript tables 3-198 ptf, translation file 2-176, 4-23, 4-77 Punch codes, ASCII equivalents 4-175 punch()=, symbolic parameters for codes 2-232 punchout.q, records written out by require 1-228, 4-18 pvals, print P-values for special T stats 3-159 P-values Newman-Keuls test 3-165 paired preference test 3-173 significant net difference test 3-169 t-test on column means 3-164 t-test on column proportions 3-163
Q q2cda, Quantum tables to CDA 2-82, 4-32 column headings 2-169 options with 4-35 qdi files 1-201, 1-217 qdiaxes, generate Quantum spec 1-217 qextras.lst file for Quanvert (Windows) 4-91 qfprnt, write out data in user-defined format 1-84 qnaire.txt file for Quanvert (Windows) 4-91 qotext.dat 4-82 qout, output program 1-230 qqhct, holecount file 4-17 qsj, split or join databases 4-125, 4-127 QTAXES, maximum number of axes per run 4-10 qteclean, delete files created by edit-only run 4-25 QTEDHEAP, to adjust edit statement complexity 4-10 QTELMS, max number of elements per axis 4-10 qtext, convert Quantum data to text format 4-167 QTFORM define special characters for laser printing 3-201 QTHEAP, max number of characters per axis 4-10 QTHOME, Quantum home directory 1-223 QTINCHEAP, max number of characters for inc= variables 4-10 QTINCS, maximum different inc= per run 4-10 QTINLISTHEAP, adjust definelist complexity 4-10 qtlclean, delete temporary compilation files 4-25 QTLEXCHARS, max size of long text strings 4-10 qtm_ex_, datapass program 1-227 QTMANIPHEAP, max size of expressions 4-10 QTNAMEVARS, max num of named variables 4-10 QTNOPAGE, suppress blank page 4-23 QTNOWARN, suppress license expiry warning 4-11 qtoclean, delete files created by quantum -o 4-25
250 / Index
qtsas, convert Quantum data & spec to SAS 4-56 qtspss, convert Quantum data & spec to SPSS 4-38 how differs from nqtspss 4-44 QTSPSSRC, nqtspss options 4-55 QTTEXTDEFS, max num of text symbolic params 4-10 Quancept 1-201, 1-205, 1-217, 1-218 Quantum program components of 1-3 format of 1-8 modify for Quanvert 4-68 options with 1-224 storing 1-3 which version to use 1-223 Quanvert 4-67 add with 4-71 allow creation of new axes 4-96 allow use of special T statistics 4-76 alpha variables 4-73, 4-74 axis titles 4-68 create database 4-93 create uniq_id variable 4-121 defining axes 4-68 effective base elements 3-149 export grids to SAS and SPSS 2-40, 2-249 files 4-94 files which must be present 4-96 filters 4-71 levels cross-reference files 4-94, 4-95 levels data 4-72, 4-73 missing values 4-74 n25 with 4-76 naming weighting matrices 4-71 norow/nocol/nohigh with 4-75 numeric variables 2-123, 3-60, 4-70 page width suggestions 4-71 prepare weighted databases 4-71 prevent access to weighted/unweighted data 4-116 process with 4-75 reduce disk space for database 4-75 respondent serial numbers 4-71 secure databases 2-45, 2-118 special T statistics 4-76 temporary directories 1-232 text at bottom of tables 4-71 trailer cards with 4-72 weighting matrices 3-8 Quanvert (Windows) 4-67 database icon 4-90 databases 4-86 languages 4-77 levels data 4-88 news file 4-90 notes file 4-90 packing extra files 4-91 percentiles 2-151 questionnaire file 4-91
Quantum User’s Guide Volume 1
Quanvert (Windows) (continued) set up to use .wav files 4-74 special T statistics 2-116 stats.ini file 4-86 variable groups 4-68 Quanvert Create Utility 4-67 Quanvert Menus 4-67 Quanvert Text 4-67 access rights to files 4-81 command availability 4-83 convert tables to CDA format 4-32, 4-35 creating large axes 4-83 display width 4-85 dummy axis 4-84 filtering on per-user basis 4-84 languages 4-84 multiproject databases 4-107 n11 4-85 page width 4-85 panel studies 4-73 postprocessors for profiles 4-82, 4-85 prevent alteration of texts 4-83 prevent creation of variables 4-83 prevent use of profiles 4-85 restrict access to axes and variables 4-84 row text width 4-85 translation file for prompts 4-82 translation of prompts 4-81 Quartiles, see percentiles quclean, delete temporary files 4-25 wildcard characters with 4-26 Questionaire data information files 1-201 generate Quantum spec from 1-217 Questionnaire file for Quanvert (Windows) 4-91 qvclean, remove all files for a survey 4-98 qvgroup, groups in Quanvert Windows 4-68 qvlv files, levels cross-reference files for Quanvert 4-95 qvmerge, merge variables into existing database 4-100 qvpack, pack databases 4-129 alias file for 4-128 files required by 4-125 qvpk, pack databases 4-124 qvq2cda, Quanvert Text tables to CDA 4-32, 4-35 qvsecure, create secure Quanvert database 4-116 qvshrinc, compress .inc files 4-115 qvtext.dat 4-82 qvtr, unpack databases 4-126 qvtrans, convert unpacked files 4-130 alias file for 4-128 qvupdate, update Quanvert 4-119
R Random code, set into column 1-107 Random numbers, generating 1-29 random, generate random numbers 1-29 range, check arithmetic value of field 1-38 rangeb, test arithmetic value of field, with blanks 1-39 Ranges as conditions 2-92 Ranking see Sorting Ranks in Friedman test 3-85 Raw counts in secure databases 2-45, 2-118, 4-116, 4-118 read=, how to read data 1-54 Real numbers 1-16 copying into columns 1-98 saving in integer variables 1-96 significant figures with 1-16 Real variables 1-21 defining in subroutines 1-189 reset to zero 1-111 Reals and integers in the same expression 1-27 rec_acc, number of records accepted 1-125 rec_count, number of records read so far 1-52 rec_rej, number of records rejected 1-125 reclen=, record length 1-54, 2-250, 4-2 Record length 1-54, 1-78 in levels data 3-48 in non-std data files 2-250 with levels 4-2 Record structure, defining 1-53 Record type, defining 1-53 Records counting by axis name 2-25 distribute one element across the axis 2-129 examining with list 1-138 last in file, checking for 1-52 maximum cards in, in levels jobs 4-2 maximum sub-records per, in levels data 3-48 multicard with more than 100 cols per card 1-63 number read in so far 1-52 printing 1-145 rejecting from tables 1-145 types of 1-47 writing out parts of 1-68 Redefined base, percentaging against 2-16 Reformatting data 2-53 Refused, data-mapped variables 1-205 rej=, excluding elements from the base 2-125 reject, omit record from tables 1-124 with require 1-126 rejected_, current record has been rejected 1-125 Rejecting records from tables 1-124, 1-145 rep=, repeated card types 1-56 Repeated card types defining 1-56 in unusual order 1-52 missing 1-52
Index / 251
Quantum User’s Guide Volume 1
report, write data to report file 1-70 report=, report type for rim weighting 3-21 req=, required card types 1-56 require, validating codes and columns 1-144 action codes 1-145 actions when test fails 1-156 automatic error correction 1-151 checking codes in columns 1-148 checking exclusive codes 1-150 checking logical expressions 1-153 checking routing 1-155 checking type of coding 1-146 comments with 1-147 correcting errors from 1-160 data output file for 4-18 data validation 1-143 defaults with 1-152 equivalence of logical expressions 1-154 file of records failing 4-17 with if 1-157 Required card types 4-2 defining 1-56, 3-46 Reserved variables allread 1-50 card_count 1-52 firstread 1-51, 3-64 lastread 1-51, 3-65 lastrec 1-52 number of cards read so far 1-52 number of records accepted 1-125 number of records read so far 1-52 number of records rejected 1-125 printed_ 1-67 rec_acc 1-125 rec_count 1-52 rec_rej 1-125 record written to out 1-67 rejected_ 1-125 stop statement executed 1-127 stopped_ 1-127 this record rejected 1-125 thisread 1-50 with trailer cards 1-50 Reserved words with flip 4-70 Resetting variables between respondents 1-97 resp(#)=, substitution for data-mapped variables 1-215 Response, assign to data-mapped variable 1-209 return, go to tabulation section 1-126 with levels 3-50 with reject 1-126 rgrid, rotated grid tables 2-245 Rim weighting 3-3, 3-7, 3-19 efficiency, formula 4-19, 4-21 parameters file 4-5 report for each iteration 3-21 root mean square 3-4, 3-20, 4-20 summary information for 4-19
252 / Index
rim, rim weighting 3-7 rinc, rows take precedence when paginating large tables 2-19, 2-107 Risk level for special T stats 3-156 rj, reject record in online edit 1-165 rm, delete cards in online edit 1-166 Root mean square 3-4, 3-20 formula 4-20 Rotated grid tables 2-245 round, forced rounding to 100% 2-19, 2-32 Rounding to 100% 2-19, 2-32 Routing checking 1-155 using go to 1-118 with loops 1-124 Row manipulation 3-25 expressions for 2-119 ids for 2-115 Row offsets with added tables 2-184 Row percentages 2-16 force to round to 100% 2-19 suppress small 2-21 Row ranks in tables 2-16 row, row element 2-116, 2-140 Rows alignment of text in laser printed tables 3-203 basic counts 2-83 created with col 2-83 indenting folded text 2-13 reprint at top of continued tables 2-109, 2-114 sorting 2-20, 3-126 suppressing small 2-21 text width 2-20 text width in Quanvert Text 4-85 rpunch, set a random code into a column 1-107 rqd, default action code for require 1-146 rsort, sort rows 2-20, 3-125 rt, terminate online edit for current record 1-165 Run conditions, defining 2-8 Run defaults file see Default options file Run definitions file 4-3 Run file, generate from qdi file 1-220 Run ids for table manipulation 3-38
S s, assignment in online edit 1-163 s, side element for manipulation 3-41 Sample Quantum job 2-253 Sample tables cumulative percentages 2-34 hitch/squeeze 2-191 inc= 2-136 indices 2-35 means 2-36 multidimensional tables 2-172
Quantum User’s Guide Volume 1
Sample tables (continued) subtotals 2-136 suppress percents with small bases 2-199 total percentages 2-33 totals 2-136 totals and subtotals 2-121 Sample variance, see Error variance SAS convert Quantum data/spec to 4-56, 4-65 don’t export element to 2-115 export grid in Quanvert 2-40, 2-249 export missing data as missing_ 2-124 numeric data 4-63 scale=, scaling factor 2-30, 2-32, 2-117 Scaling factors, defining 2-30, 2-117 sectbeg, start nested table section 2-222 sectend, end nested table section 2-222 Secure Quanvert databases 2-45, 2-118, 4-116 security level 4-117 Segments, defining in an axis 3-69 sel, titles to print in table of contents 3-193 Semicolons in strings 1-90 in texts 2-51 ser=, serial number location 1-55, 3-47, 4-2 Serial number, location of 1-55, 3-47, 4-2 Serial numbers in Quanvert 4-71 *set, define T variable in data file 1-113 set, assignment statement 1-89 sid, tables side by side 2-180 column headings with 2-181 g statements with 2-181 options with 2-180 percentages with 2-182 sorting with 2-182 table headings with 2-181 side, identify rows in grid axes 2-238 side=, row text width 2-20 Quanvert Text 4-85 Significant net difference test 3-166 formula 3-181 P-values for 3-169 Similar projects, linking 4-101 Simplifying your spec by reformatting the data 2-53 Single class. chi-squared test 3-78 example of 3-80 formula 3-90 Single columns, checking contents of 1-32 Single quotes, with codes 1-14 Single-coded axes, testing for 2-40 Single-coded data, from multicoded data 1-181 Single-coded, flag axes as 2-44 Small suppression, switching off 2-32 smallbase=, small base for T stats 2-20, 3-150 smbase=, suppress percents/stats with small bases 2-20, 2-196 smcol, suppress small columns 2-20, 2-32 smflag=, flag cells with small bases 2-20
smrow, suppress small rows 2-21, 2-32 smsup+, sum of suppressed elements 2-118 smsupa=, suppress small absolutes 2-21, 2-117 smsupc=, suppress small percentages 2-21, 2-117 smsupp=, suppress small percentages 2-21, 2-117 smsupt=, suppress small total percentages 2-21, 2-117 smtot, suppress small base values 2-21, 2-32 sort, sort rows 3-125 sort, sorted table or axis 2-21, 2-32, 2-44 sortcol, sort on this column 2-118 Sorting axes 2-44 cancel global for one axis 2-45 column on which to sort 2-118 columns 2-10, 3-127 effect on text-only elements 3-137 end of subgroup 2-113 example with three levels 3-131, 3-135 manipulated elements 3-141 nesting subsorts 3-135 nets 3-128 on non-base element 3-126 percentages 2-18, 3-128 rows 2-20, 3-126 secondary levels, example of 3-129, 3-134 start of subgroup 2-118 statistical elements 3-141 tables 3-125 tables of means 3-141 tables of summary statistics 3-142 text-only rows as sublevel headings 3-138 totals 3-141 unsorted rows 3-129 with sid and und 2-182 within nets, example of 3-129, 3-134 Sound files in Quanvert (Windows) 4-74 spechar=, special characters 2-21 apply to manipulated elements 3-31 with statistics 2-22, 2-137 Special response, check for 1-206 Special T statistics continuity correction 3-161 effective base 2-119, 2-153, 3-147 elements with ntot 4-87 exclude elements from 2-32, 2-44, 2-45, 2-116 formulae 3-175 include elements in 2-31, 2-45, 2-119 intermediate figures for 2-22, 3-157 least sig. diff. test 3-175 levels 3-62, 3-149, 4-88 minimum effective base for 2-29 Newman-Keuls test 3-165, 3-184 nsw elements for 3-147 on weighted jobs 3-147 overlapping data with 2-30, 3-159 paired preference test 3-170
Index / 253
Quantum User’s Guide Volume 1
Special T statistics (continued) print overlap message 3-160 P-values for 3-159 Quanvert (Windows) 2-116, 4-86 Quanvert databases 4-76 requesting 3-154 selecting elements for 3-145 significant net difference test 3-166 small base for 2-20, 3-150 suppress automatic titles 2-15 suppress overlap footnotes 2-15, 3-160 titles for 3-151 t-test on column means 3-164 t-test on column proportions 3-160 t-test on column proportions & means 3-161 very small base for 3-150, 3-151 Specified other, check for 1-205 Split database files 4-127 joining 4-127 Split or join databases 4-125 split, create clean & dirty files 1-167 Splitting long column headings 2-163 SPSS convert Quantum data/spec to 4-38, 4-44 don’t export element to 2-115 export grids from Quanvert 2-40, 2-249 export missing data as missing_ 2-124 force an axis to be multicoded 2-41, 4-50 numeric data 4-42, 4-49 sqrt, square root manipulation operator 3-26 Square roots 1-183, 3-26 Squared weighting elements 2-30, 2-49, 2-143, 3-147 squeeze=, squeeze table onto one page 2-22, 2-188 how Quantum compares table texts 2-194 numbering printed pages 2-19 paper saving mode 2-191 print page numbers logically/physically 2-196 suppress column headings with 2-193 table texts with 2-191 with wide tables 2-190 Stages in a Quantum run 1-223 Standard deviation 2-136 formula 2-156 function of 2-139 produced by list 1-139 suppress if has small base 2-20, 2-196 weighted base less than 1.0 2-143 Standard error of the mean 2-136 calculate using weighted figures 2-31 formula 2-157 function of 2-139 in weighted jobs 2-143 suppress if has small base 2-20, 2-196 use weighted counts in 2-143 stat=, axis-level statistics 3-68 stat=, table-level statistics 2-31, 3-70 statdata, SAS data file 4-65
254 / Index
Statements aliases for 4-6 continuation of 1-9 length of 1-4 statistical 2-136 Statistical elements, in sorted tables 3-141 Statistical statements, list of 2-136 Statistics analysis levels with 3-61 exclude missing values from 2-142 F and T values 3-108 factors for 2-119 flag cells with small bases 2-20 general notes about 3-71 more than one per axis 3-69 more than one per table 3-71 Quanvert (Windows) 4-86 sorted summary tables of 3-142 spechar with 2-22, 2-137 squared weighting elements for 2-30, 2-49, 2-143, 3-147 summary table of requirements 3-72 table-level 2-31, 3-70 triangular array of 3-71 see also Special T statistics stats.ini file for Quanvert (Windows) 4-86 stop, terminate the edit 1-127 stopped_, stop statement executed 1-127 Storing your program 1-3 Strings of data constants 1-15 Strings, semicolons in 1-90 struct, define record structure 1-53 with levels data files 3-48 Subaxes end of group 2-77 naming groups on elements 2-79, 2-114 start of group 2-77 tables from 2-80 Subdirectories, store variables in 4-80 Subheadings in sorted tables 3-137 in tables 2-62 nesting in column axes 2-63 positioning above columns 2-65 underline 2-63 Subroutines arguments with 1-188 convert multicoded data to single coded 1-181 defining variables in 1-189 explode 1-181 fetch 1-178 fetchx 1-180 load data from look-up file 1-178, 1-180 using 1-177 writing your own 1-182 Subscription 1-23, 1-91 subsort, start secondary level sorting 2-118, 3-134
Quantum User’s Guide Volume 1
Substitute variable names in data-mapped variables 1-215 Subtotals 2-133, 2-134 sample table with 2-136 Subtraction 1-26 Sum of factors 2-144 formula 2-156 produced by list 1-139 sum_, sorted summary of datapass errors 1-228, 4-18, 4-23 summary, keyword for secure databases 2-45, 2-118, 4-116, 4-118 supp, suppress percentages for a row 2-118 Suppressed elements, sum of 2-118 Suppressing percentages with small bases 2-196 Suppressing small absolutes 2-117 Suppressing small column percentages 2-117 Suppressing small total percentages 2-117 Suppressing statistics with small bases 2-196 Suppressing tables 2-15 Suppressing the base on continuation pages 2-57 Switching off options 2-32 SYLK format files, creating 2-13 Symbolic parameters codes 2-232 columns 2-228, 2-229 function of 2-227 global values for 2-237 how Quantum interprets 2-229 in grid axes 2-239, 2-240 text 2-234 variables 2-235 with col and val 2-231
T T and F values with nft 3-108 T statistics see Special T statistics T variables, define in data file 1-113 t1, one sample/paired T-test 3-101 t2, two sample T-test for comparing means 3-105 <>, table numbers on tt statements 2-211 Tab section, jump to from edit 1-126 tab, name axes for table 2-171 options on 2-9, 2-174 tab_, tables file 1-230, 4-24 font numbers on right side 2-12 suppressing blank page 4-23 tabcent, center tables on the page 2-22 tabcon 3-189 Table numbers 2-210 justification of 2-211 suppress 2-15 switching off 2-211 user-defined, positioning with tt 2-211 with and 2-211
Table numbers (continued) with hitch/squeeze 2-191 Table of contents create 3-189 format file 3-190 format file, naming 3-194 suppress for PostScript tables 3-211 Table texts customizing 4-7 how Quantum compares with hitch/squeeze 2-194 see also Titles #tableleft, print table on left of page 3-211 Table-level statistics 2-31, 3-70 Tables adding 2-182 dummy elements 2-186 example of 2-184 sample program for 2-183 with column offsets 2-183 with row offsets 2-184 adjacent absolutes & percentages 2-17 analysis level for 2-10, 3-53 asterisks in 1-16 boxes in 3-206 center on page 2-22 column width 2-10, 2-41 combining 2-179 convert to CDA format 4-32, 4-35 dividing 2-186 double spacing in 2-11, 2-113 filtering 2-11 fonts for laser printing 2-11, 3-209 footnotes on 2-208 generating from qdi file 1-221 grids 2-238 incrementing cells by arithmetic values 2-120 introduction to 2-1 languages 2-13, 2-176 large numbers in 2-27 laser printed, justification of column headings 3-199 logos on 3-209 manipulating see Manipulation maximum values of inc=s 2-28 mean values of inc=s 2-28 minimum values of inc=s 2-29 more than one per page 2-188 multidimensional 2-171 naming axes for 2-171 numbering 2-210 numbering with and 2-211 numbers in 1-16 one beneath the other 2-180 order of titles 2-24 page numbers for 2-213 pagination order in 2-107 pagination with wide breakdowns 2-190
Index / 255
Quantum User’s Guide Volume 1
Tables (continued) paste one under the other 2-188, 2-195 placing side by side 2-180 position of cell counts in 2-167 position on page 3-211 pounds signs in 3-198 precedence of rows & columns when paginating 2-19 print base title last 2-10 print date on 2-10 print output type on 2-24 print text in main body of 2-61 reprint rows at top of continued 2-109, 2-114 row text width 2-20 separate for different output types 2-17 sorted means 3-141 sorted summary statistics 3-142 sorting 2-21, 3-125 suppress column headings 2-193 suppress if base less than given value 2-21 suppress numbering 2-15 suppress output type on 2-24 suppress page break between 2-190 suppress the base on continuation pages 2-57 suppressing all-zero 2-19, 2-32 suppressing printing 2-15 texts 2-5, 4-7 titles and other texts with hitch/squeeze 2-191 titles at bottom of page 2-210 titles for 2-181, 2-203 titles from axis names 2-10 titles from hd= text 2-22 titles to print first 2-23 titles to print last 2-24 types of data in 2-3 unsorted where default is sorted 2-32 updating cells at higher level than axes 3-54 using dummy data 3-43 using subaxes 2-80 vertical lines in 2-167 Tables file 4-22 tabn.syl, graphics files 4-22 Tabulation section C code in 3-123 components of 2-7 editing in 3-124 hierarchies in 2-8 Tabulation statements, format of 1-5 Tags, internal variable names 4-15, 4-24, 4-94 Target weighting 3-2, 3-7 target, target weighting 3-7 tb, table numbers 2-210 tba, left justify table numbers on first page 2-211 tbb, right justify table numbers on first page 2-211 tc.def, table of contents format file 3-194 –td, directory for temporary files 1-231 Temporary disk space for a run 4-178
256 / Index
Temporary files delete 4-25 directory for 1-231 summary of 4-23 Terminating the edit 1-127 Terminating the run 1-128 with tables 1-127 without tables 1-128 termwid, output width in Quanvert Text 4-85 Testing values of data-mapped variables 1-211 Text at the bottom of tables 4-71 break points 2-163 continuing in axes 2-66 indent element when split 2-115 numeric variables 2-27, 2-42, 2-123, 4-42 prevent alteration of, in Quanvert Text 4-83 print in body of table 2-61 row, indenting folded 2-13 symbolic parameters for 2-234 table titles 2-203 underlining on elements 2-119 Text files, convert to Quantum format 4-167 Text strings, limit for 4-10 Text variables, for Quanvert 4-73, 4-74 textconv, translate Quanvert Text prompts 4-82 textdefs, number of text symbolic parameters per run 4-9 Text-only elements 2-58 sorted tables 3-137 with col/val/fld/bit 2-88 textq, convert text to Quantum data format 4-167 texts.qt, customized text file 4-8 thisread, cards read during current read 1-50 title, table titles from axis titles 2-22, 2-32 Titles 2-203 altering default order 2-205 at bottom of page 2-210 creating from axis names 2-10 default printing order 2-205 defining for Quanvert 4-68 footnotes on tables 2-208 in laser printed tables 3-205 justification of 2-203, 2-215 order of 2-24 prevent alteration of in Quanvert Text 4-83 print base last 2-10 suppress automatic for special T statistics 2-15 T statistics 3-151 table description, customizing 4-7 table, from hd= 2-22 underlining 2-207 which to print first 2-23 which to print last 2-24 with hitch/squeeze 2-191 with nested filter sections 2-221 with sid and und 2-181 topc, percent signs at top of column 2-22, 2-32
Quantum User’s Guide Volume 1
toptext=, column text 2-118 Total percentages 2-15 example of 2-33 suppress small 2-21 Total, weighting to a given total 3-5, 3-7, 3-16 total=, weighting to a given total 3-7, 3-16 Totals 2-133 excluding elements from 2-116 in sorted tables 3-141 sample table with 2-136 with nets 2-134 Trailer cards correcting 1-170 definition of 1-48 preparing for Quanvert 4-72 reading 1-49 tabulating without levels 3-64 weighting 3-13 see also Repeated cards Translations 2-13, 2-176 Quanvert (Windows) 4-77 Quanvert Text 4-81 tstat, include element in T stats 2-31, 2-32, 2-45, 2-119, 3-145 tstat, request a special T stat 3-154 tstat.dmp, intermediate figures for T stats 3-157 tstatdebug, intermediate figures for T stats 2-22, 2-32, 3-158 tt, titles 2-203 in tabcon format file 3-191 with flt 2-218 with hitch/squeeze 2-191 tta, left justification of titles on first page 2-204 ttb, right justification of titles on first page 2-204 ttbeg=, titles to print first 2-23, 2-205 ttc, centered title 2-203 ttend=, titles to print last 2-24, 2-205 T-test exclude elements from 2-32 include elements in 2-31 on column means 3-164 formula 3-177 P-values for 3-164 on column proportions 3-160 formula 3-179 P-values for 3-163 one sample 3-101 example 3-102, 3-103 formula 3-117 in weighted runs 3-101 paired 3-101 example 3-102, 3-103, 3-104 formula 3-117 two sample 3-105 example of 3-106 formula 3-117 ttg, line up title with start of column headings 2-204 ttl, left justified title 2-203
ttn, indented title 2-204 ttord=, order for printing titles 2-24, 2-205 ttr, right justified title 2-203 T-variables 1-20 Two dimensional chi-squared test 3-76 example of 3-77 formula 3-89 Two sample T-test 3-105 example 3-106 formula 3-117 Two sample Z-test on proportions 3-95 tx=, text-only element with col/val/fld/bit 2-88 type, print output types 2-24, 2-32 Types of output 2-15
U u, underline column headings 2-168 und, tables one under the other 2-180, 2-181, 2-182 Underlining column headings 2-168 column headings with pstab 3-203 element texts 2-119 for separate column texts in q2cda 2-169 in laser printed tables 3-203 in table of contents 3-190 subheadings 2-63 titles 2-207 Uniform distribution, test for 3-73 uniq_id, unique respondent numbers for Quanvert 4-121 uniqid=, in element texts 4-105 axes generated by qdiaxes 1-222 Unique ID text, and data-mapped variables 1-209 Unknown file formats for databases 4-128 unl, underline text 2-63, 2-119, 2-207 Unpack databases 4-126 Unweighted data, prevent Quanvert access 4-85, 4-116 uplev=, axis update level 2-45, 3-56 comparison with celllev 3-58 example of intermediate file with 3-57 statistics with 3-61 update base for all records at anlev= level 2-124, 3-57 with grids 2-245 useeffbase, use weighted counts for standard error 2-31, 2-32, 2-143 *usemap, define data-mapping file 1-204 User-definable limits 4-9 Quanvert Text 4-83 Users file, for Quanvert Text 4-83
Index / 257
Quantum User’s Guide Volume 1
V
W
val, elements with numeric conditions 2-89 abbreviated notation for arithmetic equality 2-91 arithmetic equality with 2-89 count missing values 2-94 data-mapped variables 1-205, 1-213 options on 2-112 ranges with 2-92 text-only elements 2-89 var(#)=, substitution for data-mapped variables 1-215 var, count elements for data-mapped variables 1-213 Variable groups for Quanvert (Windows) 4-68 Variables add new to multiproject directories 4-107 add to database 4-99 alpha for Quanvert 4-73, 4-74 blank out 1-111 C array 1-18 checking contents of 1-31, 1-34 comparing 1-31, 1-35 data 1-18 data-mapped 1-201 defaults 1-198 defining in subroutines 1-189 external 1-199 integer 1-20 reset to zero 1-111 lastrec 1-52 local 1-199 naming 1-195, 4-15 naming in program 1-199 naming of files 4-95 numeric for Quanvert 4-70 passing with call 1-190 prevent creation of, in Quanvert Text 4-83 real 1-21 reset to zero 1-111 replacing in a database 4-98 resetting between respondents 1-97 restrict access in Quanvert Text 4-84 storing in subdirectories 4-80 subscription of 1-23 symbolic parameters for 2-235 T, define in data file 1-113 types of 1-17 see also Reserved variables Variables file 1-196, 4-1 varname=, variable name alpha variables 4-73 numeric variables 4-70 weighting matrices 3-8, 4-71 vartext=, description of variable 4-70, 4-73 Vectors in manipulation expressions 3-27, 3-35 Verbatim responses 4-74 Version of Quantum, selecting 1-223 Vertical lines in tables 2-167
.wav files 4-74 Waves flipping for panel studies 4-112 link into a single database 4-113 Weighted data, prevent access to 4-116 Weighted databases, prepare for Quanvert 4-71 Weighted panel studies 4-113 Weighting anlev= with 3-12 c= with 3-13 characteristics not known 3-19 declare in axes 3-14 defining characteristics for 3-7 effective base 2-119, 2-153, 3-147 entering weights 3-7 error handling 2-31, 3-10 error variance with 2-143 example of 3-9, 3-10 exclude respondents from 3-6 factors 3-2 frequency distributions 1-142 grid tables 2-246 holecounts 1-136 input 3-5 methods of 3-1 missing values with pre/postweights 3-8 multidimensional matrices 3-13 name matrix to use 2-31, 2-125, 3-23 naming matrices 4-71 number of matrices 3-7 one dimensional T-tests with 3-101 options for 3-7 postweights 3-6 preweights 3-5 program 1-229 proportions 3-5 Quanvert 4-71 report at each rim weighting iteration level 3-21 rim 3-3, 3-19 special T stats with 3-147 standard error with 2-143 summary information 4-19 targets 3-2 to a given total 3-5, 3-7, 3-16 trailer cards 3-13 unweighted records 3-2 uses of 3-1 using weights from record alone 3-17 see also Rim weighting Weighting report file 1-229, 4-19 weightrp, weighting report file 1-229, 4-19, 4-23 Weights abbreviating lists of 3-9 copying into data file 3-24 entering 3-9 for elements 3-14, 3-15
258 / Index
Quantum User’s Guide Volume 1
Weights (continued) minimum 3-18 switching off 3-23 using 3-23 Whole numbers 1-16 Wide tables, print all on one page 2-190 Width of terminal display for Quanvert Text 4-85 Wildcard characters with quclean, qteclean, qtoclean & manipclean 4-26 Windows-based Quanvert see Quanvert (Windows) wm, define a weight matrix 3-7 wm=, weighting matrix to use 2-31, 2-125, 3-23 wmerrors, weighting error handling 2-31, 2-32, 3-10 write, write out records 1-65 as part of another statement 1-66 correcting errors from 1-160 creating data files 1-69 default output file 1-67 define default print parameters for 1-81 defining the file type 1-78 file of records failing 4-17 override use of ruler with ident 1-83 specifying an output file 1-67 turn off default print parameters 1-83 with explanatory texts 1-67 writing selected fields only 1-68 wtfactor=, factor weighting 3-15 wttarget=, target weighting 3-14 wttran, copy weights into data 3-24
Z-test (continued) overlapping samples 3-99 example of 3-100 formula 3-116 subsample proportions 3-97 example of 3-96, 3-98 formula 3-116 two sample on proportions 3-95 formula 3-115
X xor, logical operator for assignment 1-101 X-variables 1-22
Z z1, one sample Z-test on proportions 3-93 z2, two sample Z-test on proportions 3-95 z3, Z-test on subsample proportions 3-97 z4, Z-test on overlapping samples 3-99 Zero exclude from averages 2-137 special characters for 2-21 suppressing columns 2-15 suppressing elements 2-32 suppressing rows 2-15 suppressing tables 2-19, 2-32 Z-test one sample 3-93 example of 3-94 formula 3-115
Index / 259