U.S. Census Bureau Statistical Quality Standards

Transcript

1 Reissued Jul 2013 U.S. Census Bureau Statistical Quality Standards The leading source of quality data about the nation's people and economy.

2 U.S. CENSUS BUREAU Thomas L. Mesenbourg, Jr., Deputy Director and Ch ief Operating Officer Nancy A. Potok, Associate Director for Demographic Programs Arnold A. Jackson, Associate Director for Decennial Census Frank A. Vitrano, Associate Director for 2020 Census Ted A. Johnson, Asssociate Director for Administra tion and Chief Financial Officer Steven J. Jost, Associate Director for Communications Roderick Little, Associate Director for Research and Methodology David E. Hackbarth (Acting), Associate Director for Field Operations Brian E. McGrath, Associate Director for Information T echnology and Chief Information Officer William G. Bostic, Jr. Associate Director for Economic Programs METHODOLOGY AND STANDARDS COUNCIL Roderick Little, Methodology and Chair, Methodology and Associate Director for Research and Standards Council Ruth Ann Killion, Chief, Demographic Statis tical Methods Division Xijian Liu, Chief, Office of Statistical Methods and Research for Economic Programs David Whitford, Chief, Decennial Statis tical Studies Division Tommy Wright, Chief, Center for Statisti cal Research and Methodology

3 TABLE OF CONTENTS PREFAC E ... i VELOPMENT ... 1 PLANNING AND DE ... 2 A1 Planning a Data Program ... A2 Developing Data Collection Inst ng Materials ... 6 ruments and Supporti Appendix A2: Questionnaire Testing and Evaluation Methods for Censuses and Surveys .. 12 A3 Developing and Implementing a Sample Design ... 23 DATA ... 27 QUIRING COLLECTING AND AC B1 Establishing and Implementing Data Collection Methods ... 28 B2 Acquiring and Using Administrative Records ... 32 ESSING DATA ... 34 CAPTURING AND PROC ... 35 C1 Capturing Data ... C2 Editing and Imputing Data ... ... 38 ... 41 C3 Coding Data... C4 Linking Data Records... ... 44 AND MEASURES ... 47 PRODUCING ESTIMATES D1 Producing Direct Estimates from Samples ... 48 1 D2 Producing Estimates from Models ... 5 D3 Producing Measures and Indicators of Nonsampling Error ... 56 ulating and Reporting Response Rates: Appendix D3-A: Requirements for Calc Demographic Surveys and Decennial Censuses ... 61 Appendix D3-B: Requirements for Calcul ating and Reporting Response Rates: Economic Surveys and Censuses ... 70 PORTING RESULTS ... 82 ANALYZING DATA AND RE E1 Analyzing Data ... ... 83 E2 Reporting Results ... ... 87 Appendix E2: Economic Indicator Variables ... 96 98 E3 Reviewing Information Products ... Appendix E3-A: Event Participation A pproval Form and Instructions ... 106 Appendix E3-B: Statistical Review Contacts ... 108 Appendix E3-C: Policy and Sensitivity Review Checklist for Division and Office Chiefs 109 RELEASING INFO RMATION ... 112 F1 Releasing Information Products ... 113 Appendix F1: Disseminati on Incident Report ... 119 ansparency in Information Products ... 133 F2 Providing Documentation to Support Tr F3 Addressing Information Quality Guideline Complaints ... 137 Appendix F3: Procedures for Correcting Information that Does Not Comply with the Census Bureau’s Information Quality Guidelines ... 139 SUPPORTING ST ANDARDS ... 141 S1 Protecting Confidentiality ... ... 142 S2 Managing Data and Documents ... 14 4 WAIVER PROC EDURE ... 147 GLOSSARY ... 151

4 1 Preface 1. Introduction Purpose This document specifies the statistical quality st andards for the U.S. Census Bureau. As the largest statistical agency of th e federal government, the Census Bu reau strives to serve as the leading source of quality data about the nation’ s people and economy. The Census Bureau has developed these standards to promote quality in its information products and the processes that generate them. These standards provide a means to ensure consistency in the processes of all the Census Bureau’s program areas, from pla nning through dissemination. By following these standards, the Census Bureau’s employees and c ontractors will ensure the utility, objectivity, and integrity of the statistical information provided by the Census Bureau to Congress, to federal policy makers, to sponsors, and to the public. Background In 2002, the United States Office of Management and Budget (OMB) issued Information Quality Guidelines (OMB, Guidelines for Ensuring and Maximizing th e Quality, Objectivity, Utility, and Integrity of Information Disse minated by Federal Agencies , February 22, 2002, 67 FR 8452- 8460), directing all federal agencies to develop their own information quality guidelines. In October 2002, the Census Bureau issued its info rmation quality guidelines (U.S. Census Bureau, U.S. Census Bureau Section 515 Information Quality Guidelines , 2002). These guidelines established a standard of quality for the Census Bureau and incorporated the information quality guidelines of the OMB and the Department of Co mmerce, the Census Bureau’s parent agency. Following the OMB’s information quality guidelin es, the Census Bureau defines information quality as an encompassing term comprising util ity, objectivity, and integrity. Our definition of information quality is the f oundation for these standards. Utility refers to the usefulness of the informati on for its intended users. We assess the usefulness of our information products from the perspective of policy makers, subject matter users, researchers, and the public. We achieve utility by continual assessment of of emerging requirements, and development customers’ information needs, anticipation of new products and services.  The statistical quality standards related to utility include: Pla nning a Data Program (A1), Developing Data Collection Inst ruments and Supporting Materials (A2), Developing and Implementing a Sample Design (A3), Acquiring and Using Administrative Records (B2), Reviewi ng Information Products (E3), Releasing Information Products (F1), and Providing Do cumentation to Suppor t Transparency in Information Products (F2). 1 ly within the U.S. Census Bureau. some Intranet links that are accessible on Please note that this document contains i

5 Objectivity curate, reliable, a nd unbiased, and is focuses on whether information is ac presented in an accurate, clear, complete, and unbiased manner. Objectivity involves both the content of the information and the pr esentation of the information. It requires complete, accurate, and easily understood documentation of the sources of the errors that may aff ect the quality of the information, with a description of the sources of data, when appropriate.  The statistical quality standards related to objectivity include: Developing Data Collection Instruments and Supporting Mate rials (A2), Developing and Implementing a Sample Design (A3), Establishing and Implementing Data Collection Methods Records (B2), Capt uring Data (C1), (B1), Acquiring and Using Administrative Editing and Imputing Data (C2), Coding Da ta (C3), Linking Data from Multiple Sources (C4), Producing Direct Estimates from Samples (D1), Producing Estimates from Models (D2), Producing Measures a nd Indicators of Nonsampling Error (D3), Analyzing Data (E1), Reporting Results (E 2), Reviewing Information Products (E3), Releasing Information Products (F1) , Providing Documentation to Support Transparency in Information Products (F2), Addressing Information Quality Complaints (F3), and Managi ng Data and Documents (S2). Integrity refers to the security of informati on – protection of the information from unauthorized access or revision, to ensure that the information is not compromised through corruption or falsification. Several fe deral statutes and Census Bureau policies govern the protection of information, most notably Title 13 and Title 26.  Protecting Confidentiality (S1) concerning the integrity of directly addresses issues the data. All the statistical quality sta ndards contain requirements for protecting information from unauthorized access or release. In September 2006, the OMB issued , which Standards and Guidelines for Statistical Surveys specify requirements for federal statistical agen cies to ensure that their information products satisfy the information quality guidelines. The OMB standards are not in tended to describe all the efforts that an agency may undertake to ensu re the quality of its information. These Census Bureau statistical quality sta ndards provide additional guidan ce that focuses on the Census nsus Bureau’s unique nd that addresses the Ce Bureau’s statistical programs and activities a methodological and operational issues. 2. Scope information products l quality standards apply to all released by The Census Bureau’s statistica the Census Bureau and the activities that genera te those products, includi ng products released to the public, sponsors, joint partne rs, or other customers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and funding to develop and release Census Bureau other individuals who receive Census Bureau information products. ii

6 The Census Bureau often conducts associated work for sponsoring data collections and performs agencies on a reimbursable basis. The work performed by the Census Bureau under such contracts is in the scope of th ese statistical quality standards, whether performed under Title 13, ’s requirements or fundi ng constraints result in Title 15, or another authorization. If a sponsor Bureau’s manager for the program must obtain noncompliance with these standards, the Census a waiver, except where noted in the standards. For the purposes of these standards, information pr oducts include printed, el ectronic, or digital formats (e.g., Web, CD, DVD, and tape) of: news releases; Census Bureau publications; working papers (including technical papers or reports); professional papers (inc luding journal articles, book chapters, conference papers, poster sessions, and written discussant comments); abstracts; decisions about Census Bureau pr ograms; public presentations at research reports used to guide ); handouts for presentati ons; tabulations and external events (e.g., seminars or conferences custom tabulations; public-use data files; st atistical graphs, figures, and maps; and the documentation disseminated with these information products. Exclusions to the Scope None of the following exclusions apply to Statistical Quality Standard S1, Protecting Confidentiality , or the requirements for protecting conf identiality in the i ndividual standards. These standards do not apply to: Information products intended for internal Ce nsus Bureau use that are not intended for  public dissemination.  Information products delivered to agencies wi thin the Department of Commerce for their internal use.  Internal procedural or policy manuals prepared for the management of the Census Bureau and the Department of Commerce that ar e not intended for public dissemination. Information products that result from the Cens us Bureau’s administrative or management  processes.  Information products released in response to a Freedom of Information Act request.  Documents intended only for communications betw een agencies, within agencies, or with individuals outside the Census Bureau if the documents contain no data and do not discuss analyses or methodological information. Informal communications between Census Bu reau employees and colleagues in other  organizations that do not disseminate Census Bureau data or results based on Census Bureau data.  Information products delivered to sponsors or oversight agencies, in cluding the Congress, relating to the management of Census Bureau programs.  Information products authored by external rese archers at the Census Bureau’s Research Data Centers.  Information products that use Census Bureau data and are authored by Special Sworn Status individuals employed by other federal ag encies or organizations for their agencies (e.g., SSA, GAO, and CBO). iii

7  es or organizations to which the Census Information products generated by other agenci training. However, Census Bureau staff Bureau has given only technical assistance or providing such assistance should consid er these standards as guidelines.  Information products developed from surveys intended to measure Census Bureau customers’ or users’ satisfaction with Cens us Bureau products or to measure Census any public release of results of such Bureau employees’ job satisfaction. However, surveys must explain that they do not meet the Census Bureau’s statistical quality standards because the respondent s are self-selected and may not be representative of all customers, all users, or all employees.  social media. Social me dia must not be used to Communications released via analyses not previously cleared for external release. Such disseminate data or statistical Policies and Procedures Governing communications must follow the Census Bureau’s the Use of Social Media . s provide additional information to clarify the The scope statements of the individual standard scope and to list exclusions specific to each standard. 3. Responsibilities All Census Bureau employees and Special Sw orn Status individuals are responsible for following the Census Bureau’s sta r work to develop, deliver, and tistical quality standards in thei release information products. the Supporting Directorates and Divisions Responsibilities of the Program Areas and Divisions and offices within the Economic Pr ograms, Demographic Programs, and Decennial Census plan, process, analyze, and disseminate da ta. The Census Bureau’s Center for Statistical ates in areas of stat istical, methodological, Research and Methodology supports all three director arch and development. The Field Operations Directorate and behavioral, and technological rese Information Technology Directorate collect, tr ansmit, and process data for demographic household surveys, the Decennial Census, the Economic Census and surveys, and the Government Census and surveys. The Census Bu reau’s other directorates and divisions provide various types of administrativ e, logistical, and strategic support to the program areas. am areas and the supporting dir ectorates and divisions with The responsibilities of the progr respect to these statistical quality standards include:  Ensuring that the necessary resources are available to comply with the statistical quality standards.  Implementing and verifying compliance with the statistical quality standards. Guidance on implementing the standards and verifying compliance can be obtained from Methodology and Standards (M&S) C the program area’s ouncil representative as shown in Table 1. iv

8 Table 1. M&S Council Representatives Program Directorate M&S Council Representative tical Studies Division Chief, Decennial Statis Decennial Census Directorate Chief, Demographic Statis tical Methods Division Demographic Programs Directorate Chief, Office of Statistic al Methods and Research Economic Programs Directorate for Economic Programs Chief, Center for Statistical Research and All other directorates Methodology  Reporting situations where requirements of the standards might need revision (e.g., a program’s processes or products may have ch anged so that some requirements of the statistical quality standards ma y also need to be revised).  Following the procedure to obtain a waiver if un able to comply with one or more of the statistical quality standards. Responsibilities of the Methodology and Standards Council e division and office chiefs of the statistical The Census Bureau’s M&S Council consists of th methodology groups in the various program areas. The Council advises the Census Bureau’s ssues affecting research and methodology for Program Associate Directors on policy and i Census Bureau programs. The Council also en sures the use of sound statistical methods and practices, and facilitates co mmunication and coordination of statistical methodology and research throughout the Census Bureau and the broader statistical community. Council with respect to these statistical quality standards The responsibilities of the M&S include:  Promoting awareness of and compliance with the Census Bureau’s statistical quality standards.  Reviewing waiver requests and forwarding th eir recommendation for approval or denial of the waiver to the Program Associate Director. Conducting periodic reviews and evaluations of the standards to study how well the  standards are working and to identify difficulties in implementation.  Maintaining an archive of evaluation findi ngs, waiver requests, and suggestions for improvement to inform future revisions of the Census Bureau’s statistical quality standards.  Updating the standards as needed. r directorates (See Table 1.) The responsibilities of the indi vidual M&S Council members for thei include: v

9  e standards to the programs in their directorates and to Provide guidance on interpreting th ting their programs (e.g., the e in conducting and implemen directorates that participat Directorate). Field Operations  Provide assistance in implementing and verify ing compliance with the standards to the participate in programs in their directorates and to directorates that conducting and Field Operations Directorate). implementing their programs (e.g., the 4. Interpreting and Using the Standards quality standards includes process The complete set of statistical standards (designated with “A” d with “S”). The pr ocess standards are through “F”) and supporting standards (designate veloping and releasing organized according to the different processe s associated with de information products. The organizational framework for these process standards is: A. Planning and Development B. Collecting and Acquiring Data C. Capture and Processing Data D. Producing Estimates and Measures E. Analyzing Data and Reporting Results Releasing Information F. The ll the process standards. The two supporting standards address issues that cut across a , and S2, Managing Data and Documents Protecting Confidentiality supporting standards are S1, . The standards are written at a broa d level of detail, to apply to all the Census Bureau’s programs what is required and do not delineate procedures for how to satisfy and products. They describe has a list of key terms that ar e used in the standard. These the requirements. Each standard terms are defined in the glossary to provide clarification on their use in relation to these standards. To help managers interpret the requirements of the standards, examples are often provided. These examples are intended to tanding the requirements and aid the program manager in unders to provide guidance on the types of actions that may be useful in satisfying the requirements. It is important to note that the examples listed unde r a requirement are not all-inclusive; nor will every example apply to every program or product. Finally, there may be more than one acceptable way to comply with a requirement. Th at is, several equally acceptable actions might be performed to comply with a requirement, rather than only one unique set of actions. Program managers must use their judgment to determine which actions must be performed for their program to comply with a requirement. Th e program manager is expected to carry out all the actions needed to comply with a requireme nt. This may include performing activities not listed in the examples. The expectation is that program managers will balance the importance of of budget, schedule, and the information product and the size of the projec t with the constraints resources when determining how to comply with the requirements. vi

10 If the program manager believes it is not feasible to comply w ith a requirement, the program provides a standard mechanism to Waiver Procedure manager must request a waiver. The quality standard when such an exemption is exempt a program from compliance with a statistical warranted. The Waiver Procedure also pr omotes proper management and control in implementing the standards. Finally, the Wa iver Procedure ensures that appropriate documentation of exceptions to the standards is generated and maintained to inform future cal quality standards. revisions of the statisti History of the Development of the Standards 5. The Census Bureau has a long history of deliver ing high quality data about the nation’s people and economy. Technical Paper 32 , Standards for Discussion and Presentation of Errors in Data the Census Bureau’s commitment to promote , issued in March 1974, is an example of transparency in the quality of the information a nd data products it delivers to the public and to its 2 sponsors. Over the years, the Census Bureau has develope d additional guidance regard ing the quality of its products and in 1998 began to formalize its effo rts to ensure quality in its products and processes. The Census Bureau began this more formal approach by instituting a quality program based on a foundation of quality principles, standards, and guidelines. The paper, Quality Program at the U.S. Census Bureau , describes the beginnings of the Census Bureau’s Quality cial Statistics, 2001). erence on Quality in Offi Program (Proceedings of the International Conf In 2001, the Census Bureau issued the first of eleven standards. Several of new statistical quality Technical Paper 32 these standards updated the content of . Over the next four years, ten more standards were developed. In 2005, after conducting a benchmarking study of the standards of other statistical organizations, the M&S Council initiated a mo re coordinated appro ach for developing a comprehensive set of statistical quality standa rds. While the existing standards were a good start, this approach aimed to improve consiste ncy and cohesion among the standards, as well as to reflect all the requirements of the OMB’s Standards and Guidelines for Statistical Survey s in the context of the Census Bureau’s programs, products, and processes. The new approach to developing statistical quality standards relied on five key components: 1) a dedicated staff to develop the standards, rather th an ad hoc teams; 2) contractor assistance; 3) multiple reviews of draft standards to obtain feed back from the program areas; 4) focus groups to obtain more thoughtful and attentive input fr om the program areas; and 5) a documented, consistent development process. The Census Bureau began developing these st andards in May 2006. The process was completed in May 2010, when the Census Bureau issu ed these statistical quality standards. 2 Technical Paper 32 is available from the U.S. Government Printing Office, Washington, DC 20401. It was revised in: Gonzalez, M., Ogus, J., Sh apiro, G., and Tepping, B. Journal of the American Statistical Association , Vol. 70, No. 351, Part 2: Standards for Discussion and Presentation of Errors in Survey and Census Data (Sep., 1975), pp. 5- http://www.jstor.org/stable/2286149 23. vii

11 PLANNING AND DEVELOPMENT A1 Planning a Data Program Developing Data Collection Inst A2 ruments and Supporting Materials Appendix A2 : Questionnaire Testing and Evaluation Methods for Censuses and Surveys A3 Developing and Implem enting a Sample Design 1

12 Statistical Quality Standard A1 Planning a Data Program The purpose of this standard is to ensure Purpose: that plans are developed when initiating a new or revised data program. Scope: The Census Bureau’s statistical quality standards apply to all information products generate those product s, including products released by the Census Bureau a nd the activities that omers. All Census Bureau joint partners, or other cust released to the public, sponsors, these standards; this includes employees and Special Sworn Status individuals must comply with contractors and other in au funding to develop and release dividuals that receive Census Bure Census Bureau information products. nning data programs (e.g., surveys, censuses, and In particular, this standard applies to pla administrative records programs) that will re lease information products to the public, to sponsors, or to other customers. Exclusions: to the standards are listed in the Preface. No additional exclusions The global exclusions apply to this standard. Note: Specific planning requirements for each stag e of the data program are addressed in other Statistical Quality Standard E1 , statistical quality standards. For example, , Analyzing Data includes requirements for planning data analyses. Key Terms: Administrative records , bridge study business identifiable information , census , , personally identifiable , data collection , data program , information products , microdata stakeholder information response rate , sample design , sample survey , , , reimbursable project , , and users . target population The provisions of federal laws (e .g., Title 13, Title 15, and Title 26) and Requirement A1-1: Census Bureau policies and procedures on privacy and confidentiality (e.g., Data Stewardship Policies) must be followed in planning and designing any programs that will collect personally ess identifiable information. (See Statistical Quality Standard identifiable information or busin , Protecting Confidentiality S1 .) Requirement A1-2: developed that incl udes the following: An overall program plan must be 1. A justification for the program, including: a. A description of the program goals. b. A description of stakeholder requirements and expectations. c. A description of the intended informati on products (e.g., tabulations, confidential microdata, or public-use files). d. A description of revisions to an ongoing program, including: 1) Changes to key estimates, methods, or procedures. The usefulness of the revisions for 2) conducting analyses and for informing policymakers and stakeholders. 2

13 3) Planned studies to measure the effects of the changes to key estimates and time series (e.g., overlap sample s or bridge studies). For sample survey and census programs e. (i.e., programs that do not rely solely on administrative records), a description of the steps taken to prevent unnecessary duplication with other sources of information, including a lis t of related (current and past) federal and non-federal studies, surv eys, and reports that were reviewed. Notes: (1) The Office of Management and Budget’s (OMB) Guidance on Agency Survey and Statistical Information Collections provides information on preparing OMB clearance pose statistics or as part of program packages for surveys used for general pur evaluations or research studies. (2) The OMB’s Standards for Maintainin g, Collecting, and Presenting Federal Data on Race and Ethnicity provides standards for program s collecting data on race and ethnicity. Standards for Defining Metropolitan and Micropolitan Statistical Areas The OMB’s (3) provides standards for collecting, tabulating, and publishing statistics for geographic areas. An initial schedule that identifies key milestones for the complete program cycle from 2. planning to data release. Generally, the program cycle includes the following stages:  Planning a data program (Statistical Quality Standard A1).  Developing the data collection instrument and sample design (Statistical Quality Standards A2 and A3). Establishing and implementing data colle ction methods and ac  quiring administrative records (Statistical Qualit y Standards B1 and B2). Capturing and processing data (Statistical Quality Standards C1, C2, C3, and C4).  Producing estimates and quality measures (S tatistical Quality Standards D1, D2, and  D3).  Analyzing data and reporting results (Sta tistical Quality Standards E1 and E2).  Reviewing information products (Statistical Quality Standard E3).  Releasing information products (Statistic al Quality Standards F1 and F2). Note: Managers responsible for each stage of the program generally are expected to prepare milestone schedules for their stages. The overall program manager can use these individual schedules to prepare the overall milestone schedule. 3. An initial, overall cost estimate that identifies the resources needed and itemizes the costs to carry out the program. Note: Managers responsible for each stage of the program generally are expected to prepare cost estimates for their stages. The overall program manager can use these estimates to prepare the overall cost estimate. 3

14 4. the result of any c ontracts originated by A description of deliverables to be received as ation to be provided by contractors. the Census Bureau, including any document Examples of such deliverables include:  Computer software or hardware.  Data files.  Advertising or outreach se rvices and materials.  Specifications for software or hardware.  Quality control or quality assurance procedures, criteria, and results. Sub-Requirement A1-2.1: When the sponsor of a reimbursa ble project requests the Census with our Statistical Qual ity Standards or deliver Bureau to carry out activities that do not comply products that do not conform with the standards, the program manager must: 1. Obtain a waiver to carry out the noncompliant activities or to deliver the nonconforming products before agreeing to conduct the project. (See the Waiver Procedure for the procedures on obtaining a waiver.) Obtain from the sponsor a copy of the clearance package approved by the OMB, 2. including any associated terms of clearance. 3. Deliver to the sponsor written documentation th at describes the following for each area of noncompliance: a. The details regarding the noncompliance issue. The consequences of performing the noncompliant work. b. c. reau that would re sult in compliance. The actions recommended by the Census Bu Requirement A1-3: For sample survey and census programs a preliminary survey design must , be developed that describes the: 1. Target population and sampling frame. 2. Sample design. 3. Key data items and key estimates. 4. Response rate goals. 5. Data collection methods. 6. Analysis methods. Requirement A1-4: For administrative records projects , a preliminary study design must be developed that describes the: 1. Target population. 2. on by the administrative records. Coverage of the target populati 3. Key data items and key estimates. 4. Methods of integrating data sour ces, if more than one is used. 5. Analysis methods. Note: See the Administrative Records Ha ndbook for complete information on planning a project that uses administrative records. Requirement A1-5: Any contract or statement of work originated by the Census Bureau for deliverables that will be used in information products released by the Census Bureau must 4

15 include provisions that the c us Bureau’s statistical quality ontractor comply with the Cens standards. Quality control checks must be pe rformed to ensure the accuracy and Requirement A1-6: completeness of the program plans, including a ll schedules, cost estimates, agreements (e.g., memoranda of understanding, statements of wo rk, and contracts), su rvey designs, and study designs. Requirement A1-7: Documentation needed to replicate an d evaluate the data program must be produced. The documentation must be retained, cons istent with applicable policies and data-use agreements, and must be made available to Cens us Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include: Program plans, including cost estimat es and schedules, after all revisions.   Survey designs.  Study designs. Decision memoranda.  Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be readily accessible to the public to ensure transparency of information products released by the Census Bureau. 5

16 Statistical Quality Standard A2 Developing Data Collection Instruments and Supporting Materials The purpose of this standard is to ensu re that data collec tion instruments and Purpose: the collection of high quality data from supporting materials are designed to promote respondents. The Census Bureau’s statistical quality standards apply to all information products Scope: nd the activities that released by the Census Bureau a s, including products generate those product joint partners, or other cust omers. All Census Bureau released to the public, sponsors, employees and Special Sworn Status these standards; this includes individuals must comply with contractors and other in dividuals that receive Census Bure au funding to develop and release Census Bureau information products. In particular, this standard applies to the deve lopment or redesign of da ta collection instruments collection instruments and supporting materials and supporting materials. The types of data covered by this standard include:  Paper and electronic instruments (e.g., CATI, CAPI, Web, and touch tone data entry).  Self-administered and interviewer-administered instruments.  Instruments administered by telephone or in person.  Respondent letters, aids, and instructions.  Mapping and listing instruments used for opera tions, such as address canvassing, group quarters frame development, and the Local Update of Census Addresses (LUCA). Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to: terials where the Census Bureau does not Data collection instruments and supporting ma  have control over the content or format, such onic instruments used as the paper and electr for collecting import and expor t merchandise trade data. Key Terms: Behavior coding , CAPI , CATI , cognitive interviews , data collection instrument , , field test focus group , graphical user interface (GUI) , imputation , integration testing , , methodological expert review , nonresponse pretesting respondent , questionnaire , record linkage , . burden , split panel test , and usability testing respondent debriefing , Requirement A2-1: Throughout all processes associated with data collection, unauthorized release of protected information or administrativ ely restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality .) Standard S1 , Protecting Confidentiality 6

17 Requirement A2-2: A plan must be produced that addresses: Program requirements for the da ta collection instrument and 1. the graphical user interface (GUI), if applicable (e.g., data collection mode, content, constraints, and legal requirements). Supporting materials needed for the data collection (e.g., brochures, flashcards, and 2. advance letters). 3. Pretesting of the data collection in strument and supporting materials. 4. re the proper functioning of th e data collection instrument Verification and testing to ensu and supporting materials. Notes: (1) Statistical Quality Standard A1 , Planning a Data Program , addresses overall planning requirements, including the develo pment of schedules and costs. See the Guidelines for Designing Questionnaires for Administration in Different Modes (2) and the Economic Directorate Guidelines on Questionnaire Design for guidance on designing data collection instruments. (3) Data Stewardship Policy DS016, Respondent Identification Policy , contains policy requirements for data collection operations involving households where respondents in households provide information. Requirement A2-3 : Data collection instruments and s upporting materials must be developed e constraints of budget, resources, and time) data and tested in a manner that balances (within th quality and respondent burden. Specifications for data collec Sub-Requirement A2-3.1: tion instruments and supporting materials, based on program requirements, must be developed and implemented. Examples of topics that specifi ddress include: cations might a  Requirements for programming the instrument to work efficiently. For example: o Built-in edits or range checks for electronic data collection instruments (e.g., edits for numeric data that must be w ithin a pre-specified range). o Compliance with the CATI/CAPI Screen Standards for GUI (Windows-based) Instruments and Function Key Standards for GUI Instruments. (See the Technologies Management Office’s Authoring Standards Blaise Standards for Windows Surveys ). o ta collection instruments. Input and output files for da  Segmented boxes for paper data collecti on instruments to facilitate scanning.  Paper size, color, thickness, and formatting to ensure compatibility with data capture and processing systems for paper data collection instruments.  Frequently Asked Questions about the data collection.  Supporting materials, such as He lp materials and instructions. Note: The Census Bureau Guideline Presentation of Data Edits to Respondents in Electronic Self-Administered Surveys presents recommendations for designing editing functionality, c and economic self-administered electronic presentation, and wording in both demographi surveys. 7

18 Sub-Requirement A2-3.2: Data collection instruments a nd supporting materials must clearly ifications to respondents: state the following required not 1. The reasons for collecting the information. 2. A statement on how the data will be used. 3. ndatory (citing author ity) or voluntary. An indication of whether responses are ma 4. A statement on the nature and extent of conf identiality protection to be provided, citing authority. 5. An estimate of the average respondent burden associated with providing the information. 6. A statement requesting that the public direct comments concerning the burden estimate and suggestions for reducing this burden to the appropriate Census Bureau contact. The OMB control number and expirati on date for the da 7. ta collection. 8. A statement that the Census Bureau may not conduct, and a person is not required to respond to, a data collection re quest unless it displays a currently valid OMB control number. Sub-Requirement A2-3.3: Data collection instruments and supporting materials must be pretested with respondents to identify problems (e.g., problems related to content, order/context effects, skip instructions, formatting, navigation, and edits) and then refined, prior to implementation, based on the pretesting results. feasible to perform onstraints may make it in Note: On rare occasions, cost or schedule c complete pretesting. In such cases, subject matter and cognitive experts must discuss the program manager must document any decisions need for and feasibility of pretesting. The for the decision. If no acceptable options for regarding such pretesting, including the reasons the program manager must a pretesting can be identified, Waiver pply for a waiver. (See the Procedure for the procedures on obtaining a waiver.) 1. Pretesting must be performed when: a. A new data collection inst rument is developed. b. Questions are revised because the data are shown to be of poor quality (e.g., unit or item response rates are unacceptably low, m easures of reliability or validity are unacceptably low, or benchmarking reveal s unacceptable differences from accepted estimates of similar characteristics). c. Review by cognitive experts reveals that a dding pretested questions to an existing instrument may cause potential context effects. d. An existing data collection instrument ha s substantive modifications (e.g., existing new questions added). questions are revised or Note: Pretesting is not requi red for questions that perfor med adequately in another survey. 2. Pretesting must involve responde nts or data providers who are in scope for the data collection. It must veri fy that the questions: a. Can be understood and answered by potential respondents. b. Can be administered properly by interviewers (if interviewer-administered). nd do not cause undue burden. Are not unduly sensitive a c. 8

19 Examples of issues to verify during pretesting:  The sequence of questions and skip pa tterns is logical and easy-to-follow. The wording is concise, clear, and unambiguous.   Fonts (style and size), colors, and other visual design elements promote readability and comprehension. 3. One or more of the following pretesting methods must be used: a. Cognitive interviews. b. Focus groups, but only if the focus group co mpletes a self-administered instrument and discusses it afterwards. Usability techniques, but only if they ar e focused on the respondent’s understanding c. of the questionnaire. d. Behavior coding of respondent /interviewer interactions. e. Respondent debriefings in conj unction with a field test or actual data collection. f. Split panel tests. Notes: generally do not satisfy this pretesting requirement. (1) Methodological expert reviews However, if a program is under extreme budget, resource, or time constraints, the s in the Center for Statistical Research program manager may request cognitive expert and Methodology or on the Response Improveme nt Research Staff to conduct such a review. The results of this expert review must be documented in a written report. If the cognitive experts do not agree that an expert re view would satisfy this requirement, the program manage r must apply for a waiver. (2) Multiple pretesting methods should be used as budget, resources, and time permits to provide a thorough evaluation of the data collection instrument and to document that the data collection instrume nt “works” as expected. In addition, other techniques used in combination with the pretesting methods listed above may be useful in , developing data collection instruments. (See Appendix A2 Questionnaire Testing and Evaluation Methods for Censuses and Surveys , for descriptions of the various pretesting methods available.) 4. When surveys or censuses are administered using multiple modes and meaningful changes to questions are made to accommodate the mode differences, all versions must be pretested. Meaningful changes to questions to accomm odate mode differences include changes to the presentation of the questi on or response format to refl ect mode-specific functional constraints or advantages. In these cases, the proposed wording of each version must be pretested to ensure consistent interpretation of the intent of the question across modes, despite structural format or presentation differences. As l ong as the proposed wording of mode (e.g., paper versus electronic) is not each version is pretested, testing of the required, although it may be advisable. 9

20 5. Data collection instruments in any languages ot her than English must be pretested in the Pretesting supporting languages that will be used to collect data during production. materials in these languages is not required, but is recommended. Note: The Census Bureau Guideline Language Translation of Data Collection Instruments and Supporting Materials slating data collection provides guidance on tran instruments and supporting materials from English to another language. Sub-Requirement A2-3.4: Data collection instruments and supporting materials must be verified and tested to ensure that they function as intended. Examples of verification a nd testing activities include:  nt’s specifications a nd supporting materials Verifying that the data collection instrume .g., conducting walk-throughs to verify the reflect the sponsor’s requirements (e appropriateness of specifications).  Verifying that the data collection inst rument and supporting materials meet all specifications (e.g., verifying correctness of sk ip patterns, wording, instrument fills, and instrument edits). Conducting integration testing usi ng mock input files with rea listic scenarios to test all  parts of the data collection instrument toge ther (e.g., front, middle, and back modules).  Conducting usability testing to discover and eliminate barriers that keep respondents strument accurately and efficiently. from completing the data collection in  Conducting output tests to compare the output of the data collection instrument under f the data collection has been done with a development with that of its predecessor (i similar instrument in the past). Verifying that user interfaces wo rk according to specifications.  Verifying that user interfaces for elec tronic forms adhere to IT Standard 15.0.2, Web  , and any other guidance applicable to the Development Requirements and Guidelines program.  Verifying that Web-based data collection in struments comply with requirements of Section 508 of the U.S. Rehabilitation Act. Verifying that paper data colle ction instruments are compatible with the program’s data  capture and processing systems. Computer Assisted Personal Interviewing reflects Note: The Census Bureau Guideline recommended practices for en suring the quality of CAPI. Requirement A2-4: Documentation needed to replicate and evaluate the development of data collection instruments and supporting materials mu st be produced. The documentation must be retained, consistent with applicable policies and data-use agreements, and must be made available to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include: Plans for the development and testing of th  e data collection instrument and supporting materials. 10

21  Specifications for the data collection instruments and supporting materials.  Results of questionnaire development resear ch (e.g., pretesting results, expert review reports, and site visit reports).  final production instrument and reports of testing results. Input files used to test the  Computer source code for the production data collection inst rument along with information on the version of software used to develop the instrument. Producing ,  Quality measures and evaluation results. (See Statistical Quality Standard D3 Measures and Indicators of Nonsampling Error .) Notes: The documentation must be released on request to external users, unless the information (1) or administrative restricti ons that would preclude its is subject to legal protections release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 11

22 1 Appendix A2 Questionnaire Testing and Evaluation Methods for Censuses and Surveys Pretesting is critical to the identification of problems for bot h respondents and interviewers with r/context effects, skip instru regard to question content, orde ctions, and formatting. Problems with question content, for example, include c onfusion over the meaning of the question as well as misinterpretation of individual terms or concep ts. Problems with skip instructions may result in missing data and frustration by interviewers and/or respondents. Formatting concerns are relevant to self-administered questionnaires and may lead to respondent confusion and a loss of information. “Pretesting” is a broad term th hods or combinations of methods at applies to many different met aires. These methods are valuable for identifying that can be used to test and evaluate questionn different strengths a nd weaknesses, and may problems with draft questionnaires, but they have be most useful at different stages of questi onnaire/instrument development. Typically, using several pretesting methods is more effective in identifying problem questions and suggesting solutions than using just a single method. This the different types of appendix briefly describes pretesting methods, their streng ths and weaknesses, and situa tions where they are most beneficial. The enumeration and description of potential pretesti ng and evaluation methods in this appendix techniques do not satisfy the lable techniques; however, some is meant to cover all the avai pretesting requirement of Statistical Quality Standard A2 Developing Data Collection : . Other methods satisfy the requirement only under special Instruments and Supporting Materials circumstances. The pretesting requirement of St andard A2 identifies the methods that must be used to pretest census and survey questions. Although the pretesting requirement of Standard A2 must be satis fied, the appropriateness of the methods and the resources available to implemen t them should be considered in determining which pretesting methods to use. Pretesting and evaluation techni ques fall into two major cate gories – pre-field and field techniques. Generall y, pre-field techniques are used dur ing the preliminary stages of questionnaire development. Pre-field techni ques include:  satisfy the pretesting requirement, Respondent focus groups. (This method does not unless the focus group completes and discu red questionnaire.) sses a self-administe  Exploratory or feasibil ity visits to companies or establ ishment sites. (This method does not satisfy the pretesting requirement.)  Cognitive interviews. (This method satisfies the pretesting requirement.) 1 This appendix is based on two sources: 1) Protocol for Pretesting Demographic Surveys at the Census Bureau, prepared by Theresa DeMaio, Nancy Mathiowetz, Jennifer Rothgeb, Mary Ellen Beach, and Sharon Durant, dated June 28, 1993; and 2) Evolution and Adaptation of Questionnaire Development, Evaluation and Testing in Establishment Surveys, by Diane Willimack, Lars Lyberg , Jean Martin, Lilli Japec, and Patricia Whitridge. Questionnaire Development, Evaluation and Testing Monograph Paper for the International Conference on Methods, Charleston, SC, November, 2002. 12

23  Usability techniques. (This method does not satisfy the pretesting requirement unless it is focused on respondent understanding of a self-administered or interviewer- administered questionnaire.) Methodological expert revi ews. (This method does not  satisfy the pretesting requirement.) Field techniques are used to evaluate questionn aires tested under fiel d conditions, either in conjunction with a field test or collection. Using field techniques during during production data production data collection would be appropriate only for ongoing or recu rring surveys. Field techniques include:  interactions. (This method satisfies the Behavior coding of interviewer-respondent pretesting requirement.)  Respondent debriefings. (This method sa tisfies the pretesting requirement.)  Interviewer debriefings. (This method does not satisfy the pretesting requirement.)  Analysts’ feedback. (This method does not satisfy the pretesting requirement.)  Split panel tests. (This method satis fies the pretesting requirement.)  Analysis of item nonresponse rates, imput ation rates, edit fa ilures, or response satisfy the pretesting requirement.) not distributions. (This method does PRE-FIELD TECHNIQUES Respondent Focus Groups re development cycle and can be are used early in the questionnai ion-answering process. Generally, the focus group used in a variety of ways to assess the quest not satisfy the pretesting requirement, because it does not expose respondents to technique does a questionnaire. The only use of focus groups that satisfies the pretesting requirement is to have the group complete a self-administered questionnaire, foll owed by a discussion of the experience. This provides information about the appearance and formatting of the questionnaire and reveals possible content problems. Focus groups can be used before questionnaire construction begins to gather information about a topic, such as:  How potential respondents struct ure their thoughts about a topic.  How respondents understand general c oncepts or specific terminology.  Respondents’ opinions about the sensitiv ity or difficulty of the questions.  How much burden is associated with gather ing the information necessary to answer a question. Focus groups can also be used to identify varia tions in language, terminology, or the interpretation of questions and response options . Used in this way, they may provide quicker access to a larger number of people than is possibl e with cognitive interviews. One of the main advantages of focus groups is the opportunity to observe an increased am ount of interaction on a topic in a short time. The group in teraction is of central importanc e – it can result in information and insights that may be less accessible in other settings. However, precisely because of this dual’s response process a good test of an indivi group interaction, the focus group does not permit 13

24 when alone. Moreover, in focus groups the resear cher does not have as much control over the process as with cognitive interviews or interv iewer-administered questionnaires. One or two people in the group may dominate the discussi on and restrict the input from other group members. Exploratory or Feasibility Studies are another common method for evaluating survey content relative to concepts. Econo mic survey practitioners ty pically call these studies company or site visits because they carry out the studi es at the site of the business or institution. Because these visits are conducted before the questi onnaire has been developed, they do not satisfy the pretesting requirement. Because economic surveys rely heavily on business or institutional records, the primary goal of these site visits is to determine the availability of the desired data in r ecords, their periodicity, and the definition of the concept as used in comp any records. Other goals include assessment of response burden and quality and the identif ication of the appropriate respondent. The design of these company or site visits te nds to vary a great deal. Because they are exploratory in nature, the activity may contin ue until the economic survey or program staff sufficiently understands the respondents’ views of the concepts, resources permitting of course. Purposive or convenience samples ar e selected that target key data providers. Sample sizes are several members of the than thirty. Typically, small, perhaps as few as five and rarely more survey or program staff, who may or may not include questionnaire design experts, conduct nvolved in government reporting. Information meetings with multiple company employees i gained during these visits helps determine whethe r the survey concepts are measurable, what the lated to the concept of structure the questions re specific questions should be, how to organize or interest, and to whom the form should be sent. Exploratory or feasibility studies may be mu lti-purpose. In addition to exploring data availability for the concept of interest, surv y also set up reporting ey or program staff ma arrangements and review operating units to en sure correct coverage. A common by-product of these visits is to solidify relationships between the companies and the survey or program staff. Cognitive Interviews are used in the later part of the que stionnaire development cycle, after a questionnaire has been constructed based on inform ation from focus groups, site visits, or other sources. They consist of one-on- one interviews using a draft que stionnaire in which respondents questions. Cognitive interviews provide an describe their thoughts while answering the survey important means of learning about e questionnaire directly from respondents’ problems with th them. Because this technique te sts the questionnaire with potentia l respondents, it satisfies the pretesting requirement. In addition, small numbers of interviews (as fe w as fifteen) can yield information about major problems if respondents repeatedly identify the same questions and concepts as sources of confusion. Because sample sizes ar e small, iterative pretesting of an instrument is often possible. After one round of interviews is complete, researchers can dia gnose problems, revise question wording to solve the problems, and conduct additiona l interviews to see if the new questions are successful. 14

25 a laboratory setting. The advantage of the Cognitive interviews may or may not be conducted in laboratory is that it offers a controlled environment for conduc ting the interview, and provides as audio recording. However, laboratory interviews may be the opportunity for video as well impractical or unsuitable. For example, economic surveys rarely conduct c ognitive interviews in ognitive testing of economic survey s is usually conducted on-site a laboratory setting. Rather, c business or institutiona l respondent. One reason for this approach at the offices or location of the is to enable business or institutional respondents’ to have access to records. Another is business respondents’ reluctance to meet outside their workplaces for these interviews. In many economic surveys, which tend to be relatively le ngthy and require labor-int ensive data retrieval questions or sections rather than the entire from records, testing may be limited to a subset of questionnaire. Thus, researchers must be care ful to set the proper context for the target questions. “Think aloud” interviews, as cogni tive interviews have come to be called, can be conducted either concurrently or retrosp ectively – that is, the respondents’ verbalizations of their thought processes can occur either during or after the completion of the questionnaire. As the Census Bureau conducts them, cognitive interviews incor porate follow-up questions by the researcher in addition to the respondent’s statement of his or her thoughts. Probing questions are used when the researcher wants to have the respondent focus on particular task. For example, the inte rviewer may ask how respondents aspects of the question-response chose among response choices, how th ey interpreted reference periods , or what a particular term (asking the respondents to repeat the question in their own words) permits meant. Paraphrasing the researcher to learn whether stion and interp rets it in the the respondent understands the que manner intended, and it may reveal better wordings for questions. itutions, in which data retrieval often involves business records, In surveys of businesses or inst es are often augmented by ques tions asking respondents to probing and paraphrasing techniqu describe those records and their c ontents or to show the records to the researcher. Since data retrieval tends to be a labor-intensive process for business respondents, frequently requiring the tation with colleagues, it is of ten unrealistic for researchers to use of multiple sources or consul observe the process during a c ognitive interview. Instead, are often used to hypothetical probes s’ knowledge of and access identify the sources of data, discover respondent to records, recreate likely steps taken to retrieve data from records or to request information from colleagues, and suggest possible estimation strategies. Usability Techniques are used to aid development of auto mated questionnaires. Objectives are to discover and eliminate ba rriers that keep respondents from completing an automated questionnaire accurately and efficiently with mini mal burden. Usability tests that are focused on respondent understanding of the ques tionnaire satisfy the pretesting requirement. Usability tests that are focused on the interviewers’ ability to administer the instrument do not satisfy the ministered electronic pretesting requirement; however, they are recomm ended for interviewer-ad questionnaires. 15

26 ing usability testing include th e language, fonts, icons, layout, Aspects that deserve attention dur organization, and interaction features, such as data entry, error recovery, and navigation. Typically, the focus is on inst rument performance in addition to how respondents interpret survey questions. Problems identified during te sting can then be e liminated before the instrument is finalized. ility techniques are available depending upon the As with paper questionnaires, different usab stage of development. One co mmon technique is called the usability test . These tests are similar to cognitive interviews – that is, one-on-one in terviews that elicit information about the respondent’s thought process. Respondents are given a task , such as “Complete the think questionnaire,” or smaller subtasks, such as “Send your data to the Census Bureau.” The techniques are all used as res aloud , and paraphrasing probing pondents complete their assigned , testing with respondents can be done using low fidelity tasks. Early in the design phase, usability (i.e., mocked-up paper screens). As the design progresses, versions of questionnaire prototypes the automated questionnaire can be tested to choo se or evaluate basic navigation features, error correction strategies, etc. Disability accommodation testing is a form of usability testing which evaluates the ability of a disabled user to access the que stionnaire through different assi stive technologi es, such as a ility techniques. screen reader. Expert reviews (see below) are also part of the repertoire of usab Research has shown that as few as three partic ipants can uncover half of the major usability ; and ten participants percent of the problems problems; four to five participants can uncover 80 oblems (Dumas and Redish, 1999). can uncover 90 percent of the pr Finally, in a survey instrument with usability heuristic review , an expert compares the electronic by all user interfaces (Nielsen, 1993). principles that should be followed Methodological Expert Reviews sts or questionnaire-design , conducted by survey methodologi ties potential interviewers and respondents may have with the experts, evaluate any difficul who have extensive exposure to either the questionnaire. Seasoned survey researchers theoretical or practical aspects of questionnaire de sign use their expertise to achieve this goal. Because respondents do not provide direct input in these reviews, in general they do not satisfy the pretesting requirement. Usually these revi ews are conducted early in the questionnaire development process and in conc ert with other pretest methods. Expert reviews may be used instead of respondent -based pretesting only as a last resort, when extreme time constraints prevent th In such instances, survey e use of other pretesting methods. methodology experts must conduct the reviews and doc ument the results in a written report. The decision to use expert reviews ra ther than respondent-based pret esting must be made by subject- matter areas in consultation with the methodological Center for Statistical research areas in the Research and Methodology and on the Res ponse Improvement Research Staff. is a tool providing a The cognitive appraisal coding sy stem (Forsyth and Lessler, 1991) systematic approach to the me thodological expert review proce ss. Like methodological expert y questions that have po tential for reporting errors. This tool is reviews, results are used to identif 16

27 particularly effective when used by questi onnaire design experts who understand the link between the cognitive response process and meas urement results. However, novice staff or subject-area staff also can use this tool as a guide in th eir reviews of questionnaires. Methodological expert reviews al so can be conducted as part of a usability evaluation. Typically, this review is perf ormed with an automated versi on of the questionnaire, although it need not be fully functional. Experts evaluate the questionnaire for cons istency and application recovery, and ease of navigation, of user-centered principles of user-control, error prevention and training, and recall. FIELD TECHNIQUES Field techniques may be used with pretests or pilot tests of questionnaires or instruments and d in ongoing periodic (or recurring) surveys. The survey processes. They may also be employe es with potential survey res value of testing draft questionnair pondents cannot be overstated, even if it simply involves observation and evaluation by questionnaire developers. However, the following pretesting methods can be used to maximize the benefits of field testing. Behavior Coding of Respondent/Interviewer Interactions involves systematic coding of the interaction between interviewers and respondents from live or tape d field or telephone interviews is pretesting method satisfies the pretesting to collect quantitative information. Using th requirement. The focus here is on specific aspects of how the interviewer asks the question and how the assessment, the behaviors that are coded focus respondent reacts. When used for questionnaire on behaviors that indicate problems with th e question, the response categories, or the respondent’s ability to form an adequate respons e. For example, if a respondent asks for clarification after hearing the question, it is lik ely that some aspect of the question caused confusion. Likewise, if a res pondent interrupts the question befo re the interviewer finishes reading it, then the respondent misses informati on that might be important to giving a correct answer. For interviewer-administered economic surveys, the coding scheme may need to be modified from traditional househol d applications, because intervie wers for establishment surveys tend to be allowed greater flexibility. In contrast to the pre-field t echniques described earlier, the us e of behavior coding requires a sample size sufficient to address analytic requi if the questionnaire rements. For example, contains many skip patterns, it is necessary to select a large enough sample to permit observation of various paths through the questionnaire. In addition, the determination of sample sizes for behavior coding should take in to account the relevant populati on groups for which separate analysis is desired. Because behavior coding evaluates all questions on the questionnaire, it promotes systematic detection of questions that elicit large numbers of behaviors that reflect problems. However, it is not usually designed to identify the source of the problems. It also may not be able to distinguish ons of a question is better. which of several similar versi 17

28 Finally, behavior coding does not always provide an accurate diagnosis of problems. It can only detect problems that are manifest in intervie wer or respondent behavior. Some important problems, such as respondent misinterpretat ions, may remain hidden because both respondents re of them. Behavior coding is not well-suited for identifying and interviewers tend to be unawa such problems. Respondent Debriefing uses a structured quest ionnaire after data are collected to elicit information about respondents’ interpretations of survey questions. Use of this method satisfies the pretesting requirement. The debriefing may be conducted by incorporating structured follow-up ques tions at the end of a field test interview or by re -contacting respondent s after they return a completed self- s, respondent debriefings sometimes are called administered questionnaire. In economic survey “response analysis surveys” (“RAS”) or “content evaluations.” Respondent debriefings usually are interviewer-administered, but may be self-a dministered. Some Census Bureau economic surveys have conducted respondent debriefings by formulating them as self-administered questionnaires and enclosing them with survey forms during pilot te sts or production data collections. Sample sizes and designs for respon dent debriefings vary. Sample sizes may be as small as 20 or as large as several hundred. De signs may be either random or purposive, such as conducting debriefings with respondents who exhibited higher error rates or errors on critical items. Since rical summaries of results may be generated. the debriefing instrument is structured, empi tive of respondent debr iefing is to determine When used for testing purposes, the primary objec whether the respondents understand the concepts and questions in the same way that the survey designers intend. Sufficient information is obtained to evaluate the extent to which reported data are consistent with survey defi nitions. For instance, respondents may be asked whether they included or excluded particular it ems in their answers, per defin itions. In economic surveys, the debriefings may ask about the use of records or estimation strategies. In addition, respondent debriefings can be useful in determining th e reason for respondent misunderstandings. Sometimes results of respondent debriefings show that a question is superfluous and can be eliminated from the final questionnaire. C onversely, it may be discovered that additional tionnaire to better operati questions need to be included in the final ques onalize the concept of interest. Finally, the data may show that the inte oncepts or questions is nded meaning of certain c not clear or able to be understood. A critical requirement to obtain a successful respondent debriefing is that question designers and researchers have a clear idea of potential probl ems so that good debriefing questions can be developed. Ideas about potential problems can come from pre-field techniques (e.g., cognitive interviews conducted prior to the fi from a previous survey, from eld test), from analysis of data careful review of questionnaires, or from observation of earlier interviews. Respondent debriefings may be able to supplement the information obtained from behavior coding. As noted above, behavior coding demonstrates the exis tence of problems but does not always identify the source of the problem. When designed properly, the results of respondent 18

29 debriefings can provide informati . Respondent debriefings also on about the sources of problems may reveal problems not evident from the response behavior. Interviewer Debriefing has traditionally been the primary met hod used to evaluate field or pilot It also may be used following production data tests of interviewer-administered surveys. ongoing periodic or recurring collection prior to redesigning an survey. Interviewer debriefing nistering structured que stionnaires with the consists of holding group discussions or admi aire problems. The objective is to use the interviewers to obtain their views of questionn interviewers’ direct contact with respondents to enrich the questionnaire designer’s understanding of questionnaire prob lems. Although it is a useful evaluation component, it is not sufficient as an evaluation method and does not satisfy the pretesting requirement. Interviewers may not always be accurate reporters of certain t ypes of questionnaire problems for several reasons. When in terviewers report a problem, it is not always clear if the issue caused trouble for one respondent or for many. Interviewe rs’ reports of problem questions may reflect their own preference regarding a question, rath er than respondent confusion. Finally, experienced interviewers sometimes change the wording of problem questions as a matter of course to make them work, and may not even realize they have done so. Interviewer debriefings can be conducted in se veral different ways: in a group setting, through rating forms, or through sta ndardized questionnaires. Group setting debriefings are the most common method. They essentially involve c onducting a focus group with the field test interviewers to learn about their experien Rating forms ces in administering the questionnaire. obtain more quantitative information by asking inte rviewers to rate each question in the pretest questionnaire on selected charac teristics of interest to the researchers (e.g., whether the interviewer had trouble reading the question as written, whethe r the respondent understood the Standardized interviewer debriefing questionnaires collect words or ideas in the question). of a problem, the preval ence of a problem, the information about the interviewers’ perceptions reasons for a problem, and proposed solutions to a problem. Interviewer debriefings also can ask about the magnitude of specific kinds of probl ems, to test the interviewers’ knowledge of subject-matter concepts. with a questionnaire is a method of learning about problems specific to the Analysts’ Feedback economic area. At the Census Bureau, most econo mic surveys are self-administered; so survey e individual subject areas, rather than interviewers, often have or program staff analysts in th contact with respondents. While collecting feedback from anal ysts is a useful evaluation component, it does not satisfy the pretesting requirement. Feedback from analysts about th eir interactions with responde nts may serve as an informal evaluation of the questionnaire a nd the data collected. These in teractions include “Help Desk” phone inquiries from respondents and follow- up phone calls to respondents by analysts investigating suspicious data flagged by edit failu res. Analyst feedback is more useful when analysts systematically record comments from re spondents in a log. The log enables qualitative problems, because strictly anecdotal feedback evaluation of the relative severity of questionnaire sometimes may be overstated. 19

30 Another way to obtain analyst feedback is for questionnaire design expe rts to conduct focus resolve edit failures. These focus groups can groups with the analysts who review data and identify questions that may need to be redesigne d or evaluated by other methods. Regardless of how respondent feedback is captured, analys ts should provide feedback early in the questionnaire development cycle of recurring surveys to identify problematic questions. stionnaire variants or data collection Split Panel Tests are controlled experimental tests of que modes to determine which one is “better” or to measure differences between them. Split panel testing satisfies the pr etesting requirement. Split panel experiments may be conducted within a field or pilot test or embedded within production data collection for an ongoing periodic or recurring survey. For pretesting draft “better” questionna versions of a questionnaire, the search for the ire requires that an a priori standard be determined by which the different versions can be judged. Split panel tests can incorporate a single question, a set of que stions, or an entire questionnaire. zes when designing a split panel test so that It is important to select adequate sample si ed. In addition, these tests must use randomized differences of substantive interest can be measur assignment within replicate sample designs so that differences can be attr ibuted to the question or questionnaire and not to the e ffects of incomparable samples. Another use of the split panel test changing questions. Although split is to calibrate the effect of panel tests are expensive, they are extremely valu able in the redesign and testing of surveys for time is an issue. They provide an important which the comparability of the data collected over measure of the extent to which different resu lts following a major survey redesign are due to methodological changes, such as the survey instru ment or interview mode, rather than changes over time in the subject-matter of interest. Sp lit panel testing is recommended for data with important policy implications. sts produces measures of differences but does Comparing response distributions in split panel te not necessarily reveal whether one version of a question produces a better understanding of what is being asked than another. Other question ev aluation methods, such as respondent debriefings, or coding, are useful to evaluate and interpret the differences interviewer debriefings, and behavi observed in split panel tests. Analysis of Item Nonresponse Rates, Imput ation Rates, Edit Failures, or Response Distributions from the collected data can provide useful inform ation about how well the questionnaire works. Use of this method in combination with a field test does not satisfy the pretesting requirement. In household surveys, examination of item nonres ponse rates can be informative in two ways. First, “don’t know” rates can determ ine the extent to which a task is too difficult for respondents. certain questions or Second, refusal rates can determine the extent to which respondents find versions of a question to be more sensitive than others. 20

31 In economic surveys, item nonresponse may be in terpreted to have various meanings, depending itals, prisons, schools) itutional surveys (e.g., hosp on the context of the survey. In some inst rson-level records, high item nonresponse is where data are abstracted from individual pe considered to indicate data not routinely available in those re cords. Item nonresponse may be more difficult to detect in other economic surv eys where questions may be left blank because they are not applicable to the responding business or the respons e value may be zero. In these considered missing at all. cases, the data may not be Response distributions are the frequencies with which respondents provid ed answers during data collection. Evaluation of the response distribu tions for survey items can determine whether variation exists among the responses given by res question wordings or pondents or if different question sequencings produce different response pattern s. This type of analysis is most useful of a questionnaire or when pretesting either more than one version a single questionnaire for which some known distribution of characteristics exists fo r comparative purposes. The quality of collected data also may be ev aluated by comparing, reconciling, or benchmarking true for economic data, to data from other sources. This is especially but benchmarking data are also available for some household surveys. CONCLUSION At least one of the following techniques must be used to satisfy the pretesting requirement:  Cognitive interviews. Usability techniques focused on the responde the questionnaire.  nt’s understanding of  nistration of questionnaires. Focus groups involving the admi Behavior coding of respondent  /interviewer interactions.  Respondent debriefings in conj unction with a field test or actual data collection.  Split panel tests. e when multiple methods are used. Additional However, pretesting typically is more effectiv pretesting techniques should be carefully considered to provi de a thorough evaluation and documentation of questionnaire problems and soluti ons. The relative effec tiveness of the various techniques for evaluating survey questions de pends on the pretest objectives, sample size, questionnaire design, and mode of Bureau advocates that both pre- data collection. The Census field and field techniques be undertak en, as time and funds permit. For continuing surveys that have a pre-existing questionnaire, cognitive interviews should be used to provide detailed insight s into problems with the questi onnaire whenever time permits or when a redesign is undertaken. Cognitive interviews may be more useful than focus groups with a pre-existing questionnaire because they mimic the question-re sponse process. For one-time or new surveys, focus groups are useful tools for learning what res pondents think about the concepts, terminology, and sequence of topics prior to drafting the questionnaire. In economic surveys, exploratory/feasibility studies, conducte d as company or site visits, also provide information about structuring and wording the questionnaire relative to data available in es are increasingly important as surveys move business/institutional records. Usability techniqu to automated data collection. 21

32 Pre-field methods alone may not be sufficient to test a questionnaire. Some type of testing in the field is encouraged, even if it is evalua ted based only on observation by questionnaire developers. More helpful is small-to-medium-scal e field or pilot testing with more systematic evaluation techniques. The vari ous methods described in this appendix complement each other in identifying problems, the sources of problems, and potential solutions. : REFERENCES A Practical Guide to Usability Testing , Portland, OR: Dumas, J. and Redish, J., 1999. Intellect. Forsyth, B. H., and Lessler, J. T., 1991. “C ognitive Laboratory Met hods: A Taxonomy,” in Measurement Errors in Surveys, Biemer, P. P., Gr oves, R. M., Lyberg, L. E., Mathiowetz, N. A., and Sudman, S.(eds.), New York: John Wiley and Sons, Inc., pp. 393-418. Nielsen, Jakob, 1993. Usability Engineeri ng. Morgan Kaufmann, New York, NY. 22

33 Statistical Quality Standard A3 Developing and Implementing a Sample Design The purpose of this standard is to ensure that statistically sound frames are designed Purpose: and samples are selected to meet the objectives of the survey. Scope: The Census Bureau’s statistical quality standards apply to all information products nd the activities that generate those product s, including products released by the Census Bureau a joint partners, or other cust released to the public, sponsors, omers. All Census Bureau individuals must comply with employees and Special Sworn Status these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. gn and selection of statis tically sound samples used In particular, this standard applies to the desi to produce estimates or make infe rences. This standard covers:  Frame development for censuses and sample surveys.  The design and selection of samples or subsamples for surveys. ubsamples for secondary data analysis,  The design and selection of samples or s evaluations, or quality assessments. Exclusions: In addition to the listed in the Preface, this standard does not apply to: global exclusions  Selection of focus groups.  Cognitive interviewing.  Samples that will not be used to produce estimates or make inferences (e.g., samples used for operational tests, pilot studies, or quality control).  Frames and samples provided to th e Census Bureau by a sponsor.  Activities performed to produce sample estimates (e.g., weighting, estimation, and Producing Direct Estimates from , variance estimation). Statistical Quality Standard D1 Samples ted to producing estimates. , addresses requirements rela Cluster , coverage , cut-off samples , estimate , estimation , frame , housing unit , peer Key Terms: probability of selection , , primary sampling unit (PSU) , precision , probability sampling , review , , sample size sample design sampling frame , sampling weights , sequential sampling , strata , , stratification , target population , unduplication systematic sampling variance , and weights . , Requirement A3-1: Throughout all processes associated with frame development and sample design, unauthorized release of protected informati on or administratively restricted information must be prevented by following federal laws (e .g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additional provisions governing the use of the data (e.g., as may be specified in a memo randum of understanding or data-use agreement). Protecting Confidentiality (See Statistical Quality Standard S1 , .) 23

34 Requirement A3-2: A plan must be developed that addresses: Statistical requirements of th e program using the sample (e 1. .g., the target population, the key estimates, the required pr ecision of the estimates, and the expected response rates). Development of the sampling frame. 2. 3. improve efficiency and minimize the costs of data Sampling methodologies that pling, stratification, sorting, unduplication collection (e.g., probability sampling, oversam requirements, and cluster sizes). 4. Verification and testing of systems asso ciated with the sampling operations. 5. Monitoring and evaluating the accuracy of the frame and the sample (e.g., the coverage of the target population by the frames, tim eliness of the frames, efficiency of stratification, and verification of the sample). Notes: (1) The Census Bureau Guideline Sample Design and Selection identifies steps to follow and issues to consider when desi gning and selecting a sample. (2) Statistical Quality Standard A1 , Planning the Data Program , addresses overall planning requirements, including the develo pment of schedules and costs. Requirement A3-3: Sampling frames that meet the da ta collection objectives must be developed using statistically sound methods. Examples of frame development activities include: et population. Describing the targ   Constructing the frame using sources that promote accuracy and completeness. Combining multiple frames and unduplicating among them or adjusting probabilities of  selection to address units appearing in multiple frames.  Updating frames (e.g., for new construction a nd establishment “births” and “deaths”).  Identifying limitations of the frame, includi ng timeliness and accur acy of the frame (e.g., misclassification, eligibility, and coverage). Requirement A3-4: The sample design must be developed to meet the objectives of the survey, using statistically sound methods. The size and design of the samp le must reflect the level of detail needed in tabulations and other informa tion products and the prec ision required of key estimates. Any use of nonprobability sampling methods (e.g., cut-off) must be justified statistically. Examples of sample design activities include:  Setting the requirements and rules for how to define primary sampling units (PSUs), ), and criteria for self-representing PSUs. secondary units (e.g., clusters of housing units  Defining measures of size.  Determining whether oversampling of population subgroups is needed.  Defining sampling strata and criteria for clustering.  Defining the sample size by stra tum and the allocation methodology.  Determining the order of selection and the probabilitie s of selection.  Describing the sample selection methods (e .g., systematic sampling, sequential sampling, rtional to size). and probability propo 24

35  Grouping sample units into representative pa nels and identifying the duration a unit will remain in sample.  Determining sample rotation patterns. Addressing the issues involved with replac ing a current sample design with a new one  ng/maximizing overlap, and accounting for (e.g., phase-in/phase-out periods, minimizi any bias associated with the redesign).  Developing and maintaining sample design in formation needed for weighting, estimation, and variance estimation (e.g., probabilities of selection, noninterview adjustment cells, and sample replicates).  Assessing the potential bias from using the cut-off sampling method. Requirement A3-5: and samples selected to ensure high Sampling frames must be implemented quality data. Specifications and procedures for creating frames and selecting Sub-Requirement A3-5.1: samples, based on the statistical requirem ents, must be developed and implemented. Examples of issues that specifications and procedures might address include:  Stratum definitions, stratification al gorithms, and clus tering algorithms.  Addition or deletion of records to update frames. Algorithms for creating PSUs.   Sampling algorithms.  Unduplication of the sample between surveys or between different waves of the same survey.  Creation of sample replicates needed for weighting, estim ation, and variance estimation.  Assignment of sampling weights appropriate for the sample design to selected units. Sub-Requirement A3-5.2: Systems and procedures must be verified and tested to ensure all components function as intended. Examples of verification a nd testing activities include:  the technical requirements for the frame and Verifying that specifications conform to sample design (e.g., using walk-throughs and peer reviews).  Validating computer code against specifications. Performing tests of the indivi  dual modules and an integrated test of the full sample selection operation.  Verifying the accuracy of frame information.  Verifying the selection of the sample for accuracy (e.g., sample sizes are as expected). Sub-Requirement A3-5.3: Systems and procedures must be developed and implemented to monitor and evaluate th e accuracy of the frame development and sample selection operations and to take corrective action if problems are identified. Examples of activities to monitor and evaluate the accuracy include: Comparing weighted sample counts with frame counts.  25

36  Verifying that sample sizes are within expectations. Evaluating the accuracy and coverage of th  e frames against the target population. Evaluating changes in the sample design to understand how the revisions might affect the  estimates. Requirement A3-6: Documentation needed to replicat e and evaluate frame development and sample design operations must be produced. Th e documentation must be retained, consistent with applicable policies and data -use agreements, and must be made available to Census Bureau employees who need it to Statistical Quality Standard S2 , Managing carry out their work. (See .) Data and Documents Examples of documentation include: Plans, requirements, specifications, and pro  cedures for the systems and processes of frame development and sample selection.  Sampling design information needed to pr oduce estimates and variance estimates.  Descriptions of the frame and its coverage.  the frame and the adequacy of the sample Techniques used to evaluate the coverage of design.  Quality measures and evaluation results. (See Statistical Quality Standard D3 , Producing Measures and Indicators of Nonsampling Error .) Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be readily accessible to the public to ensure transparency of information products released by the Census Bureau. 26

37 COLLECTING AND ACQUIRING DATA Establishing and Implementing Data Collection Methods B1 B2 Acquiring and Using Administrative Records 27

38 Statistical Quality Standard B1 Establishing and Implementing Data Collection Methods The purpose of this standard is to Purpose: ensure that methods are established and implemented to promote the collection of high quality data from respondents. Scope: The Census Bureau’s statistical quality standards apply to all information products generate those product s, including products released by the Census Bureau a nd the activities that omers. All Census Bureau joint partners, or other cust released to the public, sponsors, these standards; this includes employees and Special Sworn Status individuals must comply with contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to estab lishing and implementing data collection methods for from respondents, including reimbursable surveys data programs that obtain information directly and surveys in which interviewers collect information from establishments. Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to: reements with other organizations and not  Administrative records data acquired under ag collected by interviewers. : CAPI Key Terms CATI , coverage , data collection , dress rehearsal , fax imaging , field test , , measurement error , load testing , mail-out/mail-back , nonresponse follow-up , nonresponse bias , systems test reinterview , response rate , supplemental reinterview , response error , and touch-tone , data entry (TDE) . Requirement B1-1: Throughout all processes associated with data collection, unauthorized release of protected information or administrativ ely restricted information must be prevented by following federal laws (e.g., Title 13, Title15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality Standard S1 , Protecting Confidentiality .) Requirement B1-2: A plan must be developed that addresses: Data collection methods (e.g., interview mode, use of incentives, and reference periods), 1. systems, and procedures. Definitions for what constitutes an interview or response (i.e., a complete interview, a 2. sufficient partial interview, or an insufficient partial interview). Verification and testing of 3. the data collection methods, systems, and procedures. ( , Developing Data Collection Instruments and Supporting Statistical Quality Standard A2 Materials , addresses questionnaire content pretesting and instrument testing.) 4. Training for staff involved in the data collection effort. of the data collection operations. 5. Monitoring and evaluating the quality 28

39 Note: , Planning a Data Program , addresses overall planning Statistical Quality Standard A1 requirements, including estimat es of schedule and costs. : Data collection methods must be designed and implemented in a manner Requirement B1-3 that balances (within the constraints of budget, resources, and time) data quality and measurement error with respondent burden. Sub-Requirement B1-3.1: Systems and procedures must be developed to implement the data collection. Examples of data collection activities for whic h systems and procedures should be developed include:  Listing possible sampling units.  and related materials (e.g., pr inting and assembling mail- Producing paper questionnaires Statistical Quality Standard A2 out packages). ( Developing Data Collection Instruments , and Supporting Materials , addresses the design of que stionnaires and materials.)  Providing OMB-required notific ations to respondents.  Providing telephone questionnair e assistance for mail-out/ma il-back data collection.  Transmitting information (by mail, electronically, the Internet, TDE, fax imaging, or terviewers and the Census Bureau. other method) between respondents or in  Formatting CAPI/CATI output files to be compatible with processing systems. Conducting interviews.  Conducting nonresponse follow-up operations.  Data collection systems and methods must be verified and tested to Sub-Requirement B1-3.2: ensure that all components function as intended. Examples of verification a nd testing activities include:  Verifying that the specifications and procedur es reflect the requirements of the program. llection operations meet Verifying that the materials used for data co  specifications (e.g., ensure that forms are printed properly). Verifying the physical assembly of mailing packages (e.g., ensure that mailing pieces fit  properly in the envelopes).  Testing the electronic data management syst ems (e.g., the systems used to manage cases and data between headquarters and the inte rviewers and between headquarters and the data processing systems) for accuracy, capaci ty (e.g., load testing), and reliability.  Conducting a systems test to verify the functioning of the data collection instrument in combination with the data management systems. nd methods under realistic conditions (e.g., the  Conducting a field test to test systems a dress rehearsal for the decennial census). 29

40 Sub-Requirement B1-3.3: ers staff involved in the data Training for field and headquart collection effort (as identified during pl anning) must be developed and provided. Examples of training topics include:  (e.g., Data Stewardship Policy DS016, Respondent Relevant Census Bureau policies Identification Policy ).  The goals and objectives of the data collection.  Survey specific concepts and definitions.  The uses of the data.  Techniques for obtaining respondent cooperation.  Follow-up skills. Sub-Requirement B1-3.4: Systems and procedures must be developed and implemented to monitor and evaluate the data co rrective actions if problems are llection activities and to take co identified. Examples of monitoring and evaluating activities include:  Tracking unit response rates, progress in co mpleting interviews, and costs of the data goals are not met. rrective action when collection, and taking co  re all cases are accounted for and investigating to locate Tracking returned cases to ensu missing cases.  Verifying that interviewers follow interviewi ng procedures and do not falsify data (e.g., by conducting field observations, conducting re interviews, or monitoring telephone if necessary, taking appropr iate corrective action (e.g., center interviewers) and, smissing interviewers). retraining, reassigning, or di  Collecting, tracking, and analyzing interviewe r performance statistics (e.g., refusals, completed interviews, refusal conversions, login hours, and completed interviews per login hour), and providing feedback or other corrective action when necessary.  Verifying that analysts follow data collecti on review procedures, and providing feedback when necessary.  Reviewing response data for accuracy a nd completeness, and taking appropriate corrective action when necessary to improve accuracy or completeness.  Reviewing response data for unexpected result s and unusual patterns (e.g., a pattern of an unusually high number of vacant households) and taking corrective action when needed (e.g., providing feedback, retraining in terviewers, or conducting supplemental reinterviews).  Conducting evaluation studies (e.g., nonresponse bias analysis, coverage evaluation study, and response erro r reinterview study). 30

41 Requirement B1-4: Documentation needed to replicat e and evaluate the data collection be retained, consiste nt with applicable methods must be produced. The documentation must policies and data-use agreements, and must be ma de available to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include:  Plans, requirements, specifications, and procedures for the data collection.  Test designs and results. and interviewers about the data collection instrument. Instructions to respondents   Statistical Quality Standard D3 , Producing Quality measures and evaluation results. (See Measures and Indicators of Nonsampling Error .) Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 31

42 Statistical Quality Standard B2 Acquiring and Using Administrative Records Purpose: The purpose of this standard is to ensure the quality of information products derived from non-Census Bu reau organizations. from administrative records data acquired Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau a nd the activities that generate those product s, including products released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status these standards; this includes individuals must comply with contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to the acqui sition and use of administ rative records data (e.g., demographic, business, and geographic administra tive records data), from non-Census Bureau organizations. Exclusions: The global exclusions to the standards are listed in the Preface. No additional exclusions apply to this standard. record linkage . Key Terms : Administrative records , data-use agreement , and Requirement B2-1: and disposing of Throughout all processes associated with acquiring, using, deral laws (e.g., Title 13, Title 15, and Title 26), administrative records data, the provisions of fe s and procedures on privacy and confidentiality data-use agreements, and Census Bureau policie (e.g., Data Stewardship Policies) must be followe d to protect administrative records data from Statistical Quality Standard S1 , Protecting Confidentiality .) unauthorized release. (See Note: For detailed procedures on acquiring, us ing, and disposing of administrative records data, see the Administrative Records Handbook. Requirement B2-2: A study plan must be developed that addresses verification and evaluation the requirements of the Administrative Records of the quality of the acquired data, in addition to Handbook. Note: Statistical Quality Standard A1 , Planning a Data Program , addresses the overall planning requirements for a data program, including estimates of schedule and costs. Requirement B2-3 : Acquired data must be reviewed to ensure that they meet the requirements specified in the data-use agreement and in th e technical documentati on provided by the source agency. Examples of review activities include: le and match the record layout.  Verifying that the data are readab 32

43  istent with counts provided by the source Verifying that the number of records is cons agency. riables with historical aver ages or expected values.  Comparing distributions of va Reviewing address lists for extraneous charac ters and to ensure that the format of  incoming information is consistent with in formation contained within Census Bureau databases. Sub-Requirement B2-3.1: The quality of the acquired data must be evaluated. Examples of evaluation activities include:  Calculating the missing data rates within the records.  Calculating coverage rates. values of variables are within acceptable  Evaluating the accuracy of the records (e.g., ranges). Sub-Requirement B2-3.2: If the data do not meet the re quirements, timely feedback on the taken, following the procedures described in problems must be provided and corrective actions the Administrative Records Handbook. Requirement B2-4: Documentation needed to replicat e and evaluate administrative records projects must be produced. The documentation must be retained, to the extent allowed by applicable policies and data-use agreements, a nd must be made available to Census Bureau ry out their work. (See Managing employees who need it to car , Statistical Quality Standard S2 Data and Documents .) the documentation specified by the Examples of documentation, in addition to Administrative Records Handbook, include:  Descriptions of processes and procedures used to verify th e data and evaluate its quality. Descriptions of processes and proced ures used to develop estimates.  Research reports used to guide decisions.   Quality measures and evaluation results. (See Statistical Quality Standard D3 , Producing Measures and Indicators of Nonsampling Error .) Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be rmation products released by readily accessible to the public ensure transparency of info the Census Bureau. 33

44 CAPTURING AND PROCESSING DATA C1 Capturing Data C2 Editing and Imputing Data C3 Coding Data Linking Data Records C4 34

45 Statistical Quality Standard C1 Capturing Data The purpose of this standard is to en sure that methods are established and Purpose: conversion of paper forms or image files into implemented to promote the accurate capture and data files for further processing. The Census Bureau’s statistical quality Scope: standards apply to all information products released by the Census Bureau a generate those product s, including products nd the activities that released to the public, sponsors, omers. All Census Bureau joint partners, or other cust employees and Special Sworn Status (SSS) individuals must comply with these standards; this includes contractors and other i ndividuals who receive Census Bureau funding to develop and release Census Bureau information products. velopment, modification, and implementation of post- In particular, this standard applies to the de collection data capture operations, such as: Operations to convert data on paper forms or  maps into data files (e.g., key from paper (KFP) data entry, optical mark recognition (OMR), and optical character recognition (OCR)).  Operations to convert image files (e.g., fax image files received directly from respondents and geographic image files) into data files (e.g., key from image (KFI) data entry, the Economic Programs’ Paperless Fax Imaging Retrieval System (PFIRS), and operations to convert geographic image files into data files). Exclusions: listed in the Preface, this standard does not apply to: In addition to the global exclusions  Electronic data collections (e.g., CATI, CAPI, a nd the Web). Statistical Quality Standard A2, Developing a Data Collection Instrument, addresses data captu re performed within an instrument during data collection. Key Terms : Data capture , key from image (KFI) , key from paper (KFP) , optical character recognition (OCR) , and optical mark recognition (OMR) . Requirement C1-1: Throughout all processes associated with data capture, unauthorized release of protected information or administrativ ely restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additi of the data (e.g., as may be onal provisions governing the use specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality , Protecting Confidentiality .) Standard S1 A plan must be developed that addresses: Requirement C1-2: 1. Requirements for the data capture systems. 2. Required accuracy levels for data capture. 3. Verification and testing of the data capture systems. operations (including SSS 4. Training for the staff who perform the data capture contractors). 35

46 5. of the data capture operations. Monitoring and evaluation of the quality Statistical Quality Standard A1 Planning a Data Program , addresses overall planning Note: , requirements, including estimat es of schedule and costs. in image files must be converted Requirement C1-3: Data collected on paper forms or suitable for subsequent processing. accurately into an electronic format Sub-Requirement C1-3.1: Specifications and procedures fo r the data capture operations must be developed and implemented. Examples of activities that specificati ons and procedures might address include:  KFP data entry.  Scanning systems for paper forms and maps (e.g., OMR and OCR).  Operations to convert image f iles (e.g., fax image files and geographic image files) into ta entry and PFIRS). data files (e.g., KFI da Sub-Requirement C1-3.2: Data capture systems and procedures must be verified and tested to ensure that all components function as intended. Examples of verification a nd testing activities include:  Verifying that data capture specificat ions reflect the system requirements.  software adhere to the specifications. Verifying that data capture systems and  nd software capture data accurately. Verifying that data capture systems a Sub-Requirement C1-3.3: Training for the staff (includi ng SSS contractors) who perform the data capture operations (as identified during planning) must be de veloped and provided. Sub-Requirement C1-3.4: Systems and procedures must be developed and implemented to monitor and evaluate the quality nd to take corrective actions if of the data capture operations a problems are identified. Examples of monitoring and evaluation activities include:  Monitoring captured data (keyed or captured th rough an automated system) to ensure that it meets the specified accuracy requirements.  equency and types of errors. Monitoring and documenting the fr Taking corrective actions when data do not meet accuracy requirements (e.g., rejecting  and repairing unacceptable batches, retraining key-entry staff, and adjusting automated systems and retesting). Requirement C1-4: Documentation needed to replicate an d evaluate the data capture operations must be produced. The documentation must be re tained, consistent with applicable policies and data-use agreements, and must be made availabl e to Census Bureau employees who need it to .) carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents Examples of documentation include: 36

47  Plans, requirements, specifications, and pr ocedures for the data capture system. ented during the data capture operations.  Problems encountered and solutions implem  Quality measures from monitoring and evalua ting the data capture operations (e.g., error rates). (See Statistical Quality Standard D3 , Producing Measures and Indicators of .) Nonsampling Error Notes: The documentation must be released on request to external users, unless the information (1) is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) Providing Documentation to Support Transparency in Statistical Quality Standard F2 , (2) Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 37

48 Statistical Quality Standard C2 Editing and Imputing Data The purpose of this standard is to en sure that methods are established and Purpose: implemented to promote the accurate correction of missing and erroneous values in survey, census, and administrative records data through editing and imputation. Scope: The Census Bureau’s statistical quality standards apply to all information products nd the activities that generate those product s, including products released by the Census Bureau a released to the public, sponsors, joint partners, or other cust omers. All Census Bureau individuals must comply with these standards; this includes employees and Special Sworn Status dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. e development and implementation of editing and In particular, this standard applies to th imputation operations for survey, census, admini strative records data, and geospatial data. Exclusions: global exclusions listed in the Preface, this standard does not apply to: In addition to the  nts, that compensate for missing data. Estimation methods, such as nonresponse adjustme , Statistical Quality Standard D1 , addresses Providing Direct Estimates from Samples requirements for estimation methods. imputation . Key Terms : Editing , truth deck , outliers , skip pattern , and Throughout all processes associat ed with editing and imputation, Requirement C2-1: unauthorized release of protected information or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and add itional provisions governing the use of the data (e.g., as may be specified in a memorandum of understanding or data-use agreement). (See Statistical Quality Standard S1 Protecting Confidentiality .) , Requirement C2-2: A plan must be developed that addresses: 1. Requirements for the editing and imputation systems. Verification and testing of the 2. editing and imputation systems. 3. Monitoring and evaluation of the quality of the editing and imputation operations. Note: Statistical Quality Standard A1 , Planning a Data Program , addresses overall planning requirements, including estim ates of schedule and costs. Requirement C2-3: Data must be edited and imputed us ing statistically sound practices, based on available information. Sub-Requirement C2-3.1: Specifications and procedures for the editing and imputation errors or missing data in the operations must be developed and implemented to detect and correct files. 38

49 Examples of issues that specifications and procedures might address include: Checks of data files for missing data, duplicat s (e.g., checks for  e records, and outlier possible erroneous extreme responses in in come, price, and other such variables). Checks to verify the correct flow  through prescribed skip patterns.  Range checks or validity checks (e.g., to determine if numeric data fall within a te data values fall within prespecified range or if discre the set of acceptable responses).  Consistency checks across variables with in individual records to ensure non- contradictory responses (e.g., if a respondent is recorded as 5 year s old and married, the record contains an error).  Longitudinal consistency checks for data fiel ds not measuring peri od to period changes. Editing and imputation methods and rules (e.g., internal consistency edits, longitudinal  edits, hot deck edits, and analyst corrections).  Addition of flags on the data f iles to clearly identify all im puted and assigned values and the imputation method(s) used. Retention of the unedited values in the file along with the edited or imputed values.   Checks for topology errors in geospatial da ta (e.g., lack of coincidence between boundaries that should align, gaps, ov ershoots, and floating segments). Checks for address range errors in geographic data (e.g., parity inconsistencies, address  range overlaps and duplicates, and addr ess range direction irregularities).  Checks for duplicate map features. Standardization of street name informati on in geographic data (e.g., consistency of  , and consistent formatting). abbreviations and directionals  Rules for when data not from the data collection qualify as “equivalent-quality-to- reported-data” for establishment data collections. Editing and imputation systems and procedures must be verified and Sub-Requirement C2-3.2: tested to ensure that all co mponents function as intended. Examples of verification a nd testing activities include:  Verifying that edit and imputation specificati ons reflect the requireme nts for the edit and imputation systems.  or programming statements against Validating edit and imputation instructions specifications. Verifying that the imputation process is working correctly using test files.  Verifying that edit and imputation outc omes comply with the specifications.   Verifying that edit and imputation rules are implemented consistently.  Verify that the editing and imputation out comes are consistent within records and consistent across the full file.  Verifying that the editing and imputation ou tcomes that do not use randomization are repeatable. Sub-Requirement C2-3.3: Systems and procedures must be developed and implemented to rations and to take corrective monitor and evaluate the quality of the editing and imputation ope actions if problems are identified. 39

50 evaluation activities include: Examples of monitoring and  Monitoring and documenting the distributions of, and reasons for, edit and imputation ons are needed in the system. changes to determine if correcti Evaluating and documenting editing results for geospatial files (e.g., edits resulting in  improvements in boundaries, feature coverage, and feature accuracy) and geographic files (e.g., address ranges, address parity, a nd geographic entity names and codes).  Reviewing and verifying data when edits pr oduce results that differ from the past.  Using a truth deck to evaluate the accuracy of the imputed values. Requirement C2-4: Documentation needed to replicate and evaluate the ed iting and imputation operations must be produced. Th e documentation must be retained, consistent with applicable policies and data-use agreements, and must be ma de available to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include: editing and imputation procedures for the  Plans, requirements, specifications, and systems, including edit rules. Distributions of changes from edits and imputations.   Retaining original responses (before edit/im putation) on data files along with the final edited/imputed responses.  Problems encountered and solutions impl emented during the editing and imputing operations.  Quality measures from monitoring and eval uating the editing and imputation operations (e.g., imputation rates and ed it change rates). (See Statistical Quality Standard D3 , tors of Nonsampling Error Providing Measures and Indica .) Notes: The documentation must be released on request to external users, unless the information (1) or administrative restricti ons that would preclude its is subject to legal protections release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 40

51 Statistical Quality Standard C3 Coding Data The purpose of this standard is to en sure that methods are established and Purpose: implemented to promote the accurate assignment of codes, including geographic entity codes, to enable analysis and tabulation of data. The Census Bureau’s statistical quality standards apply to all information products Scope: released by the Census Bureau a s, including products nd the activities that generate those product joint partners, or other cust omers. All Census Bureau released to the public, sponsors, employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to the development and implementation of post-collection ng the assignment of: coding operations, includi Codes to convert text and nume rical data into categories.   Geographic entity codes (geocodes) and geogr aphic attribute codes to distinguish and and their characteristics within digital databases. describe geographic entities Exclusions: global exclusions listed in the Preface, this standard does not apply to: In addition to the Behavior coding activities associ ated with cognitive interviewing.  , , Key Terms : American National Standards Institute codes (ANSI codes) , coding geocoding geographic entity code (geocode) , , North American Industry Master Address File (MAF) , Classification System (NAICS) , and Standard Occupational Classification System (SOC) Topologically Integrated Geographic Encoding and Referencing (TIGER) . Requirement C3-1: Throughout all processes associated with coding, unauthorized release of protected information or administratively restrict ed information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality Standard S1 , Protecting Confidentiality .) Requirement C3-2: A plan must be developed that addresses: 1. Required accuracy levels for the coding operations, including definitions of errors. Requirements for the coding systems, includi 2. ng requirements for input and output files. 3. Verification and testing of the coding systems. 4. Training for staff involved in th e clerical coding operations. 5. Monitoring and evaluation of the quality of the coding operations. Notes: , addresses overall planning Statistical Quality Standard A1 , Planning a Data Program (1) requirements, including estimat es of schedule and costs. 41

52 (2) Coding Verification , provides guidance on coding The Census Bureau Guideline, procedures. Requirement C3-3: Processes must be developed and implemented to accurately assign codes for converting text and numerical s to identify and distinguish data to categories and geocode geographic entities and their attribut es within a digital database. Sub-Requirement C3-3.1: Specifications and procedures fo r the coding systems and operations must be developed and implemented. dures might address include: Examples of issues that c oding specifications and proce  A list and description of the admissible codes or values for each item on the questionnaire. A list of acceptable reference sources, printe d and electronic, that may be used by the  coding staff (e.g., Employer Name List).  Procedures to add to the list of admissible codes or to add text responses to match existing codes.  Consistency of codes across data collection periods.  Procedures to assign and associate ge ocodes with other information within geographic files (e.g., the Master Address File/Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) database). te, must be used to convert text Sub-Requirement C3-3.2: Standardized codes, when appropria data. Examples of current coding standards include: American National Standards Institute ( ANSI ) Codes.  North American Industry Classification System ( NAICS ).   Standard Occupational Classification System ( SOC ). Sub-Requirement C3-3.3: Coding systems must be verified and tested to ensure that all components function as intended. Examples of verification a nd testing activities include:  Verifying that coding specifi cations and procedures satisfy the coding requirements.  mming statements against specifications. Validating coding instructions or progra Verifying that coding rules ar e implemented consistently.   the codes are assigned correctly. Using a test file to ensure that Sub-Requirement C3-3.4: Training for staff involved in clerical coding operations (as identified during planning) must be developed and provided. Sub-Requirement C3-3.5: Systems and procedures must be developed and implemented to e actions if monitor and evaluate the quality of the coding operations and to take correctiv problems are identified. 42

53 Examples of monitoring and evaluation activities include: Establishing a quality cont  rol (QC) system to check coding outcomes and providing feedback to coders or ta king other corrective action. Monitoring QC results (such as referral rate  s, error rates), determining the causes of corrective action (e.g., providing feedback or retraining to systematic errors, and taking coders, updating coder reference materi als, or other corrective actions). Incorporating a geocode veri fication within automated instruments and correcting  geocodes when errors are detected.  Evaluating the accuracy of geocoding and determ ining the cause of errors in incorrect geocodes.  Reviewing and updati ng coding guidelines. Reviewing software and proce dures to reflect any changes in the coding guidelines.  Requirement C3-4: Documentation needed to replicat e and evaluate the coding operations must be produced. The documentation must be re tained, consistent with applicable policies and data-use agreements, and must be made availabl e to Census Bureau employees who need it to Managing Data and Documents carry out their work. (See .) Statistical Quality Standard S2 , Examples of documentation include: Plans, requirements, specifications, a nd procedures for the coding systems.   Problems encountered and solutions im plemented during the coding operations. Quality measures from monitoring and evalua ting the coding operations (e.g., error rates  Statistical Quality Standard D3 , Producing Measures and and referral rates). (See Indicators of Nonsampling Error .) Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 43

54 Statistical Quality Standard C4 Linking Data Records The purpose of this standard is to en sure that methods are established and Purpose: implemented to promote the accurate linking of data records. Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau a s, including products nd the activities that generate those product joint partners, or other cust omers. All Census Bureau released to the public, sponsors, employees and Special Sworn Status individuals must comply with these standards; this includes dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. tomated and clerical record linkage used for In particular, this standard applies to both au characteristics of an entity to determine whether statistical purposes. It covers linking that uses multiple records refer to the same entity. Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to:  Statistical attribute matching.  Linkages performed using only a unique iden tifier (e.g., Protected Information Key or serial number). Linkages performed for qua lity assurance purposes.  , clerical record linkage Key Terms : Automated record linkage , blocking , record , field follow-up linkage , , and statistical attribute matching . scoring weights Requirement C4-1: with linking, unauthorized release of Throughout all processes associated protected information or administratively restrict ed information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data onal provisions governing the use of the data (e.g., as may be Stewardship Policies), and additi nding or data-use agreement). (See Statistical Quality specified in a memorandum of understa Standard S1 , Protecting Confidentiality .) Requirement C4-2: A plan must be developed that addresses: 1. Objectives for linking the files. 2. s to be linked. Data sets and file Verification and testing of the 3. linking systems and processes. 4. Training for staff involved in the cl erical record linkage operations. 5. Evaluation of the results of the linkage (e.g., link rates and cler ical error rates). Notes: (1) Statistical Quality Standard A1 , Planning a Data Program , addresses overall planning requirements, including estimat es of schedule and costs. 44

55 (2) The Data Stewardship Policy DS014, Record Linkage , states the principles that must be met for record linkage activities and a checkli st that must be filled out before beginning record linkage activities. The Census Bureau Guideline (3) provides guidance on procedures for Record Linkage automated and clerical record linkage. Requirement C4-3: Record linkage processes must be developed and implemented to link data records accurately. Sub-Requirement C4-3.1: Specifications and procedures fo r the record linkage systems must be developed and implemented. Examples of issues that specifications and pr ocedures for automated record linkage systems might address include:  Criteria for determining a valid link.  Linking parameters (e.g., scoring wei ghts and the associated cut-offs).  Blocking and linking variables.  Standardization of the variables used in linking (e.g., state codes and geographic entity names are in the same format on the files being linked). Examples of issues that specifications and pr ocedures for clerical record linkage systems might address include: Criteria for determining that two re cords represent the same entity.   raphic entity or entities (i.e., geocoding). Criteria for assigning records to a specific geog Linking variables.  Guidelines for situations requiring referrals.   Criteria for sending cases to field follow-up. Sub-Requirement C4-3.2: fied and tested to ensure that all Record linkage systems must be veri components function as intended. Examples of verification and te sting activities for automated record linkage systems include:  Verifying that the specifications reflect system requirements.  Verifying that the systems and software implement the specifications accurately.  Performing a test linkage to ensure systems work as specified. Examples of verification and te sting activities for clerical record linkage systems include: Verifying that the specifications reflect system requirements.  Verifying that the instructions  will accomplish what is expected.  Testing computer systems that s upport clerical linking operations. Sub-Requirement C4-3.3: Training for the staff involved in clerical record linkage (as be developed and provided. identified during planning) must Examples of training activities include: 45

56  Instructing clerks on how to implement the specifications.  Providing a training database to give clerks a chance to practice their skills.  Assessing error rates of cler ks and providing feedback. Sub-Requirement C4-3.4: Systems and procedures must be developed and implemented to monitor and evaluate the accuracy of the record linkage operations and to take corrective actions if problems are identified. aluation activities for automated record linkage operations Examples of monitoring and ev include: Evaluating the accuracy of automa ted linkages by a manual review.  Monitoring link rates and investigating deviat ions from historical results, and taking  corrective action if necessary. Examples of monitoring and ev aluation activities for clerical record linkage operations include:  Establishing an acceptable error rate.  Establishing quality control sampling rates. , and taking corrective action if necessary  Monitoring clerks’ error rates and referrals (e.g., feedback or retraining). Documentation needed to replicat e and evaluate the linking operations Requirement C4-4: must be produced. The documentation must be re tained, consistent with applicable policies and data-use agreements, and must be made availabl e to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include:  Plans, requirements, specifications, and pro cedures for the record linkage systems.  Programs and parameters used for linking.  Problems encountered and solutions impl emented during the linking operations.  Evaluation results (e.g., link rate s and clerical error rates). Notes: (1) to external users, unless the information The documentation must be released on request is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 46

57 PRODUCING ESTIMATE S AND MEASURES Producing Direct Estimates from Samples D1 Producing Estimates from Models D2 Producing Measures and Indicators of Nonsampling Error D3 Appendix D3-A : Requirements for Calculating and Reporting Response Rates: Demographic Surveys and Decennial Censuses Appendix D3-B : Requirements for Calculating and Reporting Response Rates: Economic Surveys and Censuses 47

58 Statistical Quality Standard D1 Producing Direct Estimates from Samples The purpose of this standard is to ensure th at statistically sound pr Purpose: actices are used for mples for information products. producing direct estimates from sa The Census Bureau’s statistical quality standards apply to all information products Scope: generate those product released by the Census Bureau a nd the activities that s, including products joint partners, or other cust released to the public, sponsors, omers. All Census Bureau these standards; this includes individuals must comply with employees and Special Sworn Status dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. production of direct estimates from samples and In particular, this standard applies to the information products. The standard applies to estimates of their variances for Census Bureau estimates derived from:  Samples selected for surveys or the Economic Census. Samples or subsamples selected for data anal  yses, evaluations, or quality assessments of surveys, censuses, or programs using administrative records. Exclusions: global exclusions listed in the Preface, this standard does not apply to: In addition to the  100 percent enumerations. Activities related to producing estimates from models. (See  Statistical Quality Standard D2 , Producing Estimates from Models .) , , Key Terms : Calibration , coefficient of variation (CV) , coverage error cross-sectional studies imputation direct estimates , generalized variance function , estimation , longitudinal studies , post- , stratification , raking , ratio estimation , replication methods , sanitized data , and Taylor series method for variance estimation . Requirement D1-1: Throughout all processes associated with estimation, unauthorized release of protected information or administratively re stricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data onal provisions governing the use Stewardship Policies), and additi of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality , Protecting Confidentiality .) Standard S1 Requirement D1-2: A plan must be developed that addresses: 1. Key estimates that will be produced. 2. Estimation methodologies (e.g., population c ontrols, post-stratif ication, nonresponse adjustments, ratio estimation, calibration, and raking). Variance estimation methodologies (e.g., sampli ng formula variances, Taylor series 3. (linearization) methods, repl ication methods, and generaliz ed variance functions). 4. Verification and testing of the sy stems for generating estimates. Verification of the estimates a nd evaluation of their quality. 5. 48

59 Statistical Quality Standard A1 , Planning a Data Program , addresses overall planning Note: requirements and development of schedules and costs. Estimates and their variances must Requirement D1-3: be produced using statistically sound practices that account for the sample design a nd reduce the effects of nonresponse and coverage error. Examples of statistically sound practices include:  Calculating estimates and variances in ways that take into account the probabilities of selection, stratifica tion, and clustering.  generalized variance formulas for computing variances. Developing  pling adjustments to improve the precision Using auxiliary data or performing post-sam and the accuracy of estimates (e.g., ratio or raking weighting adjustments for unit nonresponse and post-stratification).  Accounting for post-sampling adjustments wh en computing variances (e.g., imputation effects in variance estimates).  Generating weights or adjustme nt factors to allow both cros s-sectional and longitudinal estimates for longitudinal surveys. , Developing and Implementing a Sample Design Note: Statistical Quality Standard A3 , specifies requirements for the design and selection of probability samples used to produce estimates or make inferences. Specifications for the estimati on systems must be developed and Sub-Requirement D1-3.1: implemented. Examples of issues that specifi cations might a ddress include: Methodological requirements for gene rating the estimates and variances.  ation process (e.g., files used for program Data files used or saved during the estim  validation, verification, and research). Estimation systems must be verifi ed and tested to ensure that all Sub-Requirement D1-3.2: components function as intended. Examples of verification a nd testing activities include:  orm to the estimation methodologies. Verifying that specifications conf  Validating computer code against specifications.  Verifying that the estimates are comp uted according to the specifications.  Using subject matter and statistical expe rts to review the estimation methodology.  Conducting peer reviews (e.g., reviews of specifications, design documents, and programming code).  Conducting verification a nd validation tests. Conducting internal user acceptance  tests for estimation software. 49

60 Methods and systems must be de Sub-Requirement D1-3.3: veloped and implemented to verify the estimates and evaluate their quality. Examples of verification and evaluation activities include:  Comparing current estimates against historical results.  Comparing the estimates derived from the su rvey to other independent collections of similar data.  Comparing coefficients of vari ation (CVs) or variances of the estimates against historical results. Examining relationships among the estimates.   performance of variance estimates. Conducting studies to evaluate the Statistical Quality Standard D3 , Producing Measures and Indi cators of Nonsampling Note: Error , provides requirements for measuri ng and evaluating nonsampling error. Requirement D1-4: Documentation needed to replicate and evaluate the estimation operations must be produced. The documentation must be re tained, consistent with applicable policies and data-use agreements, and must be made availabl e to Census Bureau employees who need it to .) carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents Examples of documentation include:  Plans, requirements, specifications, and procedures for the estimation systems.  Final weighting specifications , including calculations for how the final sample weights are derived.  Final variance estimation specifications.  Computer source code.  Data files with weighted data and any de sign parameters that would be needed to replicate estimates and variances.  Methodological documentation. Quality measures and evaluation results. (See  Statistical Quality Standard D3 Producing , Measures and Indicators of Nonsampling Error .) Notes: (1) to external users, unless the information The documentation must be released on request is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 50

61 Statistical Quality Standard D2 Producing Estimates from Models The purpose of this standard is to ensure th at statistically sound pr actices are used to Purpose: generate estimates from models for information products. Scope: The Census Bureau’s statistical quality standards apply to all information products nd the activities that generate those product released by the Census Bureau a s, including products released to the public, sponsors, omers. All Census Bureau joint partners, or other cust individuals must comply with employees and Special Sworn Status these standards; this includes dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. In particular, this standard applies to the produc tion of estimates from models for Census Bureau information products. This standard applie s to models (e.g., regression, economic, and log- linear) used to produce estimates, such as:  Small domain estimates, including small area estimates.  Demographic estimates and projections.  Seasonal adjustment of estimates.  Census coverage estimates.  Synthetic data to protect microdata from disclosure. Exclusions: global exclusions listed in the Preface, this standard does not apply to: In addition to the  Models that are not used to produce estimat es for Census Bureau information products (e.g., models used for imputation or disclo sure avoidance which are addressed in Statistical Quality Standard C2 , Editing and Imputing Data , and Statistical Quality , respectively). Protecting Confidentiality Standard S1 , : Autocorrelation function , autoregressive integrated moving average (ARIMA) , Key Terms cross-validation , goodness-of-fit , heteroscedastic , homoscedastic , model , model validation , regression Monte Carlo simulation multicollinearity , projection , , , revisions history , residual , seasonal adjustment sanitized data , sensitivity analysis , , sliding spans , small area estimation , and spectral graphs . Requirement D2-1: with estimation, unauthorized release Throughout all processes associated stricted information must be prevented by of protected information or administratively re Title 26), Census Bureau policies (e.g., Data following federal laws (e.g., Title 13, Title 15, and Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality Standard S1 Protecting Confidentiality .) , Requirement D2-2: A plan must be developed that addresses: 1. Purpose and rationale for using a model (e.g., data to compute precise estimates are not available, or modeling with additional data will provide more accuracy). Key estimates that will be generated and th e domain of application for the model. 2. 51

62 3. Methodologies and assumpti ons related to the model, such as the: a. Model structure (e.g., functional form, variab les and parameters, error structure, and domain of interest). Model estimation procedure (e.g., least squares estimation, maximum likelihood b. estimation, and demographic estimation methods). c. Data source and how the data will be used in the model, including key modifications to the data. 4. Criteria for assessing the m odel fit (e.g., goodness-of-fit statis tics and R-squared) and the model specification (e.g., measures of multicollinearity). 5. Verification and testing of the sy stems for generating estimates. 6. s and evaluation of their quality. Verification of the modeled estimate Note: Statistical Quality Standard A1 Planning a Data Program , addresses overall planning , requirements, including estimat es of schedule and costs. Requirement D2-3: Models must be developed and implemented using statistically sound practices. Examples of statistically sound mode l development practices include:  Ensuring definitions of variables are accurate (e.g., definitions of the geographic areas used in the model, and eligibility criteria in admini strative records).  Specifying a model that has a basis in verified empirical relationships. al consistency and to ensure that logical Examining preliminary model results for intern  relationships among the data are maintained (e.g., population estimates are not negative, to super-domains and sub-domains (e.g., counties) sum (e.g., states)). Estimating measures of statisti cal uncertainty (e.  g., prediction error variances, measures of error associated with us ing synthetic data, or the Ba yesian equivalents of these measures).  Modifying the functional form, the variables, or the parameters of the model to address problems revealed by the model di agnostics and error estimates.  Having experts perform a methodological review.  Producing estimates using weight ed data, when appropriate.  that the sample design and selec Providing justification tion are adequately accounted for in the estimation process. demographic estimates a Examples of statistically sound practices for nd projections include: Basing assumptions about future relationshi ps among variables on empirical data or on  assumptions that are considered statistically sound.  Comparing raked and unraked data to ensu re logical relationships are maintained.  Providing quantitative or qualitative assessmen ts of uncertainty for each estimated or projected data point, whenever possible. Examples of statistically sound practi ces for seasonal adjustments include:  Before the first seasonal adjustment of a series, conducting a seasonal analysis to determine whether seasonal patterns exis t and periodically rep eating the analysis. when data exhibit seasonal patterns. Seasonally adjusting a time series only  52

63  es that show identifiable seasonality for Seasonally adjusting only those component seri combination of component series. aggregate series derived from the Using autoregressive integrated moving aver age (ARIMA) extrapolations in calculating  seasonal factors (e.g., the X-12-ARIMA method).  ng and seasonal adjustment di agnostics (e.g., revisions Reviewing appropriate modeli ots of the sample autocorre history, spectral graphs, pl lation function of the model residuals, forecast performance, and sliding spans) for valuable information about model adequacy and adjustment stability. Sub-Requirement D2-3.1: Model results must be evaluated and validated, and the results of the evaluation and validatio n must be documented. Examples of evaluation and validation activities include:  Validating the model by comparing w ith independent information sources.  Generating and reviewing goodness-of-fit st atistics (e.g., R-squa red and F-tests).  Generating and reviewing model diagnostics and graphical output (e.g., reviewing for outliers, multicollinearity, heteroscedastic ity, homoscedasticity, and influential observations).  Cross-validating the model using a subset of data withheld from the model fitting. Conducting sensitivity analyses to violati ons of the assumptions (e.g., Monte Carlo  simulations). en the model is developed. Models used Note: Evaluation and validation is required wh re-evaluated periodi cally as appropriate. in a continuing production setting must be Specifications for the modeling and estimation systems must be Sub-Requirement D2-3.2: developed and implemented. Examples of issues that specifi cations might a ddress include:  Descriptions of data files to be used in the model.  Equations for computing estimates and variances.  Instructions for runni ng production software.  Estimation algorithms.  Convergence criteria for iterative models. ed and tested to ensure that all Estimation systems must be verifi Sub-Requirement D2-3.3: components function as intended. nd testing activities include: Examples of verification a Using subject matter and statistical expe  rts to review the estimation methodology.  Checking that the appropri ate equations were used.  Verifying that the specifications reflect requirements.  Validating computer code against specifications. riables are used and  Assessing computer code to ensure that the appropriate data and va the code is correctly programmed. 53

64  nd debugging computer code. Performing test runs a Using different random starts to ensure  models using maximum likelihood estimates converge consistently. Sub-Requirement D2-3.4: Methods and systems must be de veloped and implemented to verify the modeled estimates and evaluate their quality. evaluation activities include: Examples of verification and  Performing sensitivity analyses using alternative assumptions to inform users of model stability.  Examining measures of statistical uncertainty.  Ensuring that variances reflect both sampling error and modeling error. able data from other sources, including  Comparing production estimates against compar previous estimates for the program or projections from prior cycles. Reviewing goodness-of-fit statis tics and model diagnostics and documenting unexpected  results to aid the revision of the model for the next cycle.  Reviewing (during each seasonal adjustment run) newly identified outliers and changes to previously identified extreme values that may cause large revisions in the seasonally adjusted series. , cators of Nonsampling Note: Statistical Quality Standard D3 Producing Measures and Indi , provides requirements for measuri Error ng and evaluating nonsampling error. Sub-Requirement D2-3.4.1: ss and results must be reviewed The seasonal adjustment proce annually by the program manager (or the appropri ate mathematical statistician) to identify needed changes in the X-12-ARIMA specification files. Using the required secure data transmission protocols, the program manager (or the appropriate mathematical statistician) must Series Methods Staff (TSMS) provide the following to the Time of the Office of Statistical Methods and Research for Economic Programs (OSMREP): 1. The new final X-12-ARIMA specifi cation files and the data used. 2. The revised X-12-ARIMA specification file a nd the data used, whenever the seasonal adjustment options must be changed outsi de of the annual review period. This information must be provided immediatel y after release of the adjusted data. Sub-Requirement D2-3.4.2: For indicator releases, any routin e revisions to the annual review process, such as benchmarking and updating of s easonality factors, must be consolidated and Statistical Policy Directive No. 3 . Deviations from this released simultaneously. See requirement must be approved as specified in the directive. Requirement D2-4: Documentation needed to replicat e and evaluate the modeling activities must be produced. The documentation must be re tained, consistent with applicable policies and data use agreements, and must be made availabl e to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include: procedures for the estimation systems. Plans, requirements, specifications, and  54

65  Data files with weight ed and unweighted data.  Computer source code.  Results of outlier analyses, including inform ation on cause of outliers, if available.  Results of model diagnostics.  Output data file with “predicted” results for every unit of analysis.  Seasonal adjustment diagnostic measures (e .g., revisions history values and graphs, spectral graphs, forecast error values and graphs, and sliding spans results).  Error estimates, parameter estimates, and overall performance stat istics (e.g., goodness- of-fit and other su ch statistics). Methodologies used to improve the estimates.   Statistical Quality Standard D3 , Producing Quality measures and evaluation results. (See .) Measures and Indicators of Nonsampling Error Notes: (1) to external users, unless the information The documentation must be released on request is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be readily accessible to the public to ensure transparency of information products released by the Census Bureau. . 55

66 Statistical Quality Standard D3 Producing Measures and Indicators of Nonsampling Error The purpose of this standard is to ensure that measures and indicators of nonsampling Purpose: error are computed and documente d to allow users to interpret the results in information , and to guide improvements to products, to provide transparency regarding the quality of the data the program. The Census Bureau’s statistical quality Scope: standards apply to all information products released by the Census Bureau a generate those product s, including products nd the activities that joint partners, or other cust omers. All Census Bureau released to the public, sponsors, individuals must comply with these standards; this includes employees and Special Sworn Status contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard app lies to activities associ ated with producing meas ures or indicators of nonsampling error associated with estimates for Census Bureau information products. Examples of nonsampling error sources include:  Nonresponse (e.g., bias from household/esta blishment nonresponse, person nonresponse, and item nonresponse).  Coverage (e.g., listing error, duplicates, unde rcoverage, overcoverage, and mismatches between the frame of administrative record s and the universe of interest for the information product).  Processing (e.g., errors due to coding, data entry, editing, weighting, linking records, disclosure avoidance methods, and inaccura cies of assumptions used to develop estimates).  Measurement (e.g., errors due to interviewe r and respondent behavior, data collection instrument design, data collection modes, de finitions of reference periods, reporting unit definitions, and inconsistencies in administrative records data). Exclusions: global exclusions listed in the Preface, this standard does not apply to: In addition to the  Errors strictly associated with a modeling methodology. Statistical Quality Standard D2 , Producing Estimates from Models , addresses these types of error. Key Terms : Convenience sample , coverage , coverage error , coverage ratio , equivalent quality key variables data item allocation rate , item nonresponse , , , latent class analysis , longitudinal measurement error survey , , nonresponse bias , nonresponse error , nonsampling error , probability release phase of selection , reinterview , quantity response rate , respondent debriefing , response , analysis survey , total quantity response rate , and unit nonresponse . Requirement D3-1: Throughout all processes associat ed with producing measures and indicators of nonsampling error, unauthori zed release of protected information or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and ta (e.g., as may be specified in a memorandum additional provisions governing the use of the da 56

67 of understanding or data-use agreement). (See , Protecting Statistical Quality Standard S1 Confidentiality .) Requirement D3-2: A plan must be developed that addresses: 1. ampling error that will be produced (e.g., The general measures and indicators of nons coverage ratios, unit nonresponse rates, item nonresponse rates, data entry error rates, coding error rates, and interviewer quality control (QC) results). 2. Any special evaluations to be conducted (e.g., studies of interviewer variance, bias). Identify the: measurement error, and nonresponse a. Motivation for the study. b. Types of errors addressed by the study. c. Measures and indicato rs to be generated. d. Data needed to conduct the evaluation and their sources. e. Methods for collecting and analyzing the data. 3. Verification and testing of systems for producing meas ures and indicators of nonsampling error. Evaluating the measures and indicators to guide improvements to the program. 4. Note: Statistical Quality Standard A1 , Planning a Data Program , addresses overall planning es of schedule and costs. requirements, including estimat Requirement D3-3: Except in the situations noted belo w, weighted response rates must be computed to measure unit and item nonrespons e. The weights must account for selection probabilities, including probabi lities associated with subsampling for nonresponse follow-up. Response rates may be computed using unweighted data when: 1. Monitoring and managing data collection activities. 2. Making comparisons with surveys using unweighted response rates. Using weighted response rate s would disrupt a time series. 3. A weighted response rate would be mislead ing because the sampling frame population in 4. an establishment survey is highly skewed, and a stratified sample design is employed. (See Sub-Requirement D3-3.2.) The Census Bureau simply collects data fo r a sponsor and performs no post-collection 5. estimation. is not appropriate for samples that are not Note: In general, computing response rates randomly selected (e.g., convenience samples or samples with self-selected respondents). Sub-Requirement D3-3.1: For demographic surveys and decennial censuses, when computing unit response rates, item response rates or item allocation/imputation rates (for key variables), and total item response rates: 1. Standard formulas must be used. (See Appendix D3-A .) 2. The final edited data or edited outcome codes mu st be used, when available. If the final edited data are not used to compute the response rates, it must be noted. The definition or threshold of a sufficient 3. partial interview must be noted if partial interviews are counted as interviews. 57

68 Sub-Requirement D3-3.2: For economic surveys and censuses, when computing unit response rates, quantity response rates (for key variab les), and total quantity response rates: 1. Standard formulas must be used. (See Appendix D3-B .) The type of response rate must be noted: 2. unweighted response rate, quantity response rate, or total quantity response rate. The variable used in computi ng the response rate must be not 3. ed (e.g., total retail sales of an establishment). 4. The definition of responding units must be noted. 5. For total quantity response rates, the sources of equivalent quality data for nonresponding tabulation units must be listed (e.g., administra tive records or qualified other sources such ) filings or company annual reports). as Security Exchange Commission (SEC 6. The edited data at the time of each estimate’s release phase must be used, when available. 7. The final edited data for the final release must be used, when available. If the final edited data are not used to compute the response rates, it must be noted. Sub-Requirement D3-3.3: Rates for the types of nonresponse (e.g., refusal, unable to locate, no one home, temporarily absent, language problem , insufficient data, or undeliverable as addressed) must be computed to facilitate the inte rpretation of the unit response rate and to better manage resources. For panel or longitudinal surveys, cumulative response rates must be Sub-Requirement D3-3.4: computed using weighted data or cumulative total quantity respons e rates must be computed to le units over repeated waves of da ta collection. If a survey uses reflect the total attrition of eligib respondents from another survey or census as it s sampling frame, then the response rate of the survey (or census) serving as the frame must be included in the computation of the cumulative response rate. Sub-Requirement D3-3.5: Cumulative response rates must be computed using weighted data over successive stages of multistage data colle ctions (e.g., a screening interview followed by a detailed interview). If estimated probabilities of selection must be used and the accuracy of the response rate might be affected, th en a description of th e issues affecting the response rate must also be provided. Note: In most situations, a simple multip lication of response ra tes for each stage is appropriate. In other situations, a more complex computation may be required. Sub-Requirement D3-3.6: be conducted when unit, item, or Nonresponse bias analyses must r the total sample or important subpopulations fall below the total quantity response rates fo following thresholds. 1. The threshold for unit res ponse rates is 80 percent. 2. The threshold for item response rate s of key items is 70 percent. 3. The threshold for total quantity response rate s is 70 percent. (Thresholds 1 and 2 do not apply for surveys that use to tal quantity response rates.) Note: If response rates fall below these thres holds in a reimbursable data collection, the sponsor is responsible for conducti ng the nonresponse bias analysis. 58

69 Coverage ratios must be computed to measure coverage error, as an Requirement D3-4: ng statistically sound methods (e.g., computing coverage ratios as indicator of potential bias, usi the uncontrolled estimate of population for a de mographic-by-geographic group divided by the population control total for the demographic-by- geographic cell used in post-stratification adjustments or using captu re-recapture methods). Note: If computing coverage ra tios is not appropriate, a desc ription of the efforts undertaken to ensure high coverage must be made available. Requirement D3-5: Measures or indicators of nonsampli ng error associated with data from administrative records must be computed to inform users of the quality of the data. Examples of measures a nd indicators include: the set of admini strative records.  Coverage of the target population by  The proportion of administrative records that have missing data items or that have been imputed to address missing data.  The proportion of data items with edit cha nges because the data items were invalid.  The proportion of records lost from the analys is or estimate due to nonmatches between linked data sets. Requirement D3-6: Measures or indicators of nonsam pling error associated with data must be computed to inform users of the quality of the data. collection and processing activities Examples of indicators of nonsampling error include:  Error rates for data entry/data capture operations.  Error rates and referral rates for coding operations. Imputation rates and edit change rates  for editing and imputation operations. Examples of analyses or studi es that generate measures or indicators of nonsampling error include:  Geocoding evaluation studies (e.g., address ma tching rates and anal ysis of rates of allocation to higher level geographic entitie s based on postal place-name or ZIP Code matches).  Analyses of geospatial accuracy (e.g., analys is of locational information in relation to geodetic control points).  Response error evaluation studies (e.g., re interview and latent class analysis).  Interviewer variance studies.  Respondent debr iefing studies.  Response analysis surveys.  Record check or validation studies.  Mode effect studies. Requirement D3-7: Methods and systems for calcula ting measures and indicators of nonsampling error must be verified and tested to ensure all component s function as intended. nd testing activities include: Examples of verification a 59

70  Verifying that calcul ations are correct.  Validating computer code against specifications. Conducting peer reviews of specifications and coding.   eck computer programs. Using test data to ch Requirement D3-8: ling error must be evaluated to guide Measures and indicators of nonsamp improvements to the program. Examples of evaluation activities include:  Analyzing the quality control re sults of processing systems (e .g., error rates from clerical coding and clerical record linkage) and developing improvements to the systems (e.g., improving training for clerks). improving clerical coding tools or Evaluating the results of nonsampling error studies (e.g., response analysis surveys,  and response error reintervie w studies) and implementing respondent debriefing studies, improvements (e.g., revising questionnaire word ing for problematic questions, revising interviewer procedures, or re vising interviewer training).  Analyzing the results of interviewer quality control systems (e.g., Quality Control (QC) reinterviews and Computer Assisted Telephone Interv iewing (CATI) monitoring, and observations) and developing improvement s (e.g., improving interviewer training programs or revising questionnaires to address systemic problems). Requirement D3-9: Documentation needed to replicate and evaluate the activities associated with producing measures and indicators of nonsampling error must be produced. The documentation must be retained, consistent with applicable policies and data use agreements, and out their work. (See must be made available to Census Bureau empl oyees who need it to carry Managing Data and Documents .) , Statistical Quality Standard S2 Examples of documentation include:  Plans, requirements, specifications, and procedures for the systems. Computer source code.  Results of quality control activities.  Results of nonsampling erro  r studies and evaluations.  Quality measures and indicators (e.g., fi nal coverage ratios and response rates). Notes: (1) The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restricti ons that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program .) (2) Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Information Products , contains specific requirements a bout documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 60

71 Appendix D3-A Requirements for Calculating and Reporting Response Rates: Demographic Surveys and Decennial Censuses 1. Terms and Variables The variables needed to calcula te demographic survey and dece nnial census response rates are based on classifications suggested by the Ameri can Association for Public Opinion Research ), 2008. This effort helps to ensure consiste ncy to external standards while allowing the ( AAPOR Census Bureau to adapt the classification to our specific circumstances. The terms and variables are partitioned into three sections. The first section describes eligibility on distinguish among sample units th status. Variables in this secti at are known to be eligible for data collection, are known to be collection, or have an unknown eligibility for ineligible for data ion target population guides the di data collection. The data collect stinction between eligible and ineligible units. The second sect ion describes the response status for eligible sample units. The entifying the type of (or the reason for) the third section provides detail on nonrespondents by id nonresponse. 1.1 Eligibility Status The total number of units select ed for a sample is defined as n . These units can be classified by E) , or of their eligibility status: elig ible for data collection ( I) , ineligible for data collection ( unknown eligibility ( . The target population determines th e classification of a unit as eligible U) or ineligible. The target popul ation refers to persons, house holds, or other units upon which inferences (estimates) are made. Specific units may be considered eligible for one census or survey but ineligible for another, depending upon the target population. For example, in a survey of housing, vacant units may be part of the targ me vacant units may be et population, but these sa outside the target population in an in come survey and would therefore be classified as ineligible. (Probability of selection) Variable p i Probability of selecting a unit for the sample, including all subsampling, even Definition subsampling for nonresponse follow-up. (Sample weight) w Variable i Definition The inverse of the final probability of selecting a unit for the sample, including all 1 subsampling, such as subsampling for nonresponse follow-up.  w i p i 61

72 Term E (Eligible) The weighted count of sample units that ar e eligible for data co Definition llection. A person, household, or other unit is elig ible if an attempt has been made to collect data and the target population. Both occupied and the unit is confirmed to be a member of vacant units can be considered eligible. Variable e – An indicator variable for whether a unit selected for the sample is eligible for i e data collection. If a sample unit is eligible, = 1, else = 0. e i i Computation Sum of the sample weight for all eligible units. n *  e ) E w (  i i i  1 Reference Equivalent to the sum of AAPOR “Interview” disposition code (1.0) and disposition code (2.0). “Eligible, non-interview” Term I (Ineligible) Definition The weighted count of sample units that are ineligible for data collection. This is the number of units for which an attempt has been made to collect data and it is member of the target population. confirmed that the unit is not a Variable i – An indicator variable for whether a unit selected for the sample is confirmed i as not being a member of the target popul ation at the time of data collection. Information confirming ineligibility may come from observation, from a respondent, or from another source. Some examples of ineligible units include: demolished structure, entire household in armed forces, unit under construction, unit screened out, nonresidential unit, fax/ data line or disconnected number (in random-digit dial surveys), and vacant unit. If a sample unit is ineligible, i = 1, i else i = 0. i Computation Sum of the sample weight for all ineligible units. n  i ) ( I * w  i i  1 i Reference Equivalent to AAPOR “Not Elig ible” disposition code (4.0). Term U (Unknown eligible) Definition The weighted count of sample units for which eligibility is unknown. Variable u – An indicator variable for whether th e eligibility of a unit selected for the i sample could not be determined. This occurs if data are not collected from a unit and there is no information available about whether or not the unit is a member of of units with unknown eligibility include: the target population. Some examples unable to locate unit, unable to reach/uns afe area, address never assigned/worked, or number always busy or call screening/ blocking (in random di git dial surveys). If a sample unit is u of unknown eligibility, = 1, else u = 0. i i Computation Sum of the sample weight fo r all units with an unknown eligibility. n * ) U  ( w u  i i  1 i 62

73 Note Surveys that have large number of units with unknown elig ibility (e.g., random- proportion of cases of unknown eligibility digit-dial surveys) may estimate the that are eligible, . This estimated proportion ma y be used to adjust the ee estimates of E . The survey must have a defensible basis for estimating ee I and to not eligible cases (e.g., assume that the ratio of eligible among the known cases applies to the unknown cases). Without a defensible basis, may not be used to ee I E . The number of eligible units may be adjusted by adjust the estimates of and ee adding ( U ) to E . The number of ineligible un its may be adjusted by adding * must be stated explicitly and the ( U - ( ee * U )) to I . The basis for estimating ee justification described clearly. Equivalent to AAPOR “Unknown Eligibilit y, Non-Interview” disposition code Reference (3.0). T (Total count) Term Definition The weighted count of all units (eligib le, ineligible, and of unknown eligibility) selected for the sample. Computation Sum of the sample weights for th e eligibility status outcome of all units. n )] ( * [    T u i e w  i i i i 1 i  The relationship between and T is T = E E, I, U, I + U . For the i th unit e + = 1. + i u + i i i 1.2 Response Status Response status is determined only for eligible sa mple units. The definition of sufficient data for a unit to be classified as a response will vary across surveys and will impact the count of responding units. Term R (Response) Definition The weighted count of eligible sample units with sufficient data to be classified as a response. In a multi-mode survey or census, responses may be obtained by mail, Internet, telephone, fax, touch-tone data entry/voice recognition, or personal visit. r Variable – An indicator variable for whether an eligible unit selected for the sample i responded to the survey and provided su fficient data. If a unit responded, r = 1 i r else = 0 (note r = 0 for units classified as U or I and units that did not respond i i with sufficient data). Computation Sum of the sample weights for all response outcomes. n R  ( * ) r w  i i  1 i Reference Equivalent to AAPOR I+P (complete interviews + partial interviews) disposition codes (1.1) and (1.2). 63

74 1.3 Reasons for Nonresponse and better manage resources, it is recommended To improve interpretation of the response rate of) nonresponse be measured. Six specific terms that whenever possible, reasons for (or types REF , NOH , TA , LB , INSF , and describing nonresponse reasons are defined below. These terms ( asons for sample units. ) define specific nonresponse re OTH (Refusal) REF Term The weighted count of eligible sample un Definition its that refused to respond to the survey. ref Variable – An indicator variable for whether an eligible sample unit refused to respond i to the survey. If a unit refused to respond, ref = 1, else = 0. ref i i Computation Sum of the sample wei ghts for all “refusal” outcomes. n  ( ) * REF ref w  i i  1 i Reference and break-off) – disposition code (2.10). Equivalent to AAPOR “R” (refusal Term NOH (No one home) Definition The weighted count of eligible sa mple units that did not respond because no one was found at home during the interviewing period. noh Variable eligible sample un – An indicator variable for whether an it did not respond to i the survey because no one was found at home during the interviewing period. If a noh unit was “no one home,” = 1, else = 0. noh i i Computation Sum of the sample weights for all “no one home” outcomes. n  noh * ) NOH ( w  i i  1 i Equivalent to AAPOR “No one at residence” – disposition code (2.24). Reference Term TA (Temporarily absent) Definition The weighted count of eligible sa mple units that did not respond because the occupants were temporarily absent during the interviewing period. ta Variable – An indicator variable for whether an eligible sample unit did not respond to i mporarily absent during the interviewing the survey because the occupants were te period. If a unit was “temporarily absent,” ta = 0. = 1, else ta i i Computation Sum of the sample weights for all “temporarily absent” outcomes. n )  ( TA * w ta  i i 1  i ay/unavailable” – disposition code (2.25). Reference Equivalent to AAPOR “Respondent aw 64

75 Term (Language barrier) LB mple units that did not respond because an Definition The weighted count of eligible sa r was not available to co nduct the interview in the interviewer or interprete required language. Variable lb – An indicator variable for whether an eligible sample unit selected for the i sample did not respond to the survey because an interviewer or interpreter was not available to conduct the interview in th e required language. If a unit did not respond due to a language barrier, lb lb = 1, else = 0. i i Computation Sum of the sample weights for all “language barrier” outcomes. n  w ) LB ( lb *  i i  1 i nguage” – disposition code (2.33). Reference Equivalent to AAPOR “La INSF (Insufficient data) Term The weighted count of eligible sample units selected for the sample that Definition provide sufficient data to participated but did not qualify as a response. Variable insf - An indicator variable for whether an eligible sample unit that was selected i but did not provide sufficient data to for the sample returned a questionnaire, a unit returned a questionna ire but fails to provide qualify as a response. If sufficient data to qua lify as a response, insf = 0. insf = 1, else i i Computation Sum of the sample weights for “insufficient data” outcomes. n *  ( ) INSF w insf  i i  1 i Reference “Break off” and “Break-off questionnaire too incomplete Equivalent to AAPOR to process” – disposition code (2.12). Term OTH (Other nonresponse) Definition The weighted count of sample units that did not respond for a reason other than refusal, no one home, language barrier, tem porarily absent, insufficient data, or if a unit was classified as unknown eligibility. oth Variable – An indicator variable for whether a unit selected for the sample was a i nonresponse for a reason other than refu sal, no one home, language barrier, temporarily absent, or insufficient data or if the unit was classified as unknown eligibility. If a unit does not respond for reasons other than refusal, no one home, language barrier, temporarily absent, insuffi cient data, or if a unit was classified as unknown eligibility, oth = 1, else oth = 0. i i Computation Sum of the sample weights for “other nonresponse” outcomes. n w  ( * ) oth OTH  i i  1 i Reference Equivalent to AAPOR “Other,” “D ead,” “Physically or mentally unable,” and .30), (2.31), (2.32), and (2.35). “Miscellaneous” – disposition codes (2 65

76 2. Unit Response and Nonresponse Rates 2.1 Primary Response Rates UR rate (Unit Response Rate) Rate Definition e sum of eligible units and units of unknown The ratio of responding units to th eligibility (expressed as a percentage). rate = [ R /( E Computation U )] * 100 UR + Equivalent to AAP Reference OR Response Rate 2 (RR2). Rate rate ( Alternative Response Rate) AR The ratio of responding units to estimat Definition ed eligible units (expressed as a percentage). U AR [ R /[( E )+ ee * = ] * 100 where: ee = estimated proportion of cases of Computation rate ally eligible. The survey must have a defensible unknown eligibility that are actu basis for estimating . If such a basis does not exist, then ee may not be used to ee adjust the estimates of and E and the survey may not estimate the AR rate . I Equivalent to AAP OR Response Rate 3 (RR3). Reference UR Rate rate (Cumulative Unit Response Rate for multistage surveys) M The product of unit response rates for all stages of the survey Definition k j is the unit response rate at stage rate = UR where, of the survey UR UR Computation j M  j 1  j and k is the total number of stages. If another equation yields a more accurate estimate of the cumulative unit response rate because it uses additional information about the frame, then that e quation should be used. If the cumulative response rate is misleading or inaccurate, an explanation of the problems must be documented. 2.2 Detailed Eligibility and Nonresponse Rates Rate REF rate (Refusal Rate) to the sum of eligible units and units of Definition The ratio of units classified as “refusals” unknown eligibility (expresse d as a percentage). REF rate = [ REF /( E + Computation )] * 100 U Reference OR Refusal Rate 1 (REF1). Equivalent to AAP NOH rate (No One Home Rate) Rate Definition The ratio of units classified as “no one home” to the sum of eligible units and units of unknown eligibility (e xpressed as a percentage). )] * 100 Computation NOH rate = [ NOH /( E + U Reference No AAPOR equivalent. 66

77 Rate TA rate (Temporary Absent Rate) Definition The ratio of units classified as “tem porarily absent” to the sum of eligible units and units of unknown eligibility (expressed as a percentage). Computation TA rate = [ TA /( E + U )] * 100 Reference No AAPOR equivalent. LB rate (Language Barrier Rate) Rate Definition The ratio of units classified as “language barriers” to the sum of eligible units and xpressed as a percentage). units of unknown eligibility (e Computation LB /( E + U )] * 100 rate = [ LB Reference No AAPOR equivalent. Rate INSF rate (Insufficient Data Rate) Definition The ratio of units classified as having “insufficient data” to the sum of eligible units and units of unknown eligibility (expressed as a percentage). Computation INSF rate = [ INSF /( E + U )] * 100 Reference No AA POR equivalent. OTH rate (Other Reason for Nonresponse Rate) Rate Definition The ratio of units classified as “o ther nonresponse” to the sum of eligible units (expressed as a percentage). and units of unknown eligibility OTH rate = [ OTH Computation E + U )] * 100 /( Reference No AAPOR equivalent. Rate U rate (Unknown Eligibility Rate) Definition The ratio of units classified as having an “unknown eligibility” to the sum of eligible units and units of unknown elig ibility (expressed as a percentage). U Computation U rate = [ E /( + U )] * 100 Reference No AA POR equivalent. 3. Item Response and Allocation Rates 3.1 Item Response Rates (Weighted total of responses required for data item A) IREQ Term A Definition The weighted count of sample units fo r which a response to item A is required. A response is required for item A unless it is a valid skip item. Variable ireq – An indicator variable for whether a response to item A is required. If a Ai response is required, ireq ireq = 0 = 1, else Ai Ai 67

78 all units requiring a Computation Sum of the sample weight for response to item A. n *  w ireq IREQ  i Ai A i 1  Term (Total valid responses for data item A) ITEM A Definition The weighted count of sample units for which a valid response to item A is obtained. Variable item – An indicator variable for whether a valid response to it em A is obtained. Ai If a valid response is obtained, item = 0 = 1, else item Ai Ai Computation Sum of the sample weight for all units requiring a respons e to item A for which a valid response is obtained. n * ITEM  item w  A Ai i i  1 IR rate (Item response rate for data item A) Rate A Definition The ratio of the weighted count of units with a valid response to item A to the weighted count of units that required a response to item A. Computation IR rate = ITEM / IREQ A A A TIR Rate rate (Total item response rate for data item A) A sponse rate for item A and either the unit The product of the weighted item re Definition response rate, reflecting the response rate to item A after accounting for both unit nonresponse and item nonresponse, or the cumulative unit response rate for multistage surveys. Computation TIR rate = IR or * UR A A TIR rate = IR UR * A A M 3.2 Item Allocation Rates Allocation involves Item nonresponse is measured thro ugh the calculation of allocation rates. using statistical procedures, such as within-household or neares t neighbor matrices populated by donors, to impute for missing values. ALLO (Total number of responses allocated for item A) Term A Definition The weighted count of sample units for which a response is allocated to item A. Variable allo – An indicator variable for whether a response is allocated to item A. If a Ai response is obtained, allo = 0 = 1, else allo Ai Ai Computation Sum of the sample weight for all units requiring a respons e to item A for which a response is allocated. n * allo w  ALLO  i Ai A  1 i 68

79 Rate IA rate (Item allocation rate for data item A) A Definition The ratio of the weighted count of un its with an allocated response to item A to the weighted count of units that required a response to item A. Computation IA rate IR ALLO / IREQ = 1 - rate = A A A A References The American Association for Public Opinion Research. 2008. Standard Definitions: Final . 5th edition. Lenexa, Kansas: Dispositions of Case Codes and Outcome Rates for Surveys AAPOR. 69

80 Appendix D3-B Requirements for Calculating and Reporting Response Rates: Economic Surveys and Censuses 1. Terms and Variables For many economic programs, there is a need to di stinguish between the su rvey (sampling) unit, the reporting unit, a nd the tabulation unit: A is an entity selected from the underl ying statistical population of similarly- survey unit constructed units. Examples of survey un its for different economic programs include establishments, Employer Identification Number s (EIN), firms, state and local government Some programs use different survey units for entities, and building permit-issuing offices. amples include the Annua different segments of the total population. Ex l Retail Trade Survey The ARTS samples EINs and firms (which can (ARTS) and the Survey of Construction (SOC). be comprised of one or more establishments), and the SOC samples residential housing permits and newly constructed housing units in areas where no permit is re quired. For cross-sectional or longitudinal surveys, the survey unit may change in composition over time (perhaps due to mergers, acquisitions, or divestitures). A reporting unit is an entity from which data are collected. Reporting units are the vehicle for obtaining data and may or may not correspond to a survey unit for several reasons. First, the composition of the originally-sampled entity can change over the sample’s life cycle, as noted above. Second, for some surveys, an entity may request (or the Census Bureau may ask the corresponding to different parts of the business or several separate pieces entity) to report data in other entity type. For example, a large, di verse company in a company-based collection may of business in which it operates or may ask to request a separate form for each region or line report separately for each of its establishments to align with their record keeping practices. collection agency that provides the data for Similarly, many government programs have a central several governments, but issue additional mail-ou ts to obtain supplemental items that are not obtained by the central collection agency. A tabulation unit houses the data used for estimation (or ta bulation, in the case of a census). As with reporting units, the tabulation units may not correspond to a survey unit. Some programs consolidate establishment or plan t-level data to a company level or parent government level to create tabulation units, so that the tabulation unit is often equivalent to the survey unit. Other programs create artificial units that split a report ing unit’s data among the different categories in which the reporting unit operates; for example, crea ting separate tabulation units by industry. In this case, the tabulation unit represen ts a portion of a survey unit. For each program, the “statistical period” describe s the reference period for the data collection. For example, an annual program might collect data on the prior year’s bu siness; the statistical period refers to the prior year, but the data are obtained in the calendar year. During a given statistical period, all three types of units can be active, inactive, or ineligible. An active unit is in business and is in-scope for the program during th e statistical period. An inactive unit is not ved to have been active in the the statistical period but is belie operating or is not in-scope during 70

81 past and can potentially become active and in-s cope in the future; examples include seasonal businesses for monthly or quarterly programs (tem porarily idle) or businesses that operate in more than one industry, with the primary activit y for a given statistical period being conducted in nally, a survey unit may become ineligible and permanently an “out-of-scope” industry. Fi excluded from subsequent computations due to a merger or acquisition, a permanent classification category change, or a death. All un its are considered active until verified evidence otherwise is provided. Economic programs compute two different types of response rates: a unit response rate and weighted item response rates. The Unit Response Ra te (URR) is defined as the ratio of the total unweighted number of “r esponding units” to the total number of units eligible for collection. URRs are indicators of the performance of data collection for obtaining usable responses. s base URRs on their re porting units, whereas Consequently, the majority of business program 1 the majority of ongoing government progr ams base URRs on the survey units that correspond to the tabulation units. Other exceptions are addressed on a case-by-case basis. The formulae for the URR provided in Section 2.1 and the detailed unit nonresponse rate breakdowns presented in the Section 2.2.1 use the term “reporting unit” for simplicity. A program can produce at most 2 one URR per statistical pe riod and per release phase . Quantity and Total Quantity Response Rates (QRR and TQRR) are item-level indicators of the “quality” of each estimate. In contrast to the URR, these weighted response rates are computed ogram may produce several QRRs and TQRRs per for individual data items, so that a pr statistical period and release. Both are weighted measures that take the size of the tabulation unit ng parameters. These rates measure the proportion into account as well as the associated sampli of each estimate obtained directly or indirectly from the survey unit and are consequently based RR measures the weighted propor tion of an estimate obtained on the tabulation units. The Q directly from the respondent for the survey/cen sus; the TQRR expands the rate to include data from equivalent quality sources. necessary to determine the source of the final To compute the weighted item response rates, it is tabulated value of the associated data item for each tabulation unit i . This value could be directly obtained from respondent data, indi rectly obtained from other equi valent quality data sources, or imputed. The classification process is straightforw ard for items that are directly obtained from the survey questionnaire (i.e., form items), less so for items that are obtained as functions of collected items (i.e., derived items). The formul ae provided in Sections 2.1 and 2.2.2. can be ire that the item value classification process be applied to either form or derived items, but requ performed immediately prior a nd that the classification proc ess or rules be documented. 1 The central collection unit may provide the responses for the majority of the program data (e.g., providing responses from all associated sample units for most of the program items). Supplemental mailings are used to obtain the rest of the items. 2 Leading indicator surveys often have mo re than one official release of the sa me estimate. For example, a program might release a preliminary estimate fo r the current statistical period along with a revised estimate from the prior period. Response rates should be computed at each release phase, and it is expected that the response rates (unit or same estimate with each release. item) will generally increase for the 71

82 1.1 Eligibility Status N The total number of active reporting units in a statistical period is defined as . These RU by their eligibility status: el igible for data collection ( E ), reporting units can be classified IA ), unknown eligibility ( U ineligible ( lified administrative sources ), or data obtained from qua ( ). Reporting units that have been determined to be out-of-scope for data collection during the A tions, as are inactive cases. Note that the U statistical period are excluded from all computa cases are assumed to be active and in-scope in the ab sence of evidence otherw ise. Reporting units may be considered eligible in one survey or censu s but ineligible for a nother, depending upon the business after October 2004 is target population. For example, a reporting unit that was in eligible for the 2004 Annual Retail Trade Survey, but is ineligible for the October 2004 Monthly Retail Trade Survey. E (Total Eligible) Term The count of reporting units that were eligib le for data collection in the statistical Definition period. e Variable – An indicator variable for whether a reporting unit is eligible for data i collection in the statistical period. Thes e include chronic refusal units (eligible reporting units that have notified the Census Bureau that they will not participate in a given program). If a reporting unit is eligible, e = 1, else e = 0. i i e Computation The sum of the indicator va riable for eligibility ( ) over all the reporting units in i N RU  E e the statistical period.  i  1 i Term IA (Total Ineligible/Inactive) The count of reporting units that were inel igible for data collection in the current Definition statistical period. Variable ia – An indicator variable for whethe r a reporting unit in the statistical period i has been confirmed as not a member of the target population at the time of data collection. An attempt was made to collect data, and it was confirmed that the rget population at that time. These reporting unit was not a member of the ta reporting units are not include d in the URR calculations for the periods in which they are ineligible. Information confirming ineligibility may come from observation, from a respondent, or from another source. Some examples of ineligible reporting units include firms th at went out of business prior to the y that is out-of-scope for the survey in survey reference period, firms in an industr reported data from outside of the reference period. question, and governments that ia If a reporting unit is ineligible, = 1, else ia = 0. i i Computation The sum of the indicator va riable for ineligibility ( ia ) over all the reporting units i N RU  ia IA in the statistical period.  i i 1  (Total Unknown Eligibility) U Term 72

83 al period for which e ligibility could not Definition The count of reporting units in the statistic be determined. Variable u e eligibility of a reporting unit in the – An indicator variable for whether th i ined. If a reporti ng unit is of unknown statistical period could not be determ eligibility, u = 1, else u = 0. For example, units w hose returns are marked as i i have unknown eligibility ( u “undeliverable as addressed” = 1), as do unreturned i mailed forms. The sum of the indicator vari able for unknown eligibility ( u Computation ) over all the i N RU U  u reporting units in the statistical period.  i i 1  A Term (Administrative data used as source) Definition The count of reporting units in the statis tical period that be long to the target population and were pre-selected to use ad ministrative data rather than collect survey data. a Variable – An indicator variable for whether admi nistrative data of equivalent-quality- i to-reported data rather than survey data was obtained for an eligible reporting unit in the statistical period. The decision not to collect survey data must have been made for survey efficiency or to reduce respondent burden and not because that reporting unit had been a refusal in the past. These reporting units are excluded from the URR calculations because they were not sent questionnaires, and thus their data are included in the calculation of the could not respond, although TQRRs. If a reporting unit is pre-select ed to receive administrative data, a = 1, i else a = 0. i pre-selected to use administrative data Computation The sum of the indicator variable for units N RU ing units in the statistical period. A  a ) over all the report ( a i  i i 1  The relationship among the counts of reporting units in the statistical period in the four eligibility th categories is given by N For the i e reporting unit, = E + IA + U + A. + ia + u + a = 1. Note i i i i RU that the value of N may change by statistical period. RU 1.2 Response Status active reporting un its in the statistical period. Response status is determined only for the eligible Term R (Response) Definition The count of reporting units in the statistic al period that were eligible for data collection in the statistical peri od and classified as a response. Variable r – An indicator variable for whether an eligible reporting unit in the statistical ui period responded to the survey. To be clas sified as a response, the respondent for the reporting unit must have provided suffici ent data, and the data must satisfy all 73

84 the critical edits. The definition of sufficient data will vary across surveys. Programs must designate required data item s before the data collection begins. If a reporting unit responded, r = 1, else r r = 0 for reporting units = 0 (note ui ui ui ting units classified as IA , U , which were eligible but did not respond and for repor ). or A The sum of the indicator va riable for eligible reporting units that responded ( Computation r ) ui N RU r  R the statistical period. over all the reporting units in  ui 1  i 1.3 Reasons for Nonresponse and better manage resources, it is recommended To improve interpretation of the response rate that whenever possible, reasons for (or type s of) nonresponse be measured on a flow basis ribe “unit nonresponse” and will be presented in whenever possible. These terms are used to desc unweighted tabulations. Five spec ific terms describing nonrespons e reasons are defined below. REF , CREF , and INSF ) define nonresponse reasons for eligible reporting The first three terms ( UAA units. The final two terms ( OTH ) define the reasons for re porting units with unknown and eligibility. REF (Refusal) Term The count of eligible reporti ng units in the statistical peri od that were classified as Definition “refusal.” ref Variable – An indicator variable for whether an eligible reporting unit in the statistical i ref period refused to respond to the survey. If a reporting unit refuses to respond, i = 1, else ref = 0. i Sum of the indicator variable for “refusal” ( ref Computation ) over all the reporting units in the i N RU  REF ref statistical period.  i i 1  Term CREF (Chronic refusal) Definition The count of eligible reporti ng units in the statistical peri od that were classified as “chronic refusals.” Variable cref – An indicator variable for whether an eligible reporting unit in the statistical i fusal is a reporting unit that informed period was a “chronic refusal.” A chronic re would not participate in a given program. The Census the Census Bureau that it Bureau does not send questionnaires to chro nic refusals, but they are considered to be eligible reporting units. Chronic refusals comprise a subset of refusals. If a cref reporting unit is a chronic refusal, = 1, else cref = 0. i i Computation The sum of the indicator vari able for “chronic refusal” ( cref ) over all the i N RU  CREF cref reporting units in the statistical period.  i i  1 (Insufficient data) INSF Term 74

85 Definition The count of eligible reporti ng units in the statistical peri od that were classified as providing insufficient data. Variable insf -- An indicator variable for whether an eligible reporting unit in the statistical i provide sufficient da ta to qualify as a period returned a questionnaire, but did not response. If a reporting unit returned a questionnaire but failed to provide sufficient data to qua lify as a response, insf = 0. insf = 1, else i i The sum of the indicator variab le for “insufficient data” ( insf Computation ) over all the i N RU  insf INSF reporting units in the statistical period.  i i  1 Term UAA (Undeliverable as addressed) Definition The count of reporting units in the statis tical period that were classified as “undeliverable as addressed.” Variable uaa – An indicator variable for whether a reporting unit in the statistical period i able as addressed.” These reporting had a questionnaire returned as “undeliver tionnaire is returned as “undeliverable units are of unknown eligibility. If a ques as addressed,” uaa = 0. = 1, else uaa i i Computation The sum of the indicator vari able for “undeliverable as addressed” ( uaa ) over all i N RU UAA uaa  the reporting units in the statistical period.  i i  1 OTH (Other nonresponse) Term The count of reporting units in the statistical period that were classified as “other Definition nonresponse.” oth Variable a reporting unit in the statistical period – An indicator variable for whether i was a nonresponse for a reason other than refusal, insufficient data, or ng units are of unknown undeliverable as addressed. These reporti eligibility. If a reporting unit does not respond for reasons othe r than refusal, insufficient data, or oth undeliverable as addressed, = 1, else oth = 0. i i Computation able for “other nonresponse” ( oth The sum of the indicator vari ) over all the i N RU oth OTH  reporting units in the statistical period.  i  1 i 1.4 Quantity Response Rate Terms in the statistical period is defined as The total number of active tabulation units N . Recall that TU the number of tabulation units N depending differ from the number of reporting units N may , TU RU on the economic program. After a program creates tabulation units and pe rforms any necessary data allocation procedures (from reporting unit(s) to ta bulation unit(s)), the individual data items are classified according to the final source of data obtained for the units: data reported by the respondent, equivalent-quality-t o-reported data obtained fr om the program-approved outside sources (such as company annual reports, Secu rity Exchange Commissi on (SEC) sites, trade been determined to be out-of- data. Tabulation units that have association statistics), or imputed 75

86 scope for data collection during the statistical pe riod are excluded from all computations, as are inactive cases. Variable v for tabulation unit in the statistical period) i (Tabulated value of data item t ti th i Definition The quantity stored in the variable for item t for the tabulation unit in the -quality-to-reported, reported, equivalent statistical period. This quantity may be or imputed. ) (Reported data tabulation units for item t R Term t The count of eligible tabulation units Definition that provided reported data during the studied statistical period for item t that satisfied all critical edits. This count will vary by item and by statistical period. Variable r – An indicator variable for whether tabulation unit i in the statistical period ti t provided reported data for item t that satisfied all edits. If the tabulated item value for unit (t i ) contains reported data, then = 1, else r = 0. r i ti ti The sum of the indicator va riable for reported data ( r Computation ) over all the tabulation ti N TU  ) in the statistical period. R r N units ( TU  t ti i 1  Term (Equivalent-quality-data tabulation units for item t ) Q t The count of eligible tabulation units that use equivalent-quali ty-to-reported data Definition for item . Note that these data are t indirectly obtained for the tabulation unit. This and by statistical period. count will vary by item q Variable – An indicator variable for whether tabulation unit i in the statistical period ti o-reported data for item contains equivalent-quality-t . Such data can come from t three sources: data directly subst ituted from another census or survey s (for the same reporting unit, data item concept, and time period), administrative data d , or data obtained from some other equivalent source c validated by a study approved by the program manager in collaborati on with the appropriate Research and reports, Securitie s and Exchange Methodology area (e.g., company annual Commission (SEC) filings, trade associati on statistics). If the tabulated item t i (t value for unit ) contains equivalent-quality-to-reported data then q = 1, else q ti ti i = 0. Computation The sum of the i ndicator variable for equivalent -quality-to-reported data ( q ) over ti N TU ) in the statistical period. Q q  all tabulation units ( N TU  ti t i  1 S Term (Substituted data tabul ation units for item t ) t Definition aining directly substituted data for item The count of eligible tabulation units cont t . This count will vary by item and by statistical period. – An indicator variable for whether a tabulation unit in the statistical period s Variable ti contains directly substitu ted data from another census or survey for item t . The same reporting unit must provide the item value (in the other program), and the item concept and time period for the subs tituted values must agree between the 76

87 ) contains directly two programs. If the tabulated item (t t i value for unit i s substituted data from another survey, = 1, else s = 0. ti ti s Computation The sum of the indicator variable for directly substituted data ( ) over all ti N TU  ) in the statistical period. S s N tabulation units ( T U  ti t i  1 D Term (Administrative data tabulation units for item t ) t The count of eligible tabulation units containing administrative data for item t . Definition This count will vary by item and by statistical period. Variable d – An indicator variable for whether a tabulation unit in the statistical period ti t value for unit i t t . If the tabulated item contains administrative data for item ( ) i contains administrative data, d d = 1, else = 0. ti ti Computation The sum of the indicator variab le for administrative data ( d ) over all tabulation ti N TU ) in the statistical period. D d  N units ( TU  t ti i  1 Term C (Equivalent source data tabulation units for item t ) t ource data that is Definition The count of eligible tabulation units c ontaining equivalent-s neither administrative data nor data substituted directly from another economic program for item t . This count will vary by it em and by statistical period. c Variable – An indicator variable for whether a tabulation unit in the statistical period ti contains equivalent-source data valid ated by a study approved by the program manager in collaboration with the a ppropriate Research and Methodology area . (e.g., company annual report, SEC filings, trade association statistics) for item t value for unit t (t If the tabulated item i ) contains equivale c nt-source data, then ti i c = 1, else = 0. ti le for equivalent-source data ( c Computation The sum of the indicator variab ) over all ti N TU C  c ) in the statistical period. N tabulation units ( TU  ti t i 1  Term M ) (Imputed data tabulation units for item t t Definition The count of eligible tabulation uni ts containing imputed data for item t . This count will vary by item and by statistical period. Variable m – An indicator variable for whether a tabulation unit in the statistical period ti contains imputed data for item t . If the tabulated item t value for unit i (t ) i contains imputed data, m = 1, else m = 0. ti ti Computation The sum of the indicator variable for imputed data ( m ) over all tabulation units ti N TU M m  ) in the statistical period. N ( U T  ti t i  1 77

88 = , S , D D , and C + for item t in a statistical period is given by Q S The relationship among Q t t t t t t t C + . The relationship among the counts of tabulation units for item t in the statistical period is t given by N = R + + Q . M t t TU t Variable f (Nonresponse weight adjustment factor) i th Definition A tabulation unit nonresponse weight adjustment factor for the tabulation unit i f in the statistical period. The variable is set equal to 1 for surveys that use i imputation to account for unit nonresponse. Variable w (Sample weight) i th The design weight for the tabulation unit in the sta tistical period. The design Definition i ng factors and outlier adjust weight includes subsampli ments, but excludes post- sampling adjustments for nonresponse and fo r coverage. This variable represents the inverse unbiased probability of selection for the tabulation unit. t Variable i) for tabulation unit (Design-weighted value of item t i th i The design-weighted tabulated quantity of the variable for item t for the Definition t tabulation unit in the statistical period (i.e, = w v ). Note that this value has not i ti i been adjusted for unit non-response. ) Term T (Total value for item t Definition The estimated (weighted) total of data item t for the entire p opulation represented by the tabulation units in the statistical period. T is based on the value of the data provided by the respondent, equivalent-qua lity-to-reported data, or imputed data. The calculation of T incorporates subsampling f actors, weighting adjustment factors for unit nonresponse (adjustment-to -sample procedures only), and outlier- adjustment factors, but does not includ e post-stratification or other benchmarking adjustments. th Computation i The product of the design weighted tabulated value of item t for the tabulation t in the statistical period ( ) and the nonresponse wei ght adjustment factor ( f ), i i N TU T f t  ) in the statistical period. N summed over all tabulation units ( TU  i i i  1 2. Response and Nonresponse Rates The rates defined below serve as quality indicators in the process control sense for non- negatively valued items such as total employees or total payroll. For items that can take on positive and negative values, such as income or earnings on investments, the program should plan to develop two sets of weighted item response rates (QRRs and TQRRs) – one from negatively valued data and one from non-negatively valued data – or propose alternative quality indicators that provide adequate transparency into data quality and assist in taking corrective actions. 78

89 2.1 Primary Response Rates URR Rate (Unit Response Rate) The proportion of reporting units in the statistical period based on unweighted Definition counts, that were eligible or of unknown e nded to the survey ligibility that respo (expressed as a percentage). [ R /( E + U Computation URR = )] * 100 QRR t ) Rate (Quantity Response Rate for data item Definition T ) of data item t reported by the The proportion of the estimated (weighted) total ( active tabulation units in the statistical period (expressed as a percentage). N TU   t r   i ti    1 i   100 * QRR = Computation T       TQRR (Total Quantity Response Rate for data item t ) Rate t ) of data item Definition The proportion of the esti mated (weighted) total ( T reported by the active tabulation units in the statistical pe riod or from sources determined to be ta (expressed as a percentage). equivalent-quality-to-reported da N TU   q r t ) (    ti ti i   i 1    * 100 = TQRR Computation T       2.2 Detailed Response and Nonresponse Rates 2.2.1 Unit Nonresponse Rate Breakdowns The following breakdowns provide unweighted unit nonresponse rates. REF rate (Refusal Rate) Rate The ratio of reporting units in the statistical period that were classified as “refusal” Definition to the sum of eligible units and units of unknown eligibility (expressed as a percentage). Computation REF rate = [ REF /( E + U )] * 100 Rate CREF rate (Chronic Refusal Rate) The ratio of reporting units in the statistical period that were classified as “chronic Definition (expressed as a to the sum of eligible units and units of unknown eligibility refusals” percentage). 79

90 Computation CREF /( E + U )] * 100 CREF rate = [ INSF rate (Insufficient Data Rate) Rate Definition The ratio of reporting units in the statistical period that were classified as to the sum of eligible units and units of unknown eligibility “insufficient data” (expressed as a percentage). Computation [ INSF /( E + U )] * 100 INSF rate = UAA rate (Undeliverable as Addressed Rate) Rate in the statistical period that were classified as The ratio of reporting units Definition “undeliverable as addressed” to the sum of eligible units and units of unknown eligibility (expressed as a percentage). Computation [ UAA /( E + U )] * 100 UAA rate = OTH rate (Other Reason for Nonresponse Rate) Rate in the statistical period that were classified as “other The ratio of reporting units Definition to the sum of eligible units and units of unknown eligibility reason for nonresponse” (expressed as a percentage). + Computation OTH /( E [ U )] * 100 OTH rate = Rate U rate (Unknown Eligibility Rate) The ratio of reporting units in the statistical period that were classified as “unknown Definition eligibility” (expressed as to the sum of eligible units and units of unknown eligibility a percentage). U rate = + U )] * 100 [ Computation E U /( 2.2.2 Total Quantity Response Rate Breakdowns The following breakdowns provide weighted item response rates. rate (Equivalent-Qualit y-to-Reported Data Rate) Rate Q derived from equivalent-quality-to- Definition The proportion of the total estimate for item t the statistical period (expressed as a reported data for tabulation units in percentage). N N TU TU     ( ) q t s d c t           ti ti ti ti i i   1 1 i i     Computation Q rate = * 100 * 100  T T             80

91 rate (Survey Substitution Rate) Rate S Definition derived from substituted other The proportion of the total estimate for item t in the statistical period (expressed as a survey or census data for tabulation units , substituted data items must be obtained percentage). To be tabulated in this rate time period as the target program, and from the same reporting unit in the same e two programs must agree. the item concept between th N TU   t s     ti i 1  i   Computation rate = S * 100 T       Rate D rate (Administrative Data Rate) Definition The proportion of the total estimate of item t derived from administrative data for tabulation units in the statistical period (expressed as a percentage). N TU   d t   i ti   i 1    Computation D rate = * 100 T       rate (Other Source Rate) Rate C Definition t derived from other source data The proportion of the total estimate of item validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology ar ea (such as company annual reports, SEC filing, trade association statistics) for tabulation units in the statistical period (expressed as a percentage). N TU   t c     i ti 1  i   rate = Computation C * 100 T       Rate M rate (Imputation Rate) t The proportion of the total estimate of item Definition derived from imputes for tabulation units in the statistical period (expressed as a percentage). N TU   m t     i ti  1 i   * 100 rate = M Computation T       81

92 REPORTING RESULTS ANALYZING DATA AND E1 Analyzing Data Reporting Results E2 Appendix E2 : Economic Indicator Variables E3 Reviewing Information Products Appendix E3-A : Event Participation Approva l Form and Instructions Appendix E3-B : Statistical Review Contacts Appendix E3-C : Policy and Sensitivity Review Checklist for Division and Office Chiefs 82

93 Statistical Quality Standard E1 Analyzing Data The purpose of this standard is to ensure that statistical analyses, inferences, and Purpose: are based on statistica lly sound practices. comparisons used to develop information products The Census Bureau’s statistical quality standards apply to all information products Scope: released by the Census Bureau a nd the activities that s, including products generate those product joint partners, or other cust omers. All Census Bureau released to the public, sponsors, employees and Special Sworn Stat th these standards, including us individuals must comply wi contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to the analys es performed to generate information products. It includes analyses:  Used to produce Census Bureau informat ion products (e.g., reports, news releases, conference papers, journal articles, a nd maps), regardless of data source.  strative records data, or any data linked Conducted using census data, survey data, admini with any of these sources.  Performed during research to develop impr oved methodologies for frame construction, survey design, sampling, data collection, data capture, processing, estimation, analysis, or other statistical processes.  Performed to evaluate the quality of Census Bureau data, methodologies, and processes.  Conducted to guide decisions about processe s or information products of the Census Bureau’s programs. Exclusions: Preface. No additional exclusions to the standards are listed in the The global exclusions apply to this standard. Key Terms : Bonferroni correction , cluster , covariance , direct comparison , goodness-of-fit , outliers hypothesis testing implied comparison , multivariate analysis , , , parameter , peer review , , regression , sample design , Scheffe’s method , sensitivity analysis significance level , statistical inference , and . Tukey’s method Requirement E1-1: with analyzing data, unauthorized Throughout all processes associated ely restricted information must be prevented by release of protected information or administrativ d Title 26), Census Bureau policies (e.g., Data following federal laws (e.g., Title 13, Title 15, an Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be specified in a memorandum of understa nding or data-use agreement). (See Statistical Quality Standard S1 Protecting Confidentiality .) , Requirement E1-2: A plan must be developed prior to the st art of the analysis that addresses, as appropriate: A description of the analysis , addressing issues such as: 1.  Purpose. 83

94  Research questions or hypotheses.  Relevant literature. A description of the data, addressing issues such as: 2.  The data source(s).  the concept(s) in the hypotheses. Key variables and how they relate to  llect and process the data. Design and methods used to co  Limitations of the data. 3. A description of the methodology, addressing issues such as:  nd economic analysis techniques, ANOVA, Analysis methods (e.g., demographic a ear analysis, nonparametric approaches, box plots, and regression analysis, log-lin scatter plots).  Key assumptions used in the analysis.  Tests (e.g., z-tests, F-test, chi-square, and R-squared) and significance levels used to judge significance, good ness-of-fit, or degree of association.  Limitations of the methodology. 4. Appropriateness of the data and underlying a ssumptions and verification of the accuracy of the computations. Notes: the analysis may change, as the researcher During a data analysis project, the focus of (1) plan should be updated , as appropriate, to learns more about the data. The analysis reflect major changes in the direction of the analysis. , addresses overall planning Statistical Quality Standard A1 , Planning a Data Program (2) requirements, including schedul e and estimates of costs. Requirement E1-3: practices that are appropriate for the research questions Statistically sound must be used when analyzing the data. Examples of statistically sound practices include: Reviewing data to identify and address nonsampling error issues (e.g., outliers,  nd bias in the frame or sample from which inconsistencies within records, missing data, a data are obtained).  the analysis, where feasible. Validating assumptions underlying  Developing models appropriate for th e data and the assumptions. (See Statistical Quality Standard D2 , Producing Estimates from Models .)  Using multiple regression and multivariate analysis techniques, when appropriate, to examine relationships among dependent va riables and independent variables.  Using a trend analysis or othe r suitable procedure when testin g for structure in the data over time (e.g., regression, time series analysis, or nonparametric statistics). Sub-Requirement E1-3.1: The data analysis must account for the sample design (e.g., unequal ustering) and estimation methodology. probabilities of selection, st ratification, and cl 84

95 Notes: (1) methodological feature(s) has no effect on the If it has been documented that a particular is not necessary to account for that feature in the analysis results of the analysis, then it (e.g., if using weighted and unweighted data produce similar results, then the analysis may use the unweighted data; if the variance properties for clustered data are similar to those for unclustered data, then the anal ysis need not account for clustering). Any conclusions derived from sa mple data must be supported by Requirement E1-4: appropriate measures of st atistical uncertainty. Examples of measures of statistical uncer tainty that support conclusions include:  ified confidence levels (e.g., 90% or 95%). Confidence or probability intervals with spec  Margins of error for specified confidence leve ls, provided the sample size is sufficiently large that the implied confidence interval has coverall close to the nominal level.  P-values for hypothesis tests, such as ar e implied when making comparisons between groups or over time. Comparisons with p-valu es greater than 0.10, if reported, should come with a statement that the difference is not statistically different from zero.  Confidence intervals, probability intervals or p-values should be statistically valid and account for the sample design (e.g., accounti ng for covariances when the estimates are based on clustered samples). If based on a model, then the key assumptions of the model should be checked and not contradi cted by the observed data. (See Statistical Quality Standard D2 . Producing Estimates from Models , Note: Although the p-value does not indicate the size of an effect (or the size of the difference in rong evidence against the null, p-values between a comparison), p-valuse below 0.01 constitute st 0.01 and 0.05 constitute moderate evidence, and p-values between 0.05 and 0.10 constitute weak evidence. Sub-Requirement E1-4.1: The same significance level or confidence level must be used throughout an analysis. Table A shows the re quirements for specific information products: ence Levels by Information Product Table A: Significance and Confid Information Product Significance Confidence Level Level Census Bureau publications 0.10 0.90 News releases 0.10 0.90 All other information products (e.g., working 0.90 or more 0.10 or less papers, professional papers, and presentations) The data and underlying assumptions must be appropriate for the analyses Requirement E1-5: and the accuracy of the com putations must be verified. Examples of activities to ch eck the appropriatene ss of the data and underlying assumptions and the accuracy of the computations:  Checking that the appropriate equati ons were used in the analysis.  ables are used in the Reviewing computer code to ensure that the a ppropriate data and vari analysis and the code is correctly programmed. 85

96  Performing robustness checks (e.g., checking that unexpected results are not attributable assess fit of models and comparing findings to errors, examining plots of residuals to against historical results for reasonableness).  Performing sensitivity analyses using alternative assumptions to assess the validity of measures, relationships, and inferences.  and statistical Requesting peer reviews by s ubject matter, methodological, experts to assess analysis approach and results. Requirement E1-6: Documentation needed to replicate and evaluate the analysis must be produced. The documentation must be retained, cons istent with applicable policies and data-use us Bureau employees who need it to carry out agreements, and must be made available to Cens their work. (See Statistical Quality Standard S2 , Managing Data and Documents .) Examples of documentation include:  procedures relating to the analysis. Plans, requirements, specifications, and  Computer code (e.g., SAS code).  Data files with weight ed and unweighted data.  Outlier analysis results, including informa tion on the cause of outliers, if available.  Error estimates, parameter estimates, and overall performance statistics (e.g., goodness- of-fit statistics).  Results of diagnostics re lating to the analysis. Notes: to external users, unless the information The documentation must be released on request (1) ons that would preclude its or administrative restricti is subject to legal protections release. (See Data Stewardship Policy DS007, Information Security Management .) Program Statistical Quality Standard F2 , Providing Documentation to Support Transparency in (2) Information Products , contains specific requirements about documentation that must be to ensure transparency of information products released readily accessible to the public by the Census Bureau. 86

97 Statistical Quality Standard E2 Reporting Results Purpose: The purpose of this standard is to ensure that information products meet statistical reporting requirements; that they provide understandable, objectiv e presentations of results and ons are supported by the data. conclusions; and that conclusi Notes: Requirement F1-4 of Statistical Quality Standard F1, Releasing Information Products , (1) s regarding information products affected by serious data contains reporting requirement quality issues that may impa ir the suitability of the pr oducts for their intended uses. (2) Department Administrative Order (DAO) 219-1 establishes the policy for Commerce Department employees engagi ng in public communications. Scope: standards apply to all information products The Census Bureau’s statistical quality nd the activities that released by the Census Bureau a s, including products generate those product released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to the repo rting of results in information products such as:  News releases.  tion products that the program’s Associate Census Bureau publications (i.e., informa Director has reviewed and a pproved and the Census Bureau has affirmed their content).  division reports intended for release to the Working papers (e.g., technical papers and public).  Professional papers (e.g., journal articl es, book chapters, conference papers, poster sessions, and written discussant comments).  Research reports used to guide deci sions about Census Bureau programs.  Abstracts.  Presentations at public events, such as seminars or conferences. ( Statistical Quality Standard E3 , Reviewing Information Products , defines public events.)  Handouts for distribution at public events.  tabulations, estimates, and th eir associated documentation. Tabulations, including custom  Statistical graphs, figures, and thematic maps. Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to:  Papers, presentations, or other public co mmunications prepared or delivered by Census Bureau employees that are not relate d to programs, policies, or operations of the Department of Commerce (DOC) or the Census Bureau. (The DOC Summary of Ethics Rules state that you may use your Census Bureau affiliation in non-official l biographic information, and it is given no contexts only if it is used as part of genera 87

98 more prominence than other significant biog raphical details. Contact the Office of Analysis and Executive Support (OAES) for additional guidance.) Key Terms Census Bureau publications , coefficient of variation (CV) , confidence interval , : custom tabulations , design effect , direct comparison , estimate , implied , derived statistics , information products , margin of error (MOE) , metadata , nonsampling error comparison policy , view sampling error , significance level , standard error , statistical inference , statistical , , transparency , significance , and working papers . synthetic data Requirement E2-1: with reporting results, unauthorized Throughout all processes associated ely restricted information must be prevented by release of protected information or administrativ d Title 26), Census Bureau policies (e.g., Data following federal laws (e.g., Title 13, Title 15, an Stewardship Policies), and additi onal provisions governing the use of the data (e.g., as may be nding or data-use agreement). (See specified in a memorandum of understa Statistical Quality Standard S1 , .) Protecting Confidentiality Requirement E2-2: All information products must provi de accurate and reliable information t that information in an unbiased manner. that promotes transparency and must presen Information products based on data that have “serious quality issues” are not permitted 1. except under the restrictions in Sub-Requirement F1-5.2 of Statistical Quality Standard , F1 . Releasing Information Products y Standard F1 describes serious data quality Note: Requirement F1-5 in Statistical Qualit issues. Except as noted below, information products (including each table, graph, figure, and 2. map within an information product, and incl uding stand-alone tables, such as custom tabulations) must include a source statement that: a. Indicates the program(s ) that provided the data. b. Indicates the date of the source data. es do not need source statements. Note: Abstracts and presentation slid 3. Except as noted below, information products phs, figures, and maps (including tables, gra that stand alone) must indicate that the data are subject to error ar ising from a variety of sources, including (as appropria te) sampling error, nonsampli ng error, model error, and any other sources of error. Including one of the following in the information product will satisfy this requirement: An explicit statement indicating that the data are subject to error arising from a a. variety of sources. b. A description of th e error sources. A discussion of the error sources. c. Note: Abstracts and presentation slides do not need to indicate that the data are subject to error. 88

99 4. Except as noted below, information products must include a reference (i.e., URL) to the tion of the program(s). full methodological documenta Note: Abstracts and presentation slides do not need to include a reference to the full methodological documentation. 5. All inferences and comparisons of estimates based on sample data must include appropriate measures of statistical uncertain ty, such as margins of error, confidence intervals, or p-values for hypothesis tests. a. Results that are not statistically significa nt must not be discussed in a manner that implies they are significant. b. The same significance or confidence le vel must be used throughout an information ts for specific information products: product. Table A shows the requiremen Table A. Significance and Confid ence Levels by Information Product Information Product Significance Confidence Level Level Census Bureau publications 0.10 0.90 News releases 0.10 0.90 All other information products (e.g., working 0.90 or more 0.10 or less papers, professional papers, and presentations) c. Direct comparison statements that are not statistically significant must include a statement conveying the lack of st atistical si gnificance, such as: “The 90 percent confidence interval for th e change includes zero. There is insufficient evidence to conc lude that the actual change is different from zero.” Such a statement may be given in a footnote. For example, “ Sales of nondurable goods were down 0.6 percent (+/- 0.8 %)* .” Footnote: “ *The 90 percent confidence ude that the actual interval includes zero. There is insufficie nt evidence to concl change is different from zero. ” d. The text must clearly state whether each co mparison (direct or implied) is statistically significant. This must be done either by: 1) Using a blanket statement such as, “ All comparative statements in this report have undergone statistical testing, and, unle ss otherwise noted, all comparisons are statistically significa nt at the 10 percent significance level,” and specifically noting any implied comparison statem ents that are not significant. 2) Reporting a p-value for each comparison. 3) Stating whether or not the c onfidence interval includes 0. 89

100 e. Statements of equality between population quantities that are being estimated with are not sampling error are not allowed. For example, the following statements acceptable underlying population quantities: , since they refer to unknown  “ The poverty rate for state A equals the rate for state B. ”  “ The poverty rate remained statistically unchanged ” (for a comparison across time). not statistically different It is acceptable to say that the estimates are “ ” or (for statistically unchanged, ” if the difference in the estimates is comparisons over time) “ are acceptable , not statistically significant. For example, the following statements since they refer to the estima tes of population quantities:  The estimated poverty rate for state A, 8. 1 percent (± 0.2), is not statistically “ different from the estimated poverty ra te for state B, 8.1 percent (± 0.2). ”  “ The estimated poverty rate remained statistically unchanged for non-Hispanic ” However, this statement must be accompanied by whites at 8.2 percent (± 0.2). the abovementioned footnote. 6. Key estimates in the text must be accompanied by confidence intervals or margins of nts (e.g., equivalents for Bayesian inferences or for error error (MOEs) or their equivale arising from synthetic data) fo r the information products indi cated in the table below. Providing a URL to these measures of stat istical uncertainty is not sufficient. for Key Estimates by Information Product Table B. Confidence Intervals or MOEs Confidence intervals or MOEs Information Product Census Bureau publications Required News releases for the economic data items Required listed in Appendix E2 Not required News releases for all other data (e.g., economic data items not in Appendix E2 , household-level data, and person-level data) Abstracts and presentations slides Not required All other information products (e.g., working Required papers and professional papers) Notes: (1) In working papers and professional papers, p-values, standard errors, coefficients of variation (CV), or other appr opriate measures of statisti cal uncertainty may be used instead of confidence intervals or MOEs. to zero, the interval may be replaced by a (2) If the width of a confidence interval rounds statement such as “ The width of the confidence interv al for this estimate rounds to zero .” 7. Except as noted below, information products must include or make available by reference ss the statistical uncertainty of derived (URL) information that allows users to asse statistics as well as of the estimates themselves. For example, 90

101  Measures of statistical uncerta inty (e.g., variances, CVs, st andard errors, error arising from synthetic data, or th eir Bayesian equivalents).  Methods to estimate the measures of statis tical uncertainty (e.g., generalized variance functions or equations and design effects).  istical uncertainty for derived statistics, Methods to approximate the measures of stat such as estimates of change or ratios of estimates. Notes: (1) This requirement does not apply to re sponse rates, unless the information product analyzes the response rates or draws conclusions from them. (2) Abstracts and presentation slides need not make available information on statistical uncertainty. Custom tabulations must provi de information on statistical uncertainty as specified in Sub-Requirement E2-2.2, item 4. (3) Maps need not portray or indicate info rmation on statistical un certainty, but if not, they must include a URL at which users can access measures of statistical uncertainty and other information about statistical uncertainty. (4) When information on statistical uncertainty is made available by referencing a URL, to the location of the information. the URL must direct users specifically 8. the information product must assess the results presented, If needed for readers to include: a. A discussion of the assumptions made. b. The limitations of the data. c. A description of the methodology us ed to generate the estimates. affect the results. d. An explanation of how the methodology and the limitations might 9. The information presented must be technically and factually correct. 10. The information must be presented logically and any results must follow from the data and the analysis. 11. Any anomalous findings must be addressed appropriately. 12. The subject matter and methodological litera ture must be refere nced, as appropriate. 13. Policy views must never be expressed. 14. Except as noted in Sub-Requirement E2-2 .1 (item 3), personal views must not be expressed. Sub-Requirement E2-2.1: In addition to the requirements for all information products, the requirements for working papers, professional pa pers, research reports, presentation slides, handouts for distribution at pr esentations, and similar produc ts include the following: 1. Except as noted below, a disclaimer must be included on the title page. The author may any views expressed ong as it indicates that determine the wording of the disclaimer as l 91

102 are those of the author and not necessarily t hose of the Census Bureau. An example of a disclaimer is: “ Any views expressed are those of the author(s) and not necessarily those of the U.S. Census Bureau. ” Note: The disclaimer is not needed for:  Census Bureau publications, new releases , abstracts, and handouts for advisory committee meetings.  Information products that ar e distributed internally.  Information products that have been reviewed and approve d by the Associate Director as not needing a disclaimer b ecause the documents do not contain personal views (e.g., working papers).  Presentation slides, unless they will be di stributed as handouts or published (e.g., in conference proceedings). 2. Working papers published on the Census Bur eau’s Web site and wr itten entirely by non- Census Bureau individuals (e.g., external re searchers at the Census Bureau’s Research Data Centers) must incorporate the disclaim er described above, with an additional statement indicating that the Census Bureau has not reviewed the paper for accuracy or reliability and does not endorse its contents or conclusions. 3. Personal views may be expressed only if they are appropriate for the paper or presentation because they are on statistical , methodological, technical, or operational issues. 4. Working papers and professional papers that discuss the results of qualitative research not supported by statistical testing (e.g., ba sed on samples that are not random, are nonrepresentative, or are too small to provide statistical support of the results) must include a caveat explaining w hy the qualitative methods used do not support statistical testing. The caveat also mu st address how the findings can (or cannot) be extended to wider populations. erious data quality issues” related to Information products based on data with “s 5. nonsampling error may be written only when th eir purpose is not to report, analyze, or discuss characteristics of the population or economy, but to:  Analyze and discuss data quality issues or research on methodological improvements, or to  Report results of evaluations or methodological research. Note: Statistical Quality Standard F1 , Releasing Information Products describes serious data quality issues and the rest rictions on releasing informati on products with such issues. Note: Although not a requirement of the statistical quality standards, the Census Bureau requires presentation slides to use the PowerP oint templates featuring the Census Bureau Marketing Services Office Intranet Web site. wordmark provided at the Customer Liaison and 92

103 Sub-Requirement E2-2.2: for all information products, the In addition to the requirements requirements for tabulations include the following: 1. The level of detail for tabulations must be appropriate for the le vel of sampling error, nonsampling error, and any other erro r associated with the estimates. 2. All tabulations, except as noted for custom tabulations in item 4 below, must present estimates that take into account the sa mple design (e.g., weighted estimates). 3. All tabulations, except as noted for custom tabulations in item 4 below, must account for missing or invalid data items (e.g., use imput ed data, adjust weights, or display the weighted total of the cases where the data were not reported). 4. Custom tabulations must: Present weighted estimates unless a client tabulations. If requests unweighted a. unweighted tabulations are produc ed for a client, a discussion of the issues associated with using unweighted counts must be pr ovided with the tabulations. Providing a reference (URL) citing the disc ussion is not sufficient. Account for missing or invalid data items unl ess a client requests custom tabulations b. that exclude imputed data. If tabulations are produced for a client that exclude imputed data, additional metadata must be pr ovided with the tabulations to describe and quantify the level and the extent of the missing data. Providing a reference (URL) citing the metadata is not sufficient. Include measures of statistical uncertain ty (e.g., CVs, standard errors, MOEs, c. confidence intervals, or their Bayesian e quivalents) with weighted tabulations, or include a reference (URL) to the measures of statistical uncertainty. If a program manager thinks that computing estimates of sampling error is not feasible (e.g., for reasons of cost, schedule, or resources), the program manager must work with their research and methodology Assi stant Division Chief (ADC) to provide the client with acceptable measures of statistical uncert ainty or the means to compute them. Note: Although not a requirement of the statistical quality standards, program managers who produce custom tabulations mu st refer to and follow the requirements of Data Stewardship Policy DS021, Custom Tabulations . a footnote) as statistic ally significant in any 5. If any differences are identified (e.g., with table within an information product, then all statistically significant differences must be similarly identified in all the tables. However, it is not required to identify statistically significant differences in tables. 6. Tabulations must be formatted to prom ote clarity and comprehension of the data presented. Examples of formatting practices that pr omote clarity and comprehension include: Presenting at most four dimens ions in a cross-tabulation.   Labeling all variables. 93

104  ce the text description of the relationships Using row or column percentages to reinfor involved.  Labeling the type of statistic s being presented (e.g., freque ncy, percentage, means, and standard errors).  Presenting totals and subt otals when appropriate.  Labeling header columns for each page in multi-page tabulations.  Indicating when a data value is suppr essed because of disclosure issues.  Footnoting anomalous values (e.g., outliers). symbols in tables must be appropriate for the 7. Displaying estimates that equals zero and content/subject matter being presented and accord ing to acceptable sta tistical practice. An estimate that equals zero should be shown as a numeric value, e.g., 0.00 for two-decimal accuracy. The exception is when the estimate is less than half of a unit of measurement from zero and there is a meaningful differen ce between an actual zero and a rounded zero for the particular statistics. Use the sym bol without additional punctuation such as parenthesis. Use an “X” instead of “(X)”. Examples of approved standard symbols: a. A ‘Z’ means the estimate rounds to zero. ld because estimate did not meet publication b. An ‘S’ means that the estimate is withhe standards. c. An ‘X’ means that the estimate is not applicable. d. An ‘N’ means that the estimate is not available or not comparable e. A ‘D’ means that the estimate is withheld to avoid disclosing data for individual companies; data are included in higher level totals Sub--Requirement E2-2.3: In addition to the requireme nts for all information products, figures, and maps include the following: the requirements for statistical graphs, 1. The dimensions of graphs, figures, and maps must be consistent with the dimensions of st not be used when displaying only two the data (e.g., three-dimensional effects mu dimensions of data). 2. Graphs, figures, and maps must be forma tted to promote clarity and comprehension of the data presented. omote clarity and comprehension include: Examples of formatting practices that pr Labeling axes and includi ng the unit of measure.   Including a legend that defines acrony ms, special terms, and data values.  Preparing graphs, figures, and maps in th e same format throughout the information product.  Using consistent scales across graphs, figures, or maps that are likely to be compared.  Using units of measure appropriate to th e scale of the graph, figure, or map.  ro to avoid giving an Starting the base of the graph or figure at ze inappropriate visual impression. 94

105  spond to the level of measur ement (e.g., a light-to-dark Ensuring that color hues corre color scheme corresponds with low-to-high values).  Complying with accessibility re quirements of Section 508 of the U.S. Rehabilitation Act. Note: The Census Bureau Guideline on the Presentation of Statistical Graphics and the Administrative Customer Service Division (ACSD) Chart Publishing Guidelines provide additional guidance on presenting graphics. 95

106 Appendix E2 Economic Indicator Variables tically significant are not allowed, except for Direct comparison statements that are not statis statements of changes in news releases for th e economic indicator variables listed in the table to indicate that the below must be provided below. In these news releases, the footnote comparison is not statistically significant: e change includes zero. There is “The 90 percent confidence interval for th lude that the actual change insufficient evidence to conc is different from zero.” Year- Current to- Period date Current to Same to Period to Period Program Frequency Levels Characteristics prior Prior One year- Period Year to- Ago date Monthly: Monthly Sales Advance Monthly Total retail plus       (Dec. Retail and Food food services Quarterly: only) Services Survey  Total retail  (MARTS) Monthly Sales Monthly Wholesale  Total wholesale    Trade Survey Durable goods  (MWTS)  Nondurable Inventories    goods Quarterly Receipts or Quarterly Services  Total    revenue Survey (QSS)  2-digit sector totals Expenses    Distributive trades Monthly Manufacturing and  Total    sales plus Trade Inventories manufacturing, manufacturers’ and Sales (MTIS) retail and shipments and total wholesale trade business inventories Monthly Authorizations Building Permits  U.S. Total and     Survey (BPS) by size  Region, Total, 1-unit Monthly Starts Survey of U.S. Total and      Construction (SOC) by size Completions     Region, Total,  1-unit 96

107 Year- Current to- Period date Current to Same to Period to Period Program Frequency Levels Characteristics prior Prior One year- Period Year to- Ago date Sales  U.S. and region,      1-unit Monthly Construction Value Put-in-Place Total      expenditures (VIP) Surveys residential  Total  nonresidential Private total  Private  residential  Private nonresidential  Public total Public  residential  Public highway Quarterly Seasonally- Quarterly Financial  Total     adjusted: Report (QFR) manufacturing After-tax profits   Durable goods manufacturing Sales  Nondurable   After-tax profits goods per dollar of sales manufacturing Not seasonally- Total      adjusted: manufacturing  After-tax profits Durable goods  manufacturing  Sales Nondurable   After-tax profits goods per dollar of sales manufacturing Total mining  Wholesale trade  Retail trade  Total sales Quarterly E-Commerce  Total retail      E-commerce sales      Quarterly Rental vacancy Housing Vacancies  U.S.      rate, Homeowner and  Regions vacancy rate, Homeownership Homeownership (CPS/HVS) rate 97

108 Statistical Quality Standard E3 Reviewing Information Products The purpose of this standard is to ensure that information products released by the Purpose: Census Bureau receive the appropriate reviews re quired to ensure they ar e of high quality and do y restricted information. This standard also not disclose protected information or administrativel blic events are reviewed and approved. ensures that plans to participate at pu Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau a nd the activities that generate those product s, including products released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. s to the review and approval of: In particular, this standard applie  Information products, including internal info rmation products that are subsequently released to the public.  Participation at public events. Note: Information products (e.g., professional pape rs, presentations, or othe r materials) prepared by a Census Bureau employee that pertain to the Census Bureau’s programs, policies or operations and are related to the employee’s job or area of expertise, are covered by this standard, even if prepared on the employee’s own time, without the use of Census Bureau resources or support. See , Section 11, Non- Department Administrative Order (DAO) 219-1 Official Public Communications. Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to:  Information products prepared or delivered by Census Bureau employees, but which are not related to programs, policies, or operati ons of the Department of Commerce or the Census Bureau. (Census Bureau employees or Special Sworn Status individuals, who want to include their Census Bureau affiliation as biographical information in the communication, should obtain guidance from the Office of Analysis and Executive Support.) , Key Terms : Census Bureau publications , custom tabulations direct comparison , disclosure , policy view implied comparison , participation , information products , public event , and working , papers . All Census Bureau information produc ts must be reviewed before release Requirement E3-1: to ensure that disclosure avoidance technique s necessary to prevent unauthorized release of protected information or administratively rest ricted information have been implemented completely and correctly. Information protected by federal law (e.g., Title 13, Title 15, and Title 26) and by the Confidential Information Protec tion and Statistical Efficiency Act of 2002 98

109 Statistical Quality Standard S1 , (CIPSEA) is covered by this requirement. ( Protecting Confidentiality , addresses disclosure avoidance techniques.) Sub-Requirement E3-1.1: The Census Bureau’s Disclosure Review Board (DRB) procedures must be followed for information products th at use data protected by Title 13 to prevent unauthorized release of protected information or administratively restricted information, particularly personally identifiab le information or business identifiable information. (See the DRB Intranet Web site for further guidance and procedures.) Requirement E3-2: To maintain the Census Bureau’s position as unbiased and neutral with issues, employees must submit an Event Participation Approval regard to policy and political Form through their Division Chief to the Chief, International Relations Office, to receive e United States, except for the conferences noted approval to participate in public events within th below. Appendix E3-A contains the Event Participati on Approval Form. See the Census at public events for further information. Bureau’s Intranet Web page on participation Definitions: A “public event” means that the event is open to the general public, including events that (1) require a registration fee. (2) “Participation” means that the employ ee takes an active role in the event. Examples of the types of activities that c onstitute participation and require an Event Participation Approval Form include:  professional association conference. Presenting a paper or poster at a  Organizing and/or chairing association conference. a session at a professional  Acting as a discussant.  Serving as a panelist.  Giving a seminar or workshop at colleges, universities, the Washington Statistical Society, or othe r organizations.  Making a presentation as an expert memb er of a working group or other group.  Staffing a Census Bureau-sponsored booth at a professional associati on conference or at a trade show.  Conducting Foreign Trade Divi sion export compliance seminars on the Foreign Trade Regulations and electronic expor t reporting for U.S. exporters. Examples of the types of activities that do not constitute participation and do not require an Event Participation Approval Form include:  Attending a conference or seminar as a member of the audience only.  Participating in corporate recruiting sponsored by the Human Resources Division, including conducting information sessions a nd other presentations at colleges or universities. Examples of events that are not public and do not require an Event Participation Approval Form include:  Attending a meeting with a survey sponsor.  ooperation between statistical agencies. Attending private planning meetings for c 99

110  reau to elicit input from survey users. Attending meetings sponsored by the Census Bu  ch and Methodology (CSR M) seminar at the Presenting a Center for Statistical Resear Census Bureau. Notes: The Event Participation Approval Form is not needed for the following conferences: (1)  Joint Statistical Meetings (JSM) of the American Statistica l Association (ASA)  American Association of Pub lic Opinion Research (AAPOR)  International Conference on Establishment Surveys (ICES)  Population Association of America (PAA)  American Economics Association  International Statistical Institute (ISI)  Association of Ameri can Geographers (AAG) (2) Multiple employees that participate in the same session of a conference need submit only one form. (3) Contact the Chief of the International Re lations Office regarding attendance at international events or for questions regardi ng whether an Event Part icipation Form must be submitted. All information products must underg o review and receive approval before Requirement E3-3: they are released to the public, to sponsors, or to other customers. Sub-Requirements E3-3.1 and levels of review needed. through E3-3.5 describe the types Examples of information products covered by th is requirement include, but are not limited to:  News releases.  Census Bureau publications (i.e., informa tion products that the program’s Associate Director has reviewed and a pproved and the Census Bureau has affirmed their content).  Working papers (e.g., technical papers and division reports intended for release to the public).  Professional papers (e.g., journal articl es, book chapters, conference papers, poster sessions, and written discussant comments).  Research reports used to guide deci sions about Census Bureau programs.  Abstracts.  rs or conferences. (See Requirement E3-4 Presentations at public events, such as semina for additional requirements for presentations.)  Handouts for distribution at public events.  Data sets (e.g., public-use files) and their associated documentation.  tabulations, estimates, and th eir associated documentation. Tabulations, including custom  Statistical graphs, figures, and thematic maps. Notes: Drafts of information products to be releas ed for limited circulation (e.g., professional (1) papers) outside the Census Bureau are subj ect to the requirements for a supervisory review as stated in Sub-Requirement E3-3.1 . The other reviews (i.e., content/subject 100

111 matter, statistical, and policy) are not required unless the supervisor determines that the product needs any of those reviews. (2) t, Census Bureau Policy requires that the Chief While not a statistical quality requiremen of the Demo-Econ Media Relations Branch in the Public Information Office (PIO) be informed of any information products or othe r materials being prepared for public use. (See the Census Bureau Policies a Clearance and nd Procedures Manual, Chapter B-13 – Release of Public Information Materials , for further guidance and procedures.) All information products must undergo a supervisory review and Sub-Requirement E3-3.1: receive approval. 1. orm the supervisory review and approval. The following table specifies who must perf Type of Information Product Supervisory Reviewers Census Bureau publications  Author’s immediate supervisor  Author’s Division or Office Chief  Associate Director of the program releasing the information product News releases  Author’s immediate supervisor  Author’s Division or Office Chief  Associate Director of the program releasing the information product  Associate Director for Communications All other information products Author’s immediate supervisor   Author’s Division or Office Chief The supervisory reviewer must verify that the following requirements have been met. 2. All information products a. The content of the information product is technically and factually correct. procedures have been followed. All mandated disclosure avoidance b. The provisions for reviewing and releasi ng information products in any data-use c. agreements have been followed. The information product complies with the Census Bureau’s statistical quality d. standards. e. If the information product is a draft to be released for limited ci rculation outside the Census Bureau, it must include a disclaimer that states the draft is still under review and is not for distribution. All information products containing text f. No policy views are expressed in the information product. No personal views are expressed in Census Bureau publications or news releases. g. operational issues Only personal views on statistical, methodol ogical, technical, or h. r than Census Bureau publications and are expressed in information products othe news releases. i. A disclaimer is included on th e title page in all informa tion products except as noted below. The author may determine the wo rding of the discla imer as long as it essed are those of the author and not necessarily those indicates that any views expr 101

112 of the Census Bureau. An example of a disclaimer is: “ Any views expressed are those of the author(s) and not necessarily those of the U.S. Census Bureau. ” Note: The disclaimer is not needed for:  , abstracts, and handouts for advisory Census Bureau publications, new releases committee meetings.  Information products that ar e distributed internally.  Information products that have been reviewed and approve d by the Associate Director as not needing a disclaimer because the documents do not contain personal views (e.g., working papers).  Presentation slides, unless they will be distributed as handouts or published (e.g., in conference proceedings). j. The information is presented logically and any results follow from the data and the analysis. Any anomalous findings are addressed appropriately. k. Correct grammar is used. l. m. us Bureau PowerPoint template found on Presentation slides use the required Cens the Customer Liaison Marketing and Servi ces Office (CLMSO) Intranet Web site. Corporate Identity Standard.) (Note: This is a Census Bureau Notes: (1) When the author is a Division or Office Chie f, the supervisory reviewer is the author’s Associate Director. When the author is a higher-level manager than a Division or Office Chief, the supervisory review is waived. (2) When the author is a Senior Technical (S T) employee, the supervisory reviewer is the Chief of the Center for Statis tical Research and Methodology. Sub-Requirement E3-3.2: All information products, except data sets and custom tabulations, must undergo a content/subject-m atter review and receive approval. However, the documentation that accompanies data sets or custom tabulations must receive a content/subject matter review. The following table specifies who must perfor m the subject matter review and approval: 1. Type of Information Product Content/Subject Matter Reviewers Abstracts Author’s Division or Office Chief Review ers who are outside the author’s All other information products organizational unit (branch), and who have expertise in the subject matter, operation, or statistical program discussed in the information product. If a qualified outside reviewer is not available, a reviewer within the autho r’s organizational unit is permitted. fy that the following requirements have The content/subject matter reviewer must veri 2. been met: 102

113 a. The content of the information product is technically and factually correct. The information is presented logically and any conclusions follow from the data and b. the analysis. Any anomalous findings are addressed appropriately. c. d. Subject-matter literature is referenced in the information product, as appropriate. The content/subject matter reviewer must e ither approve the information product or 3. provide the author with sp ecific written instructions on issues to be revised. The content/subject matter reviewer must revi ew the information product again after the 4. nded revisions. If the review author addresses any recomme er and the author disagree with how the comments are addressed, they must inform their supervisors so that a resolution can be reached. Note: If an information product is generate d from a program sponsored by an outside organization or uses data provide d by an outside organization, th e author’s Division or Office Chief should determine whether to send the pr oduct to the outside organization for an additional review. Sub-Requirement E3-3.3: All information products must undergo a statistical review and receive approval, even if the author believes the information product involves no statistical methodologies. The following table specifies who must perf orm the statistical review and approval: 1. Statistical Reviewers Type of Information Product Conference papers Reviewers who ha ve expertise in the statistical methodology or program discu ssed in the information product Note: Appendix E3-B provides a list of statistical review contacts for conference papers. Abstracts Author’s Division or Office Chief Note: If the Division or Office Chief determines that an abstract requires a more rigorous statistical review, stract to the appropriate he or she must refer the ab Research and Methodology A ssistant Division Chief (ADC). All other information products Resear ch and Methodology ADC of the program related to the topic of the information product Note: Appendix E3-B provides a list of statistical review contacts by topic/subject matter. the following requirements have been met: The statistical reviewer must verify that 2. a. The discussion of assumptions and limi tations is accurat e and appropriate. The description of the reliability of the data is accurate and complete. b. 103

114 c. to support any comparison statements, Statistical testing is performed correctly whether expressed directly or implied. ccurate and statistically sound. Calculations and equations are a d. The content, conclusions, and any recomme ndations on technical, statistical, or e. operational issues are supported by the me thodology used and the data presented. f. A source statement is included in the information product. (See Requirement E2-2, item 2 , in Statistical Quality Standard E2, Reporting Results .) g. Statistical uncertainty is appropriately conveyed. Comparison statements, such as historical comparisons, are appropriate. h. 3. The statistical reviewer must either approve the information product or provide the author with specific written instructions on issues to be revised. The statistical reviewer must review the information product again after the author 4. addresses any recommended revisions. If the reviewer and the author disagree on how the comments are addressed, they must inform their supervisors so that a resolution can be reached. Notes: (1) Media releases that do not contain estimates atistical or survey or discussions of st methodology need not undergo a st atistical review (e.g., media advisories such as the one ble for 2008 Income, Poverty and Health titled, “Census Bureau Releases Timeta Insurance Estimates and American Community Survey Data”). (2) Two types of geographic products need not undergo a statistical review: Thematic maps presenting data from the cen sus short form if th e underlying data have a) been reviewed and approved. b) Geographic reference products (e.g., refe rence maps, and documents showing lists and numbers of geographic entities and relationships between entities). Sub-Requirement E3-3.4: All information products invo lving methodologies other than statistical must undergo a methodologi cal review and receive approval. The review must be conducted by individuals with expertise in the methodologies used in 1. the information product (e.g., cognitive psyc hology, economics, demographic analysis, geographic information systems, or any other specialized methodology). The methodological reviewer must either appr ove the information pr oduct or provide the 2. author with specific written instru ctions on issues to be revised. 3. The methodological reviewer must review th e information product ag ain after the author addresses any recommended revisions. If the reviewer and the author disagree on how their supervisors so that a resolution can the comments are addressed, they must inform be reached. 104

115 Sub-Requirement E3-3.5: All information products must undergo a policy and sensitivity or Office Chief may not review by the author’s Division or Office Chief. The Division Chief delegate this review. Notes: provides a checklist developed by the Office of Analysis and Executive (1) Appendix E3-C nsitivity review. If the Division or Office Support (OAES) to assist in the policy and se Chief needs guidance on a specific issue, he or she may refer the issue to the Associate Director, the OAES, the Congre ssional Affairs Office (CAO), or the PIO, as appropriate. When the author is a Division or Office Ch ief or higher-level manager, the policy and (2) sensitivity review is at the discre tion of the author’s supervisor. Requirement E3-4: h or without a paper) to be delivered by Census All presentations (wit Bureau staff at meetings and c onferences open to the public (including advisory and data user meetings) must undergo a dry run rehearsal. A senior division manager (ADC or higher) must attend the dry run. 1. All reviewers must be i nvited to the dry run. 2. levant materials to Authors must provide copies of their presen tations and any other re 3. vance of the dry run. everyone invited, in ad Notes: (1) Presentations that have had a dry run and ar e simply being repeated at another venue do not need another dry run unless substantive changes have been made to the presentation. (2) The dry run is optional for Division or Offi ce Chiefs or higher and for Senior Technical (ST) employees, at the discre tion of their supervisors. Requirement E3-4.1: Authors of informal presentations (e.g., presentations without written remarks or audio-visual aids, in cluding unwritten discussant or pa nelist remarks) must review their approach with their Division or Office Chief. Note: When the author is a Division Chief, Office Chief, or other higher-level manager, this review is at the discreti on of the author’s supervisor. Requirement E3-5: The results of the review and appr oval of information products must be documented, either electronically or on paper, and the documentation retained according to division or directorate policies and procedures. Examples of documentation include:  Completed approval forms.  Approval e-mail messages from reviewers. 105

116 Appendix E3-A Event Participation Approval Form and Instructions Employees must submit the Event Particip ation Approval Form before making a firm commitment to participate in public events in order to:  Ensure that the Census Bureau maintains neut rality on all policy and political issues and avoids any appearance of partiality on those issues.  Keep the Deputy Director and the Executive Staff informed of staff participation in public events. Note : This form is not needed for the following conferences:  Joint Statistical Meetings (JSM) of the American Statistica l Association (ASA)  American Association of Pub lic Opinion Research (AAPOR)  International Conference on Establishment Surveys (ICES)  Population Association of America (PAA)  American Economics Association  International Statistical Institute (ISI)  Association of Ameri can Geographers (AAG) In addition, this form does apply to international events. Contact the Chief of the not at international events, including events in International Relations Office regarding attendance U.S. territories. Instructions The participating employee must: Provide the following information on the form: 1.  The name, sponsor, date, location, and description of the event.  The topic of the session or panel. Comple te a form for each session of a conference that has Census Bureau participants – one form for the entire conference is not sufficient if Census Bureau employees participate in more than one session.  The names and affiliations of all participants in the session or panel.  The audience anticipated. o Indicate whether the event is open to the public or limited to invited guests, organization members, a nd/or paying attendees. o Indicate whether media and congressional, federal, state, or local government representatives are expected to attend. 2. Obtain approval from their Division Chief. Submit the completed form to the Chief, International Relations Office. 3. 106

117 Event Participation Approval Form Participant Name: Date of Invitation: Date Submitted: Event Name and Location: Event Date: Event and Sponsor Description: (e.g., event theme, organization mission statement, website address) Description of Panel, Session, or Discussion Topic: Invited Participants: (Include affiliation of each participant) Audience: (Indicate whether the meeting is open to the public or limited to invited guests, organization members, or paying attendees) (Indicate whether media and congressional, federal, state, or local government representatives are expected to attend) Attached Materials: (If any) Division Chief Approval __________________ Submit completed forms to the Chief, International Relations Office 107

118 Appendix E3-B Statistical Review Contacts Statistical Review Contacts by Topic / Subject Matter Topic / Subject Matter Contact Census 2000 / 2010 Assistant Division Chief (ADC) for Sampling and Estimation (DSSD) ADC for ACS Statistical Design (DSSD) American Community Survey (ACS) SIPP) ADC for Sample Design and Estimation Demographic Surveys (e.g., CPS, NHIS, and (DSMD) Small Area Estimates Chief, Center for Statistical Research and Methodology Center for Administrative Records Chief, Administrative records data Research and Applications (CARRA) Chief, O ffice of Statistical Methods and Economic Programs data Research for Economic Programs (OSMREP) or appropriate Research and Methodology ADC International data, multiple data sources, or other data Appropriate ADC for Research and Methodology for the author’s directorate Statistical Review Contacts for Conferences Conference Contact Joint Statistical Meetings (JSM) Chief, Center for Statistical Research and Methodology American Association of Public Opinion Research Chief, Center for Statistical Research and (AAPOR) Methodology International Conference on Establishment Surveys Chief, Office of Statistical Methods and Research for Economic Programs (OSMREP) (ICES) ) Population Association of America (PAA ADC for Sample Design and Estimation (DSMD) thodology (FCSM) Chief, Center for Statistical Research and Federal Committee on Statistical Me Methodology Ch ief, Center for Economic Studies (CES) American Economics Association (AEA) Census Advisory Committee or Appropriate ADC for Research and National Academy of Sciences Methodology for the author’s directorate International Statistical Institute (ISI) Ch ief, Center for Statistical Research and Methodology Chief, Demographic Statistical Methods International Union for the Scientific Study of Population (IUSSP) Division (DSMD) Other international conferences, including the UN Appropriate ADC for Research and Economic Commission for Europe and Organization Methodology for the author’s directorate for Economic Cooperati on and Development Association of American Geographers ( AAG) Appropriate ADC for Research and Methodology for the author’s directorate SAS Users Appropriate ADC for Research and Methodology for the author’s directorate All other conferences Appropriate ADC for Research and Methodology for the author’s directorate 108

119 Appendix E3-C ist for Division and Office Chiefs Policy and Sensitivity Review Checkl This checklist should be used to determine the su itability for publication and release of Census Bureau information products. If the answer to any of the following questions formation product proposed is “yes,” then the in for publication/release must not be released un til the issue raised by the question has been resolved appropriately. 1. Is the information product inconsiste nt with the Census Bureau’s mission? 2. Would publication/release of the info rmation product compromise the Census Bureau’s ability to perform its mission? 3. Does the information product express view s on or discuss any of the following topics in an inappropriate manner or in a way that is inconsistent with laws, Commerce Department policies, or Census Bureau policies? notices, court cases, congressional Laws, regulations, Federal Register a. testimony, or policy statements or decisions pertaini ng to the Commerce Department or the Census Bureau. Examples include:  Sections of the Commerce Department’s Code of Federal Regulations.  Policies and Procedures Manual. Chapters of the Census Bureau’s  Census Bureau’s Data Stewardship Policies.  Census Bureau’s Information Technology Security Policies and Regulations. The Freedom of Information Act or the Privacy Act. b. c. Matters that are currently be ing investigated by Congress. Issues relating to privacy, confidentiality, data security, or access to and use of d. administrative records (including any issu es related to personally or business identifiable information or data breaches). e. Budget/appropriations issues. Any issue that is politically sensitive or that has been the subject of recent f. news articles, correspondence, hearings , or current or potential lawsuits. Examples of sensitive issues include:  Current poverty estimates. 109

120  Concerns about the American Community Survey (ACS).  Concerns about Local Update of Census Addresses Program (LUCA).  Concerns about the enumeration of sens itive populations (racial or ethnic populations, immigrants, the homeless, or Group Quarter’s (GQ) populations such as prisoners, residents of nursi ng homes, or college students).  Concerns about the enumeration of overseas Americans.  Concerns about statistical sampling or ad justment of decennial census counts. Sensitive historical issues like the in ternment of Japanese Americans or g. statistical adjustment of the decennial census. 4. Is it possible that releas e of the information product wi ll affect any national policy issues related to the topics it discusses? 5. Does the information product discuss matters related to sharing confidential Title 13 and/or Title 26 information/data in a way that suggests the sharing is inconsistent with laws, Census Bureau policies, or IRS policies? 6. Does the information product suggest or imply that the Census Bureau may be regulatory, or administrative activity of cooperating in any way with an enforcement, another government agency? An example would be a discussion of providing tabulations of public-use data to a federal law enforcement agency. It would be acceptable to discuss the Census Bureau’s policy to encourage the agency to perform the tabulat ions and to inform the agency that any tabulations provided by the Census Bureau are subjec t to public disclosure. 7. Does the information product discuss sp ecific contract/acquisitions issues or information in a manner that improperly disc loses commercial proprietary information or trade secrets? 8. Does the information product single out a particular group or cat egory of individuals to receive special treatment, consideration, or recognition (e.g., iden tifying key partners rt) in a manner that might compromise the who contributed to the decennial census effo Census Bureau’s ability to perform its mission? 9. Does the information product contain any subject matter or language that might be deemed offensive, insensitive, or inappropriate? 10. Does the information product lack the di sclaimer (if required) indicating that the information product represents the autho r’s views (on statistical, methodological, technical, or operational issues ) and does not necessarily re present the po sition of the , specifies when the Census Bureau? (Statistical Quality Standard E2, Reporting Results disclaimer is required.) 110

121 Note: If the disclaimer is required but missing, the author must add it before the information product may be published or released. 111

122 RELEASING INFORMATION Releasing Information Products F1 Appendix F1 : Dissemination Incident Report Providing Documentation to Support Tr F2 ansparency in Information Products Addressing Information Qu ality Guideline Complaints F3 Appendix F3 : Procedures for Correcting Information that Does Not Comply with the Census Bu reau’s Information Quality Guidelines 112

123 Statistical Quality Standard F1 Releasing Information Products The purpose of this standard is to establish quality criteria for releasing information Purpose: products. Statistical Policy Directive No. 3 and Statistical Policy Directive No. 4 describe The OMB’s requirements for notifying the public of the release of information products. The Census Bureau’s Product Release Notification Policy an d Policies and Procedures Manual (Chapter B- Clearance and Release of Public Information Materials 13 – ) describe procedures for notifying the PIO about information products to be released to the public. Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Note: Information Products , contains specific requirements about documentation that must be readily accessible to ensure transparency in informati on products released outside the Census Bureau. Scope: standards apply to all information products The Census Bureau’s statistical quality nd the activities that released by the Census Bureau a s, including products generate those product released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. Exclusions: In addition to the global exclusions listed in the Preface, (1) Requirements F1-2 and F1-3 of this standard do not apply to:  and similar information products. Professional papers, presentations,  clients (e.g., data file s and tabulations). Information products delivered to sponsors or this standard do not apply to: (2) Requirements F1-7 through F1-10 of  Professional papers, presentations, and similar information products. Key Terms : Coefficient of variation (CV) , coverage ratio , dissemination , estimate , information nonsampling error product , metadata , nonresponse bias , releases of information products , , response rate , , and sampling error . sample design Neither protected information nor ad ministratively restricted information Requirement F1-1: may be released outside the Census Bureau, ex cept as allowed under a pplicable federal laws (e.g., Title 13, Title 15, and the Confidential In formation Protection and Statistical Efficiency Act) and data-use agreements. Sub-Requirement F1-1.1: Throughout all processes associ ated with releasing information products, unauthorized release of protected inform ation or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additional provisions governing the use of the data (e.g., as may be specified in a memo randum of understanding or data-use agreement). .) Protecting Confidentiality (See Statistical Quality Standard S1 , 113

124 blic by the Census Bureau must be Requirement F1-2: Information products released to the pu released according to a dissemination plan that addresses: 1. What information product(s) are planned for release. 2. for all regular or recurring information The release schedule. The release schedule before January 1 products for the upcoming year must be published on www.census.gov of that year. (See OMB Statistical Policy Directive No. 4 .) The reviews and approvals needed before releasing the information products to the 3. public. The mode of release by the Census Bureau. 4. Requirement F1-3: ting information products, including Policies and procedures for dissemina those related to any planned data revisions or any corrections fo r data quality issues identified after an information product has been releas ed, must be documented and published on the Census Bureau’s Internet Web site. Requirement F1-4: eased outside the Census Bureau until Information products must not be rel reviews and approvals. (See they receive the appropriate , Statistical Quality Standard E3 Reviewing Information Products .) Embargoed news releases and data fi Requirement F1-5: les must not be released to the public by any means (including print, broa dcast, Internet, podcast, blogs, or in any other form) before release. (See the U.S. Census Bureau Embargo Policy.) the specified date and time of ith the Census Bureau’s statistical Requirement F1-6: Information products must comply w to be released outside quality issues in order quality standards and must be free of serious data the Census Bureau without restrictions. Serious data quality issues related to sampling error occur when the estimated 1. coefficients of variation (CV) for the majority of the key estimates are larger than 30 percent. Notes: (1) This requirement does not apply to s econdary estimates. For example, if the estimated month-to-month change is the ke y estimate, and the monthly estimates are to the estimated month-to-month change. secondary, the requirement applies only Statistical Quality Standard A1 , Planning a Data Program , provides requirements for (2) identifying key estimates. Serious data quality issues relate d to nonsampling error occur when: 2. a. All products: 1) The data suggest that the primary survey concepts are not clea rly defined or that measurement of the concepts failed for some reason. The key estimates are inconsistent w ith our base of knowledge about the 2) characteristic being estimated. 114

125 3) Issues that are serious enough to raise c oncerns about the accuracy of the data occur in sample design, sampling methods , questionnaire or forms design, data ocedures, or the underlying assumptions collection, data processing, estimation pr of a model. b. Products derived primarily from census or survey data: 1) Unit response rates for surveys or censuses, or cumulative unit response rates for panel or longitudinal surveys, are below 60 percent. xt wave in panel or longitudinal surveys Sample attrition from one wave to the ne 2) is greater than five percent. 3) Item response rates or total quantity response rates on key items are below 70 percent. Coverage ratios for population groups asso ciated with key estimates are below 70 4) percent. Combined rates for key estimates (e.g., computed as unit response  item 5)  coverage) are below 50 percent. response Notes: (1) These thresholds are provided because bias is often associated with low response rates or with low coverage ratios. If nonresponse bias analyses or other studies show that the bias associat ed with nonresponse is at an acceptable level, or that coverage error are effective, these steps taken to mitigate nonresponse bias or thresholds do not apply. (2) The Census Bureau conducts a few surveys that do not use probability samples. Generally, they are establishment surveys th at select the largest units in the target universe and do not attempt to collect data from the small units in the universe. For these surveys, the above thresholds do not apply. These surveys have serious data quality issues if the responding units do not comprise at least 70 percent of the target universe, based on the unit res ponse rate or the total quantity response rate, as appropriate. (3) Statistical Quality Standard D3 , Producing Measures and Indicators of Nonsampling Error , specifies requirements on computing response rates. Information products with data fr ee from the serious data quality Sub-Requirement F1-6.1: issues described in Requirement F1-6 may be released outside the Census Bureau with no restrictions, subject to confidentiality constraints. Sub-Requirement F1-6.2: Information products with data th at have any of the serious data quality issues in Requirement F1-6 may be rel eased outside the Census Bureau only under the restrictions described below. 1. Restrictions for information products with serious data quality issu es related to sampling error: The information product must: a. Note that the CV exceeds 0.30 fo r a majority of the key estimates. 115

126 b. Note that data users should exercise caution when using estimates with high sampling error. (e.g., aggregates of the estimates may be c. Indicate why the data are being released tes have extremely high magnitude or useful or the knowledge that the estima extremely low magnitude may be useful). 2. Restrictions for information products w ith serious data quality issues related to nonsampling error: a. Products that are Census Bureau publicat ions or regular or recurring products (i.e., Statistical Policy Directive No. 3 or Statistical Policy Directive products governed by ): No. 4 er before releasing the information 1) The program manager must obtain a waiv product. The information product must summarize a ny nonsampling error issues related to 2) Requirement F1-6, item 2a (1 through 3). If response rates, coverage ratios, or th e combined rates fall below the thresholds 3) in Requirement F1-6, item 2b: The key estimates affected must be identified. i. A table must be included that provides the response rates or coverage ratios ii. for key estimates in enough detail to al low users to evaluate how the issue may affect their use of the data. Other quantitative measures of the impact of the issue should be included to the extent feasible. The information product must include deta ils about the potential impact of the 4) quality issues on the data. e URL of the complete documentation on The information product must include th 5) the nonsampling error issues. b. Products released to sponsors: 1) The information product must summarize a ny nonsampling error issues related to Requirement F1-6, item 2a (1 through 3). e combined rates fall below the thresholds If response rates, coverage ratios, or th 2) in Requirement F1-6, item 2b: The key estimates affected must be identified. i. A table must be included that provides the response rates or coverage ratios ii. for key estimates in enough detail to al low users to evaluate how the issue may affect their use of the data. Other quantitative measures of the impact of the issue should be included to the extent feasible. 3) The information product must include de tails about the potential impact of the quality issues on the data. 4) The delivery of the product to th e sponsor must include the complete rror issues or a URL where the documentation on the nonsampling e documentation is accessible. 116

127 lications or are not regular or recurring c. Products that are not Census Bureau pub products (e.g., custom tabulations, data f iles, professional papers, working papers, technical reports, and similar products): Release to the public is not allowed, ex 1) cept as noted in item 2) below. The information product may be released only on request. If released on request, the information product must: Include this disclaimer: “ These data are being rele ased on request, despite i. concerns about their quality. The Cens us Bureau’s policy is not to withhold data that are available, unless re leasing such data would violate confidentiality requirements. The Ce nsus Bureau recommends using these data only for research or evaluation pur poses, and not to make statements about characteristics of th e population or economy b ecause they do not meet the criteria outlined in the Census Bureau’s Statistical Quality Standard: Releasing Information Product. ” Summarize the nonsampling error issues. ii. Include summary metadata describing the issues and the impact on the data. iii. Provide the URL of the complete documentation on the nonsampling error iv. issues. 2) Release is permitted only for informati on products whose purpose is not to report, analyze, or discuss char acteristics of the population or economy, but whose purpose is to:  Analyze and discuss data quality i ssues or research on methodological improvements, or to  Report results of evaluations or methodological research 3) External researchers at the Census Re search Data Centers may not have access to confidential data that are affected by seri ous data quality issues, except to analyze the data quality issues, including developing potential solutions. If the researcher has corrected the data quality issues and the Census Bureau has determined that the researcher’s solutions are appropriate, the revised data may be used for subject-matter (e.g., poverty) analyses. Requirement F1-7: be serious is suspected in a previously When a data quality issue that might nager must notify Census Bureau senior released information product, the program ma s been identified. At a minimum, the senior management of the issue immediately after it ha managers to be notified include: The Division Chief(s) responsible for the prog ram with the suspected data quality issue. 1. The Associate Director responsible for the prog ram with the suspected data quality issue. 2. Note: These senior managers will decide whet her the issue should be escalated to the Deputy Director and provide guidance on the appropr iate actions to take and the specific data quality issue. stakeholders or organizations to notify re garding the suspected 117

128 Requirement F1-8: e identified in a previously released When serious data quality issues ar information product, a notification must be dissem inated to alert the publi c. If the product was released to a sponsor, the notificatio ns must be made to the sponsor. The notification must be disseminated immediat ely after identifying a serious data quality 1. issue, even if the issue is not yet fully understood. If appropriate the data affected by the da ta quality issue must be removed from a. the Census Bureau’s Internet Web site at this time. mponents, with additional information that The notification must include the following co 2. facilitates understanding the issue and its effects as appropriate: a. A description of the issue. A description of what is known about the effect on the data. b. A description of what is known about the cause. c. A statement indicating the data have been removed until the issue has been fixed d. (if appropriate). e. Plans for addressing the issue. Expected release dates of revised products. f. If the notification is disseminated before the issue is fully understood, it must be updated 3. rstanding is achieved. when a more complete unde Note: Program managers must notify the re sponsible Division Chief(s) and Associate notifications to the public or sponsors. Director (Requirement F1-7) before making Requirement F1-9: Any serious error or data quality i ssue identified in a previously released information product must be addressed appropriately. Examples of appropriate actions to address se rious errors and data quality issues include:  Correct the error and re -release the product.  Release an “errata” document for the product, describing the error and the correction.  If it is not feasible to correct an error, release a description of the error and its likely effects on the program’s estimates and results.  If a data user or a sponsor reported the erro t and indicate when r, acknowledge the repor the issue is expected to be resolved. If the error wi ll not be corrected, respond and ed and what actions will be taken to address explain to the user why it will not be correct the error. Serious errors or data quality issues identified in a previously Sub-Requirement F1-9.1: released information product must be documen ted by completing the Dissemination Incident Report found in Appendix F1 and submitting it to the Quality Program Staff . Information products approved for release to the public must be published Requirement F1-10: on the Census Bureau’s Internet Web site and must adhere to the requirem ents of Section 508 of the U.S. Rehabilitation Act. 118

129 other ess, and e incident these nt (see to the QPS roducts release to to disseminating Statistical Quality Standard F1 with serious inaccuracies or is out of scope. However, p or on 301-763-6598 if you have any nt report, Table A, on page 2 and send it ented the release of an erroneous product. ease to the public, release to sponsors, or ther this information. mination of information products at preventative measures were taken as a result of the incide dent report in Tables B, C, and D and be prepared to discuss cident. Report these actions and the actual or planned dates of ation to promote understanding of the factors associated with th of the normal review process, it nerated by the Quality Program Staff for accuracy and completen 119 nt reports to the Program Associate Directors. approximately 2 weeks after receipt of the initial report. This report is not a substitute for your Directorate’s procedures for nt. Program managers complete this initial report. Appendix F1 is report is needed to identify common factors that contribute iew program managers to ga [email protected] hose incidents as required by Sub-Requirement F1-8.1 in Dissemination Incident Report . Do NOT report any specifics about disclosure avoidance procedures. adequate disclosure avoidance measures. rs or other quality-related problems (rel nt, complete the initial dissemination incide rror, outside the normal review process, prev help prevent future incidents. ter the review would be in scope. Contact the Quality Program Staff (QPS) at This report documents the nature of incidents involving the disse [email protected] If a product is sent to a sponsor and an error is found as part Table E, section II of the detailed dissemination incident report). After three months, the QPS will follow-up with you to find out wh provide comments to the QPS. implementation to the QPS when they contact you in the follow-up. The QPS will submit a summary report based on all the incide Determine and implement actions to prevent recurrence of the in at questions and answers when the QPS conducts the interview with you. After the interview, review the dissemination incident report ge The QPS will contact you to schedule an interview to be held Review and answer the questions in the detailed dissemination inci An initial report (see Table A) which documents the incide A detailed report (see Tables B, C, and D) which gathers inform Within one week of discovering the incide and its impact. The Quality Program Staff will interv Releases of information products with erro “Near misses” in which detection of an e other agencies within the Commerce Department). Releases (or “near misses”) of data without Please prepare a dissemination incident report for: 6. 5. 7. 4. 2. 3.   1.    released with errors identified af Note: The dissemination incident report has two parts: dealing with dissemination incidents. Purpose: quality-related problems and the factors associated with t (Releasing Information Products). The information gathered in th inaccurate information products and to questions regarding these instructions. Scope: Instructions:

130 the table below and email the completed table to the Quality 120 . Initial Dissemination Incident Report OF THE INCIDENT working papers, summary ) that generated the incident luding releasing information ation products affected by the ation products affected by the [email protected] prematurely) Other – please specify Reports (e.g., publications, brief, documentation, highlights) Microdata or summary files Tables News release incident. Select all that apply: d) e) b) c) incident. a) Inaccurate data released b) Incomplete data released c) Wrong file released d) Improper disclosure (inc e) Geographic error f) Display error (e.g., incorrect heading or symbol) g) Other – please specify a) Program Staff at A6. Incident type – Select from the following: A2b. Division (use acronym) A4a. Name of specific inform A3. Program or Survey Name (list all affected) TABLE A: DOCUMENTATION A2a. Directorate A1. Contact Information (name and phone number) A4b. Type(s) of specific inform Provide the following: A7. Date the incident was detected Please provide the following information regarding the incident in A8. Description of the problem(s A5. Description of the incident

131 121 OF THE INCIDENT e incident – Select from the cident (within the Bureau and generated the incident occurred nt (i.e., Is this the only al or Commerce lect all that apply: None of the above Bureau staff – within Division Other contact outside the Census Bureau – specify Commerce Undersecretary BEA Other - specify Other Census Bureau divisions – specify Other Bureau staff Deputy Director Your Associate Director Bureau staff – within Branch Your Division Chief User – public User – Congression User – sponsor following: d) a) a) outside) – Select all that apply: e) b) c) h) c) g) b) d) f) e) g) f) occurrence? Has it happened before?) c) Action not performed correctly b) Wrong action performed h) Other – please specify e) Wrong data file used a) Needed action not performed f) Wrong variables used g) Communication d) Action performed out of sequence A12. Who discovered / reported th A11. How the incident was detected A13. Who was notified of the in A9. Problem type(s) – Se Provide the following: A14. Frequency of the incide TABLE A: DOCUMENTATION A10. Date the problem(s) that

132 Count the time spent by anyone who participated in completing Table A. 122 OF THE INCIDENT ny cases were affected? By it take to complete this n to address the incident – ent of the severity of the at this point – e.g., How ma how much were the estimates overstated or understated?) Notified sponsor Other – specify Posted revised data Posted user note on web Removed data from web Notified selected users product? a) Select all that apply: table? a) Explain the basis of your assessment (What you know f) c) e) d) b) incident using a scale of 1 for minor to 5 for extremely severe Provide the following: A14.1 Did the incident affect multiple releases of a recurring TABLE A: DOCUMENTATION A17. How many person-hours did A15. Your preliminary assessm A16. The immediate actions take

133 iled report on the following pages and nd answers when the Quality Program Staff (Table A), please remember to 123 ***After completing the initial report review and answer the questions in the deta be prepared to discuss these questions a interviews you.***

134 . answer these questions when the Quality 124 tion will promote understanding of the incident and its impact. Count the time spent by anyone who participated in reviewing Tables B, C, and D Obtain details from program manager /NA Y/N Detailed Dissemination Incident Report mination incident report and be prepared to rror originated in the stions in Tables B, C, how many person-hours OF THE INCIDENT Editing Transmitting Other – specify Data entry / electronic capture Quality checks (i.e., e Other – specify Stakeholder input / concepts to measure Instrument development Pretesting / testing Imputation Frame and sample development / selection Interview mode / timing Interviewing Geographic processing QC check) Processes and Factors Associated with the Incident c) f) c) c) b) b) d) b) e) d) d) a) a) Planning / development Capturing and processing data a) Collecting / acquiring data it takes to review the que and D and prepare for the interview with the number of person-hours in Question D6. Quality Program Staff. Please record the Program Staff conducts the interview with you. This informa TABLE B: ORIGIN B1. B2. B0. Please keep track of Question B3. In what process did the incident originate? Review the questions in the detailed disse I.

135 125 Obtain details from program manager Y/N /NA rror originated in the OF THE INCIDENT Other – specify Geocoding User documentation Dissemination Record linkage Other – specify Other – specify Quality checks (i.e., e Disclosure avoidance Quality checks / review of products Report writing / Production of tables Data analysis Other – specify Coding Other – specify Quality checks / analyst data review Modeling / seasonal adjustment Variance estimation Weighting / post collection adjustment Creating Summary file Creating Microdata file Tabulation QC check) d) b) b) h) g) c) c) f) b) f) e) h) g) b) c) d) i) a) Other – specify e) a) a) Analyzing data / reporting results a) Producing estimates and measures Releasing information products Protecting Confidentiality B4. B7. Question B8. B5. TABLE B: ORIGIN B6.

136 126 Explain the factors that contributed to the incident Y/N /NA ED WITH THE INCIDENT not up-to-date? contribute to the incident: actors regarding existing Inadequate production procedures? Inadequate quality control Inadequate change control Inadequate version control procedures? were followed (e.g., a checklist)? Lack of tools to ensure that procedures A lack of documented procedures? Procedures that were Procedures that were not followed procedures? properly? procedures? a) c) b) b) d) a) c) d) Did inadequate or incomplete requirements for the processes where the problem occurred contribute to the incident? program needs contribute to the incident? Did the failure of requirements to reflect procedures contribute to the incident: Question Procedures C1. Did any of these f TABLE C: FACTORS ASSOCIAT C4. Requirements C3. C2. Did any of these factors

137 127 Explain the factors that contributed to the incident /NA Y/N ED WITH THE INCIDENT contribute to the incident: contribute to the incident: contribute to the incident: were not kept up-to- inappropriate or Computer programs were not run in the A lack of documented requirements? Incorrect versions of the computer files reflect the specifications? A lack of documented specifications? Computer programs did not accurately correct order? were used? Specifications that Specifications that were not followed? Requirements were not kept up-to-date? date? Requirements were not followed? a) c) a) b) a) methods)? (e.g., use of inappropriate analysis incorrect methods contribute to the incident b) b) c) Did the use of suboptimal methods Did the application of Did software errors contribute to the c) incident (e.g., mistakes in computer code)? occurred contribute to the incident? for the processes where the problem Computer Programming and Implementation C10. C9. Did any of these factors C7. Did any of these factors C8. Methods C6. Did inadequate or incomplete specifications Specifications C5. Did any of these factors C11. Question TABLE C: FACTORS ASSOCIAT

138 128 Explain the factors that contributed to the incident Y/N /NA ED WITH THE INCIDENT il to catch the error? e and carry risk of ffs contribute to the rform any of these quality during the application of statistical, content, and policy)? data collection)? Check the accuracy of data, results, etc. Monitor operations (e.g., monitoring information products (e.g., supervisory, Test systems to ensure that they function as intended? Perform the required reviews of the changes? Test and implement process or system Did failures in hando incident? d) c) a) contribute to the incident (e.g., methods e) that are labor intensiv b) introducing errors)? checks contribute to the incident: paste errors, data entry errors, manual manual or clerical methods? (e.g., copy and geographic edits, forgetting to perform step) Communication Internal to the Census Bureau Quality Control Question Communication C12. Did mistakes occur C14. C13.1 Did a quality check fa TABLE C: FACTORS ASSOCIAT C13. Did the failure to pe

139 129 Explain the factors that contributed to the incident Y/N /NA ED WITH THE INCIDENT ng contribute to the communications? requirements)? Failure to communicate responsibilities? Failure to communicate changes to the Failure to communicate institutional knowledge? (e.g., staff no longer work in program area and no documentation exists for staff taking over) Misinterpretation of requirements? people who need to know (e.g., changes Misinterpretation of procedures? Failure of other internal to procedures, specifications, or Misinterpretation of specifications? c) b) a) b) a) c) d) incident? Did inadequate traini contribute to the incident? Question TABLE C: FACTORS ASSOCIAT C17. C15. Did any of these communications failures incident? C16. Did any of the following factors contribute to the

140 130 Explain the factors that contributed to the incident /NA Y/N ED WITH THE INCIDENT buted to the incident? es contribute to the s contribute to the incident? contribute to the Excessive time constraints? Normal process (been done before) Change in the normal process Budget limitations? First time the process (or procedure) was performed not fully explained. discussion of these problems was Data released with known problems, but Data correct, but apparent anomalies were inadequate. b) Did misaligned skills of staff contribute to a) Did inadequate staff resources contribute to the incident? c) What other factors contri b) Did the experience levels of staff or managers contribute to the incident? a) the incident? Did conflicting prioriti incident? incident? For example, o o sponsors, and users) stakeholders (e.g., ESA, Commerce, Did insufficient communication to C23. C24: Please select one of the following: C22. C21. Other C20. C18. Communication with Entities Outside the Census Bureau C25. C19. Did any of these factor Question TABLE C: FACTORS ASSOCIAT Resources

141 . 131 Count the time spent by anyone who participated in reviewing Tables B, C, and D ributed to this it take to review the ENT OF THE INCIDENT the severity of the about the incident, what tional gaps are you aware and D and prepare for the for additional incidents? bilities that cont incident, what other opera level of geography was affected?) many sample cases were affected? By how much of that increase the risk were the estimates overstated or understated? What extremely severe? do you think you need to close those gaps (e.g., resources, software, and hardware)? incident using a scale of 1 for minor to 5 for is your final assessment of interview with the Quality Program Staff? questions in Tables B, C, incident and when were (will) they implemented? recurrence of the problems that resulted in the D5. (Answer if identified operational gaps in D4) What D4. Beyond the vulnera Question D6. How many person-hours did D1. Now that you know more D3. What actions were (will be) taken to prevent TABLE D: FINAL ASSESSM D2. Explain the basis for this assessment (e.g., How

142 find out what preventative measures were 132 f will follow-up with program managers to in the incident and when prior to release and when were (will) they implemented? of the problems that resulted were (will) they implemented? taken as a result of the incident. II. Follow-up After three months, the Quality Program Staf TABLE E: PREVENTATIVE MEASURES E2. What actions were (will be) taken to detect incidents Question E1. What actions were (will be) taken to prevent recurrence

143 Statistical Quality Standard F2 nsparency in Information Products Providing Documentation to Support Tra y the documentation that must be readily Purpose: The purpose of this standard is to specif accessible to the public to ensure transparency and reproducibility in information products released by the Census Bureau. The documentation required by this standard aims to provide sufficient transparency into the Census Bureau’s information products so that oduce the estimates and qualified users can repr results in the products. However, federal law (e .g., Title 13, Title 15, and Title 26) and Census identiality of protected information or Bureau policies require safeguarding the conf administratively restricted information. Theref ore, complete transparency and reproducibility may not always be possible. At a minimum, the documentation will allow users to assess the accuracy and reliability of the es timates and results in the Census Bureau’s information products. Note: Statistical Quality Standard F1 Releasing Information Products , addresses the required , s data quality problems and the likely effects documentation and metadata to describe any seriou the Census Bureau’s information products. of the problems on the data and estimates in Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau and the activities that generate those produ cts, including products released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. Exclusions: The global exclusions to the standards are listed in the Preface. No additional exclusions apply to this standard. Key Terms : Administratively restricted information data program , information product , , , qualified user , readily accessible , reproducibility , and transparency . protected information Requirement F2-1: Documentation that would breach the confidentiality of protected information or administratively restricted informat e data-use agreements ion or that would violat with other agencies must not be released. (See Statistical Quality Standard S1 , Protecting Confidentiality .) Requirement F2-2: Documentation must be readily access ible in sufficient detail to allow qualified users to understand a nd analyze the information a nd to reproduce (within the constraints of confidentiality requirements) and ev aluate the results. The documentation must be made readily accessible by doing one or more of the following: product if it is nece 1. Including the documentation in the information ssary for readers to understand the results. 133

144 2. Referencing the full methodological documen tation in the information product (e.g., reau’s Internet providing a URL) and publishing the documenta tion on the Census Bu Web site. Delivering the full methodological documentation to the sponsors of reimbursable 3. programs or providing them with a URL to the documentation. Note: The oduct Metadata Standard (GPMS) , and the Federal Census Bureau Geospatial Pr Geographic Data Committee (FGDC) Content Sta ndard for Digital Geospatial Metadata (CSDGM) provide additional requirements for geospatial products. Sub-Requirement F2-2.1: Descriptions of the data prog ram must be readily accessible. Examples of information that describes the data program include: census, evaluation study, or research).  The purpose of the program (e.g., survey,  sor(s) of the program. The organizational spon  The organization that conducted the program.  The data source (e.g., organization or agency ) and the database or systems from which the data are drawn for administrative records data. The universe of inference or targ et population for the program.  Sub-Requirement F2-2.2: Descriptions of the concepts, va riables, and classifications that underlie the data must be readily accessible. Examples of concepts, variables, and clas sifications that underlie the data include: concepts being measured.  Definitions of the primary  The wording of questions asked in surveys or censuses. e key variables.  Identification of th  ts underlying all variables. Descriptions of the concep  Geographic levels of the data.  The reference dates for the data and for the ge ographic levels.  Descriptions of any derived measures. Sub-Requirement F2-2.3: Descriptions of the methodology, including the methods used to collect and process the data and to produce estimates, must be readily accessible. of the methodology include: Examples of documentation  Discussion of methods employe d to ensure data quality.  Quality profiles. (See the Census Bureau Guideline on Quality Profiles .)  Documentation of pretesting of the data co llection instruments, including qualitative studies. Source and accuracy statement.   Description of the sampling frame.  Description of the sample design.  The size of the sample. Information on eligibility criter ia and screening procedures.  134

145  Description of sample weights, in cluding adjustments for nonresponse.  The mode and methods used to collect the data.  The dates of data collection.  Description of any bounding methods used to control telescoping. Description of estimation procedures, incl  ing, and imputation uding weighting, edit methods. Reasons for not imputing the data when im putation for item nonresponse is not carried  out.  Description of how to calcu late variance estimates.  Discussion of potential nonsampling errors (e .g., nonresponse, covera ge, processing, and measurement).  Discussion of the methods to approximate th e standard errors of derived statistics. Description of any substantial changes in procedures or methodology over time and the  known impact on the data.  References to methodological documentat ion maintained by the source organization supplying administrative records data. Model description, including a ssumptions and type of model.   Equations or algorithms used to generate estimates.  Description of seasonal adjustment methods . (See the Census Bureau Guideline on Seasonal Adjustment Diagnostics .) Description of small area estimation methods.   affecting the estimates or projections. Any limitations or data quality problems Descriptions of known data a nomalies and corrective actions.  Sub-Requirement F2-2.3.1: Measures and indicators of the qua lity of the data must be readily accessible. Examples of measures and indicators of the quality of the data include:  The disposition of sample cases (e.g., numbers of interviewed cases, ineligible cases, and nonresponding cases).  Unit response rates or quantity response rates.  Item response rates, item allocation rates, total quantity response rates, or quantity response rates for key data items.  Rates for the types of nonresponse (e.g., re cate, no one home, fusal, unable to lo temporarily absent, language problem, insuffi cient data, and undeliverable as addressed). Coverage ratios.  Indicators of the statistical precision of the estimates (e.g., estimates of sampling  variances, standard errors, coefficients of variation, or confidence intervals). the set of admini Coverage of the target population by strative records.   The proportion of administrative records that have missing da ta items or that contain invalid data for key variables.  The proportion of data items with edit change s because the data items were invalid or otherwise required changes. ysis or estimate due to nonmatches when  The proportion of records lost from the anal linking data sets. 135

146  Effects on the estimates related to coverage issues, nonmatches in record linking, and missing data items in surveys, censuses, or administrative records.  Model diagnostics (e.g., goodness of fit, coeffici ent of variation, and percent reduction in confidence interval of the direct estimates). Note: Statistical Quality Standard D3 , Producing Measures and Indi cators of Nonsampling Error , contains requirements on producing measur es and indicators of nonsampling error. The methodology and results of evaluations or studies of the Sub-Requirement F2-2.3.2: quality of the data must be readily accessible. Examples of evaluations or studies of the quality of the data include:  Nonresponse bias analyses.  Evaluation studies (e.g., eval uation studies of response e rror, interviewer variance, respondent debriefing, record check or validation, and mode effects).  Response analysis surveys. Comparisons with independent sources, if available.   Match analyses.  Reconciliations (e.g., a comparison of import and export data).  Periodic summaries of quality control results (e.g., interv iewer quality control (QC) results and error rates measured by data entry QC and coding QC). Note: Results of routine revi ews and verifications need not be readily accessible unless needed for data users to assess the quality of the information product. Documentation of public-use data files must be readily accessible in Sub-Requirement F2-2.4: sufficient detail to allow a qualified user to understand and work with the files. Examples of documentation of public-use data files include:  File description.  File format (e.g., SAS file or text file).  Variable names and descriptions (e.g., data dictionary or record layout).  Data type for each variable (e.g., numeric, alphanumeric, and length).  Description of variables used to uniqu ely identify records in the data file. Description of flags to indi cate missing and imputed items.  136

147 Statistical Quality Standard F3 Addressing Information Quality Guideline Complaints The purpose of this standard is to ensure that complaints alleging that information Purpose: products are not in compliance with the Census Bureau’s Information Quality Guidelines are addressed. Scope: standards apply to all information products The Census Bureau’s statistical quality released by the Census Bureau a nd the activities that generate those product s, including products released to the public, sponsors, joint partners, or other cust omers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other in dividuals who receive Census Bureau funding to develop and release Census Bureau information products. In particular, this standard applies to inform ation products released by the Census Bureau for which a party outside the Census Bureau alleges that the Census Bureau has not adhered to its information quality guidelines. Exclusions: In addition to the global exclusions listed in the Preface, this standard does not apply to:  Information released by the Census Bureau before October 1, 2002. releases of information products . Key Terms : Information products , information quality , and Complaints must be reviewed by the program manager responsible for the Requirement F3-1: information product being challenged. Note: The Census Bureau Information Quality Web site correction procedures contains the complainants must follow to submit complaints for information they believe does not comply Information Quality Guidelines . with the Census Bureau’s Requirement F3-2: Except as noted below, program ma nagers must follow the procedure in Appendix F3 to investigate and resolve complaints. Note: These programs have developed correction procedures specific to their information products and must follow their own correction pr ocedures. The appeals process, when not separately defined in the program’s procedures stated in Appendix F3. , will be managed as  Count Question Resolution (CQR).  Local Update of Census Addresses (LUCA).  Governmental Unit Boundaries.  Street and Address Range Information.  Small Area Income and Poverty Estimates (SAIPE).  Annual Estimates of th e Total Population.  Foreign Trade Statistics. 137

148 Corrected information must be readily accessible on the Census Bureau’s Requirement F3-3: Internet Web site ( www.census.gov ) and subsequent issues of recurring information products, including subsequent an nual reports, must reflect the corrected data. Note: Because the Information Quality Guid elines under which these corrections will occur correction of historical data leased after October 1, 2002, any are for statistical information re suggested by a complaint with which the Census Bureau concurs will be performed at the discretion of the program area. Requirement F3-4: In the case of a serious error that could potentially mi slead policy makers, any published reports containing the e rroneous data must be reissued. Requirement F3-5: Complaints and the resulting actions must be documented by the program manager and submitted to the Chair of the Methodology and Standards Council. 138

149 Appendix F3 ion that Does Not Comply with the Procedures for Correcting Informat Census Bureau’s Information Quality Guidelines complaints alleging that the Census Bureau The following procedures must be followed when on quality guidelines are received. has not adhered to its informati Note: These procedures do not apply to the se ven programs listed in Requirement F3-2 of Statistical Quality Standard F3 , Addressing Information Quality Guideline Complaints . Those programs follow their own correction procedures that are specific to their data products. The Census Bureau’s Quality Pr ogram Staff will notify the De partment of Commerce within 1. ten business days of receiving a complaint that alleges a violation of the information quality guidelines. The program manager must review: 2. The information being challenged in consul tation with the appropr iate methodology staff. a. The processes that were used to create and disseminate the information. b. Whether the information conforms or does not conform to the Census Bureau’s c. Information Quality Guidelines. Based on the outcome of the above review, the Census Bureau will determine if a correction 3. (or corrections) must be made. t, the responsible program manager will, with If the Census Bureau concurs with a complain 4. the concurrence of the area Associate Direct or in consultation with the Methodology and Standards Council, determine the appropriate corrective action, taki ng into account such factors as:  The nature of the information involved.  The significance and magnitude of the error w ith respect to the use of the information.  The cost of implementing the correction.  The effectiveness of the correct ion in terms of timeliness. The Census Bureau will respond in writing to th e affected person within 60 days of receiving 5. the complaint. ew, the response will expl ain the process that If the Census Bureau has completed its revi a. the complaint, the findings of the review, the Census Bureau followed in its review of and the resolution. b. If the Census Bureau has not completed its review, the response will notify the affected person that a review is underway, and provide an expected completion date. When the review is complete, the Census Bureau must again contact the affected person in writing, and explain the process that the Census Bureau followed in its review of the complaint, the findings of the review, and the resolution. If a correction is warranted, the response will include a progress repo rt, and a subsequent c. the correction action is complete. written response will be sent when 139

150 d. If a correction is not warranted , the Census Bureau will explai n that a correction will not be made, and why. If the Census Bureau declines to correct the ch allenged data, and the aff ected party appeals, a 6. panel appointed by the Methodology and Standard s Council will manage the appeal process. a. The Census Bureau will respond to all request s for appeals within 60 days of receipt. If the appeal requires more than 60 days to resolve, the Census Bureau will inform the b. appellant that more time is required, indica te the reason why, and provide an estimated decision date. 140

151 SUPPORTING STANDARDS S1 Protecting Confidentiality S2 Managing Data and Documents 141

152 Statistical Quality Standard S1 Protecting Confidentiality the confidentiality of protected information The purpose of this standard is to ensure Purpose: and administratively restricted information. Scope: The Census Bureau’s statistical quality standards apply to all information products nd the activities that generate those product released by the Census Bureau a s, including products released to the public, sponsors, omers. All Census Bureau joint partners, or other cust (SSS) individuals must comply with these standards; this employees and Special Sworn Status ndividuals who receive Census Bureau funding to develop and includes contractors and other i release Census Bureau information products. In particular, this standard applies to:  Data collected from respondents and protected under Title 13.  ential Information Protection and Statistical Efficiency Data protected under the Confid Act (CIPSEA). Data collected under Title 15 and protect ed by legislation governing sponsoring  agencies.  Administrative records provided by source ag encies, such as Federal Tax Information (FTI) protected under Title 13 and Title 26. Exclusions: The global exclusions to the standards are listed in the Preface. No additional exclusions apply to this standard. Administratively restricted information business identifiable Key Terms : , , bottom-coding information cell suppression , confidentiality , controlled rounding , controlled tabular adjustment , , disclosure , noise infusion , personally identifiable information , protected information , random , and rounding , recoding , swapping , synthetic data top-coding . All Census Bureau employees a nd SSS individuals must follow the Requirement S1-1: provisions of federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Information Technology (IT) Secur ity policies and Data Stewards hip policies, such as DS018 Data Breach Policy and DS022 ), and data-use agreements to Unauthorized Browsing Policy prevent unauthorized release of protected information and administratively restricted information. Sub-Requirement S1-1.1: Neither protected information nor administratively restricted information may be released outside the Census Bureau, except as allowed under applicable federal laws (e.g., Title13, Title 15, and CIPSEA) and data-use agreements. 142

153 Requirement S1-2: be used to prevent unauthorized Disclosure avoidance techniques must tively restricted information, particularly release of protected information and administra business identifiable information. personally identifiable information or voidance techniques include: Examples of disclosure a Random rounding.   Controlled rounding.  Top-coding.  Bottom-coding.  Recoding.  Data swapping.  Generating synthetic data.  Noise infusion.  Using rules to define sensitive cells (e.g., thresholds).  Protecting sensitive cells (e .g., cell suppression, random ro unding, controlled rounding, collapsing cells, and controlled tabular adjustment). Notes: Contact the Census Bureau’s Disclosu re Review Board (DRB) for guidance on (1) disclosure avoida nce techniques. Reviewing Information Sub-Requirement E3-1.1 of Statistical Quality Standard E3, (2) , addresses requirements for disclosure avoidance review. Products Report on Statistical Disclosure Limitation Statistical Policy Working Paper 22 : (3) Methodology , published by the Office of Manageme nt and Budget's Federal Committee rious techniques to prevent on Statistical Methodo logy, provides information on va disclosure of protected information. 143

154 Statistical Quality Standard S2 Managing Data and Documents Purpose: The purpose of this standard is to ensure that data and documentation internal to the files are retained, secured, and accessible to Census Bureau are appropriately managed (i.e., authorized users) to promote th e transparency and reproducibility of Census Bureau processes and products, and to inform future projects and improvement efforts. Statistical Quality Standard F2 , Providing Documentation to Support Transparency in Note: Information Products , contains specific requirements a bout documentation that must be readily accessible to the public to ensure transparency in information products released by the Census Bureau. Scope: standards apply to all information products The Census Bureau’s statistical quality released by the Census Bureau a s, including products generate those product nd the activities that released to the public, sponsors, joint partners, or other cust omers. All Census Bureau individuals must comply with these standards; this includes employees and Special Sworn Status dividuals who receive Census Bureau funding to develop and release contractors and other in Census Bureau information products. In particular, this standard applies to activit ies related to managing Census Bureau data and documentation needed to replicate results (e.g., models or survey estimates) from research and evaluation studies, surveys, censuse s, and administrative records. Exclusions: Preface. No additional exclusions The global exclusions to the standards are listed in the apply to this standard. , and Key Terms , protected information , reproducibility Administratively restricted information : transparency , and version control . Requirement S2-1: Throughout all processe s associated with managing data and documents, unauthorized release of protected information or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Information Technol ogy (IT) Security policies and Da ta Stewardship policies, such as DS007 Information Security Management Program ), and additional provisions governing the in a memorandum of unders use of the data (e.g., as may be specified tanding or data-use Statistical Quality Standard S1 Protecting Confidentiality .) , agreement). (See A plan for data and document manage ment must be developed that Requirement S2-2: addresses: 1. Individuals and divisions responsible for managing the data and documents. Data and documents to be managed. 2. Technical issues relevant to managing the data and documents (e.g., media, retention 3. file naming conventions, and periods, storage locations, user access rules, version control, inventory of files retained). 144

155 4. Special operations needed to store and access information (e.g., scanning, encrypting, or compressing data). Timetables for reviewing retained files to ve rify their usefulness and readability in the 5. stored format (e.g., every five years). Note: The Disposition of Federal Records: A Records Management Handbook provides guidance on establishing, managi sposition program within a ng, and operating a records di Federal agency. The Census Bureau Guideline on the Long-Term Backup of Research and and the ACSD records management Intranet page provide additional Evaluation Files guidance on managing data files. Requirement S2-3: Data and documentation needed to re plicate and evaluate program or research results must be retained according to Census Bureau policies (e.g., Census Bureau Records Schedules, Records Management Policies in the Census Bureau’s Policies and Procedures Manual, and division reements with providers of -level policies), data-use ag cords disposition and arch ival regulations (e.g., administrative records, and appropriate Federal re ). National Archives and Records Administration’s (NARA) statutes Examples of data and documentation to retain include:  Data files and descri ption of variables.  Planning and design decisions, including the OMB (Office of Management and Budget) Information Collection Request package.  Analysis plans.  Field test design and results.  Cognitive or usability testing results.  Sampling plan and justificati ons, including the sampling fram e used and any deviations from the plan.  Justifications for the items on the survey instrument, including why the final items were selected.  Instructions to responde nts and interviewers.  Description of the data collecti on and data processing methodologies.  Questionnaire images.  Description of the weighting and estimation methods, including vari ance estimation.  nd data editing methodologies. Description of the imputation a Specifications and computer code (e.g., specifications and code for sampling, editing,  weighting, imputation, analysis, variance estimation, and tabulation).  Description of models used for estimates and projections. Documentation of disclosu  re avoidance techniques.  Quality measures, including the equations and interpretations of the measures.  Evaluation reports, including special evaluati ons such as nonresponse bias analyses and interviewer variance studies. ciated with the release of data.  Publicly available documentation asso 145

156 Sub-Requirement S2-3.1: An inventory must be developed and maintained to allow authorized retained data and documents. users to identify and access the Note: The Census Bureau Guideline on the Long-Term Backup of Research and Evaluation Files provides information on producing an inventor y to explain retained data and documents to potential users. 146

157 WAIVER PROCEDURE Introduction The Census Bureau’s statistical quality standard s apply to all Census Bureau’s information products and the programs that develop and re lease those products, as described in the Scope in the Preface to these standards. If a progr statement am is not complying or anticipates that they may be unable to comply with any requirements of these standards, the program manager must apply for a waiver. This waiver procedure provides a consistent m echanism to excuse a program from compliance with a statistical quality standard. Waivers will be granted when the circumstances warrant it however, no waivers to Statistical Quality Standard S1 , Protecting Confidentiality , will be granted. This procedure promotes proper management a nd control in implementing the standards and ensures that appropriate documentation of exce ptions to the standards is generated and maintained. This documentation is important for providing transparency in to the quality of the Census Bureau’s information products and for inform ing future revisions of the statistical quality standards. Procedure 1. The affected program manager, in colla boration with the program area’s M&S Council Request for a Waiver of a waiver request using the form representative, must prepare a Statistical Quality Standard . The program manager must: Indicate the Program(s)/Information Produc t(s) to be exempted by the waiver.   Indicate the specific requirement(s) to be waived.  Describe the noncompliance issue  Describe any anticipated effects that may result from the noncompliance.  Explain why the program area is not able to comply with the specific requirements of the standard.  Describe any actions to be taken to mitigate the effects of noncompliance.  Describe the corrective actions planned to achieve compliance. Include milestones dates for key accomplishments including the date the program(s) / information product(s) will be brought into compliance 2. The program manager must em ail the waiver request to the Quality Program Staff to review for completeness and accuracy. 3. After correcting any issues noted by the Qu ality Program Staff, the program manager and M&S Council representative must sign the wa iver request and submit the completed waiver request to the subject matte r Division Chief for concurrence. 147

158 The Division Chief will review the waiver re quest and, if concurring, sign the request and 4. forward it to the Quality Program Staff . 5. The Quality Program Staff will schedule th e waiver for review by the M&S Council and, as appropriate, by additional stakeholders. 6. The M&S Council will review the waiver re quest and concur or not concur with the request, noting any recommendations regarding their position. The M&S Council will determine which Associate Director(s) need to approve the 7. waiver request, based on the requirements being waived and the Program(s) / Information Product(s) involved. 8 The Quality Program Staff will forward the waiver request and the Council's recommendation to the appropriate Associate Director(s) accountable for the quality of the Program(s) / Information Product(s). 9. The Associate Director(s) will approve or deny the waiver and retu rn the waiver request to the Quality Program Staff . the Program Manager, Division Chief, and the 10. The Quality Program Staff will ensure that M&S Council receive a copy of the approved or denied waiver request. The Quality Program Staff will publish a pproved waiver requests on the M&S Council 11. Intranet page. 12. The Quality Program Staff will maintain records of all waiver requests and their resolutions and use them to inform future revisions of the standard. 13. If the waiver is granted, the program ma nager must develop a co rrective action plan and implement the corrective actions described in the waiver request, within the timeline ctive actions will not be stated on the waiver request. If the corre implemented on time, another waiver must be requested. 14. The Quality Program Staff must follow-up on the implementation of the corrective action plan and periodically report on the progress to the M&S Council. 15. After the corrective action has been comple ted, the Quality Program Staff will notify the M&S Council and update the M&S Council Intranet page to indicate when the program came into compliance. Questions If you have questions regarding th e waiver procedure or whether a waiver is needed, contact the or the appropriate M&S Council representative. Quality Program Staff 148

159 Waiver for Quality Standard Requirement Affected Program(s) / Information Product(s): on Product(s) to be exempted by this waiver.> Noncompliance : Describe how the program area is or will not be in compliance. Anticipated effects : Describe any anticipated effects that may result from the noncompliance. Justification : Explain why the program area is not able to comply with the specific requirement. E2-2 #3a – There is limited screen real estate to Mitigating Actions: Describe any actions being taken to mitigate the effects of noncompliance. Corrective Action Plan: Describe the corrective actions planned to achieve comp liance. Include milestones dates for key accomplishments including the date when the Program(s) / Informa tion Product(s) will be brought into compliance. 149

160 Waiver for Quality Standard Requirement Program Manager: Date: Date: M&S Council Representative: > Subject Matter Division Chief: Does not Concur> Date: < Subject Matter Division Chief Name > Title > < Subject Matter Division Chief Methodology and Standards Council: Does not Concur> Date: > Chair, Methodology and Standards Council Associate Director: Denied> Date: 150</p> <p><span class="badge badge-info text-white mr-2">161</span> Statistical Quality Standards GLOSSARY -A- Accuracy of survey results refers to how closely the results from a sample can reproduce the results that would be obtained from a complete count (i.e., census) conducted using the same techniques at the same time. The difference between a sample result and the result from a nd at the same time is an indication of the complete census taken under the same conditions a precision of the sample result. Administrative records and administrative record data refer to micro data records contained in files collected and maintained by administrative or program agencies and commercial entities. Government and commercial entities maintain these files for the pur pose of administering programs and providing services. Administrative r ecords (e.g., Title 26 data) are distinct from atistical purposes, such as data from censuses systems of information collected exclusively for st and surveys that are collected unde r the authority of Titles 13 or 15 of the United States Code (U.S.C.). For the most part, the Census Bure au draws upon administrative records developed by federal agencies. To a lesser degree, it may use information from state, local, and tribal governments, as well as commercial entities. To obtain these data, the Census Bureau must adhere to a number of regulatory requirements. Administrative Records Tracking System (ARTS) is an electronic database on the Census The Bureau’s Intranet. It tracks Census Bureau administrative records agreements, agreement commitments, administrative data proj ects, and relevant external contacts. (as defined in Data Stewardship Policy DS007, Administratively restricted information consists of agency documentation that is not Information Security Management Program ) pre-release or embargoed public information. intended as a public information product and other Examples of administratively restricted information include:  ”For Official Use Only” (FOUO) information: Internal Census Bureau documentation consisting of program or operational materials (e .g., contracting, financial, budget, security, gement to be either protected under the legal, policy documents) determined by mana that release could ne gatively impact the Freedom of Information Act and/or of a nature mission of the Census Bureau.  Embargoed data or reports that have not been released, but meet Disclosure Review Board requirements for public release.  Proprietary contractor information, such as its cost proposal and labor rates.  All information not otherwise protected by statut ory authority, but that is subject to access and/or use restrictions, as pr ovided in a valid Agreement with the government agency or other entity supplyi ng the information.  All personally identifiable information (PII) not protected by an existing legal authority.  All business identifiable information (BII) no t protected by an existing legal authority. household or nearest neighbor Allocation involves using statistical pr ocedures, such as within- matrices populated by donors, to impute for missing values. 151</p> <p><span class="badge badge-info text-white mr-2">162</span> American National Standards are a standardized set of numeric Institute codes (ANSI codes) or alphabetic codes issued by the American National Standards Institute (ANSI) to ensure uniform identification of geographic entiti es through all federal government agencies. autocorrelation function of a random process describe s the correlation between the The processes at different points in time. is the pairing of data, primarily via computer software. Automated record linkage autoregressive integrated moving average (ARIMA) model is a generalization of an An autoregressive moving average or (ARMA) model for nonstationary time series. A nonstationary time series is a time series not in equilibrium about a constant mean level. In a nonstationary time series, the mean or variance of the series may not be the same at all time periods. The model is generally referred to as an ARIMA(p,d,q) model where p, d, and q are to zero and refer to the order of the autoregressive, integrated integers greater than or equal (differencing), and moving average parts of the model respectively. autoregressive moving average (ARMA) model is a stationary model of time series data An where the current data point and current stocha stic error are each m odeled as finite linear regressions of previous da ta points or stochastic errors respec tively. The regression for the data points is referred to as an autoregression. The regr ession for the stochastic errors is referred to as a moving average. Symbolically, the model is denoted as an ARMA (p,q) model where p and q are integers greater than or equal to zero and re fer to the order of the autoregressive and moving ry time series is a time series in equilibrium average parts of the model respectively. A stationa about a constant mean level. These models are fitted to time series data either to better understand the data or to predic t future points in the series. -B- Behavior coding of respondent/interviewer interactions involves systematic coding of the interaction between interviewers and respondents from live or tape d field or telephone interviews to collect quantitative information. When used for questionnaire assessm ent, the behaviors that are coded focus on behaviors indicative of a prob lem with the question, the response categories, or the respondent's ability to fo rm an adequate response. lue of an estimator Bias is the difference between the expected va and the actual population value. Blocking is grouping the records of a set into mutually exclusive, exhaustive pieces by using a set of fields (e.g., state, last name, first initial). Usually us ed in the context of record linkage. Bonferroni correction is a method used to address the prob lem of multiple comparisons. It is based on the idea that if an e xperimenter is testing n depende nt or independent hypotheses on a set of data, then one way of maintaining the fa mily-wise error rate is to test each individual would be if only one hypothesis hypothesis at a statistical significan ce level of 1/n times what it were tested. 152</p> <p><span class="badge badge-info text-white mr-2">163</span> Bottom-coding is a disclosure limitation technique that involves limiting the minimum value of re of individuals or ot a variable allowed on the file to prevent disclosu her units with extreme values in a distribution. bridge study continues an existing methodology conc urrent with a new methodology for the A purpose of examining the relationship between the new and old estimates. Business identifiable information is information defined in the Freedom of Information Act (FOIA) as trade secrets or commercial or financ ial information, that is obtained from a person representing a business entity, and which is pr ivileged and confidential (e.g., Title 13) and exempt from automatic release under FOIA. Also included is commercial or other information that, although it may not be exempt from release under the FOIA, is exempt from disclosure by Personally identifiable information . law (e.g., Title 13). Also see -C- The approach to estimation for finite populat ions consists of: (a) a computation of calibration weights that incorporate specified auxiliary information and are restrained by calibration equation(s); (b) the use of these weights to compute linearly weighted estimates of totals and other finite population parameters: weight times variable value, summed over a set of observed units; (c) an objective to obtai n nearly design unbiased estimat es as long as nonresponse and other nonsampling errors are absent. Cell suppression is a disclosure limitation technique where sensitive cells are generally deleted from a table and flags are insert ed to indicate this condition. census is a data collection that seeks to obtain data directly from all eligible units in the entire A target population. It can be considered a sample with a 100 percent sampling rate. The Economic Census may use administrative records da ta rather than interviews for some units. Census Bureau publications are information products that ar e backed and released by the Census Bureau to the public. “Backed and released by the Census Bureau” means that the Census Bureau’s senior management official s (at least through th e Associate Director responsible for the product) ha ve reviewed and approved the product and the Census Bureau affirms its content. Because publications do not contain personal view s, these information products do not include a disclaimer. Clerical record linkage is record matching that is primarily performed manually. gether on the basis of some cluster is a set of units grouped to well-defined criteria. For A example, the cluster may be an existing groupi ng of the population such as a city block, a hospital, or a household; or may be conceptual su ch as the area covered by a grid imposed on a map. numeric values so that the Coding is the process of categoriz ing response data using alpha responses can be more easily analyzed. 153</p> <p><span class="badge badge-info text-white mr-2">164</span> Coefficient of variation (CV) ulated by dividing the standard is a measure of dispersion calc so referred to as the relative standard error. deviation of an estimate by its mean. It is al Cognitive interviews are used as a pretesting technique consisting of one-on-one interviews nd out directly from respondents about their problems with the using a draft questionnaire to fi questionnaire. In a typical cognitive interv iew, respondents report aloud everything they are thinking as they attempt to answer a survey question. Computer-assisted personal interviewing (CAPI) is an interviewing technique similar to computer-assisted telephone interviewing, except that the interview takes place in person instead of over the telephone. The interviewer sits in front of a computer terminal and enters the answers into the computer. Computer-assisted telephone interviewing (CATI) is an interviewing technique, conducted using a telephone, in which the interviewer follows a script provided by a software application. The software is able to customize the flow of the questionnaire based on the answers provided, as well as information alrea dy known about the participant. confidence interval is a range of values determined in the process of estimating a population A parameter. The likelihood that the true value of the parameter falls in that range is chosen in advance and determines the length of the interval . That likelihood is calle d the confidence level. , where MOE Confidence intervals are displayed as (lower bound, upper bound) or as estimate ± MOE = standard error of the associated estimate (when the confidence level = 90%, z-value * z-value the = 1.645). Confidence level is the probability that an assertion about the value of a population parameter is correct. Confidence limits are the upper and lower boundaries of the confidence interval. involves the protection of personally identifiable information and business Confidentiality identifiable information fr om unauthorized release. Controlled rounding constrained to ha ve the sum of the is a form of random rounding, but it is published entries in each row and column equa l the appropriate published marginal totals. Controlled tabular adjustment is a perturbative method for stat istical disclosure limitation in tabular data. This method perturbs sensitive cell values until they are considered safe and then rebalances the nonsensitive cell values to restore additivity. convenience sample is a nonprobability sample, from wh ich inferences cannot be made. A Convenience sampling involves selecting the samp le from the part of the population that is not allowed for Census Bureau information convenient to reach. Convenience sampling is products. 154</p> <p><span class="badge badge-info text-white mr-2">165</span> Covariance cates the strength of relations hip between two variables. It is a characteristic that indi ations of two random variables, x and y, from is the expected value of the product of the devi their respective means. refers to the extent to which elements of the target populat ion are listed on the Coverage refers to the extent that elem ents in the population are on the Overcoverage sampling frame. undercoverage refers to the extent that elements in the population are frame more than once and missing from the frame. Coverage error which includes both undercoverage and overc overage, is the error in an estimate that results from (1) failure to include all units belongi ation or failure to ng to the target popul onduct of the survey (undercoverag include specified units in the c e), and (2) inclusion of some units erroneously either because of a defective fra me or because of inclusion of unspecified units or inclusion of specified units more than once in the actual survey (overcoverage). is the ratio of the populat ion estimate of an area or group to the independent coverage ratio A The coverage ratio is sometimes referred to as a coverage rate estimate for that area or group. and may be presented as percentage. ysis) form a class of research (also known as cross-sectional anal Cross-sectional studies methods that involve observation of some subset of a population of items all at the same time. nal and longitudinal studies is that cross- The fundamental difference between cross-sectio sectional studies take place at a single point in time and that a longitudinal study involves a Longitudinal survey . series of measurements taken on the sa me units over a period of time. See Cross-validation partitioning a sample of da ta into subsets such that is the statistical practice of the analysis is initially performed on a single su bset, while the other subset(s) are retained for subsequent use in confirming and validating the initial analysis. are tables prepared by the Census Bure Custom tabulations au at the request of a data user or program sponsor. This terminology does not ap ply to tables produced by Census Bureau software (e.g., FERRET or American Fact Finder). cut-off sample of the units in the population that have is a nonprobability sample that consists A the largest values of a key variable (frequently the variable of interest from a previous time period). For example, a 90 percent cut-off sample consists of the largest units accounting for at least 90 percent of the po pulation total of the key variable. Sample selection is usually done by sorting the population in decrea sing order by size, and including units in the sample until the percent coverage exceeds the established cut-off. -D- Data capture is the conversion of information provide d by a respondent into electronic format suitable for use by subs equent processes. 155</p> <p><span class="badge badge-info text-white mr-2">166</span> Data collection involves activities and processes that obtain data about the elements of a population, either directly by cont acting respondents to provide th e data or indirectly by using data sources. Respondents may be individuals or administrative records or other organizations. Data collection instrument refers to the device used to collect data, such as a paper questionnaire or computer as sisted interviewing system. on products, often on a regular schedule. data program is a program that generates informati A These programs include efforts such as the censuses and surveys that collect data from respondents. Data programs also include operati ons that generate information products from administrative records and operations that combin e data from multiple sources, such as various surveys, censuses, and administrative records. Specific examples of multiple source data programs include the Small Area Income a nd Poverty Estimates (SAIPE) program, the Population Division’s “Estimates and Projections” program, the Na tional Longitudinal Mortality Study, and the Annual Survey of Manufactures (A SM). One-time surveys also are considered data programs. Data-use agreements for administrative records are si gned documents between the Census Bureau and other agencies to acquire restricted st ate or federal data or data from vendors. These are often called Memoranda of Understanding (MOU). are calculated from other statistical measures. For example, population Derived statistics figures are statistical measures, but populat a derived quantity. ion-per-square-mile is design effect is the ratio of the variance of a statistic, obtained from taking the complex The sample design into account, to th e variance of the sta tistic from a simple random sample with the same number of cases. Design effects differ for different subgroups and different statistics; no single design effect is universally appli rvey or analysis. cable to any given su s out a difference between estimates. direct comparison is a statement that explicitly point A Direct estimates are estimates of the true values of the target populations, based on the sample design and resulting survey data collected on the variable of interest, only from the time period of interest and only from sample units in the domain of interest. Direct estimates may be ratio adjustment, hot or cold deck imputation, adjusted using explicit or implicit models (e.g., and non-response adjustment) to correct for nonresponse and coverage errors. Disclosure is the release of personally identifiab le information or business identifiable information outside the Census Bureau. Dissemination ored distribution of information to the means Census Bureau-initiated or spons public (e.g., publishing information products on the Census Bureau Internet Web site). Dissemination does not include distribution limited to govern ment employees or agency contractors or grantees; intra-agency or inter-agency use or sharing of government information; r the Freedom on Information Act, the Privacy and response to requests for agency records unde 156</p> <p><span class="badge badge-info text-white mr-2">167</span> Act, or other similar law. This definition also does not include distribution limited to s releases, archival records, public filings, correspondence with individuals or persons, pres subpoenas, or adjudicative processes. dress rehearsal is a complete test of the data co llection components on a small sample under A Field test . conditions that mirror the full-implementation. See -E- is the process of identifying and examining mi ssing, invalid, and inconsistent entries and Editing changing these entries according to predetermined rules, other data sources, and recontacts with respondents with the intent to produce more accu rate, cohesive, and comprehensive data. Some of the editing checks involve logical relationships that follow directly from the concepts and re or are obtained through the application of definitions. Others are more empirical in natu statistical tests or procedures. is data obtained from another source than the respondent, which have Equivalent quality data quality equivalent to data reported by the res pondent. Equivalent qu ality data have three possible sources: 1) data directly substituted fr om another census or survey (for the same reporting unit, question wording, an d time period); 2) data from ad ministrative records; or 3) data obtained from some other equivalent source that has been validated by a study approved by the program manager in collaboration with the appropriate Research a nd Methodology area (e.g., on (SEC) filings, and trade company annual reports, Secu rities and Exchange Commissi association statistics). is a numerical quantity for some characteri stic or attribute calculated from sample estimate An data as an approximation of the in the entire population. An true value of the characteristic estimate can also be developed from models or algorithms that combine data from various sources, including administrative records. Estimation is the process of using data from a survey or other sources to provide a value for an unknown population parameter (such as a mean, proporti on, correlation, or e ffect size), or to provide a range of values in the form of a confidence interval. Exploratory studies Feasibility studies ) are common methods for specifying and (also called evaluating survey content relative to concepts. In economic surveys, these studies often take the form of company or site visits. External users – see Users . -F- is properly called Paperless Fax Imagi ng Retrieval System (PFIRS). This Fax imaging collection method mails or faxes a paper instrument to respondents. The respondents fax it back matically turned into an image file. to the Census Bureau, where it is auto 157</p> <p><span class="badge badge-info text-white mr-2">168</span> (also called Exploratory studies ) are common methods for specifying and Feasibility studies evaluating survey content relative to concepts. In economic surveys, these studies often take the form of company or site visits. Field follow-up is a data collection procedure involv ing personal visits by enumerators to housing units to perform the operations such as, re solving inconsistent and/ or missing data items on returned questionnaires, conducting a vacant/del ete check, obtaining information for blank or missing questionnaires, and visiti ng housing units for which no que stionnaire was checked in. is a test of some of the procedures on a small scale that mirrors the planned full-scale field test A . Dress rehearsal implementation. See focus group is a pretesting technique whereby res pondents are interviewed in a group setting A to guide the design of a questi onnaire based on the respondent’s reaction to the subject matter and the issues raised dur ing the discussion. frame consists of one or more lists of th e units comprising the universe from which A respondents can be selected (e.g., Census Bureau employee telephone direct ory). The frame may employees). It may also miss elements that are include elements not in the universe (e.g., retired in the universe (e.g., new employees). frame population enumerated prior to the selection of a is the set of elements that can be The sample. -G- is the conversion of spatial informa tion into computer-readable form. As Geocoding such, geocoding, both the process and the con cepts involved, determines the type, scale, accuracy, and precision of digital maps. geographic entity is a spatial unit of any type, legal or statistical, such as a state, county, A place, county subdivision, census tract, or census block. geographic entity code (geocode) is a code used to identify a specific geographic entity. For A example, the geocodes needed to identify a censu s block for Census 2000 data are the state code, county code, census tract number, and block numbe r. Every geographic entity recognized by the Census Bureau is assigned one or more geogr aphic codes. "To geocode" means to assign an address, living quarters, establishment, etc., to one or more geographic codes that identify the geographic entity or entities in which it is located. describes the relationship generalized variance function is a mathematical model that A population total) and its corresp onding variance. Generalized between a statistic (such as a variance function models are used to approxima te standard errors of a wide variety of characteristics of the target population. Goodness-of-fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepanc y between observed values and the values 158</p> <p><span class="badge badge-info text-white mr-2">169</span> expected under a model. Such measures can be us ed in statistical hypothesis testing (e.g., to test mples are drawn from identical distributions, or for normality of residuals, to test whether two sa s follow a specified distribution). to test whether outcome frequencie graphical user interface (GUI) emphasizes the use of pictur es for output and a pointing A ereas a command line interface requires the user device such as a mouse for input and control wh to type textual commands and i nput at a keyboard and produces a si ngle stream of text as output. -H- Random variables are heteroscedastic if they have different variances. The complementary concept is called homoscedasticity. homoscedastic if they have the same variance. This is also known as Random variables are homogeneity of variance. The complement is called heteroscedasticity. rooms or a single housing unit is a house, an apartment, a mobile home or trailer, a group of A vacant, intended for occupa ncy as separate living room occupied as separate living quarters or, if quarters. The Census Bureau’s estimates progra m prepares estimates of housing units for places, counties, states, and the nation. Hypothesis testing draws a conclusion about the tenability of a stated value for a parameter. For example, sample data may be used to test whethe r an estimated value of a parameter (such as the fficiently different from zero that the null difference between two population means) is su , can be rejected in favor of the (no difference in the population means) hypothesis, designated H 0 (a difference between the two population means). alternative hypothesis, H 1 -I- An implied comparison between two (or more) estimates is one that readers might infer, either because of proximity of the two estimates in th e text of the report or because the discussion presents the estimates in a manner that makes it likely readers will compare them. For an implied comparison to exist between two estimates:  The estimates must be for similar subgroups that it makes sense to compare (e.g., two age subgroups, two race subgroups).  The estimates must be of the same type (e.g., percentages, rates, levels).  The subgroups must differ by only one characte ristic (e.g., teenage males versus teenage females; adult males versus adult females; teenage males versus adult males). If they differ by more than one characteristic an im plied comparison does not exist (e.g., teenage males versus adult females).  The estimates appear close enough to each other in the report that the reader would make a connection between them. Two estimates in the same paragraph that satisfy the first three criteria will always constitute an implied comparison. However, if the two estimates were in different sections of a report they would not c onstitute an implied comparison. Estimates presented in tables do not constitut e implied comparisons. However, if a table displays the difference between two es timates, it is a direct comparison. 159</p> <p><span class="badge badge-info text-white mr-2">170</span> Imputation is a procedure for entering a value for a sp ecific data item where the response is missing or unusable. Information products may be in print or electronic format and include news releases; Census Bureau publications; working papers reports); professional papers (including technical papers or (including journal articles, book chapters, c onference papers, poster sessions, and written discussant comments); ab stracts; research reports used to guide decisions about Census Bureau programs; presentations at public events (e .g., seminars or conferences); handouts for presentations; tabulations and cust om tabulations; public-use data fi les; statistical graphs, figures, and maps; and the documentation disseminated with these information products. Information quality is an encompassing term comprising utility, objectivity, and integrity. Integration testing is the phase of software testing in which individual software modules are combined and tested as a group. The purpose of integration testing is to verify functional, performance and reliability requirements placed on major design items. Integration testing can expose problems with the interfaces among progra m components before tr ouble occurs in real- world program execution. otection of the information from unauthorized refers to the security of information – pr Integrity access or revision, to ensure that the inform ation is not compromised through corruption or falsification. Internal users – see Users . Interviewer debriefing has traditionally been the primary me thod used to evaluate field or pilot tests of interviewer-administered surveys. In terviewer debriefing consis ts of group discussions or structured questionnai res with the interviewers who conducte d the test to obtain their views of questionnaire problems. item allocation rate An is the proportion of the estimated (wei ghted) total (T) of item t that was imputed using statistical procedures, such as est neighbor matrices within-household or near populated by donors, for that item. occurs when a respondent provides so me, but not all, of the requested Item nonresponse information, or if the reported information is not useable. -J- Joint partners refers to projects where both the Ce nsus Bureau and another agency are collecting the data together , but for their own use. It is a co llaborative effort to reduce overall costs to the government and increase efficiency. 160</p> <p><span class="badge badge-info text-white mr-2">171</span> -K- Key from image (KFI) is an operation in which keyers enter questionnaire responses by referring to a scanned image of a questionnaire for which entries could not be recognized by optical character or optical mark recognition with sufficient confidence. Key from paper (KFP) is an operation in which keyers ente r information directly from a hard- copy questionnaire that could not be read by optical character or optical mark recognition with sufficient confidence. Key variables are geography, demographic attributes, main classification variables (e.g., economic attributes, industry etc.) of units to be studied. -L- is a method for estimating one or mo re components of the mean squared Latent class analysis error or an estimator. is a method that models a parametr ic relationship between a dependent Linear regression variable Y, explanatory variables Xi, i = 1, ..., p, and a random term ε . This method is called "linear" because the relation of the response (t he dependent variable Y) to the independent variables is assumed to be a lin ear function of the parameters. – see Record linkage . Linking em or device and measuring its response. is the process of putting demand on a syst Load testing Load testing generally refers to the practice of modeling the expected usage of a software program by simulating multiple users accessing the program concurrently. Logistic regression occurrence of an event. tion of the probability of is a model used for predic ar function of the parameters using explanatory It models the logit of the probability as a line , i = 1, ..., p. variables X i longitudinal survey is a correlational research study th at involves repeated observations of A the same items over long periods of time, often many decades. Longitudinal studies are often used in psychology to study developmental trends across the life span. The reason for this is that unlike cross- sectional studies, longitu dinal studies track the same unit of observation, and therefore the differe nces observed in those pe ople are less likely to be the result of cultural di fferences across generations. -M- is a method of data collection in wh ich the U.S. Postal Service delivers Mail-out/mail-back dents are asked to complete and mail the addressed questionnaires to housing units. Resi questionnaires to a specified data capture center. 161</p> <p><span class="badge badge-info text-white mr-2">172</span> The (MOE) an estimate at a given level of margin of error is a measure of the precision of error, the less confidence one should have that confidence (e.g., 90%). The larger the margin of the "true" figures; th at is, the figures for the whole population. the reported results are close to Master Address File (MAF)/Topologically Integrated Geographic Encoding and Referencing (TIGER) geographic database in which the is a topologically integrated cation, connection, and relative re topological structures define the lo lationship of streets, rivers, railroads, and other features to each other, a nd to the numerous geographic entities for which the Census Bureau tabulates data for its censuses and sample surveys. Matching – see . Record linkage Measurement error is the difference between the true va lue of the measurement and the value obtained during the measurement process. Metadata are data about data. Metadata are used to facilitate the understanding, use and An item of metadata may describe an individual datum or content item, or management of data. a collection of data including multiple content items. Methodological expert reviews are independent evaluations of an information product conducted by one or more technical experts. Thes e experts may be within the Census Bureau or . Peer reviews outside the Census Bureau, such as advisory committees. See also file includes the detailed information a bout people or establishments. Microdata microdata A come from interviews and administrative records. A is a formal (e.g., mathematical) description of a natural system. The formal system is model the natural system consists of some collection of observable and governed by rules of inference; latent variables. It is presumed that the rules of inference governing the formal system mimic in some important respect the causal relations that g overn the natural system (e.g., the formal laws of arithmetic apply to counting persons). Model validation involves testing a model’s predictive capabili ties by comparing the model results to “known” sour ces of empirical data. Monte Carlo simulation is a technique that converts uncerta inties in input variables of a model into probability distributions. By combining the distributions and ra ndomly selecting values from them, it recalculates the simulated model many times and brings out the probability of the output. multi-stage sampling , a sample of clusters is selected and then a subsample of units is In selected within each sample cluster. If the s ubsample of units is the last stage of sample selection, it is called a two-stage design. If the subsample is also a cluster from which units are again selected, it is called a three-stage design, etc. 162</p> <p><span class="badge badge-info text-white mr-2">173</span> Multicollinearity of a high degree of linear correlation is a statistical term for the existence riables in a multiple regression amongst two or more explanatory va model. In the presence of multicollinearity, it is difficult to assess the effe ct of the independent variables on the dependent variable. is a generic term for many methods of analysis that are used to investigate Multivariate analysis or more variables. relationships among two -N- Noise infusion is a method of disclosure avoidance in which values for each establishment are perturbed prior to table creation by applying a random noise multiplier to the magnitude data (e.g., characteristics such as first-quarter payrol l, annual payroll, and number of employees) for each company. Nonresponse means the failure to obtain information from a sample unit for any reason (e.g., no and Item Unit nonresponse one home or refusal). There are two types of nonresponse – see . nonresponse Nonresponse bias lue of an estimate from the population is the deviation of the expected va parameter due to differences between res pondents and nonrespondents. The impact of nonresponse on a given estimate is affected by bo th the degree of nonresponse and the degree that the respondents’ reported va lues differ from what the nonres pondents would have reported. mates caused by differences between is the overall error observed in esti Nonresponse error respondents and nonrespondents. component and nonresponse bias. It consists of a variance is an operation whose objective is to obtain completed questionnaires Nonresponse follow-up d not have a completed questionnaire in mail from housing units for which the Census Bureau di areas (mailout/mailback, update/l eave, and urban update/leave). subsampling is a method for reducing nonrespons e bias in which new attempts Nonresponse sampling units that di d not provide responses are made to obtain responses from a subsample of to the first attempt. Nonsampling errors are survey errors caused by factors other than sampling (e.g., nonsampling errors include errors in cove non-response errors, fa ulty questionnaires, rage, response errors, interviewer recording errors, and processing errors). North American Industry Cla ssification System (NAICS) is the standard used by Federal The statistical agencies in classi fying business establishments for the purpose of collecting, analyzing, and publishing statisti cal data related to the U.S. bus iness economy. Canada, Mexico, and the U.S. jointly developed the NAICS to provide new comparability in statistics about business activity across North America. NAICS c oding has replaced the U. S. Standard Industrial www.census.gov/epcd/www/naics.html Classification (SIC) system (f or more information, see ). 163</p> <p><span class="badge badge-info text-white mr-2">174</span> -O- Objectivity focuses on whether information is accurate , reliable, and unbias ed, and is presented in an accurate, clear, complete, and unbiased manner. Optical character recognition (OCR) is a technology that uses an optical scanner and computer software to “read” human handwriting and convert it into electronic form. Optical mark recognition (OMR) optical scanner and computer is a technology that uses an software to recognize the presence of marks in predesignated areas and assign a value to the mark depending on its specific lo cation and intensity on a page. Outliers in a set of data are values that are so far removed from ot her values in the distribution that their presence cannot be attributed to the random combination of chance causes. -P- The p -value is the probability that the observed value of the test statistic or a value that is more is true, is obtained. lculated when H rnative hypothesis, ca extreme in the direction of the alte 0 Parameters are unknown, quantitative measures (e.g., tota l revenue, mean revenue, total yield or number of unemployed people) for the entire po pulation or for specified domains that are of interest. A parameter is a constant in the equation of a curve that can be varied to yield a family of similar curves or a quantity (such as the mean, regression coefficient, or variance) that can be estimated by calculations from sample data. characterizes a sta tistical population and that Participation an active role in the event. means that the employee takes is an independent evaluation of an in formation product conducted by one or more A peer review technical experts. Personally identifiable information refers to any information about an individual maintained by the Census Bureau which can be used to disti nguish or trace an individual’s identity, such as their name, social security number, date and plac e of birth, biometric r ecords, etc., including any Business or linkable to an i ndividual. Also see other personal information which is linked . identifiable information . The Census Bureau’s status policy views Census Bureau information products must not contain to absolutely refrain from taki as a statistical agency requires us ng partisan political positions. Furthermore, there is an important distinction between producing data and using that data to advocate for program and policy changes. The Ce nsus Bureau’s duty is to produce high quality, relevant data that the nation’s policy makers can use to formul ate public policy and programs. The Census Bureau should not, however, insert it self into a debate about the program or policy implications of the statistics it produces. We produce poverty st atistics; we do not advocate for programs to alleviate poverty. Population estimates (post-censal or in tercensal estimates) are prepared for demographic groups are developed from separate measures of the and geographic areas. These estimates usually 164</p> <p><span class="badge badge-info text-white mr-2">175</span> components of population change (births, deaths, and net in ternational domestic net migration, d with other methodologies in the absence of migration) in each year but may be supplemente current measures of components. Post-stratification is applied to survey data by stratify ing sample units after data collection using information collected in the survey and auxiliary information to adjust weights to population control totals or for nonresponse adjustment. Precision of survey results refers to how closely th e results from a sample can be obtained across repeated samples conducted using the same t echniques from the same population at the same time. A precise estimate is stable over replications. Pretesting fferent techniques for identifying problems is a broad term that incorporates many di t, order/context effects, skip for both respondents and interviewers with regard to question conten . instructions, and formatting are clusters of reporting units selected in the first stage of a Primary sampling units (PSU) multi-stage sample. Probabilistic methods for survey sampling are any of a va riety of methods for sampling that give a known, non-zero probability of selection to each member of the frame. The advantage of probabilistic sampling methods is that sampling error can be calculated without reference to a sampling, systematic sampling, and stratified model assumption. Such methods include random sampling. probability of selection (frame) unit will be drawn in a is the probability that a population The ility is the number of elements drawn in the sample. In a simple random selection, this probab sample divided by the number of elements on the sampling frame. Probability sampling is an approach to sample select ion that satisfies certain conditions: 1. We can define the set of samples that are possible to obtain with the sampling procedure. 2. A known probability of selection is associated with each possible sample. 3. The procedure gives every element in the population a nonzero probability of selection. 4. We select one sample by a random mechanism under which each possible sample receives exactly its probability of selection. project A is a temporary endeavor undertaken to crea te a unique product, service, or result. projection is an estimate of a future value of a characteristic based on trends. A Protected information (as defined in Data Stewardship Policy DS007, Information Security Management Program ) includes information about indivi duals, businesses, and sensitive statistical methods that are prot ected by law or regulation. The Census Bureau classifies the following as protected information: Individual census or survey responses.  165</p> <p><span class="badge badge-info text-white mr-2">176</span>  Microdata or paradata, contai ning original census or surv ey respondent data and/or the disclosure avoi administrative records data that do not meet dance requirements.  Address lists and frames, including the Master Address File (MAF). Pre-release Principal Economic Indicators and Demographic Time-Sensitive Data.  Aggregate statistical in formation produced for internal use or research that do not meet the  Disclosure Review Board disclosu re avoidance requirements, or that have not been reviewed and approved for release.  Internal use methodological documentation in su pport of statistical products such as the primary selection algorithm, swapping rates, or Disclosure Review Board checklists. All personally identifiable information (PII) prot ected by an existing legal authority (such as  Title 13, Title 15, Title 5, and Title 26). All business identifiable information (BII) protected by an existing legal authority.  public event means that the event is open to the genera l public, including events that require a A registration fee. -Q- A qualified user is a user with the expe rience and technical skills to meaningfully understand and analyze the data and results. For example, a qualified user of direct estimates produced from riance estimation, and hypothesis testing. samples understands sampling, estimation, va is the proportion of the estimated (weighted) total (T) of data item t quantity response rate A d as a percentage). [N ote: Because the value reported by tabulation units in the sample (expresse of economic data items can be negative (e.g., inco me), the absolute value must be used in the numerators and denominator s in all calculations.] questionnaire is a set of questions designed to co llect information from a respondent. A A or respondent-completed, using paper-and-pencil questionnaire may be interviewer-administered methods for data collection or comp uter-assisted modes of completion. -R- Raking is a method of adjusting sample estimates to known marginal totals from an independent source. For a two-dimensional case, the proce dure uses the sample we ights to proportionally adjust the weights so that the sample estimates ag ree with one set of marginal totals. Next, these adjusted weights are proportionally adjusted so that the sample estimate s agree with the second ment process is repeated enough times until the set of marginal totals. This two-step adjust sample estimates converge simultaneously to both sets of marginal totals. In random rounding, cell values are rounded, but instead of using standard rounding conventions a random decision is made as to whether they will be rounded up or down. is a method of estimating from sample data. In ratio estimation , an auxiliary Ratio estimation mple. The population total X of the is obtained for each unit in the sa , correlated with y variate x i i creased precision by taki ng advantage of the must be known. The goal is to obtain in x i 166</p> <p><span class="badge badge-info text-white mr-2">177</span> ^ y   , , is and x . The ratio estimate of Y, the population total of y X Y  correlation between y R   i i i x   where y and x are the sample totals of y and x respectively. i i Readily accessible means that users can access the documentation when they need it, not that it is only available on request. Recoding technique that involves co llapsing/regrouping detail is a disclosure limitation the resulting categories are safe. categories of a variable so that Record linkage is the process of linking or matching two or more records that are determined to refer to the same person or establishment. is a statistical method which tries to pred ict the value of a ch aracteristic by studying Regression more other characteristics. its relationship with one or A regression model is a statistical model used to depict the relationshi p of a dependent variable . to one or more independent variables Reimbursable projects receives payment (in part or in are those for which the Census Bureau total) from a customer for pr oducts or services rendered. Reinterview is repeated measurement of the same unit intended to estimate measurement error tect and deter falsific ation (quali ty control (response error reinterview) or designed to de reinterview). release phase refers to the point in the statistical process where you release the data. It may A be to the public, the sponsor, or any ot her user for whom the data was created. Releases of information products are the delivery or the dissem ination of information products to government agencies, organizations, sponsors, or individuals outside the Census Bureau, including releases to the public. Replication methods are variance estimation methods that take repeated subsamples, or estimate for each replicate, and then compute replicates, from the data, re-compute the weighted the variance based on the deviations of these repl icate estimates from the full-sample estimate. ect the variability due to the sample design. The subsamples are generated to properly refl Reproducibility means that the information is capable of being substantially reproduced, subject to an acceptable degree of imprecision. For info rmation judged to have more (less) important impacts, the degree of imprecision th eased). If the Census Bureau at is tolerated is reduced (incr applies the reproducibility test to specific types of original or supporting data, the associated guidelines shall provide relevant definitions of reproducibility (e.g ., standards for replication of laboratory data). With respect to analytic resu lts, “capable of being substantially reproduced” means that independent analysis of the original or supporting data using identical methods would imprecision or error. an acceptable degree of generate similar analytic results, subject to 167</p> <p><span class="badge badge-info text-white mr-2">178</span> A residual is the observed value minus the predicted value. Respondent burden is the estimated total time and financial resources expended by the respondent to generate, maintain, retain, a nd provide census or survey information. Respondent debriefing involves using a structured questionnaire is a pretesting technique that following data collection to elicit information about respondents' interpretations of survey questions. A response analysis survey is a technique for evaluating ques tionnaires from the perspective of the respondent. It is typi conducted after a respondent has cally a respondent debriefing completed the main survey. Response error is the difference between the true answ er to a question and the respondent's answer. It may be caused by the respondent, th e interviewer, the ques tionnaire, the survey procedure or the interac tion between the respondent and the interviewer. A response rate measures the proportion of the selected sample that is represented by the responding units. are regARIMA modeling and seasonal is a stability diagnostic to comp Revisions history time spans. History analysis be adjustment results over lengthening gins with a shortened series. the regARIMA model and Series values are added, one at a time, and seasonal adjustment are reestimated. Comparing different sets of adju stment options for the same series may indicate that one set of options is mo re stable. Among adjustment opt ions whose other diagnostics in fewer large revisions, that is, fewer large indicate acceptable quality, options that result changes as data are added, usually are preferred. -S- The sample design describes the target popula tion, frame, sample size, and the sample selection methods. sample size elements selected for the sample, is the number of population units or The on and available budget for observing the selected determined in relation to the required precisi units. A sample of the population. sample survey is a data collection th at obtains data from a sampled population is the collection of all possible ob servation units (objects on which The measurements are taken) that might have been chosen in the sample. For example, in a presidential poll taken to determine who people w ill vote for, the target population might be all registered voters who persons who are registered to vote. The sample d population might be all can be reached by telephone. 168</p> <p><span class="badge badge-info text-white mr-2">179</span> Sampling ng a segment of a population to observe and facilitate the is the process of selecti ing of interest about the populat estimation and analysis of someth ion. The set of sampling units selected is referred to as the sample. If all the units are selected, the sample is referred to as a census. is the uncertainty associated with an estimate that is based on data gathered Sampling error from a sample of the population rather than the full population. sampling frame is any list or device that, for purposes of sampling, de-limits, identifies, and A allows access to the sampling units, which contai n elements of the frame population. The frame may be a listing of persons, housing units, businesse s, records, land segments, etc. One sampling frame or a combination of frames may be tire frame population. used to cover the en Sampling units are the basic components of a sampling frame. The sampling unit may contain, for example, defined areas, houses, people, or businesses. Sampling weight is a weight assigned to a given samp ling unit that equals the inverse of the unit's probability of being include d in the sample and is determined by the sample design. This weight may include a fact or due to subsampling. Sanitized data , used for testing, may be totally fictitious or based on real data that have been altered to eliminate the ability to identify the information of any entity represented by the data. ession analysis to in a linear regr Scheffé's method is a method for adjusting significance levels account for multiple comparisons. It is particular ly useful in analysis of variance, and in gressions involving basis functions. Scheffé's constructing simultaneous confidence bands for re method is a single-step multiple comparison procedure which applies to the set of estimates of all possible contrasts among the factor level means, not just the pairwise di fferences considered by the Tukey method. is the amount of value assigned when a pa ir of records agree or disagree on the scoring weight A same matching variable. Each matching variable is assigned two scoring weights --- a positive weight for agreement and a nega tive weight for disagreement. After comparing all matching variables on a matching variable by matching variable basis, the resu lting set of assigned weights are added to get a total score records with scores above a for the total record. Pairs of predetermined cut-off are classified as a match; scores below a second pairs of records with predetermined cut-off are clas sified as a non-match. Seasonal adjustment is a statistical technique that cons ists of estimating seasonal factors and applying them to a time series to remove the seasonal variations in the estimates. Sensitivity analysis is designed to determine how the va riation in the output of a model (numerical or otherwise) can be apportioned, qualitatively or quan titatively, to changes in input parameter values and assumptions. This type of analysis is useful in ascertaining the capability of a given model, as well it s robustness and reliability. 169</p> <p><span class="badge badge-info text-white mr-2">180</span> Sequential sampling is a sampling method in which samples are taken one at a time or in result of their measurements (as assessed successive predetermined groups, until the cumulative accept or reject the population or to continue against predetermined limits) permits a decision to sampling. The number of observations required is not determined in advance, but the decision to terminate the operation depends, at each stage, on the results of the previous observations. The on after a certain number plan may have a practical, automatic terminati of units have been examined. refers to the probability of rejecting a true null hypothesis. Significance level Simple random sampling (SRS) is a basic probability selection scheme that uses equal probability sampling with no strata. skip pattern in a data collection inst rument is the process of skipping over non-applicable A questions depending upon the an swer to a prior question. Sliding spans diagnostics are seasonal adjustment stabil ity diagnostics for detecting adjustments that are too unstable. X-12-ARIMA creates up to four overlapping subspans of the time series, seasonally adjusts each span, then compares the adjustments of months ( quarters with quarterly data) common to two or more spans. Months ar e flagged whose adjustments differ by more than a certain cutoff. (The default cutoff is 3% fo r most comparisons.) If too many months are flagged, the seasonal adjustment is rejected for being too unstable. The series should not be adjusted unless other software options are found that lead to an adjustment with an acceptable number of flagged months. Sliding spans diag nostics can include comparisons of seasonally adjusted values, seasonal factors, trading day fa ctors, month-to-month changes and year-to-year changes. (Year-to-year change results are not used to accept or reject an adjustment.) Small area estimation is a statistical techniqu e involving the estimation of parameters for small sub-populations where a sample has insufficient or no sample for the sub-populations to be able to make accurate estimates for them. The term “small area” may refer strictly to a small geographical area such as a county, but may also refer to a “small domain,” i.e., a particular demographic within an area. Small area estima tion methods use models and additional data sources (such as census data) that exist for thes e small areas in order to improve estimates for them. is conferred upon individuals for whom the Census Bureau approves Special sworn status (SSS) access to confidential Census Bureau data in furt rpose. SSS individuals herance of a Title 13 pu are subject to same legal penalties for vi olation of confidentiality as employees. al or trading day are diagnostic graphs that indicate th e presence of season Spectral graphs effects. Visually significant peaks at the marked seasonal and/or trading day frequencies usually indicate the presence of these effects, in some cas es as residual effects after an adjustment that is not fully successful for the span of data from wh ich the spectrum is calculated. Spectral graphs are available for the prior-adjusted series (or original series if specified), regARIMA model residuals, seasonally adjusted se ries, and modified irregular. 170</p> <p><span class="badge badge-info text-white mr-2">181</span> Split panel tests mental testing of questionn aire variants or data refer to controlled experi r" or to measure differe collection modes to determine which one is "bette nces between them. Stakeholders include Congress, federal agencies, sponsor s, state and local government officials, zations that fund data programs, use the data, advisory committees, trade associations, or organi or are affected by the results of the data programs. The standard deviation is the square root of the variance and measures the spread or dispersion around the mean of a data set. variability of an estimate due to sampling. standard error is a measure of the The Standard Occupational Classification System (SOC) is used to classify workers into The occupational categories for the purpose of collecti ng, calculating, or disseminating data (for more information, see www.bls.gov/soc/ ). consists of comparing two record s, determining if they refer to Statistical attribute matching “similar” entities (but not necessarily the same en tity), and augmenting data from one record to the other. Statistical inference is inference about a population from a random or representative sample terval estimation, and statistical significance drawn from it. It includes point estimation, in testing. statistical model consists of a series of assumptions about a data generating process that A distributions and functions on t hose distributions, in order to explicitly involve probability construct an estimate or a projec tion of one or more phenomena. Statistical purposes estimation, or analysis of the characteristics of refer to the description, groups without identifying the individuals or organizations that compose such groups. Statistical significance is attained when a statistical pro cedure applied to a set of observations yields a p that the null hypothesis -value that exceeds the level of probability at which it is agreed will be rejected. Strata are created by partitioning the frame and ar e generally defined to include relatively homogeneous units within strata. Stratification involves dividing the sampling frames into subsets (called strata) prior to the selection of a sample for statistical efficiency , for production of estimates by stratum, or for operational convenience. Stratifi cation is done such that each st ratum contains units that are to be highly correlated with relatively homogeneous with respec t to variables that are believed the information requested in the survey. 171</p> <p><span class="badge badge-info text-white mr-2">182</span> Stratified sampling h the population is divided into is a sampling procedure in whic and the selection of samples homogeneous subgroups or strata is done independently in each stratum. Sufficient data is determined for a survey by whethe r the respondent completes enough items for the case to be considered a completed response. Supplemental reinterview allows the regional offices to se esentative (FR) lect any field repr with an original interview assignment for reinterview. All assigned cases that are not selected for reinterview are available as inactive supplem ental reinterview cases. The regional office may place a field representative in supplemental rein terview for various reasons: the FR was not selected for reinterview; the FR was hired dur ing the assignment period; or the regional office needs to reinterview additional cases to inve stigate the FR for suspected falsification. Swapping involves selecting a sample of records, is a disclosure limitation technique that edetermined variables, and swapping all other finding a match in the database on a set of pr variables. Synthetic data are microdata records created to improve data utility while preventing disclosure of confidential respondent information. Synthe tic data is created by statistically modeling original data and then using thos lues that reproduc e the original e models to generate new data va data's statistical properties. Users are unable to identify the information of the entities that provided the original data. Systematic sampling is a method of sample selection in which the sampling frame is listed in th element is selected for the sa mple, beginning from a random start some order and every k between 1 and k. A systems test is used to test the data collection in strument along with the data management systems. -T- The target population is the complete collec tion of observations under study. For example, in a presidential poll taken to determine who people w ill vote for, the target population might be all persons who are registered to vote. The sample d population might be all registered voters who can be reached by telephone. Taylor series is a representation of a function as an infinite sum of polynomial terms A a single point. derivatives at calculated from the values of its Taylor series method for variance estimation is used to estimate variances for non-linear The estimators such as ratio estimators. If the samp le size is large enough so that estimator can be closely approximated by the first or der (linear) terms in the Taylor series, then the variances can be approximated by using variance me thods appropriate for linear st atistics. The Taylor series ˆ approximation to the ratio estimator is: This approximation is .     YY yY YXxX       R linear in the survey sample totals x and y. 172</p> <p><span class="badge badge-info text-white mr-2">183</span> Testing is a process used to ensure that methods, systems or other components function as intended. time series is a sequence of data va lues obtained over a period of time, usually at uniform A intervals. Timeliness of information reflects the length of time between the information's availability and the event or phenomenon it describes. Top-coding is a disclosure limitation technique that involves limiting the maximum value of a variable allowed on the file to prevent disclosu re of individuals or ot her units with extreme values in a distribution. Topologically Integrated Geographic Encoding and Referencing (TIGER) – see definition for Master Address File (MAF)/Topologically Integrated Geographic En coding and Referencing (TIGER). total quantity response rate is the proportion of the estima ted (weighted) total (T) of data A item t reported by tabulation units in the sample or from sources determined to be equivalent- quality-to-reported data (expr essed as a percentage). ectronic instrument to (TDE) Touch-tone data entry method that uses an el is a data collection collect and capture data by telephone. Transparency refers to providing do cumentation about the assumptions, methods, and limitations of an information product to al low qualified third parties to reproduce the information, unless prevented by confiden tiality or other legal constraints. Truth decks are used to test imputation methods by comparing the imputed values to the original values for the items flagged as missing. The truth deck originates as a file of true responses. Certain responses ar e then blanked in a manner that reflects the probable nonresponse in the sample. The truth deck is then run through the imputation process in order to evaluate the accuracy of the imputed values. Tukey’s method is a single-step multiple comparison procedure and statistical test generally used in conjunction with an ANOVA to find whic h means are significantly different from one another. Named after John Tukey, it compares all possible pairs of means, and is based on a studentized range distributi on q (this distribution is si milar to the distribution of t from the t-test). -U- Unduplication involves the process of dele ting units that are erroneously in the frame more than once to correct for overcoverage. Unit nonresponse occurs when a sampled unit fails to respond or a sampled unit response does ssified as not having responded at all. not meet a minimum threshold and is cla 173</p> <p><span class="badge badge-info text-white mr-2">184</span> in surveys is the proce Usability testing ss whereby a group of representative users are asked to interact and perform tasks with survey materials (e.g., computer-assisted forms) to determine if the intended users can carry out planned tasks e fficiently, effectively, and satisfactorily. user interface is the aspects of a computer system or program that can be seen (or heard or A otherwise perceived) by the human user, and the commands and mechanisms the user uses to control its operation and input data. Users are organizations, agencies, the public, or any others expe cted to use the information products. Census Bureau employees, contractors, and other Special Sworn Status individuals internal users . Users outside of the Census Bureau, affiliated with the Census Bureau are including Congress, federal agenci es, sponsors, other Special Swor n Status individuals, and the . external users public, are Utility refers to the usefulness of the information for its intended users. -V- Variance is a measurement of the error associated with nonobservation, that is, the error that on are not measured. The measurement is the occurs because all members of the frame populati average of the squared differences between data points and the mean. Version Control is the establishment and maintenance of baselines and the identification of changes to baselines that make it possible to return to the previous baseline. A baseline, in the context of documentation, is a document that has been formally reviewed and agreed on. -W- Weights are values associated with each samp le unit that are intended to account for probabilities of selection for each unit and other errors such as nonresponse and frame undercoverage so that estimates us ing the weights represent the enti re population. A weight can be viewed as an estimate of the number of that the sampled unit units in the population represents. Working papers are information products that are prep ared by Census Bureau employees (or contractors), but the Census Bur eau does not necessarily affirm their content. They include technical papers or reports, divi sion reports, research reports, a nd similar documents that discuss analyses of subject matter topics or methodologic al, statistical, technical or operational issues. The Census Bureau releases working papers to the public, generally on the Census Bureau’s e Director responsible Web site. Working papers must include a discla imer, unless the Associat for the program determines that a disclaimer is not appropriate. 174</p> </div> </div> <div class="col-md-2"> <div> <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <!-- doc amp below related --> <ins class="adsbygoogle" style="display:block" data-ad-client="ca-pub-2017906576985591" data-ad-slot="9235765260" data-ad-format="auto" data-full-width-responsive="true"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <div> <h3>Related documents</h3> <div class="mb-4"> <div class="media"> <img class="mr-3 mt-2" src="/thumbs/1/de321938159334eb68b5e84cdfd5c2fc.jpg" alt="TheWaterFootprintAssessmentManual 2"> <div class="media-body"> <h3> <a href="/FKWm9c_2c/thewaterfootprintassessmentmanual-2.html">TheWaterFootprintAssessmentManual 2</a> </h3> <p>The WaTer FooTprinT assessmenT manual Hardback PPC: Live area – 159 x 240mm – Trim size – 156 x 234mm – Bleed – 18mm – Spine – 24.2mm - m - Y -K 1 page document C The Water Footprint Assessment Manual...</p> <a class="btn btn-primary" href="/FKWm9c_2c/thewaterfootprintassessmentmanual-2.html">More info »</a> </div> </div> </div> </div> </div> </div> </div> <script type="text/javascript" src="//s7.addthis.com/js/300/addthis_widget.js#pubid=ra-5cc342bdc5d5f486"></script> <footer class="text-muted"> <div class="container"> <p class="float-right"> <a href="#">Back to top</a> </p> <p>2019 © DocMimic - <a href="/privacy">Privacy Policy</a> - <a href="/tos">Terms of Service</a></p> </div> </footer> <script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script> <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/cookieconsent2/3.1.0/cookieconsent.min.js"></script> <script> window.addEventListener("load", function(){ window.cookieconsent.initialise({ "palette": { "popup": { "background": "#252e39" }, "button": { "background": "#14a7d0" } }, "theme": "classic" })}); </script> </body> </html><script data-cfasync="false" src="/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script>