Attachment C: SACHRP Recommendations on Benchmarking

Attachment C: SACHRP Recommendations on Benchmarking
Approved by SACHRP October 26, 2016

In the SACHRP recommendation on Big Data, there is a brief discussion of the status of benchmarking, and whether benchmarking should be classified as human subject research. OHRP has asked that SACHRP provide further consideration of this issue, as the SACHRP recommendation did not provide as much detail as OHRP would like. The focus of this recommendation is to provide assistance in creating regulatory interpretation for determining when benchmarking is research subject to HHS regulations. SACHRP notes that benchmarking activities are generally low risk and valuable, and as such SACHRP supports the creation of efficient and consistent guidance for determining that status of benchmarking activities. SACHRP believes that most benchmarking activities do not meet the definition of research under the HHS regulations, but there are exceptions as described below.

A. Existing Recommendation

The content of the existing SACHRP recommendation is:

SACHRP statement on benchmarking (from Big Data and Human Subjects Research recommendation): http://www.hhs.gov/ohrp/sachrp-committee/recommendations/2015-april-24-attachment-a/index.html

Institutions that conduct quality assurance, benchmarking and similar studies using real world big data should consider mechanisms to minimize risk in those studies, even when they do not represent human subjects research. In guidance, HHS might suggest ways in which institutions could undertake such a process, as part of an overall program of considering how studies – both human subjects research and non-research – might pose privacy or other risks to patients and clients, and how those risks might be reduced.

Real World, Big Data Studies as Quality Improvement; “Benchmarking” as a Research Activity

Identified data may be used in real world, big data research without consent if the “research” can legitimately be classified as conducted for purposes of quality assurance, quality improvement or management and administration oversight. In these cases, however, research and the other activities may not be mutually exclusive: an activity could be both at the same time, with elements of both quality improvement and research, or a management analysis project or quality assurance program could evolve into research, depending on what is found, how the project unfolds and how and whether the intent behind it broadens or changes. OHRP has recently posted two letters that describe in some detail OHRP’s application of human subjects research regulations in the context of “big data” quality improvement studies, registry studies, and studies using de-identified data collected as part of standard of care.[16] This OHRP correspondence reiterates that institutions and facilities whose sole involvement is providing data – even identified data – for studies are not “engaged in research,” and that central or single IRBs may be useful for studies in which multiple sites contribute data. In general, SACHRP views these letters as expressing reasonable, appropriate and useful application of the Common Rule to this group of “big data” studies.

“Benchmarking” is sometimes cited as an activity that straddles the line between human subjects research and QA or management analysis, but in reality, “benchmarking” may refer to a number of distinct activities, each of which may have its own risk of crossing over into research. Benchmarking may, for example, denote such distinct activities as performance benchmarking, process benchmarking, and “best practices” benchmarking – meaning, respectively, across organizations and/or within one organization: collecting and analyzing data about performance and outcomes measures; collecting and analyzing various processes in place for production of goods and services and their various levels of success; and collecting and comparing (and analyzing for possible adoption) the practices and policies of the overall best-performing organizations within a specific economic, academic or service activity. In many cases, benchmarking requires the aggregation and systematic analysis of massive data sets, as for example, in an effort to understand and compare health outcomes as they may be influenced by various clinical and laboratory procedures.

B. Further Thoughts on Benchmarking

Definition of Benchmarking from Merriam Webster:

1 - usually bench mark : a mark on a permanent object indicating elevation and serving as a reference in topographic surveys and tidal observations.

2a - a point of reference from which measurements may be made.

2b - something that serves as a standard by which others may be measured or judged.

2c - a standardized problem or test that serves as a basis for evaluation or comparison (as of computer system performance).

Definition of Benchmarking from BusinessDictionary.com: •

A measurement of the quality of an organization's policies, products, programs, strategies, etc., and their comparison with standard measurements, or similar measurements of its peers. •

The objectives of benchmarking are (1) to determine what and where improvements are called for, (2) to analyze how other organizations achieve their high performance levels, and (3) to use this information to improve performance.

The Two Components of Benchmarking

There are two essential parts to benchmarking; the first is gathering data to establish a benchmark, the second is measuring performance against an established benchmark. These will respectively be referred to as “gathering component” and “measuring component” throughout this document. This recommendation provides analysis of both components.

Examples of benchmarking

Benchmarking activities involve comparisons based on measures of central tendency and data spread (mean, median, mode, standard deviation, variance, etc.) for some outcome. For each of these examples, there will be data collection for the gathering component and comparison in the measuring component. Examples include:

Comparing hospital lengths of stay for a specific diagnosis or procedure (shorter is presumably better, all other things being equal and assuming patients survive or are not transferred)

Comparing numbers of Warning Letters received by accredited Human Research Protection Programs versus non-accredited organizations.

Surveying/analyzing salary data from local, state and national databases to ensure employees are appropriately compensated.

Comparing rates of infection in ICUs.

Surveying patient satisfaction in small hospitals and measuring against other hospitals.

Comparing turnover of ICU nursing staff.

Discussion

As with all projects that are potentially research involving human subjects, it is appropriate to perform a triage of whether the project is:

not research,

research that does not involve human subjects,

research in which a given institution is not engaged,

exempt research,

research that requires expedited IRB review, or

research that requires convened IRB review.

As noted above, there are two components to benchmarking: the first is gathering data to establish a benchmark, and the second is measuring performance against an established benchmark. Neither component automatically meets the definition of research in the Common Rule, which is “a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.” In fact, SACHRP believes that most benchmarking activities do not meet the definition of research, because their purpose and their design are focused on understanding, analyzing and improving operational efficiency and effectiveness in the production of goods and services.

SACHRP would like to clarify certain threshold positions regarding assessment of the definition of research:

SACHRP agrees with existing OHRP guidance in the QI FAQs that publication by itself is neither a necessary nor sufficient criterion to determine whether a project is research.
SACHRP believes that the fact that the project is conducted in more than one institution, by itself is neither a necessary nor sufficient criterion to determine whether a project is research, although a study that is conducted entirely within one organization or a set of organizations under common control tends to indicate that benchmarking is done for business operations purposes, and not to derive generalizable knowledge.
SACHRP believes that if any part of a benchmarking project is intended to develop or contribute to generalizable knowledge, even if it is a secondary goal, then that part of the project is research and needs to be triaged and treated accordingly. SACHRP does not support a primary purpose approach to defining research, whereby an activity is not research if the primary purpose does not meet the definition of research, even if a secondary purpose does meet the definition of research.
As follow on to the fourth position, this does not mean that if one component of the benchmarking activity (either the gathering component or the measuring component) is research that the other component is also research. One component can be designed to contribute to generalizable knowledge while the other is not, or the purpose can change as the activity progresses.
It is important to consider the design and purpose of the benchmarking activity to determine if it meets the definition of research. Identical or similar activities can qualify as benchmarking, Quality Improvement (QI), research, or a clinical investigation as defined by FDA depending on their design and purpose. For example, one may obtain data on ICU infection rates and submit those data to a central repository for benchmarking assessment. If the purpose is for a hospital to obtain a “top tier” ranking in its ICU infection rates compared to similar hospitals, this activity would be non-research benchmarking. Similarly, if the purpose is to fulfill requirements for measurement as required by the Centers for Medicare Services (CMS), this also is non-research benchmarking. If the purpose is to implement a procedure at a hospital to improve ICU infection rates and measure the outcome, and to compare that outcome to similar hospitals, that is a combined benchmarking and QI process. If, however, the purpose is to gather data and analyze them to identify reasons for the different ICU infection rates at similar hospitals, this is research. If the purpose is to provide data on the safety and efficacy of rubber gloves and facemasks, this could be a clinical investigation under FDA regulations. In other words, how the data are used affects the regulatory classification of the activity.
It may be helpful to consider the role of the individual(s) who is conducting the benchmarking activity. Whether the individual is an employee of a hospital quality improvement department, or of a clinical device manufacturer, or of a research department of a university could be considered in determining whether the benchmarking activity meets the definition of research, as the role of the person designing or leading the activity suggests the purpose of the activity.
OHRP should consider development of a multi factorial set of criteria for determining when an activity is research based on the role of the individuals conducting the activity and the design and purpose of the activity.

Analysis of the Gathering Component; Collection of Data to Establish a Benchmark.

Definition of Research

The first triage assessment is whether the project meets the definition of research in the Common Rule, “a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.”

The first element to assess in the definition of research is whether the activity is a systematic investigation. Benchmarking is usually a systematic investigation, in that it is done according to a plan. Therefore, the gathering component is unlikely to not be research based on that part of the definition.

The second element to assess in the definition of research is whether the activity is designed to develop or contribute to generalizable knowledge. One approach to this assessment is to focus on the design and purpose: If the design and purpose of a benchmarking project is to collect data to inform other purposes besides developing or contributing to generalizable knowledge, then it is not research. To be clear, benchmarking activities can be designed to develop or contribute to generalizable knowledge, or have embedded components to that end, and as such constitute research under the Common Rule. In the end, this approach creates a focus on design and purpose, whereby the act of collecting information may or may not be research based on the purpose of the person collecting it. The focus shifts from the act itself to the reason for doing it. In many situations the data would be collected in the same manner, for a particular purpose, even if there were no intent to do any research.

The end result of this approach is that IRBs will have to look at the stated purpose of performing the activity. If that purpose is not written down, then the IRB will have to ask for a statement of purpose in a protocol or an IRB submission form in order to document it. This may result in an encouragement for investigators and institutions to simply state that they are not intending to perform research as defined in the regulations, that they have a different planned purpose. The question is whether this is a more significant problem than considering all acts of measurement to be research and presenting all of those to the IRB, or adopting some other method of defining research.

The third element to assess in the definition of research is whether the resultant information is “generalizable knowledge.” As noted above, SACHRP does not believe that publication is a necessary or sufficient criterion to determine this, nor is the fact that the activity is performed in more than one institution. However, both publication and the number of institutions involved can be criteria to be considered in this issue along with other factors. “Generalizable knowledge” is not defined, but one can consider it to be knowledge that is widely applicable beyond the circumstances of its collection.

Research Involving Human Subjects

The next step in the triage after determining that an activity is research is to determine whether it is research involving human subjects. Many times the gathering component of benchmarking activities is not human subjects research because the individuals do not meet the definition of a human subject under 45 CFR 46.102(f). The definition is:

Human subject means a living individual about whom an investigator (whether professional or student) conducting research obtains

(1) Data through intervention or interaction with the individual, or
(2) Identifiable private information.

Intervention includes both physical procedures by which data are gathered (for example, venipuncture) and manipulations of the subject or the subject's environment that are performed for research purposes. Interaction includes communication or interpersonal contact between investigator and subject. Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects.

Often the gathering component can be designed so that it does not involve human subjects. It is rarely necessary to intervene or interact with individuals in benchmarking projects, although it can be done. Much more commonly the issue is whether or not the investigator obtains identifiable private information as part of the gathering component of the benchmarking activity. As long as the research is conducted in such a way that no such information is collected, then there are not human subjects in the research. Techniques to achieve this can include the use of coded data under certain conditions, as discussed in the OHRP Guidance on Research Using Coded Private Information or Specimens (2008). In addition, the data can be anonymized to help ensure that the research does not involve human subjects.

Institution Is Not Engaged in Research

The next step in the triage after determining that an activity is research involving human subjects is determine whether a given institution is engaged in research. OHRP has clarified in several guidance documents that when an institution solely provides data to external parties, even if the information is identifiable private information, for certain activities the institution is not engaged in research.

The OHRP guidance on “Engagement of Institutions in Human Subjects Research” provides useful guidance on this issue in section B.6:

Institutions whose employees or agents release to investigators at another institution identifiable private information or identifiable biological specimens pertaining to the subjects of the research.

Note that in some cases the institution releasing identifiable private information or identifiable biological specimens may have institutional requirements that would need to be satisfied before the information or specimens may be released, and/or may need to comply with other applicable regulations or laws. In addition, if the identifiable private information or identifiable biological specimens to be released were collected for another research study covered by 45 CFR part 46, then the institution releasing such information or specimens should:

a. ensure that the release would not violate the informed consent provided by the subjects to whom the information or biological specimens pertain (under 45 CFR 46.116), or
b. if informed consent was waived by the IRB, ensure that the release would be consistent with the IRB’s determinations that permitted a waiver of informed consent under 45 CFR 46.116 (c) or (d).

Examples of institutions that might release identifiable private information or identifiable biological specimens to investigators at another institution include:

a. schools that release identifiable student test scores;
b. an HHS agency that releases identifiable records about its beneficiaries; and
c. medical centers that release identifiable human biological specimens.

Note that, in general, the institutions whose employees or agents obtain the identifiable private information or identifiable biological specimens from the releasing institution would be engaged in human subjects research.

Furthermore, OHRP has clarified in “Correspondence with Dr. Anthony Asher on behalf of the National Neurosurgery Quality and Outcomes Database” and in “Clinical Data Registries - OHRP Correspondence,” that when an institution provides data for certain activities, even research activities, it is not engaged in research. Therefore, when institutions provide information to parties outside of the institution for the collection component of benchmarking, generally it is possible to structure the provision of the information in a manner such that the institution is not engaged in research.

The institution receiving the information for the collection component may be engaged in research, and a separate analysis of that portion of the project is necessary. If an activity is research involving human subjects, and private identifiable information is provided to the receiving institution as part of the collection component, that institution will be engaged in research.

Exempt Research

The collection component of benchmarking might be exempt under 45 CFR 46.101(b)(1) through (b)(6). As with any other project, an analysis of the project will be necessary on a case-by-case basis.

Research that Needs Expedited or Convened IRB Review

If the collection component of benchmarking does not meet any of the categories above, and thus is non-exempt research, it must have IRB review and approval. In most cases, such activities will qualify for expedited review, but theoretically there could be collection component activities that involve sensitive data that requires convened IRB review.

Analysis of the Measuring Component; Use of Existing Data to Benchmark

Definition of Research

The first element to assess in the definition of research is whether the activity is a systematic investigation. The measuring component of benchmarking is usually a systematic investigation, in that it done according to a plan. Therefore, it is likely to be research based on that part of the definition.

The second element to assess in the definition of research is whether the activity is designed to develop or contribute to generalizable knowledge. One approach to this assessment is to focus on the design and purpose: If the design and purpose of a benchmarking project is to collect data to inform other purposes besides developing or contributing to generalizable knowledge, then it is not research. The measuring component of benchmarking activities can be designed to develop or contribute to generalizable knowledge, or have embedded components to that end, and as such constitute research under the Common Rule. As with the gathering component analysis, this purpose-based approach creates a focus on the elements of design and purpose. The act of comparing institution data to the gathered benchmark information may or may not be research based on the purpose of the person collecting it and that person’s role. The focus shifts from the act itself to the reason for doing it and the role of the person doing it. In many situations the data would be compared in the same manner, for a particular purpose, even if there were no intent to do any research.

As with the gathering component analysis, the end result of this approach is that IRBs are going to have to look at the stated purpose of performing the activity. If that purpose is not written down, then the IRB is going to have to ask for a statement of purpose in a protocol or an IRB submission form in order to document it. This may result in an encouragement for investigators and institutions to simply state that they are not intending to perform research as defined in the regulations. The question again is whether this is a more significant problem than considering all acts of measurement to be research and presenting all of those to the IRB, or adopting some other method of defining research.

The third element to assess in the definition of research is whether the resultant information is “generalizable knowledge.” The term is not defined, but one can consider it to be knowledge that is widely applicable. If the sole purpose of the measurement component is to assess how a single institution compares to other institutions in terms of the benchmark, that is arguably not generalizable knowledge. However, if the benchmarking data is used to create generalizable knowledge, then it would be research. To illustrate this issue, if an institution solely performs the measurement component in order to assess whether it has more or fewer infections in the ICU, this would not be research. However, if an institution or an investigator performs the measurement component in order to determine the causes of the different infection rates across the ICUs, then this would be research.

Research Involving Human Subjects

The next triage assessment is whether the activity is research involving human subjects. Many measuring component activities are not human subjects research because the individuals do not meet the definition of a human subject under 45 CFR 46.102(f). The definition is:

Human subject means a living individual about whom an investigator (whether professional or student) conducting research obtains

(1) Data through intervention or interaction with the individual, or
(2) Identifiable private information.

Intervention includes both physical procedures by which data are gathered (for example, venipuncture) and manipulations of the subject or the subject's environment that are performed for research purposes. Interaction includes communication or interpersonal contact between investigator and subject. Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects.

The measuring component of benchmarking projects can often be designed so that they do not involve human subjects. It is rarely necessary to intervene or interact with individuals in benchmarking projects, although it can occur. Much more commonly, the issue is whether or not the investigator obtains identifiable private information as part of the benchmarking activity. As long as the measuring component is conducted in such a way that no such information is collected, then there are not human subjects in the research. Techniques to achieve this can include the use of coded data under certain conditions, as discussed in the OHRP Guidance on Research Using Coded Private Information or Specimens (2008). In addition, the data can be anonymized to ensure that the research does not involve human subjects.

Institution Is Not Engaged in Research

The third triage assessment is whether a given institution is engaged in research. If an institution is receiving identifiable private information to perform research activities, then it will be engaged in research. This is in contrast to the current OHRP interpretation that providing identifiable private information does not make an institution engaged in research, as discussed above.

Exempt Research

The measuring component of a benchmarking activity might be exempt under 45 CFR 46.101(b)(1) through (b)(6). As with any other project, an analysis of the project will be necessary on a case-by-case basis.

Research that Needs Expedited or Convened IRB Review

The measuring component of a benchmarking activity that does not meet any of the categories above, and thus constitutes non-exempt research, must have IRB review and approval. In most cases, such activities will qualify for expedited review, but theoretically there could be measuring component benchmarking activities that involve sensitive data that require convened IRB review.

C. Administrative Considerations in Obtaining IRB Review for Benchmarking Activities

When institutions and other parties are designing benchmarking activities, they should proactively consider whether or not the gathering component or the measuring component of the activity constitutes research. Steps can be taken so that the activities are not research, or not research involving humans, or so that given institutions are not engaged in the research. In those cases where the gathering component of a benchmarking activity meets the definition of research involving human subjects, the research can often be designed so that a single IRB can perform the review, as the institutions that are providing private identifiable information are not engaged in research. In those cases where the measuring component of a benchmarking activity meets the definition of research, it is more administratively difficult to use a single IRB if each institution engages in the measuring component. However, if the measuring component can be performed centrally, then a single IRB could review the research without requiring a waiver of IRB oversight from each individual institution.

D. SACHRP Recommendations

1. OHRP clarify that when the design and purpose of the gathering component and the measuring component of benchmarking is to inform other purposes besides developing or contributing to generalizable knowledge, then it is not designed to develop or contribute to generalizable knowledge and is not research.

2. OHRP clarify that if any part of a benchmarking project is intended to develop or contribute to generalizable knowledge, even a secondary goal, then that part of the project is research and needs to be triaged accordingly. OHRP should clarify it does not support a primary purpose approach to defining research.

3. OHRP clarify that when institutions provide identified or de-identified data for benchmarking purposes, even if the benchmarking activity involves research, the institution is not engaged in research. This would mirror the OHRP FAQs on QI and posted OHRP correspondence on that issue.

4. SACHRP suggests that the current OHRP FAQ on QI could be modified to include benchmarking in the question and as an example in the answer. Possible language is presented in Appendix I of this recommendation.

5. SACHRP recommends that OHRP should consider development of a multi factorial set of criteria for determining when an activity is research based on the role of the individuals conducting the activity and the design and purpose of the activity.

Appendix I - Proposed Modification to Current OHRP Guidance

OHRP could modify the current FAQ on QI to include reference to benchmarking. Here is an example, with the additions italics:

Do quality improvement activities fall under the HHS regulations for the protection of human subjects in research (45 CFR part 46) if their purposes are limited to: (a) delivering healthcare, and (b) measuring and reporting provider performance data for clinical, practical, or administrative uses, such as benchmarking?

No, such quality improvement activities do not satisfy the definition of “research” under 45 CFR 46.102(d), which is “…a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge…” Therefore the HHS regulations for the protection of human subjects do not apply to such quality improvement activities, and there is no requirement under these regulations for such activities to undergo review by an IRB, or for these activities to be conducted with provider or patient informed consent.

The clinical, practical, or administrative uses for such performance measurements and reporting could include, for example, helping the public make more informed choices regarding health care providers by communicating data regarding physician-specific surgical recovery data or infection rates. Other practical or administrative uses of such data might be to enable insurance companies or health maintenance organizations to make higher performing sites preferred providers, to allow other third parties to create incentives rewarding better performance, or to allow institutions to participate in benchmarking activities.

Attachment C: SACHRP Recommendations on Benchmarking

Related Letters

HHS Email updates