Thursday, September 8, 2016

Does The Hospital Compare 5-Star Rating Promote Public Health?

Blog_Hospitals_staircase

The Centers for Medicare and Medicaid Services (CMS) recently released summary “5-star” scores for hospitals on its Hospital Compare website. In advance of the release and since, hospitals have criticized these scores as inaccurate and unfair. In defense of the scores, CMS and various other stakeholders have cited the need for quality transparency that consumers can use to help guide choice of health care providers. Both are right.

The reason we measure and publicly report quality is to facilitate improvements in the health of the population by a) informing consumers about high- or low-quality providers that they might want to choose or avoid, and b) providing information to providers that they can act upon to improve the care they deliver and the scores the public will see. We expect these actions to improve public health by directing consumers to higher quality providers and by inspiring all providers to deliver higher quality care. The relevant question is whether the new 5-star hospital ratings facilitate either.

The 5-star scores are based upon 64 measures of hospital quality that are published on the Hospital Compare website. In addition, there is now a single summary star rating based upon performance on the individual measures. The methodology used to create the scores has been criticized as unfair for lack of adequate risk adjustment of the medical complexity and socioeconomic disadvantages of the patients some hospitals disproportionately serve.

This is a valid concern that should be evaluated. In our view, however, these are not the biggest problems with the 5-star summary score or the measures from which they are derived. More fundamental concerns are whether a single summary score makes sense; whether we should grade quality on a curve; and whether we are measuring the right things.

The Quality of Summary Scores

A single summary score that describes overall quality at one hospital is probably not very useful for consumers. Consumers want to know about the quality a hospital delivers for their condition or for a medical procedure. Given that the quality for different types of care can vary widely within a single institution, it is unlikely that a single summary score would accurately represent the quality of care for all conditions or procedures at one hospital.

To construct the summary star scores, some fairly complex statistical calculations are performed, which essentially use rank order performance on individual measures, weighted by importance to come up with a summary score. The end result is a distribution of summary scores that approximates a bell-shaped curve with 48 percent of hospitals assigned 3-stars; about 3 percent assigned each 1- and 5-stars, and the rest 2- or 4-stars.

There are several problems with using a curve. First, it implies a meaningful difference in performance when there might not be one. For many of the individual measures from which the summary score is derived most hospitals are no different than the national average. Second, it implies that many stars equal high quality and few stars low. Regardless of whether quality across hospitals is uniformly high, low, or average, the curve will distribute hospitals across the 5-stars. Consider the measures reported in the “effectiveness of care” domain. The average national score is over 92 percent for most of the measures; for several it approaches 100 percent. There is little clinically meaningful difference in scores across hospitals and the performance is uniformly high.

If our objective is to drive the delivery of high-quality care across our health system, it would be better to define clinically meaningful thresholds that define high quality and report performance against those standards. For some measures, most hospitals would achieve the threshold; for others few. Whether or not a hospital reaches a quality standard for an individual measure is more important to patients and policymakers than the relative performance on individual measures, especially when performance is uniformly high or low.

Measures That Matter

While the summary scores are calculated from measures across seven logical and important domains—mortality, safety, readmissions, patient experience, effectiveness of care, timeliness of care, and efficiency—from the perspective of a consumer looking to understand hospital quality for specific procedures or conditions, many of the individual measures may be irrelevant. For many conditions and procedures quality measures—including ones that assess competency to diagnose and manage the rare or complex—are missing.

The measure set is a good start for patients with cardiovascular disease, but less helpful to those with other conditions. Although one domain is labeled “effectiveness of care” there are no measures in the set that assess whether patients actually get better. Do patients treated for coronary heart disease attain better health? Do patients who undergo a hip or knee replacement achieve improvements in pain and function? To be fair, these types of measures are uncommon so it is not surprising that there are none in the measure set, but they should be developed.

Making Useful Comparisons

Although a defined set of measures is used to calculate the summary score, different measures from the set are used to construct the summary score for different hospitals depending upon the services provided by the hospital and whether a hospital has a large enough sample size for individual measures.

The summary scores of multidisciplinary hospitals are based on many measures across different disciplines; those of specialty hospitals are based on a smaller set of measures. Consequently, summary scores for different hospitals may be constructed from very different measures, making comparisons difficult to understand. For example, within the “Effectiveness of Care” domain, 25 percent of hospitals are scored on two or fewer measures, while 32 percent are scored on 27, 28, 29, or 30 measures.

With regard to sample size, there is a clear relationship between higher volumes and better outcomes. At the same time, outcome scores for individual hospitals with low volume have such wide confidence intervals as to be not useful. For this reason CMS does not report scores for some measures for some hospitals if the sample size is small. These measures are also excluded from the summary score calculation for low-volume hospitals. Although this is a statistically rational approach, it contributes to an imbalance of reporting across hospitals and circumvents quality reporting for a group more likely to have low quality.

Perspective from a 5-Star Hospital

For full disclosure, our institution, Hospital for Special Surgery (HSS), is among the 3 percent of hospitals with a 5-star rating. For HSS, this rating is consistent with other ratings we’ve achieved, including the #1 orthopedic hospital for the last seven years and #2 for rheumatology by U.S. News & World Report.

We believe the consistency across ratings for HSS is in large part the result of a care delivery model with focus on one set of related conditions. This facilitates the delivery of high-quality care in one area and, in terms of the star ratings, measurement of quality in a focused area without averaging across disciplines. That said, even for our highly specialized hospital, important quality constructs are missing from our score.

In our view the concept behind the Hospital Compare 5-star summary, i.e. utilizing public disclosure of provider quality information to promote the receipt of high-quality care by consumers and thus improve population health, is a good one. However, as currently constructed the scores are unlikely to achieve this goal for the following reasons: roll-up scores across conditions/procedures obfuscate quality at the level of the condition or procedure where gains in quality could happen; grading on a curve fails to identify whether quality is good or bad; and measurement is incomplete and/or imbalanced both in terms of the application of existing measures across hospitals and the absence of important measures in the set.

In other words, the current scores don’t help consumers pick a high-quality hospital for specific conditions or procedures and don’t promote meaningful quality improvement across hospitals. In fact, in a value-based market where financial rewards are given only to the highest performers rather than providers that achieve high quality, defining quality based on a curve rather than a meaningful threshold will prevent some high-quality hospitals from being rewarded and could discourage hospitals from sharing best practices.

Public reporting of hospital quality has great promise to improve population health. We believe that in order to achieve this goal the 5-Star reporting program should be revised to report at the level of the procedure or condition using measures that matter; utilize specific performance thresholds to define quality; and use the same measures when making comparisons across hospitals. This is not a small body of work, but it is important. Given all the attention 5-Star Hospital ratings have gotten, there should be great interest in improving hospital quality reporting and many parties willing to help.



from Health Affairs BlogHealth Affairs Blog http://ift.tt/2cGqFRj

No comments:

Post a Comment