Department of Health and Human Services
DEPARTMENTAL APPEALS BOARD
Appellate Division
Arizona Health Care Cost Containment System
Docket No. A1929
Decision No. 2981
DECISION
The Arizona Health Care Cost Containment System (State), which administers the Medicaid program in Arizona, has appealed a June 26, 2018 determination by the Centers for Medicare & Medicaid Services (CMS) to disallow $19,923,489 in federal financial participation (FFP). Arizona's Medicaid program had received the disallowed FFP for its expenditures on schoolbased health services, including speech therapy, furnished to Medicaideligible children. The disallowance determination is based on findings of an audit conducted by the United States Department of Health and Human Services' Office of Inspector General (OIG).1 The audit involved a review of a statistical sample of schoolbasedhealthservice expenditures for which the State claimed FFP from January 2004 through June 2006.
In this appeal the State raises objections to the OIG's statistical sampling and estimation methods and to the OIG's finding that expenditures for certain speech therapy services were ineligible for FFP. We overrule these objections and therefore affirm the disallowance.
Legal Background
The federal Medicaid statute, sections 19011946 of the Social Security Act (Act),2 authorizes federal financial assistance to states that provide "medical assistance" (health insurance benefits) to lowincome individuals and families as well as to blind and disabled persons. Act §§ 1901, 1903. Each state operates a Medicaid program in accordance with broad federal requirements and the terms of its federally approved "State
Page 2
plan for medical assistance," which specifies the health care items and services covered by the program. Act §§ 1902(a)(10), 1905(a); 42 C.F.R. Part 435. A state Medicaid program may pay for a health care service furnished by a school to a Medicaideligible child if that service falls within a medical assistance category in section 1905(a) of the Act and is either specified in the State plan or covered under Medicaid's Early and Periodic Screening, Diagnostic, and Treatment benefit. CMS, Medicaid and School Health: A Technical Assistance Guide (Aug. 1997), AZ Ex. 1, at AHC 000709, 001112; Tex. Health and Human Servs. Comm., DAB No. 2187, at 2 (2008).
If applicable federal requirements are met, a state Medicaid program is entitled to FFP – that is, federal matching funds – for a percentage of the program's medical assistance expenditures, including expenditures for covered schoolbased health services. Act §§ 1903(a), 1905(a); Tex. Health & Human Servs. Comm., DAB No. 2187, at 3. In general, "[o]nly those expenditures for medical assistance made by a state in accordance with the state plan are eligible for FFP." Tex. Health & Human Servs. Comm., DAB No. 2176, at 3 (2008).
Case Background
Arizona's Medicaid program covers schoolbased health services (services rendered by school employees or contractors) under a subprogram called Direct Service Claiming (DSC). AZ Ex. 6, at AHC 0132; AZ Ex. 2, at AHC 087. DSC's coverage extends only to Medicaideligible students who are also eligible for school health services under Part B of the Individuals with Disabilities Education Act (IDEA). AZ Ex. 6, at AHC 0132, 0140; AZ Ex. 2, at AHC 0087. DSCcovered services may include nursing, behavioral health, and physical, occupational, and speech therapy services. AZ Ex. 6,at AHC 0131. Local Education Agencies (LEAs), a term that includes public school districts and charter schools, bill Arizona's Medicaid program on a feeforservice basis for services they furnish under DSC. Id. at AHC 0132 ns. 1, 3. The State, in turn, claims FFP (on form CMS64) for payments it makes to the LEAs for those services.3 Id. at AHC 013132. Arizona's state plan provides that schoolbased health services are covered by DSC only if medically necessary and furnished in accordance with applicable federal and state statutes and regulations and with applicable program policies, procedures, and guidelines. Id. at AHC 0132; AZ Ex. 2, at AHC 008789.
Between 2007 and 2009, the OIG performed an audit of the State's FFP claims for schoolbased services to determine whether they comported with federal and state requirements. AZ Ex. 6,at AHC 013334. The OIG issued a final report of the audit's findings on March 22, 2010. Id. at AHC 0122.
Page 3
The audit examined FFP claims for the period January 1, 2004 through June 30, 2006. Id. at AHC 0134. During that period the State claimed (and received) approximately $124 million in FFP for schoolbased health services. Id.
To determine whether that sum had been properly claimed, the OIG used statistical sampling and estimation methods. Id. at AHC 0127, 013435. The OIG first identified 9,542,367 paid schoolbased health services associated with the audited FFP claims. Id. at AHC 0134, 0144. The OIG grouped these services into 530,029 "studentmonths." Id. at AHC 0144. (A studentmonth was the statistical "sampling unit"4 and "represented all paid Medicaid schoolbased health services provided to an individual student for a calendar month." Id.) Certain studentmonths were excluded,5 leaving a "sampling frame," or target population, of 528,543 studentmonths, encompassing services for which the federal government had provided $123,614,883 in FFP. Id. From the target population the OIG selected a "simple random sample" of 100 studentmonths.6 Id. at AHC 01440145. That 100unit sample encompassed 1,989 discrete schoolbased health services for which the federal government had provided $32,212 in FFP.7 Id. at AHC 0134.
The OIG then obtained and reviewed medical records and other documentation in order to determine whether each of the schoolbased services in the sample "was allowable [that is, eligible for FFP] in accordance with Federal and State requirements." Id. at AHC 01340135. The OIG found that Arizona's Medicaid program had paid for unallowable schoolbased health services provided during 46 of the 100 sampled studentmonths, and that the federal government had provided $6,764 in FFP for those services.8 Id. at AHC 0135.
Page 4
Based on these results, the OIG calculated a "point estimate"9 of the total amount of FFP paid for unallowable schoolbased services in the population. Id. at AHC 0146. This estimate was approximately $35.7 million. Id. The OIG calculated the point estimate by dividing the amount of FFP found by the OIG to have been improperly paid for schoolbased health services provided during the sampled studentmonths ($6,764) by the number of studentmonths in the sample (100) – then multiplying the resulting quotient ($67.64) by the number of studentmonths in the population (528,543). (The quotient used to project the sample results to the population is referred to by the parties as the "mean difference estimator" or the "meanperunit estimator"; it is essentially the average amount of FFP paid in error, as determined by the OIG, for each studentmonth in the sample. See CMS Ex. 1, ¶ 9.)
To account for likely error in the point estimate – the actual value of unallowable FFP in the population was probably higher or lower than the estimate – the OIG constructed a twosided 90 percent "confidence interval" around the point estimate.10 AZ Ex. 6, at AHC 0146; CMS Ex. 1, ¶¶ 11, 12, 34. Using the lower bound, or lower limit, of that confidence interval, the OIG concluded that the State had received "at least" $21,288,312 in FFP for schoolbased health services whose costs were ineligible for federal reimbursement and recommended that CMS disallow that amount. AZ Ex. 6, at AHC 0136, 0141, 0146.
After the OIG issued its final report, CMS agreed to overturn or modify one or more of the report's sample findings based on additional documentation furnished by the State. AZ Ex. 12, at AHC 0216. This action reduced the amount of unallowable expenditures in the sample (as determined by the auditors) from $6,764 to $6,502.55. CMS Ex. 4, at 1. Using the lower figure (rounded up to the nearest dollar), CMS recalculated the
Page 5
population point estimate of unallowable expenditures and constructed a new confidence interval whose lower bound was $19,926,220. Id. at 2.
Based on the revised lowerbound estimate, CMS formally disallowed $19,923,489 in FFP claimed by the State for schoolbased health services for the 30month audit period. AZ Ex. 12. (CMS disallowed slightly less than the lowerbound estimate in order to account for a small FFP refund by the State. Id. at AHC 0216.)
The State then filed a request for reconsideration, raising objections to the OIG's sampling and estimation methods. In support of that request, the State submitted a September 2010 report prepared for the State by Milliman, an actuarial consulting firm.11 AZ Ex. 7; AZ Br. at 6. The State also relied upon a February 12, 2018 analysis prepared by EconLit LLC, an economic and litigation consulting firm. AZ Ex. 11; AZ Br. at 6.
CMS denied the request for reconsideration (AZ Ex. 13), and the State then filed this appeal. With its opening brief the State submitted, among other material, the Milliman Report (AZ Ex. 7), the February 12, 2018 EconLit analysis (AZ Ex. 11), and an April 29, 2019 "Supplemental Report" by EconLit (AZ Ex. 17). In response, CMS submitted a declaration by Jared B. Smith, Ph.D., the OIG's Director of Quantitative Methods (CMS Ex. 1). The State followed up with a reply brief and a June 28, 2019 "Rebuttal Report" by EconLit (AZ Ex. 20).
CMS moved to exclude the Rebuttal Report from the record. The Board denied that motion but gave CMS an opportunity to file a surreply and supporting supplemental declaration from Dr. Smith. The Board also gave the State an opportunity to file a sursurreply "confined to addressing arguments made in CMS's surreply." See July 19, 2019 Ruling Denying Request to Exclude Appellant's Exhibit 20.
CMS filed its surreply together with a second declaration by Dr. Smith (CMS Ex. 8). The State then moved to strike the portion of the surreply addressing its objection to the OIG's finding that certain schoolbased speech therapy services were ineligible for FFP. See Appellant's Aug. 15, 2019 Motion to Strike. The Board denied that motion but gave the State additional time to file its sursurreply. See Aug. 20, 2019 Ruling Denying Motion to Strike. The State then timely filed a sursurreply and a "Second Rebuttal Report" by EconLit (proposed AZ Ex. 21).
On September 10, 2019, CMS moved to exclude EconLit's Second Rebuttal Report, asserting, in part, that the Board did not authorize the State to file any additional expert reports, and that, in any event, the content of the Second Rebuttal Report is largely repetitive and unnecessary. In response, the State argued, in part, that the Second
Page 6
Rebuttal Report is no more repetitive than Dr. Smith's second declaration and responds to arguments advanced in that declaration. The Board's August 20, 2019 ruling did not expressly state that the State may not submit any additional expert report with its sursurreply. Although the State did not then ask the Board in advance for leave to do so, the Board has determined that the State nevertheless should have an opportunity to present to the Board additional expert opinion in support of the arguments in its sursurreply and that our decisionmaking would be best served by considering the State's last expert report. We accordingly deny CMS's September 10, 2019 motion and admit EconLit's Second Rebuttal Report.12
Analysis
The State contends that the disallowance should be reduced on two grounds. First, it submits that the OIG's sampling and estimation methods were "invalid" or inappropriate, resulting in a "significantly overstated disallowance." AZ Br. at 3, 21. Second, the State disputes the OIG's finding that speech therapy services provided during 12 sample studentmonths were ineligible for FFP. Id. at 1120. We begin by assessing the State's objections to the OIG's sampling and estimation methods.
1. The OIG's samplebased estimate of unallowable costs is reliable evidence supporting the disallowance.
To determine the extent to which a state Medicaid agency's FFP claims include unallowable (FFPineligible) expenditures, CMS may rely upon statistical sampling and estimation when "individual review of the underlying records would be impractical due to volume and cost." N.Y. State Dep't of Social Servs., DAB No. 1394, at 22 (1993); see also P.R. Dep't of Health, DAB No. 2385, at 56 (2011) (noting that the Board and federal courts "have repeatedly upheld the use of statistical sampling in calculating disallowances of public funds"). When statistical sampling is used for that purpose, as it was in this case, CMS (as do other federal grantor agencies) typically bases the resulting disallowance determination on the estimate given by the lower bound of a twosided 90 percent confidence interval. See, e.g., N.Y. State Dep't. of Social Servs., DAB No. 1358, at 4546 (1992); P.R. Dep't of Health at 4, 7, 9; N.J. Dep't of Human Servs., DAB No. 2415, at 13 (2011) (noting that "[it] has long been standard practice of the [OIG] to use the lower limit of the 90% twosided confidence interval" in auditing FFP claims). The Board has held this lower bound, if properly derived using valid methods, is "reliable evidence of the amount of unallowable costs charged to federal funds," P.R. Dep't of Health at 9, and "protect[s] [the State] with a 95% degree of confidence from having to"
Page 7
refund more than the true but unknown amount of the FFP overpayment, Ok. Dep't of Human Servs., DAB No. 1436, at 6 (1993).13
Because the State challenges the validity or appropriateness of the OIG's sampling and estimation methods, CMS has the burden in this proceeding to show that those methods are "scientifically valid" and yielded "reliable evidence" of the amount of FFP improperly claimed for schoolbased health services.14 MidKansas Cmty. Action Program, Inc., DAB No. 2257, at 4 (2009); N.Y. Dep't of Social Servs., DAB No. 1358, at 54 (stating the federal agency must show that the statistical methods were "reasonable under the particular circumstances" and produced "reliable evidence" of the amount of FFP claimed for unallowable expenditures). CMS carried that burden based on the declarations of its expert, Dr. Smith, who has 10 years of experience working as a statistician and completed 60 quarter hours of graduatelevel coursework relating to statistics. Citing authoritative literature in the field of statistical sampling and estimation, Dr. Smith credibly asserted that the OIG used valid procedures and methods to estimate the amount of FFP claimed for unallowable schoolbased health services during the 30month audit period. Those procedures and methods included:
 identifying an appropriate sampling unit – the studentmonth (CMS Ex. 1, ¶¶ 10, 2728, 30);
 defining a finite target population (the sampling frame), consisting of nonoverlapping sampling units (studentmonths) that had an equal chance of being selected (id., ¶¶ 10, 2728);
 drawing a simple random sample from the target population using widely accepted statistical software (RATSTATS) developed by the federal government (id., ¶¶ 7, 10);
Page 8
 using a meanperunit estimator or a mean difference estimator,15 a statistic calculated based on the sample findings, to derive an "unbiased"16 point estimate of unallowable expenditures for schoolbased health services in the population (id., ¶¶ 9, 10, 14); and
 "account[ing] for the uncertainty of the unbiased point estimate" by calculating a twosided 90 percent confidence interval around the unbiased point estimate (id., ¶¶ 1114).
According to Dr. Smith, the lower bound, or lower limit, of the 90 percent confidence interval is "designed to produce an estimate [of the FFP overpayment] that is less than the actual overpayment about 95 percent of the time" and "gives [the State] the benefit of the doubt for the uncertainty in the sampling process." Id., ¶ 12; see also CMS Ex. 8, ¶ 9.
The State offers three critiques of the OIG's work. None persuades us that the OIG failed to apply accepted and appropriate methods of audit sampling and estimation, or that the disallowance rests upon on an unreliable estimate of unallowable expenditures in the relevant population.
Page 9
(1) Sample size and precision of the point estimate
The State first contends (in its opening brief) that the size of the sample – 100 studentmonths – was "too small to yield a result that had any acceptable precision." AZ Br. at 7 (italics and emphasis added).17
"Precision" is an attribute of the point estimate. CMS Ex. 1, ¶¶ 11, 31, 43. It is the degree to which that estimate varies across potential samples; in other words, precision captures the uncertainty inherent in the point estimate. Id., ¶ 11 (noting that "[w]hile an unbiased point estimate [of FFP overpayment] exactly equals the actual repayment total on average across potential samples, [that estimate] may be higher or lower than the actual overpayment for any given sample").
In his declaration, CMS's expert, Dr. Smith, acknowledged that "[b]etter precision can be achieved by pulling larger samples" (as well as by using more complex sampling procedures or different methods of measuring the population characteristic of interest). Id., ¶ 11. Dr. Smith also acknowledged that "[p]recision and the general match between the sample and the population is an important consideration when relying on a point estimate." Id. However, Dr. Smith emphasized that precision is less salient in this case because the estimate supporting the disallowance is not the unbiased point estimate but, rather, the lower limit of the confidence interval around that estimate. Id. Dr. Smith explained that any imprecision in the point estimate arising from "sample design and choice of estimation method" is "account[ed] for" in calculating the confidence interval's
Page 10
lower bound,18 and that "[b]y design, the level of assurance provided by the confidence interval is not related to the precision of the point estimate." Id., ¶¶ 1216, 21, 31, 34, 43. Given CMS's reliance on the lower bound of the confidence interval to support the disallowance, said Dr. Smith, aspects of the sample design that tend to decrease precision (such as a smaller sample size) tend to work in the State's favor because "[t]he lower the precision is, the greater will be the reduction from the unbiased estimate." Id., ¶ 13; see also CMS Ex. 8, ¶ 13 (noting that sample "[d]esigns with worse precision tend to result in more conservative lower limits") and 14 (stating that the confidence interval's lower limit "is meant to provide an estimate that tends to be conservative regardless of the variability of the sample").
In response to these statements, the State did not (in its reply or sursurreply briefs) press its initial claim that the sample size was too small. Nor did the State dispute Dr. Smith's opinion that imprecision in the point estimate is accounted for in calculating the confidence interval's lower limit and "does not impact the confidence associated with the lower limit that CMS is using for the overpayment calculation in this case." CMS Ex. 1, ¶ 31. Indeed, Petitioner conceded that the point estimate's precision has "nothing to do with whether OIG's lower bound is correct." Sursurreply at 5.19 Furthermore, the Board has recognized in other cases that the point estimate's imprecision tends to benefit the grantee when the disallowance is based on the lower bound of the relevant confidence interval. See, e.g., N.Y. Dep't of Social Servs., DAB No. 1358, at 48 (noting that because smaller samples generate wider confidence intervals, and because the federal agency "disallowed only the amount established by the lower limit of the confidence interval," the State "potentially benefited" from the fact that the sample size, and resulting precision of the point estimate, was smaller or less than they could have been); Pa. Dep't of Public Welfare, DAB No. 1508, at 10 (1995) (holding that the State "was not prejudiced, and indeed likely benefitted, from the use of a smaller sample" than called for by the grantor agency's policy because only the amount established by the confidence interval's lower
Page 11
limit was disallowed); Ok. Dep't of Human Servs. at 8 (stating that the lower bound of confidence interval gave the state "the benefit of any doubt raised by use of the smaller sample"); N.J. Dep't of Human Servs. at 10 (noting that the state had "not shown any prejudice to it" from the chosen sample design "given that the disallowance was not based on the point estimate"). Accordingly, we find that the size of the sample in this case did not render the population estimate supporting the disallowance invalid, and that any imprecision in the point estimate likely benefitted the State because it widened the confidence interval whose lower bound supports the disallowance.
(2) Representativeness of the sample
The next criticism leveled by the State is that the sample of 100 studentmonths was not "representative" of the population from which it was drawn. AZ Br. at 7, 10; Reply at 2. The sample was not representative, says the State, because the average amount of FFP paid for services performed during a sampled studentmonth was $322.12, while the average FFPpaid amount for studentmonths in the population was only $233.88. Reply at 2, 5; Sursurreply at 2; see also AZ Ex. 7, at AHC 0158. The State provided evidence that the reason for the discrepancy was a higherthanexpected number of studentmonths in the sample for which the FFPpaid amounts were between $1,000 and $4,500, a range at the higher end of the frequency distribution of those amounts in the population. AZ Ex. 7, at AHC 015860. The State submits that it was necessary to "adjust" for that discrepancy to ensure that the disallowed amount was not "overestimated" but that CMS proffered no evidence that the OIG "considered whether [the] sample was unrepresentative, much less whether it could and should be adjusted." Sursurreply at 2, 3.
CMS acknowledges that the "average paid amounts are higher in the sample than in the frame" but submits that the lower bound of the confidence interval calculated from the sample is a valid and reliable estimate of actual FFP overpayment amount. Response Br. at 16. In support of that position, Dr. Smith stated that:
 A simple random sample "will tend to differ" from the population and is "not expected to match the population on all dimensions";
 A "key goal of statistics" is to measure the differences between the sample and population and "account for them in a reasonable manner";
 "One wellsupported approach for handling the potential differences between the sample and the population is to rely on the confidence interval rather than the point estimate" obtained from the sample; and
 "The confidence interval is designed to cover the population total even in situations where the sample does not match the population."
CMS Ex. 1, ¶¶ 37, 4445.
Page 12
Dr. Smith further asserted that the average FFPpaid amounts identified by the State "serve as a good test case for the ability of the confidence interval to account for the differences between the sample and the population":
[The State] calls attention to the fact that the average paid amounts are higher in the sample than in the population. This argument must fail if it is possible to use the sample to reliably estimate the very quantity [FFPpaid amounts] that [the State] claims is not represented. In fact, when OIG used the same [estimation] method it used for the refund amount [the FFP overpayment] to calculate paid amounts it obtained a 90 percent confidence interval ($122,949,342 to $217,560,575) that contained the actual paid amount in the population ($123,614,883). As expected, even though the average paid amounts are higher in the sample than in the population, the confidence interval calculated from the sample still captures the correct population total.
Id., ¶ 38 (relying on the calculations in CMS Ex. 4, at 1). In short, says Dr. Smith, the sample in this case is "sufficiently representative" of the population with respect to FFPpaid amounts because the lower limit of the confidence interval calculated for that characteristic includes the total (aggregate) FFPpaid amount in the population, even though the average FFPpaid amount in the sample ($322.12) is larger than the average FFPpaid amount in the population ($233.88). Id., ¶ 44; see also CMS Ex. 8, ¶ 33 (stating that "when the paid amounts in the population are estimated using the sample values, the resulting lower limit is still less than the known paid amounts in the population").
We find Dr. Smith's analysis persuasive. As he suggests, random sampling, if executed properly with respect to a properly defined population, provides substantial assurance that the resulting sample's relevant attributes will sufficiently represent those in the population and lead to unbiased estimates. See CMS Ex. 8, ¶ 3 (stating that under accepted statistical theory, a sample is "sufficiently representative if it is pulled at random from the population to which the inference will be performed"); Colo. Dep't of Social Servs., DAB No. 1272, at 33 (1991) (stating that the "assumption underlying a validly drawn sample is that it is representative of the universe from which it is drawn").20
Page 13
Of course, by chance, a random sample will almost certainly not perfectly represent the population across every dimension. The parties agree with the basic maxim that a random draw does not guarantee a sample that is a perfectly representative subset of the population. CMS Ex. 1, ¶ 37 ("Because samples are random, they will tend to differ from the populations that they were drawn from.");AZ Ex. 20, at AHC 0308 (acknowledging that "[r]andom samples cannot ensure representativeness with respect to any characteristic" and that "every sample contains sampling errors"). Consequently, an estimate of the population parameter based on sample data will differ from the value that would have been found for the parameter if the entire population had been surveyed. CMS Ex. 1, ¶ 11. But as Dr. Smith explained, random sampling enables a researcher to apply probability theory to calculate that expected difference (or error), then use that calculation to construct a confidence interval to show the range of values that can be said to include the true population value with a prespecified level of confidence (absent any bias). CMS Ex. 1, ¶ 37, 4445; see also FJC Ref. Guide at 240 n.83 (noting that "[r]andomness in the technical sense . . . justifies the probability calculations behind standard errors [and] confidence intervals"). Consequently, if differences between a sample and the population are due to random selection, as they appear to be in this case – the State does not allege that the OIG's methods introduced bias into the sample results – then an error in the point estimate attributable to those differences can generally be "accounted for" in the interval estimates based on the sample. See CMS Ex. 8, ¶ 33 (noting that differences between the sample and the population are generally accounted for through the use of the lower limit); FJC Ref. Guide at 241, 243, 244, 246 (explaining that a population estimate derived from a random sample will reflect some degree of error whose magnitude can be reported in terms of the "standard error" and confidence interval); id. at 296 (defining "sampling error," also called "random error," as the difference between an unbiased samplebased estimate and the true population value that results from the fact that the sample "is not a perfect microcosm of the whole").
The State asserts that the OIG "did nothing to account for the unrepresentativeness of the sample," and that this failure "produc[ed] a lower bound that is incorrect, statistically invalid, and far too high." Reply at 6. These assertions are unconvincing for at least three reasons. First, the State sidesteps Dr. Smith's main point, which is that the confidence interval's upper and lower bounds account for samplepopulation differences because they are calculated using the expected error, or variability, in the point estimate due to those differences. See CMS Ex. 1, ¶¶ 34, 43 (noting that the interval estimates are determined using a measurement of the point estimate's variability). The State does not cite any statistical text, theory, or formula to rebut that proposition. See Sursurreply at 4; Reply at 23, 6. Nor does the State suggest that the difference between average FFPpaid amounts in the sample and population reflected mistakes by the OIG in defining the target population or selecting the sample.
Page 14
Second, the State does not contend that the disparity between average FFPpaid amounts in the sample and population is so large or unusual that the sample cannot be relied upon to produce valid interval estimates. See AZ Br. at 11 (stating that the "disallowance can be recomputed without a need for further sampling," provided that a different "estimator" is used to extrapolate the sample findings). The State's exhibits include a November 2014 report prepared by Al Kvanli, Ph.D., who stated that he had 33 years of experience "specializing in statistical issues related to government audit applications." AZ Ex. 10, at AHC 0184. Dr. Kvanli noted in this report that the average FFPpaid amount in the sample was 1.64 standard deviations above the population mean for that parameter. Id. at AHC 0187. Dr. Kvanli further stated that while the sample mean was "slightly on the high side," it was not "unusually so," and that he considered "a sample mean more than 2 standard errors from the population mean as being in the 'unusual' category." Id. at AHC 018788. The State did not dispute these particular observations, which appear valid on their face. See FJC Ref. Guide at 244 (noting that "[r]andom errors larger in magnitude than the standard error are commonplace," while "[r]andom errors larger in magnitude than two or three times the standard error are unusual").
Third, the State's suggestion that the sample's alleged nonrepresentativeness actually caused the OIG's lowerbound estimate to be higher than it would otherwise have been is unfounded. According to the State, because the average FFPpaid amount for the sampled studentmonths was 37.7 percent higher than for studentmonths in the population, the OIG "overstated" the lower bound of the confidence interval for FFPoverpayment amount by the same percentage. Sursurreply at 2, 5 (citing AZ Ex. 20, at 8 and AZ Ex. 21, at 34). The State provided no statistical or other mathematical analysis to back up that particular claim. According to the Milliman report, the difference in average FFPpaid amounts in the sample and population stemmed from a higherthanexpected number of studentmonths in the sample for which the FFPpaid amount was between $1,000 and $4,500. AZ Ex. 7, at AHC 015860. The State does not say which studentmonths in that upper range included services that the OIG identified as ineligible for FFP, specify the amount(s) of FFP paid for those services, or show that those amounts (if any) had an outsized effect in calculating the mean FFP overpayment for the sample. See CMS Ex. 1, ¶ 45 (commenting that a statistical test run by Milliman provided "no evidence of any potential impact" of the samplepopulation differences on the confidence interval's lower limit).
Based on the foregoing analysis, we conclude that the sample's alleged nonrepresentativeness did not invalidate, or render unreliable, the OIG's lowerbound estimate of unallowable expenditures in the target population.
Page 15
(3) Appropriateness of the OIG's statistical "estimator"
The State's third main criticism assumes that the second is wellfounded. The State contends that in light of the sample's alleged nonrepresentativeness, the OIG should have used – and indeed was "required" to use – a "cluster sample analysis"21 to estimate the amount of unallowable costs in the population. AZ Br. at 710. By "cluster sample analysis," the State evidently means the use of a "ratio estimator" to project sample findings to the population. Id. at 911; Reply at 2; Sursurreply at 2. If a cluster sample ratio estimator is used, says the State, then the lowerbound of the resulting confidence interval would be several million dollars less than the lowerbound figure calculated by the OIG, warranting a comparable reduction in the disallowance. See AZ Br. at 911; Reply at 2, 4.
We pause to describe the State's ratio estimation approach, which is set out in EconLit's Supplemental and Second Rebuttal Reports (AZ Ex. 17, at AHC 0248, and AZ Ex. 21, at 2). The State calculated the ratio estimator by dividing the amount of FFP that the OIG determined had been paid in error for services performed during the sampled studentmonths (this amount was originally $6,764 but was reduced by CMS to $6,503 (rounded up from $6,502.55)) by total FFP payments made for those services ($32,212). The resulting quotient – the fraction of total FFP paid for services performed during the sampled studentmonths that was paid in error (according to the OIG) – is 0.20188. The State multiplied that fraction by total FFP paid for all studentmonths in the population (approximately $123.6 million) to derive a point estimate of unallowable costs ($24,953,704) and a twosided 90 percent confidence interval whose lower bound is $11,945,011, approximately $8 million less than the disallowance amount. See AZ Ex. 17, App. B; AZ Ex. 21, at 2.
As noted, the OIG used a mean difference estimator, rather than a ratio estimator, to make its population estimates. CMS Ex. 1, ¶¶ 910. The mean difference estimator, which Dr. Smith described as "unbiased" (a characterization not disputed by the State), is the average FFP overpayment amount for the studentmonths in the sample. That estimator is derived by dividing the total amount of FFP paid in error for the sampled studentmonths ($6,503) by the number of student months in the sample (100). The OIG multiplied that sample mean ($65.03) by the number of studentmonths in the population
Page 16
(528,543) to calculate a point estimate and confidence interval with a lower bound of approximately $19.9 million. See CMS Ex. 4, at 2; AZ Ex. 21, at 2.
According to the State and its consultant, EconLit, the ratio estimator "adjusts for the fact that the sample is not representative with respect to the amount paid (because it does not matter if the sample contains an unusually high number of large payments or an unusually high number of small payments)." AZ Br. at 10. The State further asserts that, because the OIG's overall objective was to estimate the absolute "dollar amount of improperly reimbursed dollars," it was appropriate to view the chosen sample as "clusters" of "dollars paid," rather than as "a simple random sample of studentmonths paid," and that viewing the sample as clusters of dollars paid "dictates the use of a ratio estimator." AZ Ex. 20, at AHC 0305, 0309; see also Sursurreply at 2 (asserting that the OIG "chose to audit clusters (studentmonths) of claims, rather than individual claims"). In addition, the State submits that the ratio estimator yields a population estimate that is "more precise" than the estimate derived using the mean difference estimator, and that "[p]recision matters in this case because HHS/OIG's technique increases the risk that a single estimated improper payment amount and its associated confidence limits could be much too high or much too low." AZ Ex. 20, at AHC 0311.
In response to these assertions, Dr. Smith explained that a ratio estimator allows a statistician to account for known "auxiliary" information (here, that information is the amount of FFP paid for a studentmonth) when estimating the value of the population characteristic of interest (FFP paid in error). CMS Ex. 1, ¶ 33. Ratio estimation, said Dr. Smith, "can improve the precision of the point estimate when the auxiliary information is related to the measure under study." Id. Dr. Smith acknowledged that such improvement is theoretically possible in this case because "total paid amounts for the studentmonth are likely related to the total error amounts." Id. However, Dr. Smith asserted that it is "not necessary" that the point estimate be as accurate and efficient as possible given CMS's reliance on the lower limit of the confidence interval, CMS Ex. 8, ¶ 9, and that any potential improvement in precision obtained from a ratio estimator does not affect the "reliability" of the lowerbound estimate supporting the disallowance:
[Sample] [d]esigns with worse precision tend to result in more conservative lower limits. The reason for this relationship is that the lower limit is calculated by subtracting a measure of precision from the point estimate. When the precision is worse, the amount subtracted is greater. It follows that at the time the sample was pulled, the choice to use the difference estimator was more conservative than the ratio estimator. The fact that the lower limit of the difference estimator is higher or lower than alternatives calculated for any given sample does not implicate the reliability of the limit.
Page 17
. . . Textbooks commonly refer to the precision of the point estimate; however, they do not refer to the precision of the lower limit. For example, Cochran, 1977, and Thompson, 2012, do not provide any equations for the calculation of the precision of the lower limit and do not mention the concept anywhere in the text. [The State] also has not provided any calculations comparing the precision of the two lower limits. Even if [the State] provided such calculations, it is not clear why they would be meaningful given that the lower limit is meant to provide an estimate that tends to be conservative regardless of the variability of the sample. Moreover, both the difference estimator and the ratio estimator have the same confidence level and thus are likely to be less than the actual overpayment amount the same percentage of time.
The issue with the concept of the precision of the lower limit is apparent in EconLit's statement that the 'precision matters in this case because HHS/OIG's technique increases the risk that a single estimated improper payment amount and its associated confidence limits could be much too high or much too low.' The possibility that the confidence limits are too low works in favor of [the State] because it would result in [the State] repaying a substantially smaller amount to CMS than it was actually overpaid. Moreover, the risk that the limit is too high is captured with the confidence level and is known to be small.
CMS Ex. 8, ¶¶ 1315 (citations and paragraph numbers omitted); see also id., ¶ 25 (stating the "reliability of the lower limit of the difference estimate can be proven regardless of the properties of the ratio estimate," and thus it is "of secondary importance whether the ratio estimate is in fact unbiased or more precise than the CMS approach for the current population"); CMS Ex. 1, ¶ 31 (stating that "precision is a measure of the point estimate and does not impact the confidence associated with the lower limit that CMS is using for the overpayment calculation in this case").
Dr. Smith made several other points in response to the State's claim that the OIG used an inappropriate estimator. First, he asserted that defining the sampling unit as a studentmonth (rather than as a cluster of "dollars paid") and selecting a random sample of studentmonths were valid sampling methods and sufficed to produce an "unbiased" estimate of the relevant population parameter. CMS Ex. 1, ¶¶ 10, 25, 2728 (explaining that the OIG's sampling unit satisfied accepted criteria for drawing a simple random sample). "[T]he fact that each studentmonth can be broken down to multiple services or dollar values," said Dr. Smith, "does not undermine the unbiased nature of the sample as long as the total [dollaramount] for each studentmonth is finite." Id., ¶ 10; see also CMS Ex. 8, ¶ 21 (stating that defining the sampling unit to match the "quantity measured" (dollars) is "not required by the proofs underlying finite population sampling").
Page 18
Regarding the State's assertion that a ratio estimator was the only appropriate estimator for studies in which the sampling units are "clusters" of payments or of paid claims, Dr. Smith stated that "there is no single estimator that the meets the definition of a 'clustered sample calculation.'" CMS Ex. 1, ¶ 32. "When the divisible sample units [such as studentmonths] are selected using a simple random sample," said Dr. Smith, "the standard equations used for calculating point estimates and confidence intervals are unchanged by the fact [that] the sample units are divisible." Id., ¶ 30. "For example, the mean difference estimator (the equation used by the OIG) would remain the same whether the sample is called a clustered sample or a simple random sample." Id.
Dr. Smith further stated that the OIG does not use a ratio estimator because it can be "biased," and that "the lower limit of the ratio estimator tends to be less reliable than the lower limit of the difference estimator." Id., ¶ 22. Dr. Smith stated that the ratio estimator's possible bias "is described in Section 6.8 of Sampling Techniques (Cochran 1977)." Id.
Finally, Dr. Smith asserted that the State failed to show that the ratio estimator was actually "unbiased or more precise" than the mean difference estimator for the OIG's audit. CMS Ex. 8, ¶¶ 2530 (stating, in paragraph 28, that "[w]hile it is possible that the ratio estimate is unbiased for the current data, the point has not been proven and certainly could not have been shown at the time that the sample was designed").
Having carefully considered Dr. Smith's declarations and the EconLit reports, we are unpersuaded that the OIG used an incorrect or inappropriate estimator to project its sample findings to the population. Key assertions by EconLit lack foundation, allowing Dr. Smith's opinions on the subject, which we find facially plausible, to stand unrebutted. For example, EconLit cited no literature, statistical theory, or evidence of accepted norms of audit sampling to support its suggestion that it was necessary for the OIG to use a ratio estimator merely because the sampling units could be viewed, or should have been originally defined, as "clusters" of "dollars paid." See AZ Ex. 20, at AHC 0305; see also AZ Ex. 11, at AHC 0199200 (presenting the argument that the sample data should have been analyzed as a "cluster sample").
Also unfounded is the State's principal claim that a ratio estimator must be used to correct for a nonrepresentative sample.22 EconLit cited the following literature to support that claim: Cochran, William G., Sampling Techniques (3rd ed. 1977) at 249; Schaeffer, Richard L., et al., Elementary Survey Sampling (3rd ed.) at 20506; and Kish, L., Survey Sampling at 204. AZ Ex. 17, at AHC 0235; AZ Ex. 20, at AHC 0309. CMS
Page 19
provided copies of these cited pages as attachments to Dr. Smith's second declaration. CMS Ex. 8, at 1518. As Dr. Smith observed, id., ¶ 18, none of the cited pages mentions nonrepresentative samples or indicates that a ratio estimator must be used to address potential differences between the sample and the population. Those pages, Dr. Smith stated, merely confirm that "precision improvements [in the point estimate] are possible when there is a close relationship between the available auxiliary information [FFP paid for a studentmonth] and the quantity of interest [FFP paid in error]." Id., ¶ 19.
In addition, the State failed to substantiate its assertion that the ratio estimator is unbiased for the population of studentmonths defined by the OIG. The parties apparently agree that a ratio estimate is unbiased if two conditions (specified in Cochran, Sampling Techniques (3rd ed. 1977)) are met: (1) the "relationship between the paid amounts and error amounts is a straight line through zero"; and (2) "the variance of the paid amounts is proportional to the variance of the error amounts." CMS Ex. 8, ¶ 26; see also AZ Ex. 20, at AHC 0306 (quoting from Cochran text). EconLit asserted that "[b]oth of these conditions are true for the population in the instant matter," AZ Ex. 20, at AHC 0306, but we see nothing in the record to back up that assertion.23 EconLit did not, in particular, provide statistical or other quantitative analysis showing that the relationship between FFP payments and FFP overpayments is a "straight line" – that is, a relationship of direct proportionality or perfect correlation24 – nor did EconLit demonstrate that the variance of paid amounts is proportional to the variance of paymenterror amounts.25 EconLit asserts in its Rebuttal Report that it provided evidence of "high correlation" – a strong linear relationship – between "improper payments and claimed payments." AZ Ex. 20, at AHC 0310. But merely "high" correlation does not appear to satisfy the condition set out in the Cochran textbook, which is that the relationship be one of perfect correlation (a straight line). Furthermore, EconLit did not, so far as we can determine, measure the correlation between a studentmonth's improper (disallowed) and claimed payments. Instead, EconLit measured the "correlation between the percent of reimbursed dollars . . . disallowed in a studentmonth and the total number of dollars reimbursed" in the month; found that these two variables were "effectively not correlated" (with a correlation coefficient of negative 0.068); and then concluded – without supporting or reasoning –
Page 20
that this relationship confirms that, "on average, disallowed payments are a fixed percent of claimed payments."26 AZ Ex. 17, at AHC 023334.
The State contends that "[t]he question [for the Board] should be which lower bound [estimate]" – the mean difference estimate or the ratio estimate – "bears the closest relationship to the unknown 'actual overpayment,' not whether CMS may think some other number is 'conservative' or 'acceptable' enough for its purposes." Sursurreply at 6 (italics added). If one applies that standard – together with the principle that the "point estimate" is ordinarily the best estimate of the truebutunknown population parameter of interest given the relevant study design (see CMS Ex. 1, ¶ 12) – then the lowerbound of the confidence interval determined by the OIG (approximately $19.9 million) is preferable to the State's lowerbound estimate ($11.9 million) because the former is closer to either party's point estimate of the "actual overpayment" – CMS's point estimate based on the adjusted mean difference estimator ($34.3 million), and the State's point estimate based on the ratio estimator ($24.9 million). AZ Ex. 21, at 2.
The State also contends that its lowerbound estimate is "more conservative" than the OIG's, Sursurreply at 6, but the estimate cannot be considered conservative given the State's failure to verify the ratio estimator's unbiasedness. Furthermore, even assuming for the moment that the State's lowerbound estimate is more conservative than the OIG's (and the State has not shown that its estimate is more conservative), CMS was under no obligation to choose the more conservative of two conservative estimates of the actual FFPoverpayment amount given that either estimate is likely to be less than the true amount of unallowable costs in the population. See CMS Ex. 8, ¶ 16 (noting that "the lower limits of the ratio and difference estimator will vary in their exact value" upon repeated sampling but that one "can expect that both [estimates] will tend to be less than the actual overpayment amount in a majority of cases"). Given that the OIG's lowerbound estimate was derived using valid sampling and estimation techniques and is unbiased, the State cannot prevail merely by touting a lower interval estimate derived from a sample statistic (the ratio estimator) whose suitability has not been adequately
Page 21
demonstrated. Cf. N.J. Dep't of Human Services at 10 (noting that statistical results from a given "sample design" can be valid even if the design is not "the most appropriate one").
Finally, we note that despite its assertion that "precision matters" in choosing the estimator, the State failed to provide any statistical calculations or analysis to support its claim that the population point estimate generated by the ratio estimator was actually more precise than the point estimate yielded by the mean difference estimator. See AZ Ex. 20, at AHC 0311 (asserting, without supporting statistical analysis, that the ratio estimator is actually "more precise" and not just "potentially" more so). Dr. Smith stated that Cochran, Sampling Techniques (3rd ed. 1977) "defines the inequality that must hold in the sampling frame for the ratio estimate to be more precise than the difference estimator (referred to here as the 'mean per unit' estimate)." CMS Ex. 8, ¶ 29. "To argue that the ratio estimate is certainly more precise than the difference estimator," said Dr. Smith, "EconLit would have to show that the sample provides incontrovertible evidence that the [cited] inequality holds for the population, not just the sample." Id. However, EconLit's reports do not, as Dr. Smith accurately observed, "mention the inequality" described in the Cochran text, "much less demonstrate with certainty that the inequality holds for the population or could have been proven at the sample design phase." Id.
2. The State failed to carry its burden to show that certain expenditures for speech therapy were allowable.
We now consider the State's objection to the OIG's finding that certain schoolbased speech therapy services were ineligible for FFP.
Title 42 C.F.R. § 440.110(c) establishes conditions under which a state Medicaid program may receive FFP for "[s]ervices for individuals with speech . . . disorders."27 One of those conditions is that the services must have been provided "by or under the direction of a speech pathologist." 42 C.F.R. § 440.110(c)(1). A "speech pathologist" is defined in the regulation as "an individual who meets one of the following conditions":
(i) Has a certificate of clinical competence from the American Speech and Hearing Association[;]
(ii) Has completed the equivalent educational requirements and work experience necessary for the certificate[; or]
(iii) Has completed the academic program and is acquiring supervised work experience to qualify for the certificate.
Page 22
Id. § 440.110(c)(2).28
The OIG found that schoolbased speech therapy services provided during 12 sampled studentmonths were ineligible for FFP because the services were not provided by, or "under the direction of," a "speech pathologist" in accordance with section 440.110(c).29 AZ Ex. 6, at AHC 01370138. The State has the burden to show that this finding is, in whole or part, "legally or factually unjustified." L.A. Cty. Dep't of Public Health, DAB No. 2842, at 6 (2018) (internal quotation marks omitted); see also Pa. Dep't of Human Servs., DAB No. 2835, at 5 (2017) (noting that a grantee in a disallowance appeal has the "burden to document" the allowability of FFP claims questioned by the federal agency). It did not carry that burden. To begin, the State does not contend that the speech therapists whose services the OIG flagged as unallowable were "speech pathologists" as defined in section 440.110(c)(2). See AZ Br. at 1316; Reply at 812. In other words, the State does not contend that the therapists possessed the professional credential (American Speech and Hearing Association certificate) or met the alternative academic and work requirements specified in that regulation. Instead, the State asserts that CMS "accepted" state licensure under section 361940.01 of the Arizona Revised Statutes (which provides for the licensing of "speechlanguage pathologists") "as meeting the federal standard" in section 440.110(c)(2). AZ Br. at 14, 21.
There is no evidence of any such "acceptance" by CMS, legally binding or otherwise. While Arizona's State plan (as it existed during the audit period) stated that a schoolbased speech therapy provider had to be a statelicensed "speechlanguage pathologist," the plan did not state or imply that state licensure sufficed to make a therapy provider's services eligible for FFP. To the contrary, the State plan stated that a statelicensed provider's services would be "covered in accordance with the requirements in 42 CFR § 440.110." AZ Ex. 2, at AHC 0088. That statement is most reasonably read to mean that schoolbased speech therapy furnished by a statelicensed provider would be covered only if the provider met the federal definition of a "speech pathologist" or if the provider acted under the direction of a speech pathologist. The State does not claim that it has – or that it has ever had – a different reasonable understanding of that particular Stateplan language. Furthermore, the legal significance of state licensure is an academic issue in this case because the State failed to submit proof that the therapists whose services are
Page 23
implicated by the audit findings were statelicensed "speechlanguage pathologists" under section 361940.01 of the Arizona Revised Statutes. During the audit, the State gave the OIG copies of some stateissued credentials (including "limited licenses" issued by the Arizona Department of Health Services), but it was the State's obligation to produce those materials in this proceeding if it wished to rely upon them.
There being no allegation, much less evidence, that the questioned speech therapy services were provided "by" persons meeting the federal definition of a speech pathologist, those services are eligible for FFP only if they were provided "under the direction of" a speech pathologist meeting that definition. To prove that the services met that alternative condition, the State proffered letters written by LEA officials who averred that speech therapy delivered to students during the audit period was "monitored," "coordinated," "reviewed," supported, or "supervised" by a speech pathologist. See AZ Ex. 8. All of the letters (with one exception) bear a date in September or October 2010,30 after the OIG issued its final report and several years after the questioned services were performed.
The LEAs' letters are insufficient. They contain only bare uncorroborated assertions of compliance with section 440.110(c). They give no details about the questioned services (such as when and to whom they were provided) and fail to identify the speech pathologists allegedly responsible for directing those services. The letters also fail to describe the role that the speech pathologists actually played in the students' care beyond using generalities such as "coordination" or "supervision." In addition, the letters do not indicate that the attesting officials based their statements on a review of treatment files, written program policies and procedures, or other records documenting how the billed speech therapy services were planned and delivered. And the officials do not seek to explain their organizations' apparent inability to produce such records.
A grantee must ordinarily support a claim for FFP in a Medicaid service with appropriate "contemporaneous" documentation – documentation created in the normal course of business around the time that the service was provided or when payment for the service
Page 24
was made.31 The State has given us no reason to excuse the State's failure to produce such documentation and take the LEAs' factual assertions at face value.
Even if those assertions are true, they fail to establish compliance with section 440.110(c). In Maryland Dep't of Health and Mental Hygiene, the Board held that the requirement that speech therapy be provided "under the direction of" a speech pathologist (when the speech pathologist is not the direct service provider) "is not satisfied by a showing that the speech therapist worked under the general supervision of a speech pathologist." DAB No. 2090, at 2 (2007). To demonstrate compliance with that condition, a state must proffer evidence that the speech pathologist in some way "directed . . . the provision of services to the particular student." Id. None of the LEAs' letters states that a speech pathologist in any sense "directed" – that is, gave orders or instructions about – the speech therapy provided to specific students. At most the letters establish that speech pathologists generally supervised the therapists who delivered schoolbased services or provided comparable degrees of oversight or "coordination."
As a last resort, the State contends that the disputed speech therapy services are not subject to section 440.110(c) because they were furnished under Medicaid's Early and Periodic Screening, Diagnostic, and Treatment (EPSDT) benefit. AZ Br. at 1820; Reply at 1216. We see no merit in this contention.
The EPSDT benefit provides comprehensive health care services for individuals under age 21 who are enrolled in Medicaid. See 42 C.F.R. §§ 440.40(b), 441.50, 441.56. The Medicaid statute requires that a "State plan for medical assistance" provide or arrange for EPSDT services, as defined in section 1905(r). Act § 1902(a)(43), 1905(a)(4)(B); see also 42 C.F.R. § 441.55. Section 1905(r) defines "early and periodic screening, diagnostic, and treatment services" to include screening, vision, dental, and hearing services – plus any other health care items and services falling within section 1905(a)'s
Page 25
definition of medical assistance that are "necessary" to "correct or ameliorate" conditions discovered during screening, "whether or not such services are covered under the State plan" for EPSDTineligible persons.
Contrary to the State's implication, section 440.110 makes no distinction between speech therapy provided under the EPSDT benefit and speech therapy provided under other medical assistance categories. Section 440.110 is found in subpart A of 42 C.F.R. Part 440. Section 440.2(b) of subpart A states that "FFP is available in expenditures under the State plan for medical or remedial care and services as defined in" section 440.110 and other sections of that subpart. In light of that statement of general purpose, section 440.110 is properly read to govern the availability of FFP for any "expenditures under the State plan" for speech therapy. EPSDT services are, as noted, a mandatory stateplan benefit that encompasses "necessary" corrective or ameliorative speech therapy services (regardless of whether such services are available to adult (21 years or older) Medicaideligibles). Because EPSDT services are a mandatory stateplan benefit, expenditures for necessary speech therapy services provided under that benefit constitute "expenditures under the State plan" whose eligibility for FFP is governed by section 440.110.
Section 440.110 is also applicable here because Arizona's Medicaid plan (as it existed during the audit period) expressly required compliance with that regulation as a condition for claiming FFP for schoolbased speech therapy provided under the EPSDT benefit. Paragraph 4.b of Attachment 3.1A (effective July 1, 2000) of the State plan specified the services, including outpatient speech therapy, generally available to "EPSDT recipients" (persons younger than 21 years). AZ Ex. 2, at AHC 008687. Paragraph 4.b further provided that, in accordance with an agreement between the State and the Arizona Department of Education, the State "will reimburse LEAs on a feeforservice basis for a defined set of Medicaid covered services . . . provided by a qualified schoolbased provider to students who are Title XIX [Medicaid] eligible and eligible for school health and schoolbased services pursuant to the Individuals with Disabilities Education Act (IDEA), Part B." Id. at AHC 0087 (italics added). The plan then specified the "reimbursable" schoolbased services available to qualified students (those eligible under both Medicaid and the IDEA). Id. at AHC 008789. Those reimbursable services included outpatient speech therapy services "covered in accordance with the requirements in 42 C.F.R. § 440.110." Id. at AHC 0088.
Section 1903(a)(1) of the Act authorizes FFP for "amount[s] expended . . . as medical assistance under the State plan." Because schoolbased speech therapy provided as an EPSDT benefit during the audit period constituted medical assistance under the existing State plan, see Act § 1905(a)(4)(B) and 1905(r), and because the State plan called for schoolbased speech therapy services to be provided "in accordance with" section 440.110, the State's expenditures for those services were eligible for FFP only if they met that regulation's conditions. Me. Dep't of Health and Human Servs., DAB No. 2292, at 10 (2009) ("A State's expenditures are eligible for federal Medicaid reimbursement only
Page 26
if they are made in accordance with the state plan."), aff'd, Me. Dep't of Human Servs. v. U.S. Dep't of Health & Human Servs., 766 F. Supp.2d 288 (D. Me. 2011); La. Dep't of Health and Human Resources, DAB No. 979 (1988) (upholding a disallowance of FFP for transportation services that did not meet stateplan requirements).
Conclusion
For the reasons outlined above, we sustain CMS's June 26, 2018 determination to disallow $19,923,489 in FFP for Arizona's Medicaid program.
Christopher S. Randolph Board Member
Constance B. Tobias Board Member
Susan S. Yim Presiding Board Member

1. The OIG's findings are set out in a March 2010 report titled "Review of Arizona's Medicaid Claims for SchoolBased Health Services." AZ Ex. 6.
 back to note 1 2. The current version of the Social Security Act can be found at http://www.socialsecurity.gov/OP_Home/ssact/ssact.htm. Each section of the Act on that website contains a reference to the corresponding United States Code chapter and section. Also, a crossreference table for the Act and the United States Code can be found at https://www.ssa.gov/OP_Home/comp2/GAPPH.html.
 back to note 2 3. The State pays the LEA the federal share of the LEA's costs of providing Medicaidcovered schoolbased health services; the LEA is responsible for the nonfederal share of those costs. See AZ Ex. 6, at AHC 0133.
 back to note 3 4. "Sampling units are the elements that are selected based on the chosen method of statistical sampling." AZ Ex. 19, at AHC 0292.
 back to note 4 5. The OIG excluded studentmonths that had a net claimed amount of zero or a netnegative claimed amount, as well as studentmonths for certain students that CMS had previously reviewed and studentmonths that the State's Office of Program Integrity had previously reviewed. AZ Ex. 6, at AHC 0144.
 back to note 5 6. A "simple random sample" is a subset of a population that is selected in such a way that each sampling unit has an equal probability of being selected. See CMS Ex. 1, ¶ 27; Fed. Judicial Ctr., D. Kaye & D. Freedman, Reference Guide on Statistics (FJC Ref. Guide), at 297 (available at https://www.fjc.gov/content/referenceguidestatistics2).
 back to note 6 7. The 1,989 services consisted of: 920 transportation services; 732 nursing services (including 620 health aide services); 328 occupational, physical, and speech therapy services; and 9 behavioral health services. AZ Ex. 6, at AHC 0134.
 back to note 7 8. The OIG cited various grounds for its findings that claimed services were unallowable, including inadequate documentation, student ineligibility, and service nonperformance. AZ Ex. 6, at AHC 013640.
 back to note 8 9. A point estimate is a single value that, given the chosen sampling and estimation approach, represents the "best estimate" of the population parameter of interest (which, in this case, is the amount of FFP provided for unallowable schoolbased health services). CMS Ex. 8, ¶ 12.
 back to note 9 10. A confidence interval is an estimate – computed from sample data and expressed as a range of values –that is believed with a preassigned level of confidence to include the true but unknown population value of interest. See Ok. Dep't of Human Servs., DAB No. 1436, at 4 (1993); Pa. Dep't of Public Welfare, DAB No. 1278, at 8 n.8 (1991); FJC Ref. Guide at 284 (a confidence interval is "[a]n estimate, expressed as a range, for a parameter"). The interval's confidence level "indicates the percentage of the time that intervals from repeated samples [of the target population] would cover the true value." FJC Ref. Guide at 247. A confidence interval shows the magnitude by which a samplebased estimate is likely to differ from the true but unknown population value because of random error (also called "sampling error" or "chance error") introduced by the sampling process. Id. at 240 n.83, 241, 24346 (explaining that an estimate of the population parameter based on a probability sample will differ from the exact population value because of random error), 296 (defining "sampling error" by stating that an estimate of a population is likely to be different from the true but unknown population value "because the sample is not a perfect microcosm of the whole," and that "[i]f the estimate is unbiased, the difference between the estimate and the exact value is sampling error").
 back to note 10 11. The Milliman report (AZ Ex. 7) is titled "Statistical Issues Concerning the Arizona Health Care Cost Containment System Audit of SchoolBased Services by the Office of Inspector General."
 back to note 11 12. On September 26, 2019, the Board informed the parties that it would address CMS's motion to exclude EconLit's Second Rebuttal Report in its decision.
 back to note 12 13. "A 90% confidence interval means that there is a 10% probability that the true value of the error rate falls outside the confidence interval; or a 5% probability that the true value is greater than the upper limit or bound of the confidence interval, and a 5% probability that it is below the lower limit. Thus, [if] the disallowance [is] based on the lower limit of the confidence interval, . . . there [is] a 95% probability that the true value is above the lower limit." Ok. Dep't of Human Servs. at 6.
 back to note 13 14. CMS must also show that statistical sampling and estimation were appropriate tools for the OIG to use in auditing the State's FFP claims for schoolbased health services. MidKansas Cmty. Action Program at 4. Those tools were appropriate in this case given the impracticality of reviewing the vast number of schoolbased health services claims made during the 30month audit period. P.R. Dep't of Health at 5. The State does not assert that it was inappropriate for the OIG to use statistical sampling and estimation in order to achieve its stated audit objective.
 back to note 14 15. As noted, the mean difference estimator was calculated by dividing the total amount of unallowable FFP identified across all sampled studentmonths ($6,724, later reduced to $6,503) by the number of sampled studentmonths (100). See CMS Ex. 1, ¶ 9 (stating that the estimator was the "average error amount" in the population).
 back to note 15 16. "Bias" in the statistical sense means a "systematic tendency for an estimate to be too high or too low." FJC Ref. Guide at 283 (italics added). According to Dr. Smith, "an estimate based on a sample is unbiased if, on average across potential samples [drawn from the population], it will not benefit CMS or a state when compared to a review of all items." CMS Ex. 1, ¶ 9; see also EconLit Rebuttal Report, AZ Ex. 20, at 3 n.3 (stating that "[a]method of estimation is unbiased if the average value of the estimate, taken over all possible samples of a given size n, is exactly equal to the true population value" (citing Cochran, William G., Sampling Techniques (3rd ed.) at 22)). "While an unbiased point estimate exactly equals the actual repayment total [the unallowable amount] on average across potential samples, [the estimate] may be higher or lower than the actual [FFP] overpayment for any given sample" because of random error introduced by the sample selection process. CMS Ex. 1, ¶ 11; FJC Ref. Guide at 243, 246 (noting that an estimate based on a sample will differ from the true population value because of random error) and 296 (defining the term "sampling error," also known as random error).
 back to note 16 17. In support of that assertion, the State quoted the following passage from EconLit's February 12, 2018 report:
 back to note 17 18. "The precision of an estimate is usually reported in terms of the standard error [which can be calculated for the sample] and a confidence interval." FJC Ref. Guide at 241. Standard error, also called standard deviation, "gives the likely magnitude" of random error in the samplebased estimate of the measured parameter, "with smaller standard errors indicating better [more precise] estimates." Id. at 243. A confidence interval is calculated by adding to, or subtracting from, the point estimate a suitable multiple of the standard error, which means that a lowprecision estimate will yield a confidence interval that is wider (has upper and lower bounds further away from the point estimate) than a comparable confidence interval around a higherprecision point estimate. See id. at 244; CMS Ex. 1, ¶¶ 1314 (noting that the lower limit is calculated "by subtracting a measure of the precision of the sample"); HHS Office of Inspector General, Statistical Sampling: A Toolkit for MFCUs (2018), cited by Dr. Smith (CMS Exs. 1, ¶ 6 and 8, ¶ 6) and available at https://oig.hhs.gov/fraud/medicaidfraudcontrolunitsmfcu/files/MFCU%20Sampling%20Guidance%20Final.pdf (noting that precision "reflects how far away the upper and lower limits are expected to be from the point estimate" and that "[t]he worse the precision, the less meaningful the point estimate will be").
 back to note 18 19. The State asserts that the OIG erroneously "overstated" the lowerbound estimate by using an "unrepresentative sample." Sursurreply at 5. We address that assertion in the next section.
 back to note 19 20. See also FJC Ref. Guide at 226 ("Probability sampling ensures that within the limits of chance . . ., the sample will be representative of the sampling frame."), 230 (noting that "randomness in the technical sense . . . provides assurance of unbiased estimates from a . . . probability sample"), and 295 (indicating that "representative sample" is "[n]ot a welldefined technical term" but that it generally means a "sample judged to fairly represent the population, or a sample drawn by a process likely to give samples that fairly represent the population, for example, a large probability sample").
 back to note 20 21. As the term itself implies, cluster sampling, which is a type of random sampling, involves the use of clusters – definable subparts of a population. Sampling households at random, and then interviewing all people in the selected households, would be an example of a cluster sample of people. FJC Ref. Guide at 284 (defining "cluster sample"); see also AZ Ex. 8, at AHC 0291 (explaining that a sampling unit can be a "cluster of claims, as, for example, the claims associated with a patient" or the claims associated with a treatmentday). Like simple random sampling, cluster sampling is a sampling procedure, but the State does not object to the procedure for drawing the sample. Also, although the State maintains that the disallowance should be "recomputed" in accordance with its proposed "cluster sample analysis" (ratio estimation), it states that the new computation can and should be performed without "further" or "new" sampling. AZ Br. at 11, 21; Reply at 17.
 back to note 21 22. The State also failed to offer any statistical or mathematical analysis showing precisely how the ratio estimator "adjusts" for a sample's nonrepresentativeness. See AZ Ex. 20, at AHC 0309, 0311, 0313.
 back to note 22 23. The State does not discuss the potential bias of the ratio estimate in its reply or sursurreply briefs or assert that the EconLit reports admitted to the record show that the conditions for unbiasedness were met.
 back to note 23 24. Correlation is the measure of the "linear association" of two variables, with perfect correlation represented by observations that fall on a straight line with a positive or negative slope. FJC Ref. Guide at 26162.
 back to note 24 25. Variance is a measure of the deviation between the mean (average) value of a variable and the actual observed values of the variable. See FJC Ref. Guide at 240 (indicating that variance is a measure of the deviation from the mean).
 back to note 25 26. EconLit states that a "regression analysis" of the disallowed amounts "versus" the claimed amounts in the sample shows that, on average, the disallowed amount increases by 14.4 cents for every dollar claimed. AZ Ex. 17, at AHC 0234 & n.6. That result, according to EconLit, proves that, on average, the disallowed amount is a "fixed proportion" of the claimed amount, meaning that "every increase of $1 in the claimed amount is expected to increase the disallowed amount by $0.14." Id. Dr. Smith questions EconLit's apparent position that "the relationship between paid amounts and errors amounts [or disallowed amounts] is a straight line based on fitting a linear regression model." CMS Ex. 8, ¶ 27. According to Dr. Smith, a linear regression model "can be fit on any dataset regardless of whether the relationship between the two variables is linear," and that "one must compare the fit of a linear model to an alternative model that includes potential nonlinearity" to show that the data are linear. Id. In any case, Dr. Smith continued, even if the "appropriate test" is performed, "it is generally not possible to prove one way or the other whether the ratio estimate will be biased using only sample data." Id. On rebuttal (AZ Ex. 21), EconLit does not dispute Dr. Smith's statement that it is generally not possible to determine if the ratio estimator is or is not biased using only sample data; nor does EconLit assert or show that its regression analysis is adequate proof that the ratio estimator is unbiased.
 back to note 26 27. Section 440.110 is found in subpart A of 42 C.F.R. Part 440. Section 440.2(b) of subpart A states that "[e]xcept as limited in [42 C.F.R.] part 441, FFP is available in expenditures under the State plan for medical or remedial care services as defined in" the provisions of subpart A.
 back to note 27 28. A second condition for federal reimbursement is that the patient must have been "referred [for speech therapy] by a physician or other licensed practitioner of the healing arts within the scope of his or her practice under State law." 42 C.F.R. § 440.110(c)(1). The OIG found that the paid speech therapy services reflected in sample studentmonth 16 were ineligible for FFP because they were furnished to a student who lacked the required referral. AZ Ex. 6, at AHC 0135, 0139, 0147 (line 16, column 5). The State does not contest that finding, and we therefore do not address it.
 back to note 28 29. The speech therapy services identified in the OIG's report as ineligible for FFP were associated with studentmonth numbers 10, 16, 36, 40, 48, 53, 63, 84, 89, 91, 92, 96. AZ Ex. 6, at AHC 014749.
 back to note 29 30. One letter is not dated, but refers to speech therapy services that were the subject of OIG's audit "in 2005 and 2006." AZ Ex. 8, at AHC 0170. This letter, like the others that are dated, is insufficient for the same reasons as we explain in the text.
 back to note 30 31. Pa. Dep't of Human Servs., DAB No. 2883, at 15 (2018) (noting the state's "obligation to have available contemporaneous records that a reasonable reviewer would find sufficient to verify its compliance with applicable federal requirements for reimbursement" (italics in original omitted)); Nat'l Alliance on Mental Illness, DAB No. 2612, at 7 (2014) (stating that the Board "generally will not rely on noncontemporaneous documentation as evidence to support claimed costs" and that such documentation "must be closely scrutinized"); S.E. Mich. Health Assoc., DAB No. 2682, at 9 (2016) ("source documentation created at the time a cost was incurred is more credible than any later reconstructions"); Md. Dep't of Health and Mental Hygiene at 2, 5, 15 (stating that "[n]oncontemporaneous documentation may not be used to establish that covered services were provided unless there is a basis to believe that such documentation is reliable" and rejecting "afterthefact statement" that covered services had been provided on specific dates); Ind. Dep't of Public Welfare, DAB No. 772, at 3 (1986) (holding that an affidavit offered to prove a cost's allowability was insufficient because it contained "afterthefact statements which were unsupported by any documentation from the time period in question, such as a job description, organizational charts, or actual evidence of work performed"); L.A. Cnty. Dep't of Public Health at 13 (holding that the grantee's declaration containing "afterthefact" conclusions was "not an accurate and reliable substitute" for contemporaneous "source documentation"); Tenn. Protection and Advocacy, Inc., DAB No. 1454, at 4 (1993) (commenting on the grantee's failure "to offer a credible explanation for the absence of contemporaneous supporting records").
 back to note 31
A critical flaw with HHS/OIG's extrapolated overpayment conclusion is its level of precision (i.e., +/‐ 40%). EconLit agrees with Milliman in that the level of precision of +/‐ 40% is far in excess of the more standard acceptable levels of precision of +/‐ 5% or +/‐ 10%. The large upper/lower bounds [of the 90 percent confidence interval] render the estimates for population values so imprecise that they are statistically meaningless and therefore unreliable (i.e., between $21.3 million and $50.2 million).
AZ Ex. 11, at AHC 0193 (emphasis in original) (quoted in AZ Br. at 8).