Opinion
Criminal Action No. 12–0298 (ES)
12-18-2017
Courtney A. Howard, James Brandon Nelson, Desiree Grace Latzer, Office of the U.S. Attorney District of New Jersey, Newark, NJ, Robert L. Frazer, Office of the US Attorney, Newark, NJ, Andrew Joseph Bruck, U.S. Attorney's Office, District of New Jersey, Newark, NJ, for United States of America. Richard Jasper, Michael Keith Bachrach, Susan K. Marcus, New York, NY, Richard Ware Levitt, Levitt & Kaizer, New York, NY, Stephen Turano, Newark, NJ, for Defendant.
Courtney A. Howard, James Brandon Nelson, Desiree Grace Latzer, Office of the U.S. Attorney District of New Jersey, Newark, NJ, Robert L. Frazer, Office of the US Attorney, Newark, NJ, Andrew Joseph Bruck, U.S. Attorney's Office, District of New Jersey, Newark, NJ, for United States of America.
Richard Jasper, Michael Keith Bachrach, Susan K. Marcus, New York, NY, Richard Ware Levitt, Levitt & Kaizer, New York, NY, Stephen Turano, Newark, NJ, for Defendant.
Salas, District JudgeThe United States charged Defendant Farad Roland with crimes that qualify him for possible imposition of the death penalty under 18 U.S.C. § 3591 through § 3598. (D.E. No. 273, Amended Notice of Intent to Seek the Death Penalty). Roland moved for a pretrial determination of intellectual disability under the Eighth Amendment and the Federal Death Penalty Act ("FDPA"), 18 U.S.C. § 3596(c), which provides that a "sentence of death shall not be carried out upon a person who is mentally retarded." (D.E. No. 453 ("Def. Mov. Br.")). The Government opposed Roland's motion. (D.E. No. 360 ("Gov. Opp. Br.")). The Court has thoroughly analyzed the extensive evidence—including testimony from seven expert witnesses and nine fact witnesses, and over 360 exhibits—presented during the eighteen-day evidentiary hearing and has carefully considered the parties' written and oral arguments. For the reasons that follow, the Court concludes that Roland has abundantly satisfied his burden of proving his intellectual disability by a preponderance of the evidence and is thus ineligible for the death penalty. Accordingly, the Government is precluded from seeking a sentence of death.
Some authorities refer to an intellectually disabled person as "mentally retarded." See, e.g. , 18 U.S.C. § 3596(c). More recent authorities, however, use the phrase "intellectually disabled" to avoid derogatory connotations of the label "mentally retarded." See Hall v. Florida , ––– U.S. ––––, 134 S.Ct. 1986, 1990, 188 L.Ed.2d 1007 (2014). This Court will likewise employ the term "intellectually disabled," unless necessary to quote another authority directly. The Court often refers to both "intellectual disability" and "intellectually disabled" as "ID." Unless otherwise indicated, all internal citations and quotation marks are omitted, and all emphasis is added.
I. FRAMEWORK FOR ASSESSING INTELLECTUAL DISABILITY
A. Legal Standard
In 1988, Congress enacted the FDPA, which provides that a "sentence of death shall not be carried out upon a person who is mentally retarded." 18 U.S.C. § 3596(c). The Supreme Court, in Atkins v. Virginia , later articulated the constitutional dimension to this prohibition, holding that in light of "our evolving standards of decency," executing the intellectually disabled violates the Eighth Amendment's ban on cruel and unusual punishment. 536 U.S. 304, 311, 321, 122 S.Ct. 2242, 153 L.Ed.2d 335 (2002). The Court recognized a national consensus that intellectually disabled persons are "categorically less culpable than the average criminal." Id. at 316, 122 S.Ct. 2242.
The Supreme Court categorically banned the death penalty as applied to individuals with ID for two additional reasons. First , the Court raised the "serious question" whether the deterrent or retributive justifications for the death penalty apply "to mentally retarded offenders." Atkins , 536 U.S. at 318–19, 122 S.Ct. 2242. Second , the Court stated that persons with ID "in the aggregate face a special risk of wrongful execution" because their reduced mental capacity increases the possibility of false confessions and hinders their ability to assist counsel, provide mitigation evidence, or testify on their own behalf. Id. at 320–21, 122 S.Ct. 2242. So, the Court concluded, given the impairments of ID individuals, executing them would not "measurably advance the deterrent or the retributive purpose of the death penalty." Id. at 321, 122 S.Ct. 2242.
The Atkins Court acknowledged the difficulties inherent in defining intellectual disability, but it did not define the condition. See id. at 317, 122 S.Ct. 2242 ("To the extent there is serious disagreement about the execution of mentally retarded offenders, it is in determining which offenders are in fact retarded."). Instead, it left "the task of developing appropriate ways to enforce [this] constitutional restriction" to the states. Id. Put differently, Atkins "did not provide definitive procedural or substantive guides" to determine who qualifies as intellectually disabled. Bobby v. Bies , 556 U.S. 825, 831, 129 S.Ct. 2145, 173 L.Ed.2d 1173 (2009). The Court did, however, point to the clinical definitions of intellectual disability promulgated by the American Association on Mental Retardation ("AAMR") and the American Psychiatric Association ("APA"). See Atkins , 536 U.S. at 308 n.3, 122 S.Ct. 2242 (identifying then-current clinical standards). It explained that "clinical definitions of mental retardation require not only subaverage intellectual functioning, but also significant limitations in adaptive skills such as communication, self-care, and self-direction that became manifest before age 18." Id. at 318, 122 S.Ct. 2242.
The AAMR is now known as the American Association on Intellectual and Developmental Disabilities ("AAIDD"). Ybarra v. Filson , 869 F.3d 1016, 1021 n.5 (9th Cir. 2017).
The Court observed:
The American Association on Mental Retardation (AAMR) defines mental retardation as follows: "Mental retardation refers to substantial limitations in present functioning. It is characterized by significantly subaverage intellectual functioning, existing concurrently with related limitations in two or more of the following applicable adaptive skill areas: communication, self-care, home living, social skills, community use, self-direction, health and safety, functional academics, leisure, and work. Mental retardation manifests before age 18." Mental Retardation: Definition, Classification, and Systems of Supports 5 (9th ed. 1992).
The American Psychiatric Association's definition is similar: "The essential feature of Mental Retardation is significantly subaverage general intellectual functioning (Criterion A) that is accompanied by significant limitations in adaptive functioning in at least two of the following skill areas: communication, self-care, home living, social/interpersonal skills, use of community resources, self-direction, functional academic skills, work, leisure, health, and safety (Criterion B). The onset must occur before age 18 years (Criterion C). Mental Retardation has many different etiologies and may be seen as a final common pathway of various pathological processes that affect the functioning of the central nervous system." Diagnostic and Statistical Manual of Mental Disorders 41 (4th ed. 2000). "Mild" mental retardation is typically used to describe people with an IQ level of 50–55 to approximately 70.
Atkins , 536 U.S. at 309, 122 S.Ct. 2242 (emphasis in original).
In Hall v. Florida , the Supreme Court clarified that these "clinical definitions of intellectual disability ... were a fundamental premise of Atkins. " ––– U.S. ––––, 134 S.Ct. 1986, 1995, 1999, 188 L.Ed.2d 1007 (2014) (relying on the most recent (and still current) clinical standards and holding that Florida had violated the Eighth Amendment by "disregard[ing] established medical practice"). "The legal determination of intellectual disability is distinct from a medical diagnosis, but it is informed by the medical community's diagnostic framework." Id. at 2000. "In determining who qualifies as intellectually disabled," the Court instructed, "it is proper to consult the medical community's opinions." Id. at 1993.
Further clarifying Atkins , the Supreme Court instructed in Moore v. Texas that "[e]ven if 'the views of medical experts' do not 'dictate' a court's intellectual-disability determination, ... the determination must be 'informed by the medical community's diagnostic framework.' " ––– U.S. ––––, 137 S.Ct. 1039, 1048, 197 L.Ed.2d 416 (2017) (citing Hall , 134 S.Ct. at 2000 ). The "current manuals," the Court stated, "offer the best available description of how mental disorders are expressed and can be recognized by trained clinicians." Id. at 1053. So, while "being informed by the medical community does not demand adherence to everything stated in the latest medical guide," courts may not disregard current medical standards nor "diminish the force of the medical community's consensus." Id. at 1044, 1048–49 (vacating non-ID determination based on lower court's rejection of "medical guidance" and failure "adequately to inform itself of the medical community's diagnostic framework"); see also Hall , 134 S.Ct. at 1995 (holding that Florida violated the Eighth Amendment by "disregard[ing] established medical practice"); Ybarra , 869 F.3d at 1023 (holding that the district court erred in affirming state court's non-ID determination because the state court "contradicted the very clinical guidelines that it purported to apply").
B. Procedural Standards
To warrant an evidentiary hearing, Roland bears the burden of making an initial showing of "reasonable doubt" about ID. Brumfield v. Cain , ––– U.S. ––––, 135 S.Ct. 2269, 2281, 192 L.Ed.2d 356 (2015) (A defendant needs "only to raise a 'reasonable doubt' as to his intellectual disability to be entitled to an evidentiary hearing."); United States v. Watts , No. 14-40063, 2017 WL 413164, at *1 (S.D. Ill. Jan. 31, 2017) ("[I]t is [the defendant's] burden to make an initial showing of 'reasonable doubt' about ID before an evidentiary hearing is warranted."). Roland easily met this burden in his moving submission. (See generally Def. Mov. Br.). Roland concedes that he also carries the initial burden of establishing a prima facie case, by a preponderance of the evidence, that he is ID at the evidentiary hearing. (Id. at 5). But if he is successful, Roland argues, the Government bears the burden of rebutting his prime facie case beyond a reasonable doubt or (at the very least) by clear and convincing evidence. (Id. at 5–13). Roland's burden-shifting argument is unpersuasive.
The Government does not oppose Roland's position that his motion should be resolved by the Court before trial. (See generally Gov. Opp. Br.); see also United States v. Sablan , 461 F.Supp.2d 1239, 1242–43 (D. Colo. 2006) (holding that the court should determine whether a defendant is ID prior to trial); United States v. Candelario–Santana , 916 F.Supp.2d 191, 193–94 & n.6 (D.P.R. 2013) (determining pretrial that defendant was not ID and noting that "logically, the issue should be resolved before trial begins"); United States v. Smith , 790 F.Supp.2d 482, 484, 535 (E.D. La. 2011) (stating that the "issue will be determined before trial by the Court without a jury" and finding defendant ID); United States v. Shields , No. 04-20254, slip op. at 1-2 (W.D. Tenn. May 11, 2009) (determining pretrial that defendant was ID).
As the Court indicated during prehearing oral argument, Roland bears the burden of proving by a preponderance of the evidence that he is ID and therefore cannot receive the death penalty. (See D.E. No. 381, Tr. at 118). The preponderance burden—the lowest of the three standards of proof—adequately reflects the degree of confidence society demands for establishing ID. See Watts , 2017 WL 413164, at *4. Indeed, this burden is consistent with every case, including those cited by the parties, to have addressed this issue (often without much controversy). C. Clinical Standards
Transcripts of the June 5–9, 12–16, 19–23, 26–27 and 29, 2017 evidentiary hearing are cited as "D.E. No. –––, Tr. at ––––" and generally followed by a parenthetical notation identifying the witness.
See, e.g. , Hall , 134 S.Ct. at 2008 n.12 (noting "that most States appear to require defendants to prove each prong separately by a preponderance of the evidence"); Black v. Carpenter , 866 F.3d 734, 744 (6th Cir. 2017) (requiring that defendant prove his Atkins claim "by a preponderance of the evidence"); United States v. Umaña , 750 F.3d 320, 332, 358–60 (4th Cir. 2014), cert. denied , ––– U.S. ––––, 135 S.Ct. 2856, 192 L.Ed.2d 894 (2015) (affirming the district court's decision "that Umaña had failed to prove his [ID] by a preponderance of the evidence"); Watts , 2017 WL 413164, at *3–4 ("The Court finds that Watts should bear the burden of proving by a preponderance of the evidence that Watts is ID and therefore cannot receive the death penalty."); United States v. Williams , 1 F.Supp.3d 1124, 1134 (D. Haw. 2014) ("[I]n a federal case, the defendant bears the burden of proving an Atkins claim by a preponderance of the evidence.") (emphasis in original); United States v. Hardy , 762 F.Supp.2d 849, 852 (E.D. La. 2010) (noting that the defendant did not contest that it is his burden to prove ID by a preponderance of the evidence); Shields , slip op. at 2 (agreeing with the parties' position that the "[d]efendant bears the burden of establishing his mental retardation by a preponderance of the evidence"); United States v. Davis , 611 F.Supp.2d 472, 474 (D. Md. 2009) (finding that the defendant established his ID "by a preponderance of the evidence"); Sablan , 461 F.Supp.2d at 1243 (rejecting a clear-and-convincing burden of proof and holding that the defendant "should have the lower burden" of proving "his mental retardation by a preponderance of the evidence"); United States v. Nelson , 419 F.Supp.2d 891, 894 (E.D. La. 2006) ("Both the Government and the defense agree that the defendant bears the burden of establishing by a preponderance of the evidence that he is mentally retarded.").
As instructed by Moore , the Court will rely on the clinical definitions of ID promulgated by the AAIDD and the APA manuals: (i) AAIDD, Intellectual Disability: Definition, Classification, and Systems of Supports (11th ed. 2010) ("AAIDD–11" or "AAIDD Manual"); and (ii) APA, Diagnostic and Statistical Manual of Mental Disorders (5th ed. 2013) ("DSM–5"). See Moore , 137 S.Ct. at 1045 (relying on AAIDD–11 and DSM–5). These manuals are the most current iterations of the authoritative sources in the field, and the parties do not dispute their application. Following the Supreme Court's guidance, this Court will also rely on the AAIDD, User's Guide: Intellectual Disability: Definition, Classification, and Systems of Supports (11th ed. 2012) (the "User's Guide"), over the Government's objections.
See also Hall , 134 S.Ct. at 1990, 1991, 1993–96 (same); Atkins , 536 U.S. at 308 n.3, 317 n.22, 122 S.Ct. 2242 (relying on then-current clinical standards).
(See Def. Mov. Br. at 13–16; Gov. Opp. Br. at 5–7; D.E. No. 434 ("Def. Post–Hearing Submission") ¶¶ 30, 614; D.E. No. 442 ("Gov. Post–Hearing Opp.") ¶¶ 30, 614).
The Government argues that "the AAIDD's User's Guide was not adopted as part of the clinical standard in Moore v. Texas " and that "[t]here is no basis in the law to include the 'User's Guide' to a 'best practices manual' as part of the clinical standard." (Gov. Post–Hearing Opp. ¶ 30 (citing Moore , 137 S.Ct. at 1045, 1048–52 )). The Court is puzzled by the Government's argument, however, given that Moore relies on the User's Guide on four separate occasions. See 137 S.Ct. at 1049, 1050, 1052, 1059.
The Court also points the Government to the Supreme Court's decision in Hall (where the Court relied on the User's Guide) and to numerous other district court decisions evaluating Atkins claims. See Hall , 134 S.Ct. at 1995 ; United States v. Wilson , 170 F.Supp.3d 347, 369–70, 384 (E.D.N.Y. 2016) ; Hill v. Anderson , No. 96-0795, 2014 WL 2890416, at *24–26, 40, 46 & n.28 (N.D. Ohio June 25, 2014) ; Williams , 1 F.Supp.3d at 1139 ; United States v. Salad , 959 F.Supp.2d 865, 886 (E.D. Va. 2013) ; Hardy , 762 F.Supp.2d at 854, 882, 889–900. Accordingly, the Court declines to accept the Government's assessment of the User's Guide.
i. Three–Prong Test
The "generally accepted, uncontroversial intellectual-disability diagnostic definition ... identifies three core elements: (1) intellectual-functioning deficits (indicated by an IQ score approximately two standard deviations below the mean— i.e. , a score of roughly 70—adjusted for the standard error of measurement); (2) adaptive deficits (the inability to learn basic skills and adjust behavior to changing circumstances); and (3) the onset of these deficits while still a minor." Moore , 137 S.Ct. at 1045. Each of these three prongs must be met for a person to be positively diagnosed.
See also Hall , 134 S.Ct. at 1994 (noting that the three criteria used by the medical community are: "significantly subaverage intellectual functioning, deficits in adaptive functioning (the inability to learn basic skills and adjust behavior to changing circumstances), and onset of these deficits during the developmental period").
See O'Neal v. Bagley , 743 F.3d 1010, 1021 (6th Cir. 2013) ("[T]he failure to satisfy any one of the three criteria is enough to sink an Atkins claim."); United States v. Montgomery , No. 11-20044, 2014 WL 1516147, at *5 (W.D. Tenn. Jan. 28, 2014) ; Candelario–Santana , 916 F.Supp.2d at 194.
APA Definition. The APA defines ID as "a disorder with onset during the developmental period that includes both intellectual and adaptive functioning deficits in conceptual, social, and practical domains." DSM–5 at 33. The following three criteria must be met before an individual may receive a diagnosis of ID:
A. Deficits in intellectual functions, such as reasoning, problem solving, planning, abstract thinking, judgment, academic learning, and learning from experience, confirmed by both clinical assessment and individualized, standardized intelligence testing.
B. Deficits in adaptive functioning that result in failure to meet developmental and sociocultural standards for personal independence and social responsibility. Without ongoing support, the adaptive deficits limit functioning in one or more activities of daily life, such as communication, social participation, and independent living, across multiple environments, such as home, school, work, and community.
C. Onset of intellectual and adaptive deficits during the developmental period.
Id. "The diagnosis of intellectual disability should be made whenever Criteria A, B, and C are met." Id. at 39.
AAIDD Definition. For the AAIDD, ID is similarly "characterized by significant limitations both in intellectual functioning and in adaptive behavior as expressed in conceptual, social, and practical adaptive skills. This disability originates before age 18." AAIDD–11 at 6. Deficits in intellectual functioning are established by "an IQ score that is approximately two standard deviations below the mean, considering the standard error of measurement for the specific assessment instruments used and the instruments' strengths and limitations." Id. at 27. Deficits in adaptive functioning are measured by:
performance on a standardized measure of adaptive behavior that is normed on the general population including people with and without ID that is approximately two standard deviations below the mean of either (a) one of the following three types of adaptive behavior: conceptual, social, and practical or (b) an overall score on a standardized measure of conceptual, social, and practical skills.
Id. The purpose of the third prong—that the disability emerged before age eighteen—"is to distinguish ID from other forms of disability that may occur later in life .... Thus, disability does not necessarily have to have been formally identified, but it must have originated during the developmental period." Id.
The AAIDD Manual also specifies five assumptions that are "an explicit part of the definition" of ID "because they clarify the context from which the definition arises and indicate how the definition must be applied." Id. at 6–7. Three of those assumptions are relevant here:
Assumption 1 : "Limitations in present functioning must be considered within the context of community environments typical of the individual's age, peers and culture." This means that the standards against which the individual's functioning are compared are typical community-based environments, not environments that are isolated or segregated by ability. Typical community environments include homes, neighborhoods, schools, businesses, and other environments in which people of similar age ordinarily live, play, work, and interact.
Assumption 3 : "Within an individual, limitations often coexist with strengths." This means that people with ID are complex human beings who likely have certain gifts as well as limitations. Like all people, they often do some things better than others. Individuals may have capabilities and strengths that are independent of their ID (e.g., strengths in social or physical abilities, some adaptive skill areas, or one aspect of an adaptive skill in which they otherwise show an overall limitation).
Assumption 5 : "With appropriate personalized supports over a sustained period, the life functioning of the person with ID generally will improve." This means that if appropriate personalized supports are provided to an individual with ID, improved functioning should result.... The important point is that the old stereotype that people with ID never improve is incorrect. Improvement in functioning should be expected from appropriate supports, except in rare cases.
Id. at 7.
Severity of ID. The DSM–5 identifies four levels of severity, ranging from mild to profound. The level of severity is "defined on the basis of adaptive functioning, and not IQ, because it is adaptive functioning that determines the level of supports required." DSM–5 at 33–34. All levels of ID severity are exempt from capital punishment. See Moore , 137 S.Ct. at 1048 ("In Atkins v. Virginia , we held that the Constitution restrict[s] ... the State's power to take the life of any intellectually disabled individual.") (emphasis in original).
According to the AAIDD, 80–90 percent of all persons with ID fall within the mild classification. AAIDD–11 at 151. "Frequently, they have no identifiable cause for the disability, they are physically indistinguishable from the general population, they have no definite behavioral features, and their personalities vary widely, as is true of all people." Id. People with mild ID nonetheless "face significant challenges in society across all areas of adult life." Id. They are often more at risk and challenged than those with even lower IQs, as mild ID often goes undiagnosed and those suffering therefrom can appear more intellectually capable than they are, thus running the risk of a "lack of access to needed mental health care, medical care, dental care, nutrition, and relationship and parenting assistance." Id. at 153.
ii. Special Considerations in Assessing Intellectual Disability
Both the APA and the AAIDD provide significant diagnostic features, explanations, and qualifiers for forensic use. They also discuss other relevant considerations in assessing ID. The Court provides an overview of some of those points here, but notes that they are further discussed below when detailing each prong.
See, e.g. , DSM–5 at 38 ("Adaptive functioning may be difficult to assess in a controlled setting (e.g., prisons, detention centers); if possible, corroborative information reflecting functioning outside those settings should be obtained."); id. (noting that certain "associated features," such as gullibility and lack of awareness of risk, "can be important in criminal cases, including Atkins -type hearings involving the death penalty); User's Guide at 22–27 (discussing "best practices and clinical judgment guidelines that address how clinicians can foster justice when dealing with these forensic issues").
Clinical Judgment. The guidelines emphasize that determining whether an individual meets the clinical definition for ID involves clinical judgment. Clinical judgment is not simply an expert's opinion, and it "is different from either ethical or professional judgment based on one's professional ethics or standards." AAIDD–11 at 85. Rather, clinical judgment is "a special type of judgment rooted in a high level of clinical expertise and experience" that "emerges directly from extensive data and is based on training, experiences, and specific knowledge of the person and his or her environment." Id.
See, e.g. , AAIDD–11 at 29 ("Clinical judgment is essential."); id. at 85 ("Clinical judgment is a key component ... of professional responsibility in the field of intellectual disability.") (emphasis in original); id. at 85–107 (devoting chapter to the "Role of Clinical Judgment in Diagnosis, Classification, and Development of Systems of Supports"); DSM–5 at 37 ("Clinical training and judgment are required to interpret test results and assess intellectual performance."); id. ("Scores from standardized measures and interview sources must be interpreted using clinical judgment."); see also Hall , 134 S.Ct. at 2000 ("It must be stressed that the diagnosis of [ID] is intended to reflect a clinical judgment rather than an actuarial determination.") (quoting AAIDD–11 at 40); United States v. Lewis , No. 08-0404, 2010 WL 5418901, at *5 (N.D. Ohio Dec. 23, 2010) (noting that clinical judgment is "essential" and a "higher level of clinical judgment is frequently required in complex diagnostic and classification situations").
Comprehensive View. "A valid diagnosis of ID is based on multiple sources of information that include a thorough history (social, medical, educational), standardized assessments of intellectual functioning and adaptive behavior, and possibly additional assessments or data relevant to the diagnosis." Id. at 100. Retrospective Diagnosis. In most non-forensic circumstances, ID determinations focus on the individual's present level of functioning. But in the Atkins context, where defendants usually do not receive an ID diagnosis during the developmental period, a retrospective diagnosis is often required. See, e.g. , id. at 55 ("Many professionals rely on a retrospective assessment approach to measure the adaptive behavior of [individuals living in prisons]."). To that end, the AAIDD Manual and User's Guide provide a series of guidelines for clinicians charged with making these retrospective diagnoses. See id. at 95–96; User's Guide at 20–21. The guidelines advise clinicians to (i) conduct a thorough social, medical, and educational history; (ii) assess adaptive behavior using multiple informants and multiple contexts; (iii) assess behavior within the context of community environments typical of the individual's peers and culture (e.g., home, community, school, and work); (iv) recognize that adaptive behavior refers to typical or actual functioning (not to capacity or maximum functioning); (v) consider the standard error of measurement of the instrument when estimating the individual's true IQ score; and (vi) apply the Flynn Effect. AAIDD–11 at 95–96; User's Guide at 20–21; see also Davis , 611 F.Supp.2d at 476–77 (listing guidelines for retrospective diagnosis). Moreover, a "retrospective diagnosis should be based on multiple data points that not only involve giving equal consideration to adaptive behavior and intelligence but also require evaluating the pattern of test scores and factors that affect test scores." AAIDD–11 at 29.
Social History. "For a thorough social history, the clinician should take a holistic approach that focuses on the individual's limitations.... Compiling a thorough social history is especially important when stakes are high, such as in many legal and guardianship cases, when classification and supports planning play a significant role in the person's life or when a retrospective diagnosis is sought." AAIDD–11 at 94–95.
Medical History. "A medical history should include a thorough review of all records related to health of family members; prenatal, perinatal, and postnatal circumstances of the individual's birth; any early concerns or diagnoses; all medical intervention ...; injuries; family involvement with alcohol or other drugs; and exposure to toxins. In addition, in the medical history, the clinician should review all developmental disorders; physical or mental disorders; and challenging, difficult, and/or dangerous behaviors."Id. at 95.
Educational History. An educational history should include (among other things) (i) looking for consistent low grades in the core academic areas; (ii) indicating any failed or repeated grades; (iii) summarizing teachers' social and behavior ratings; (iv) discussing Individualized Education Program ("IEP"), if applicable; (v) looking for other evidence of difficulties in cognitive adaptive skills besides grades and test performance (e.g., difficulty following classroom directions); (vi) looking for difficulties in practical adaptive skills (e.g., poor grooming, inability to use money correctly); (vii) looking for evidence of difficulties in social adaptive skills (e.g., follows others, lack of self-direction, few friends, gullible, does not understand social humor). Id.
The DSM–5 likewise explains:
A comprehensive evaluation includes an assessment of intellectual capacity and adaptive functioning; identification of genetic and nongenetic etiologies; evaluation for associated medical conditions (e.g., cerebral palsy, seizure disorder); and evaluation for co-occurring mental, emotional, and behavioral disorders. Components of the evaluation may include basic pre-and perinatal medical history, three-generational family pedigree, physical examination, genetic evaluation (e.g., karyotype or chromosomal microarray analysis and testing for specific genetic syndromes), and metabolic screening and neuroimaging assessment.
DSM–5 at 39.
Risk Factors. Although etiology is not required for an ID diagnosis, certain biomedical, social, behavioral, and educational factors (i.e., "risk factors") may give rise to ID. See id. at 57–61. These risks can be prenatal, perinatal or postnatal. Id. at 60–61; User's Guide at 4; DSM–5 at 39. "[A]t least one or more of the risk factors will be found in every case of" ID. AAIDD–11 at 60. The Supreme Court recognized in Moore that "[c]linicians rely on such factors as cause to explore the prospect of intellectual disability further, not to counter the case for a disability determination." 137 S.Ct. at 1051. And federal courts, including the Supreme Court, routinely point to identifiable risk factors applicable to the particular individual as corroborative of ID. See, e.g. , id. at 1051–52 (noting that "traumatic experiences" such as "childhood abuse and suffering ... count in the medical community as 'risk factors ' for intellectual disability") (emphasis in original).
(See also D.E. No. 386, Tr. at 192 (Dr. Hunter: "Etiology is not required to make the diagnosis. But etiology is a very important piece of understanding the developmental presentation across time."); D.E. No. 408, Tr. at 77 (Dr. Greenspan: "[W]e may have risk factors which [are] suggestive of etiology but we do not need to know the etiology.")).
According to the AAIDD, the four categories of risk include "(1) biomedical factors related to biologic processes such as genetic disorders or poor nutrition, (2) social factors related to social and family interactions such as stimulation and adult responsiveness, (3) behavioral factors related to potential causal behaviors such as dangerous (injurious) activities or maternal substance abuse, and (4) educational factors related to the availability of learning supports that promote intellectual development and the development of adaptive skills." User's Guide at 4; see also AAIDD–11 at 61.
On the timing of these risks, the AAIDD provides that (i) prenatal risks include "chromosomal disorder, poverty, parental drug use, lack of preparation of parenthood"; (ii) perinatal risks include "birth injury, lack of access to prenatal care, parental rejection of caretaking, lack of knowledge about intervention or treatment"; and (iii) postnatal risks include "traumatic brain injury, impaired child-caregiver interaction, child abuse and neglect, delayed diagnosis." User's Guide at 4.
The APA also discusses certain risk factors for ID, including prenatal "environmental influences (e.g., alcohol, other drugs, toxins, teratogens)" and postnatal "severe and chronic social deprivation." DSM–5 at 39.
See also Brumfield , 135 S.Ct. at 2280 (noting that defendant's low birth weight placed him at risk of potential neurological trauma); Hall , 134 S.Ct. at 1991 (noting defendant's "horrible family circumstances," including physical abuse inflicted by defendant's mother); Smith v. Ryan , 813 F.3d 1175, 1187 n.16 (9th Cir. 2016) (noting that state court committed legal error when it discounted expert's opinion that defendant's abusive upbringing contributed to his ID); Hardy , 762 F.Supp.2d at 904 (noting that defendant's brain injury was as likely to have caused his deficits as "the possible genetic source suggested by his mother's apparent limitations"); Davis , 611 F.Supp.2d at 477 (noting seizures and lead poisoning ); Nelson , 419 F.Supp.2d at 897 (noting fetal alcohol exposure).
Reliance on Stereotypes. The AAIDD cautions against the use of stereotypes in determining if someone is intellectually disabled. See User's Guide at 26. Some incorrect stereotypes about people with ID are that they (i) "look and talk differently from persons from the general population"; (ii) "are completely incompetent and dangerous"; (iii) "cannot do complex tasks"; (iv) "cannot get driver's licenses, buy cars, or drive cars"; (v) "do not and cannot support their families"; (vi) "cannot romantically love or be romantically involved"; (vii) "cannot acquire vocational and social skills necessary for independent living"; and (viii) "are characterized only by limitations and do not have strengths that occur concomitantly with the limitations." Id. "Those stereotypes, much more than medical and clinical appraisals, should spark skepticism." See Moore , 137 S.Ct. at 1052.
Comorbidity. The diagnostic criteria for ID do not require exclusion of accompanying diagnoses. "As mental-health professionals recognize, ... many intellectually disabled people also have other mental or physical impairments ...." Id. at 1051 (citing DSM–5 at 40 ("Co-occurring mental, neurodevelopmental, medical, and physical conditions are frequent in intellectual disability, with rates of some conditions (e.g., mental disorders, cerebral palsy, and epilepsy ) three to four times higher than in the general population"); and AAIDD–11 at 58–63). "The existence of a personality disorder or mental-health issue, in short, is not evidence that a person does not also have intellectual disability." Id.
II. THE EVIDENTIARY HEARING
A. Exhibits
The parties introduced over 360 exhibits at the hearing. (See D.E. No. 432, Parties' Joint Exhibit List). They include, among other things, Roland's medical records, school records, Social Security Administration ("SSA") records, New Jersey Department of Youth and Family Services ("DYFS") records, juvenile detention records, testing materials, letters, phone call recordings and transcripts, test manuals, clinical manuals, and other literature. (See id. ). The Court has thoroughly reviewed and considered each of those exhibits in assessing Roland's ID claim.
B. Expert Witnesses
During the hearing, the Court heard opinion testimony from seven expert witnesses: four for Roland and three for the Government. The Court first provides an overview of their qualifications, followed by a summary of their conclusions and credibility determinations. Indeed, "[o]ne of the crucial functions of the Court in deciding an Atkins claim is to determine the credibility of witnesses presented at the evidentiary hearing." Wilson , 170 F.Supp.3d at 379. Having reviewed each expert's qualifications, reports, testimony, and demeanor at the hearing, the Court finds that the reports and testimony of Roland's experts—Drs. Hunter, Greenspan, Bigler, and McGrew—are more thorough, internally consistent, persuasive, and thus, credible. The Court finds the Government's experts generally less credible, as they all had fundamental disagreements with, and at times disregarded, the clinical standards and oftentimes contradicted their own conclusions. The substance of the experts' testimony, however, is best understood in light of the applicable legal and clinical standards. The Court thus analyzes the specific expert testimony and evidence in the Discussion section of this Opinion.
See also Nelson , 419 F.Supp.2d at 903 (determining the credibility of expert witness testimony in the Atkins context); Davis , 611 F.Supp.2d at 491 (considering "the relative credibility of the experts in this case").
i. Roland's Experts
1. Scott J. Hunter, Ph.D.
Roland's first expert witness was Scott J. Hunter, Ph.D., a developmental clinical neuropsychologist licensed in Illinois, Indiana, and Virginia. (Def. Ex. 40 ("Dr. Hunter's Report") at 2; Def. Ex. 42 ("Dr. Hunter's CV") at 1). Dr. Hunter is the director of Neuropsychology and head of the Pediatric Neuropsychology Service for the University of Chicago Medicine and Comer Children's Hospital. (Dr. Hunter's Report at 2). He is also a professor of Psychiatry and Behavioral Neuroscience, and Pediatrics in the University of Chicago's Pritzker School of Medicine and Biological Sciences Division, where he has been a faculty member for eighteen years. (Id. ).
Dr. Hunter received his master's degree and Ph.D. in clinical and developmental psychology, with emphases in behavioral neuroscience and developmental disabilities, from the University of Illinois at Chicago. (Id. ; D.E. No. 386, Tr. at 11 (Dr. Hunter's Testimony)). Following completion of an APA-accredited pre-doctoral internship in clinical psychology in the Department of Psychiatry and Behavioral Sciences at Northwestern University Medical Center, Dr. Hunter completed a postdoctoral residency in pediatric neuropsychology at the University of Rochester. (Dr. Hunter's Report at 2). Dr. Hunter is the co-author and editor of three textbooks (one currently pending publication) addressing pediatric and lifespan neuropsychological development, and one textbook addressing the development of executive functioning and the relationship between executive dysfunction and disability in a range of neurodevelopmental and medical disorders. (Id. at 2–3). On June 6, 2017, the Court qualified Dr. Hunter as an expert for the defense on ID. (D.E. No. 386, Tr. at 21).
Dr. Hunter based his testimony and report on an interview with Roland; a review of records regarding Roland's developmental, academic, medical, and legal history; a review of Roland's SSA records; and a battery of tests administered to Roland. (Id. at 60–64, 182, 194 (Dr. Hunter's Testimony); Dr. Hunter's Report at 5–11). Dr. Hunter concluded that "Roland has significant and permanent limitations in intellectual functioning, across all three required components for a diagnosis of Mild ID." (Dr. Hunter's Report at 18; see also D.E. No. 387, Tr. at 43 (Dr. Hunter testifying that Roland "meets the criteria for diagnosis of mild intellectual disability")).
The Court finds Dr. Hunter's testimony, particularly with regard to Prong One, to be highly credible. Dr. Hunter is qualified to administer and interpret intelligence testing, and his practice routinely includes administering IQ tests. (D.E. No. 387, Tr. at 53 (Dr. Hunter's Testimony)). Dr. Hunter "ha[s] had training working with infants all the way up through geriatric age individuals." (D.E. No. 386, Tr. at 8 (Dr. Hunter's Testimony)). Dr. Hunter testified that he is "familiar with how to diagnos[e], assess, and treat and provide support for intellectual disabilities," and "[t]hat is a large part of what [he does] on a daily basis." (Id. at 21). Notably, Dr. Hunter's testimony and report comported with the clinical guidelines, and Dr. Hunter provided credible and persuasive explanations for many of the concerns raised by the Government's experts. (See infra at 519, 521-22, 525, 537-38, 543-44). Of further note, Dr. Hunter received no direct financial benefit from his work as an expert witness in this case. (D.E. No. 387, Tr. at 47 (Dr. Hunter's Testimony)). Moreover, only 10 percent of Dr. Hunter's practice is forensic, with 55 percent of that involvement being on behalf of defendants and 45 percent on behalf of plaintiffs. (Id. at 48, 94). Dr. Hunter has testified in twelve cases, but only two of them were criminal cases and none of them were in the Atkins context. (D.E. No. 386, Tr. at 20 (Dr. Hunter's Testimony)).
2. Stephen Greenspan, Ph.D.
Roland's second expert witness was Stephen Greenspan, Ph.D., a psychologist and preeminent scholar on ID. See Lewis , 2010 WL 5418901, at *2. Dr. Greenspan has a master's degree and a Ph.D. in developmental psychology from the University of Rochester, with an emphasis on developmental psychology and developmental psychopathology. (Def. Ex. 45 ("Dr. Greenspan's Report") at 29; D.E. No. 408, Tr. at 3 (Dr. Greenspan's Testimony)). He completed a postdoctoral fellowship on mental retardation and developmental disabilities at University of California at Los Angeles's Neuropsychiatric Institute. (D.E. No. 408, Tr. at 3 (Dr. Greenspan's Testimony)). Notably, Dr. Greenspan is credited with providing the three-domain framework for assessing the adaptive-functioning prong of the ID diagnosis. (Id. at 7–8). And both the AAIDD–11 and the DSM–5 rely heavily on his model of adaptive behavior. (Id. at 7). Dr. Greenspan consulted the APA on the development of the DSM–5 and is the most cited authority in both the AAIDD–11 and the online edition of DSM–5. (Id. at 6; Dr. Greenspan's Report at 2). Moreover, Dr. Greenspan is a Fellow—"a status that is awarded to a small subset of especially qualified people"—of the APA (Division of ID) and the AAIDD. (D.E. No. 408, Tr. at 17 (Dr. Greenspan's Testimony)).
Ybarra , 869 F.3d at 1021 (noting that Dr. Greenspan is the most-cited authority in the 2002 and 2010 AAIDD diagnostic manuals).
Dr. Greenspan has been published extensively on issues related to ID for over 35 years, beginning in 1980. (Id. at 16; Dr. Greenspan's Report at 2, 30–34 (listing select publications)). He is the co-editor of a book titled, "What is Mental Retardation?" and the sole or lead author of four chapters in "The Death Penalty and Intellectual Disability," a 2015 book published by the AAIDD. (Dr. Greenspan's Report at 2). He has received prestigious awards from the AAIDD, the APA, and the University of Washington in Seattle for his contributions in the field of ID. (D.E. No. 408, Tr. at 16–17 (Dr. Greenspan's Testimony)).
Dr. Greenspan's practice focuses on teaching, writing, and expert consulting or testimony. (D.E. No. 409, Tr. at 53 (Dr. Greenspan's Testimony)). Although not a clinician, Dr. Greenspan has had clinical training, particularly within the field of ID. (D.E. No. 408, Tr. at 18–19 (Dr. Greenspan's Testimony)). He indicated that he has been certified as an expert in approximately 25 federal and state Atkins proceedings. (Dr. Greenspan's Report at 2; D.E. No. 409, Tr. at 57–58 (Dr. Greenspan's Testimony)). On June 12, 2017, the Court qualified Dr. Greenspan as an expert for the defense on ID. (D.E. No. 408, Tr. at 21–22).
Dr. Greenspan based his testimony and report on an interview with Roland; interviews with 11 individuals familiar with Roland; and a review of Roland's life history records, including records regarding Roland's developmental, academic, medical, SSA, and legal history. (Dr. Greenspan's Report at 2–6; D.E. No. 408, Tr. at 72 (Dr. Greenspan's Testimony)). Dr. Greenspan concluded that a diagnosis of ID is warranted because "Roland has significant limitations in all three of the ID definitional criteria." (Dr. Greenspan's Report at 28).
Though Dr. Greenspan is undoubtedly one of the preeminent scholars on ID, the Court does have a few misgivings about his testimony, which it addresses in the Discussion section below. (See infra at 538-39). The Court nevertheless credits Dr. Greenspan's opinions, particularly with regard to Prong Two, because of his unquestionable expertise in this field and his dutiful adherence to the clinical standards in assessing Roland's ID claim. (See infra at 539-42). The Court takes note of the fact that Dr. Greenspan is credited with providing the three-domain framework for assessing Prong Two and is recognized as a leading expert in the field (as demonstrated by the fact that he is the most cited authority in both the AAIDD–11 and the online edition of the DSM–5). Indeed, Dr. Greenspan's expertise is evident from his knowledge of the guidelines and his comprehensive evaluation of Roland's adaptive functioning. (See infra at 532-34; 538-42). The Court finds Dr. Greenspan's conclusions credible for the additional reason that he demonstrated a mastery of the evidence of Roland's life history, oftentimes reciting detailed information from the record without the aid of exhibits.
3. Erin D. Bigler, Ph.D.
Roland's first rebuttal expert witness was Erin D. Bigler, Ph.D., a board-certified clinical neuropsychologist licensed in Utah, California, Texas, and Hawaii. (Def. Ex. 56 ("Dr. Bigler's CV") at 4). Dr. Bigler received his Ph.D. in experimental-psychological psychology from the Brigham Young University ("BYU") and completed a postdoctoral fellowship in neurophysiology-neuropsychology at Barrow Neurological Institute at the St. Joseph's Hospital and Medical Care. (Id. at 1). He is a professor of psychology and neuroscience at BYU and an adjunct professor of psychiatry at the University of Utah. (D.E. No. 422, Tr. at 41 (Dr. Bigler's Testimony). Dr. Bigler has directed or co-directed a clinical neuropsychology subspecialty within a APA-approved clinical psychology Ph.D. training programs for over forty years. (Def. Ex. 55 ("Dr. Bigler's Report") at 1). At BYU, Dr. Bigler directs the Brain Imaging and Behavior Lab, which studies neuroimaging correlates of brain disorders, including developmental disorders. (Id. ). He has also authored or co-authored textbooks that are widely used in training clinical neuropsychologists and has published over 350 peer-reviewed articles, many dealing with cognitive and intellectual assessment. (Dr. Bigler's CV at 4–103). On June 23, 2017, the Court qualified Dr. Bigler as an expert for the defense in clinical neuropsychology. (D.E. No. 422, Tr. at 43).Dr. Bigler based his testimony on a review of Roland's records and a review of the three primary expert reports—those of Drs. Hunter, Greenspan, and Morgan. (Dr. Bigler's Report at 1). Dr. Bigler concluded that "it is clear that Mr. Roland's level of intellectual ability is limited where the preponderance of the findings support intellectual deficits, supporting the first prong of ID." (Id. at 8).
The Court finds Dr. Bigler to be a qualified and exceptionally credible witness and gives great weight to his testimony, particularly with respect to Prong One. The Court notes that this is Dr. Bigler's first time testifying in the Atkins context. (D.E. No. 422, Tr. at 180 (Dr. Bigler's Testimony)). The Court was impressed with Dr. Bigler's testimony and frankness in responding to questions from counsel and the Court. Dr. Bigler persuasively and credibly explained that the evidence proffered by the Government (particularly relating to Roland's alleged malingering and inadequate efforts) was flawed in many key respects. (See, e.g. , id. at 56–64; 156–60). And Dr. Bigler persuasively undermined the Government's experts' credibility by providing alternative, scientifically defensible explanations for some of their conclusions. (See, e.g. , id. at 117–18; 134–36; 156–60; 169–72).
4. Kevin S. McGrew, Ph.D.
Roland's second rebuttal expert witness was Kevin S. McGrew, Ph.D., a psychometrician and director of the Institute for Applied Psychometrics. (Def. Ex. 58 ("Dr. McGrew's CV") at 1–2). Dr. McGrew received a master's degree in school psychology from the Moorhead State University and a Ph.D. in educational psychology from the University of Minnesota. (Id. at 2). Notably, Dr. McGrew has served as a measurement consultant to several psychological test publishers and national and international research studies and organizations. (Def. Ex. 57 ("Dr. McGrew's Report") at 1). Dr. McGrew is a psychological-measurement expert (or psychometrician) and has extensive experience in the development and psychometric analysis of nationally standardized, norm-referenced psychological and educational assessment instruments. (Id. ). Dr. McGrew is a member of the AAIDD and the APA, among other professional organizations. (Dr. McGrew's CV at 3). He has published extensively throughout his career, including two chapters in the AAIDD's "The Death Penalty and Intellectual Disability." (Id. at 6). Dr. McGrew has also authored or co-authored over eighty professional journal articles and book chapters, four professional books on intelligence-test interpretation, and seven psychological test batteries. (Id. at 5–15). On June 26, 2017, the Court qualified Dr. McGrew as an expert for the defense in three areas: applied psychological measurements, theories of human intelligence, and interpretation of intelligence tests. (D.E. No. 423, Tr. at 54).
Dr. McGrew based his testimony on the reports and score summary sheets of Drs. Hunter and Morgan. (Dr. McGrew's Report at 2). He concluded that Roland's "obtained IQ scores meet Prong One for consideration of an intellectual disability diagnosis." (Id. at 3).
The Court finds Dr. McGrew's testimony to be highly credible. Dr. McGrew's answers to questions posed by counsel and this Court were lucid, direct, and cogent. Dr. McGrew's opinions were thoroughly researched and grounded in either the clinical standards or other scientific principles. The Court finds Dr. McGrew's thoroughness reassuring and his conclusions particularly credible on the central issues of this case. As an example of Dr. McGrew's thoroughness, the Court points to his decision to contact Dr. Alan Kaufman, the developer of the Kaufman Brief Intelligence Test ("KBIT"), to clarify an ambiguity in comparing two different versions of the KBIT test. (D.E. No. 423, Tr. at 166–67 (Dr. McGrew's Testimony)).
ii. Government's Experts
1. Joel E. Morgan, Ph.D., ABPP–CN
The Government's primary expert witness was Joel E. Morgan, Ph.D., ABPP–CN, a board-certified clinical neuropsychologist licensed in New Jersey. (Gov. Ex. 167 ("Dr. Morgan's Report") at 1). Dr. Morgan received a master's degree in school psychology from Fairleigh Dickinson University and a Ph.D. in clinical psychology from the New School for Social Research. (Gov. Ex. 350 ("Dr. Morgan's CV") at 1–2). He has served on the board of the American Academy of Clinical Neuropsychology and on the American Board of Psychology. (D.E. No. 412, Tr. at 70–71 (Dr. Morgan's Testimony)). Dr. Morgan is the co-editor of several textbooks, including "Textbook of Clinical Neuropsychology" and "Neuropsychology of Malingering Casebook," and has authored chapters in many of these books. (Id. at 73; Dr. Morgan's CV at 7–8). He has also authored articles about neurological disorders presenting in adults and children and on the neuropsychological assessment of those disorders, ethics and professional responsibility, forensic matters, and validity assessment. (D.E. No. 412, Tr. at 73–74 (Dr. Morgan's Testimony)).
Half of Dr. Morgan's practice involves clinical diagnoses of children referred because they are not performing well in school; the other half of his practice is forensic and involves making diagnoses for purposes of litigation. (Id. at 75, 79). Dr. Morgan has administered thousands of tests similar to those he administered to Roland. (Id. at 78). On June 15, 2017, the Court qualified Dr. Morgan as an expert for the Government in the field of clinical neuropsychology and child neuropsychology. (Id. at 83–84). Roland had no objection. (Id. ).
Dr. Morgan based his testimony and report on an interview with Roland; interviews with three individuals familiar with Roland; a review of Roland's developmental, academic, and medical records; a thorough review of Roland's criminal and legal history, including various DVDs, CDs, letters, emails, and transcripts of statements and interviews; letters he believed Roland authored from prison; recordings of Roland's phone conversations from prison; a review of Roland's SSA award notice; and a battery of tests administered to Roland. (Dr. Morgan's Report at 2–15). In his report, Dr. Morgan concluded that "within a reasonable degree of scientific psychological certainty, I find that Mr. Farad Roland does not have intellectual disability." (Id. at 17). During his testimony, Dr. Morgan clarified that "essentially I am opining that we do not have a valid IQ score to make the diagnosis one way or the other." (D.E. No. 427, Tr. at 137 (Dr. Morgan's Testimony)).
The Court finds that Dr. Morgan's testimony and conclusions lack credibility in several respects. (See infra at 516-24; 534-38). The most glaring problem with Dr. Morgan's conclusion—as evidenced by his answers to questions from Roland's counsel—is that he was unreasonably dismissive of anything at all that might suggest a different conclusion from his own. (See infra at 534-38). He appears to have ignored any evidence—including Roland's exposure to factors that may give rise to ID or the fact that his own standardized adaptive-behavior measure revealed that Roland had significant deficits in the conceptual domain of adaptive functioning—that would suggest the possibility that Roland has intellectual limitations and adaptive deficits. (See infra at 495-96, 516-24; 534-38). Given the significant role of clinical judgment and the highly subjective nature of an ID evaluation, the effect of having such a pervasive bias present in the evaluator is hard to overstate. Moreover, Dr. Morgan expressed numerous disagreements with the clinical standards on which this Court is instructed to rely. In expressing his disagreement, Dr. Morgan disregarded the guidelines' express guidance and relied on evidence that these clinical standards prohibit. (See infra at 534-38). Dr. Morgan's assessment of Roland's adaptive-functioning, for example, routinely failed to comport with the clinical standards. (See infra at 534-38). For these reasons, the Court finds that Dr. Morgan's testimony is lacking in credibility as well as reliability and awards little weight to his opinions.
(See, e.g. , D.E. No. 427, Tr. at 7, 160–61 (Dr. Morgan disagreeing with the AAIDD's recommendation that the best practice is to wait one year before administering another test); D.E. No. 414, Tr. at 53–54, 167–68 (Dr. Morgan disagreeing with the guidelines' proscription of verbal behavior, criminal adaptive functioning, street smarts, and conduct in structured settings); id. at 54 (Dr. Morgan: "My professional opinion is that some of the guidelines proposed for what one can and cannot look at and evaluate and use in formulating these decisions are somewhat arbitrary.")).
2. Bernice A. Marcopulos, Ph.D., ABPP
The Government's Prong–Two rebuttal expert witness was Bernice A. Marcopulos, Ph.D., ABPP, a board-certified clinical neuropsychologist licensed to practice in Virginia. (Gov. Ex. 362 ("Dr. Marcopulos's CV") at 1). Dr. Marcopulos received a master's degree and a Ph.D. in clinical neuropsychology from the University of Victoria. (Id. ). Dr. Marcopulos is currently a professor at the Department of Psychology at James Madison University and is part of the associate faculty at the University of Virginia. (Id. at 1–1; D.E. No. 427, Tr. at 176–77 (Dr. Marcopulos's Testimony)). She primarily teaches graduate courses, including courses in neuropsychological assessments (which cover IQ testing and adaptive-functioning measures), human psychology (which cover ID), and forensic neuropsychology. (D.E. No. 427, Tr. at 176–78 (Dr. Marcopulos's Testimony)). She is a member of several professional organizations, including the APA, and oversees the creation of board-certification examinations for the American Board of Clinical Neuropsychology. (Id. at 179–81; 183–84). Dr. Marcopulos's clinical experience includes serving as the director and clinical neuropsychologist of the Division of Behavioral Medicine and Neuropsychology at the Western State Hospital in Staunton, Virginia. (Id. at 185–89; Dr. Marcopulos's CV at 1). Dr. Marcopulos has published several peer-reviewed articles in the field of clinical neuropsychology and has edited a book titled "Clinical neuropsychological foundations of schizophrenia." (Dr. Marcopulos's CV at 4–18; D.E. No. 427, Tr. at 182–83 (Dr. Marcopulos's Testimony)). She is a Fellow of the National Academy of Neuropsychology and the APA. (Dr. Marcopulos's CV at 3; D.E. No. 427, Tr. at 184 (Dr. Marcopulos's Testimony)). In 2015, Dr. Marcopulos received the American Academy of Clinical Neuropsychology Distinguished Neuropsychologist Award. (Dr. Marcopulos's CV at 3). The Court accepted Dr. Marcopulos as an expert for the Government in clinical neuropsychology on June 19, 2017. (D.E. No. 427, Tr. at 192–93). Roland had no objection. (Id. ).
Dr. Marcopulos submitted a joint rebuttal report with Dr. Morgan on May 19, 2017. (Gov. Ex. 168 ("Joint Rebuttal Report")). Drs. Marcopulos and Morgan have worked together in several capacities: they both serve on the American Board of Clinical Neuropsychology and are both oral examiners, and they have co-authored several papers, chapters, and books together. (Dr. Marcopulos's CV at 6–7; D.E. No. 416, Tr. at 15 (Dr. Marcopulos's Testimony)). Dr. Marcopulos is accredited with drafting the "Prong Two – Adaptive Skills" section of the Joint Rebuttal Report, in which she joins Dr. Morgan's conclusion that Roland "does not meet any of the three prongs diagnostic of ID." (D.E. No. 427, Tr. at 67 (Government attorney stating that "the adaptive component was written by Dr. Marcopulos"); Joint Rebuttal Report at 6–9, 10).
The Court finds Dr. Marcopulos's testimony especially lacking in credibility. The problems with Dr. Marcopulos's work are legion. First , the Government vehemently objected to Roland's questioning of Dr. Morgan on the "Prong Two—Adaptive Skills" section of the Joint Rebuttal Report and represented to this Court that "the adaptive component was written by Dr. Marcopulos." (D.E. No. 427, Tr. at 67–68 (Government attorney objecting to Roland's counsel's questioning of Dr. Morgan on that section of the Joint Rebuttal Report and stating that "several times we laid out on the record, the adaptive component was written by Dr. Marcopulos. It is fair for her but not for Dr. Morgan. He didn't write it. You can't say didn't you write that? Because he didn't write that."). Yet, Dr. Marcopulos testified that she did not author a substantial portion of the adaptive-behavior section, which (as noted below) calls into question the reliability of the entire Joint Rebuttal Report and the credibility of both authors. (See infra at 539-41).
Roland's counsel relied on the Government's representation and did not question Dr. Morgan about the "Prong Two—Adaptive Skills" section of the Joint Rebuttal Report. (See D.E. No. 416, Tr. at 152–53 (Roland's Counsel: "My understanding was that [Dr. Marcopulos] wrote the adaptive behavior section so I did not ask Dr. Morgan about that."); see also D.E. No. 427, Tr. at 68–69 (Roland's counsel withdrawing question presented to Dr. Morgan about the adaptive-functioning section of the Joint Rebuttal Report)).
Second , the Court questions the thoroughness of Dr. Marcopulos's review because her testimony comprised mostly general statements with little or no evidence from the record to support her opinions. (See infra at 539-41). Although Dr. Marcopulos expressed her frustration at the "lack of records available that could have answered" some of her questions about Roland's adaptive functioning, she appeared to have ignored many of the records that were available for her review. (See infra at 540-41). Cross-examination revealed that Dr. Marcopulos had little knowledge of Roland's life history and it remains unclear on what evidence Dr. Marcopulos relied in forming her expert opinion. (See infra at 540-41).
Third , at the hearing, Dr. Marcopulos oftentimes contradicted her own conclusions and testified on numerous occasions that Roland did indeed have certain deficits. (See, e.g. , D.E. No. 416, Tr. at 125–26, 137 (Dr. Marcopulos testifying that the record contains evidence of Roland's deficits)). Fourth , like Dr. Morgan, Dr. Marcopulos expressed several disagreements with the clinical standards. (See, e.g. , id. at 182-85 (Dr. Marcopulos disagreeing with the AAIDD's guidance to avoid using evidence of someone's behavior in prison)). Finally , and again similar to Dr. Morgan's evaluation, Dr. Marcopulos acknowledged at the hearing many of the risk factors to which Roland was exposed, but omitted them from her section of the Joint Rebuttal Report. (See id. at 108 (Dr. Marcopulos's Testimony)).
In sum, the Court finds that Dr. Marcopulos's credibility was thoroughly impeached and her testimony is not helpful in evaluating the critical issues in this case.
On May 26, 2017, Roland moved to exclude the testimony of Drs. Morgan and Marcopulos. (See D.E. No. 356–1 ("Roland's Daubert Motion")). Specifically, Roland moved "for an Order precluding, not considering, or determining that no weight should be given to the testimony of" Drs. Morgan and Marcopulos under Daubert v. Merrell Dow Pharmaceuticals , 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), Kumho Tire Co. v. Carmichael , 526 U.S. 137, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999), Federal Rules of Evidence 702, and the Fifth, Sixth, and Eighth Amendments to the Constitution. (Id. at 1–2). Although Roland's Daubert Motion cites Daubert and Kumho Tire , it relies primarily on Rule 702(d), which allows expert testimony only if "the expert has reliably applied the principles and methods to the facts of the case." (See id. at 3–50). Roland makes the distinction because he does not challenge the tests employed by the Government's experts. (Id. at 1). Rather, Roland challenges the experts' "analysis of the results," arguing that "the reports of Drs. Morgan and Marcopulos run so far afoul of the prevailing clinical standards in the field of intellectual disability that a good faith basis exists to question whether their individual expertise fits the requirements of this case." (Id. ).
The Government opposed Roland's Daubert Motion, correctly noting that it "does not challenge Dr. Morgan's methodology , it challenges his opinion. This is an improper use of Daubert. " (D.E. No. 358 at 1). While Roland's Daubert Motion raises some excellent points, they appear to go to the weight and credibility of these experts' opinions. (See, e.g. , Roland's Daubert Motion at 30 ("Drs. Morgan and Marcopulos base their conclusion on an unsupported claim that Mr. Roland was malingering. Their position is undermined, however, by the fact that Dr. Morgan administered tests to Mr. Roland to weed out the potential for malingering, and even Dr. Morgan concedes that Mr. Roland passed the tests and could not be found to be malingering on them."); id. at 44 ("Indeed, the fact that Dr. Morgan is so quick to dismiss the Social Security Administration's ability to satisfy its legal obligations, evinces a bias in Dr. Morgan's methodology that, while not in and of itself a basis to disqualify a witness under Daubert or Rule 702, is certainly a factor that may be considered in determining what weight, if any , to give to Dr. Morgan's testimony should he survive the instant admissibility challenge." (emphasis in original))).
In light of these facts, the Court agrees with the Government that "[t]o the extent that Defendant disagrees with the outcome, this is a matter for cross-examination rather than a Daubert hearing" (see D.E. No. 358 at 1). And Roland had ample opportunity to challenge the conclusions of Drs. Morgan and Marcopulos during the eighteen-day evidentiary hearing. (See D.E. Nos. 412, 414 & 427 (demonstrating that Dr. Morgan testified over the course of three days); D.E. Nos. 19 & 20 (demonstrating that Dr. Marcopulos testified for two days). Accordingly, Roland's Daubert Motion is denied.
3. Robert L. Denney, Psy.D., ABPP
The Government's final rebuttal expert witness was Dr. Robert L. Denney, Psy.D., ABPP, one of only seven board-certified neuropsychologists and forensic psychologists in the world. (Gov. Ex. 194A ("Dr. Denney's CV") at 1, 4; D.E. No. 418, Tr. at 79–80 (Dr. Denney's Testimony)). Dr. Denney has a master's degree in psychology and a doctorate in clinical psychology from the Forest Institute of Professional Psychology. (Dr. Denney's CV at 4). He is currently a staff neuropsychologist in the Department of Neurology at the Citizens Memorial Hospital and maintains a private practice at the Neuropsychological Associates of Southwest Missouri. (Id. at 1; D.E. No. 418, Tr. at 74 (Dr. Denney's Testimony)). Dr. Denney completed his clinical internship at the U.S. Medical Center for Federal Prisoners and remained there as a staff psychologist from 1991 to 2011. (Dr. Denney's CV at 1–3). Dr. Denney is a Fellow of the APA, National Academy of Neuropsychology, American Academy of Clinical Neuropsychology, and the American Academy of Forensic Psychology. ( Id. at 5). He sits on the boards of several peer-reviewed scientific journals and has chaired the National Academy of Neuropsychology. (Id. at 4–5). Dr. Denney has authored or co-authored over thirty journal articles and over twenty books or book chapters. (Id. at 6–12). The Court accepted Dr. Denney as an expert for the Government in forensic psychology and clinical neuropsychology on June 21, 2017. (D.E. No. 418, Tr. at 80). Roland had no objection. (Id. ).
Dr. Denney based his testimony and report on a review of the reports and raw test data of Drs. Hunter, Morgan, and Greenspan; Roland's educational records; Roland's incarceration history; and transcripts of Roland's November 2016 telephone calls from prison. (Gov. Ex. 194 ("Dr. Denney's Report") at 1–2). Dr. Denney concluded that "the test data presented in the record are not valid reflections of Mr. Roland's genuine cognitive functioning and should not be relied upon for diagnostic determination." (Id. at 24). Specifically, Dr. Denney opined that "[t]he combined results from both the November 2016 and March 2017 examinations, along with the demonstrated cognitive abilities manifest within the phone calls, is sufficient information within the context of litigation to conclude it is more likely than not that Mr. Roland has been malingering neurocognitive dysfunction" and "those results should not be used to support a diagnosis of ID." (Id. ).
Dr. Denney also relied on a letter dated April 28, 2015, from Roland to this Court. (Dr. Denney's Report at 2, 5 (citing Gov. Ex. 187)). As discussed at the hearing, the parties agreed that this letter was drafted by another inmate. (See D.E. No. 416, Tr. at 3–12; 139–41)). So, the Court will not rely on Dr. Denney's conclusions stemming from his review of this letter, if any.
Though Dr. Denney's qualifications are undoubtedly impressive, having had an opportunity to consider the testimony, conclusions, and demeanor of all experts at the hearing and all the evidence proffered by the parties, the Court finds the testimony and conclusions of Roland's experts, particularly Drs. Bigler and McGrew, to be more thorough, credible, and therefore more persuasive. The Court bases its conclusion on several reasons. First , the Court finds that, like the other Government experts, Dr. Denney espoused an approach to assessing ID that is directly at odds with the clinical standards. (See infra at 513-14; 521-23). In forming his conclusions, for example, Dr. Denney relies heavily on evidence (such as recorded phone conversations from prison) that the AAIDD proscribes. (See Dr. Denney's Report at 24–25). This is so because, again, like the other Government experts, Dr. Denney also disagrees with certain aspects of the clinical standards. (See, e.g. , D.E. No. 420, Tr. at 75–76 (Dr. Denney disagreeing with the AAIDD's guidance on evaluating criminal behavior and the weight of strengths and deficits)). Second , the Court is troubled by the fact that Dr. Denney concluded that Roland was malingering without ruling out, or even mentioning, any other alternatives. (See supra at 521-22, 523-24). Third , evidence and testimony from other experts demonstrated that many of the concerns Dr. Denney raised to conclude that Roland malingered (or exhibited poor effort) could indeed be addressed by alternative explanations. (See supra at 521-23). Finally , the Court is unpersuaded by Dr. Denney's skepticism about the significance of passing validity measures designed to detect inadequate effort or malingering. (See supra at 514-15, 522). So, compared to Drs. Bigler and McGrew, the Court finds Dr. Denney's testimony to be less credible and will assign it less weight. C. Fact Witnesses
The Court also heard testimony from nine fact witnesses: seven for Roland and two for the Government. They consisted of Roland's educators (Delores Lemon–Gresham and Kathleen Bohm), a social worker who performed educational evaluations for special needs students in Newark Public Schools (Andy D'Amato), Roland's cousin (Jeannette Carter), a correctional officer who interacted with Roland (Captain Michael Thomas), two SSA-related witnesses (Melissa Bruckner and Herman Huber), a mental health professional from the New Jersey Department of Corrections (Richard Cevasco), and the chief psychologist for the Federal Bureau of Prisons at the Metropolitan Detention Center (Michael Segal). The Court found all of these fact witnesses to be generally credible and provides further detail on their testimony below, as relevant to the Court's analysis.
III. ROLAND'S BACKGROUND
In analyzing Roland's Atkins claim, the Court will begin with an overview of Roland's background, drawing from a voluminous record extending to Roland's early childhood and including contemporaneous accounts and evaluations from family members, teachers, social workers, correctional officers, mental health professionals, and SSA records. The following facts from Roland's background are those that the Court finds most relevant to its analysis or those that provide context for later discussion. The Court does not attempt to provide a comprehensive summary of the voluminous record presented at the hearing. Additional facts adduced at the hearing appear throughout the three-prong analyses below. (See infra at 498-553).
Roland's early life was quite tumultuous, marked by parental loss, abuse, neglect, and malnutrition—or what the clinical community calls "risk factors." (See supra at 479-80; infra at 494-96). Roland was born in Newark, New Jersey, on August 18, 1984. (Def. Ex. 2e). He is the fifth of six children born to his mother, Elvena ("Tessie") Roland. (See Def. Exs. 2a-2k). Ms. Roland died of AIDS on May 31, 1995, when Roland was 10 years old. (Def. Ex. 2c). Roland's natural father, Lawrence James, spent much of Roland's life incarcerated. (See Def. Ex. 6b at 3–4; Def. Ex. 6c at 14–24). Mr. James also died of AIDS on May 21, 1996, when Roland was 11 years old. (Def. Ex. 2h).
Mother's Psychological Health and Substance Abuse. Ms. Roland had a significant history of drug and alcohol abuse and psychiatric hospitalizations, including one that occurred shortly after Roland's birth. Ms. Carter testified that Ms. Roland used to "smoke the pipe" and "sniff dope, or coke." (D.E. No. 385, Tr. at 99–100 (Ms. Carter's Testimony)). According to Ms. Carter, Ms. Roland was also an alcoholic who drank during her pregnancy with Roland, beginning in the morning "as soon as the liquor store opened up." (Id. ). Whenever Ms. Roland "had money to get something, she drank." (Id. at 103; Def. Ex. 5a at 103). Hospital records confirm Ms. Carter's account. Hospital records from October 2, 1984 (when Roland was less than two months old) document that Ms. Roland "takes drugs," "admits to abusing Doriden and cocaine whenever she has the money," "is a regular alcohol consumer of one to two beers per day and on occasion more," and "is unemployed and smokes up to one pack a day." (Def. Ex. 5a at 766, 779, 784, 1256). These records also indicate that Ms. Roland suffered from major depression and other psychological disorders during Roland's developmental years.
(See also Def. Ex. 5a at 779, 784 (hospital records from May 1986, when Roland was less than two years old, documenting that Ms. Roland "used Doriden, codeine, cocaine and alcohol over the past 6 months" and noting that Ms. Roland started to inject cocaine about 18 months prior about two-to-three times a week)).
(See Def. Ex. 5a at 752–54 (documenting that Ms. Roland was admitted to the hospital with a provisional diagnosis of major depression on April 30, 1985, when Roland was 8 months old, and later discharged with diagnoses of isolated explosive disorder, dysthymic disorder, mixed substance abuse, and chronic hepatitis ); id. at 754 (documenting that in May 1985, when Roland was 8 months old, Ms. Roland "admitted feeling extremely depressed recently," "had become very frustrated and threatened to stab her uncle's wife," "had financial stress," and "lost her control"); id. at 768, 777 (documenting that Ms. Roland was admitted to the hospital on October 19, 1985, when Roland was 14 months old, and diagnosed with "Dysthymic disorder and unspecified personality disorder"); id. at 777, 779, 784 (documenting that Ms. Roland was admitted to the hospital on May 18, 1986, when Roland was 21 months old, and diagnosed with "Adjustment disorder with depressed mood, mixed substance abuse continuous")).
Neglect and Malnutrition. Given the impact of her own psychiatric and substance-abuse issues, Ms. Roland struggled to care adequately for any of her children. Ms. Roland was referred multiple times to DYFS for child neglect, as the children were "left alone on a regular basis, are not dressed properly, not fed properly, and oldest child does not attend school." (See Def. Ex. 4a at 20). DYFS records further note that the apartment where the children lived had neither "electricity nor food," and DYFS made several emergency food referrals because the family had run out of food. (See id. at 24; Def. Ex. 4b at 140, 144, 163, 166).
(See also Def. Ex. 4b at 170 (noting that on April 22, 1985, when Roland was 8 months old, Amin Roland's teacher visited the Roland home but "Ms. Roland was sleeping and the children were unsupervised"); id. (DYFS caseworker reporting that she visited the Roland home the following day, on April 23, 1985, and found that Ms. Roland "was sleeping but children let worker in. Mother said she was unaware that anyone came to her home yesterday. Mother said she sleeps a lot in the day"); Def. Ex. 4a at 23 (noting that Ms. Roland's children were referred to DYFS on September 9, 1985, when Roland was 13 months old, because "[n]atural mother left children unsupervised last week" and the "[c]hildren are presently alone in the apartment"); Def. Ex. 5a at 779, 784 (medical records from October 22, 1985, when Roland was 14 months old, documenting that Ms. Roland "felt depressed and fearful that she might explode. She left home for days prior to admission to live with her girlfriend and then came to the Emergency Room complaining of feeling helpless and hopeless"); Def. Ex. 4b 165–64 (Ms. Roland's aunt reporting that Ms. Roland "does this all the time, that she asks her relatives to watch her children for a while and never returns")).
Corroborating these records, Ms. Carter testified vividly and compellingly that when she visited Ms. Roland's home in 1985 or 1986, she found Ms. Roland "in the kitchen with about three other people. They were all passed out. [Roland] was in a play pen. And he had feces all over his face and forehead." (D.E. No. 385, Tr. at 100 (Ms. Carter's Testimony)). Around him were a belt, a syringe, a burned spoon, and alcohol. (Id. ). She also recalled that when Roland visited Ms. Carter's parents' house, Roland was "always hungry," and she would sometimes see him eating "in the dark." (Id. at 111–13). And when Roland was 6 or 7 years old, he tried to eat canned dog food with a spoon, but Ms. Carter stopped him. (Id. ). Ms. Carter noted that the can had a picture of a dog on it. (Id. ).
The children were ultimately removed from Ms. Roland's custody. (See Def. Ex. 4b at 156). Roland and his brothers, Amin and Larry, were placed with their maternal aunt, Lethia Thomas, and her husband, Winston Thomas. (Id. ). But Ms. Thomas, who was legally blind and had three children of her own, continued to neglect Roland. DYFS records report that Ms. Thomas "has no commitment to these children," "is very quick to give away all responsibility for them," "often doesn't know where they are," "refused to participate in parenting techniques and displayed no real commitment," provided poor living conditions for the children, and treated the children as an "after-thought." (Id. at 184–87). DYFS records also note that "there is no food in the home for children" and that Ms. Thomas failed to register Roland in school. (See Def. Ex. 4a at 66).
Roland's younger sister, Sarita, was placed with her paternal grandmother. (Dr. Hunter's Report at 5). His youngest brother was removed from Ms. Roland's custody at birth (due to his testing positive for exposure to substances) and placed in foster care. (Id. ).
(See also Def. Ex. 4a at 66 (noting that Ms. Thomas's "home is dirty; the ceiling is falling in the bathroom and the bathroom sink is off the wall"); Def. Ex. 4b at 115–20 (requesting additional beds because the home did not have appropriate furniture); id. at 159 (Ms. Thomas refusing DYFS caseworker's offer to take Roland to doctor with the rest of the children because Roland "did not have any clothes and could not go")).
Abuse. The record is replete with Roland's abuse at the hands of Mr. and Ms. Thomas. In May 1997, when Roland was 12, a hospital social worker reported to DYFS that Roland had "3 scratch marks on the left side of his face, a mark under his left eye, and a deep scratch mark in the center of his throat caused by his aunt hitting him." (Def. Ex. 4b at 64). He "was also beaten by his uncle and had marks on his legs from the beating." (Id. ). The social worker reported that "her major concern is that [Roland's] aunt hit him in the head with a stick and she is concerned that if things get out of hand that [Roland] may be seriously hurt." (Id. ). In June 2001, at age 16, Roland was admitted to the hospital for a scalp laceration after Mr. Thomas struck him in the head with a stick. (See Def. Ex. 12d at 1–3, 6). When questioned about their abuse, Mr. and Ms. Thomas explained to DYFS that "they had [Roland] for 12 years and the child has always been a behavior problem," noting that Roland "had been thrown out of 5 schools in Newark." (Def. Ex. 4a at 61). They clarified, however, "that [Roland] was not a problem in the home[,] just in the community and when he's at school." (Id. ).
(Def. Ex. 4a at 61 (May 28, 1997 DYFS records noting that "Mr. Winston Thomas stated that a week ago he did strike [Roland] one time on the leg when he caught him setting a fire in their basement" and that "Ms. Thomas stated this was the second fire [Roland] started")).
(See also Def. Ex. 4a at 60 (DYFS records noting that Roland "disclosed that on 5/28/97 he was given $2.25 to go purchase cigarettes for his aunt. He put the money in his jean pocket and forgot the pocket had a hole in them. He lost the money [and] his aunt became angry and hit [Roland] with her hands"); Dr. Greenspan's Report at 21 (reporting Mr. Robinson's observation that Mr. Thomas was very abusive to Roland; that the beatings and Roland's screams could be heard from the streets; and that "[o]ther children thought it was funny, and told Winston that [Roland] did things he didn't do (like stealing a bicycle) just so they could hear the beatings")).
Academic Difficulties. Roland's academic record is one of overwhelming failure. Ms. Carter testified that Roland was slower in his developmental years than his peers, including slower to walk, talk, and read. (D.E. No. 385, Tr. at 107–11 (Ms. Carter's Testimony)). For example, at age 8, Roland could not read simple words like "kick" or "jump." (Id. at 111). Ms. Carter recounted that when Roland played video games with her son, who is younger than Roland, her son would have to instruct Roland on what to do, since Roland did not know how to read the words that appeared on the screen. (Id. ). School records similarly reveal that Roland was delayed in acquiring skills appropriate for a child his age. In 1993, at age 9, Roland was placed in special-education classes with a designation of emotionally disturbed. (See Def. Ex. 15a). His special education teacher, Delores Lemon–Gresham—who remembers Roland because of his bad hygiene—believes that Roland had deficits other than emotional disturbance, including that, by age 11, Roland still could not read. (D.E. No. 384, Tr. at 175–76 (Ms. Gresham's Testimony)).
In 1998, at approximately 13 years old, Roland began to get in trouble with the law. (See Def. Ex. 9a at 10–13 (summarizing Roland's arrest and court history)). Hence for parts of 1998 through 2002, he was confined at a youth detention center. (See generally Def. Ex. 9a; see also D.E. No. 384, Tr. at 64–91 (Cap. Thomas's Testimony)). During this time, he attended Sojourn High School at the Essex County Juvenile Detention Center ("Sojourn"). (See Def. Ex. 15e). In March 1999, at 14, Roland took the New Jersey Grade 8 Proficiency Assessment Individual Student Report, scoring in the lowest of three categories with a 142 ("Partially Proficient") in Language Arts Literacy and a 150 ("Partially Proficient") in Mathematics. (See Def. Ex. 15e at 8; Dr. Greenspan Report at 8–9 (specifying that Partially Proficient requires a score below 200, Proficient requires a score between 200 and 250, and Advanced Proficient requires a score above 250)). Kathleen Bohm, Roland's teacher at Sojourn, testified that Roland obtained these low scores after she had been working with him for some time, reflecting that he was "[s]till extremely limited." (D.E. No. 384, Tr. at 17 (Ms. Bohm's Testimony)).
A Sojourn report card for the 2001 through 2002 school year (when Roland was 17) lists Roland in grade 9, indicating that he was left back two grades. (See Def. Ex. 15e at 1; see also Dr. Greenspan's Report at 9). During his time at Sojourn, Roland did receive some As and Bs on his report cards. (See Def. Ex. 15e at 1, 4–7). Ms. Bohm testified, however, that in 1998, 1999, and 2001, grades at Sojourn were not based on aptitude; they were instead based primarily on participation and completing tasks. (D.E. No. 384, Tr. at 31–35 (Ms. Bohm's Testimony)). Moreover, an April 12, 2002 IEP from the Juvenile Justice Commission states that Roland "is achieving significantly below grade level." (See Def. Ex. 15b at 16). The IEP further notes that Roland "exhibits Oppositional Defiant Disorder as well as Attention–Deficit/Hyperactivity Disorder which is part of an earlier diagnoses he received in 1996. He requires frequent re-direction." (Id. at 5).
On September 7, 2001, at 17 years old, Roland took the Test of Adult Basic Education ("TABE") at Sojourn, scoring a grade-equivalent of 3.5 in reading, 3.3 in applied mathematics, 3.6 in total mathematics, 2.2 in language, and 0.3 in spelling. (See Def. Ex. 15e at 2–3). A month later, in October 2001, Roland took the New Jersey Grade 11 High School Proficiency Test, in which he failed both the reading and the writing sections, but passed the math section by one point. (See Def. Ex. 15e at 9; Dr. Greenspan's Report at 9). Roland took another TABE on December 18, 2001, at age 17, scoring a grade-equivalent of 4.4 in reading, 3.6 in total math, 2.6 in language, and 4.7 in spelling. (See Def. Ex. 15b at 5 (IEP listing Roland's results)).
Roland misspelled his last name on the September 2001 TABE scoresheet as "Ralrand FArad." (See Def. Ex. 15e at 2). When reviewing Roland's scoresheet at the hearing, Ms. Bohm testified that misspelling his name was "typical of Mr. Roland." (D.E. No. 384, Tr. at 12 (Ms. Bohm's Testimony)).
The New Jersey Department of Corrections administered several TABEs to Roland from 2005 through 2011: (i) in July 2005, at age 20, Roland scored a grade-equivalent of 6.6 in reading, 2.4 in language, 4.1 in spelling, 6.4 in mathematics computation, 2.1 in applied mathematics, and 4.2 in total mathematics; (ii) in August 2006, at age 22, Roland scored a grade-equivalent of 7.6 in reading, 2.3 in language, .0 in spelling, 5.0 in mathematics computation, 5.4 in applied mathematics, and 5.2 in total mathematics; and (iii) in March 2011, at age 26, following almost three years of incarceration, Roland scored a grade-equivalent of 6.4 in reading, 3.5 in language, 12.5 in spelling, 10.0 in mathematics computation, 6.0 in applied mathematics, and 7.8 in total mathematics. (See Gov. Ex. 113 at 5; see also Gov. Ex. 353 (compiling chart of Roland's TABE results over time)).
Reported Cognitive Difficulties. Roland's Juvenile Justice Commission records further illustrate his difficulties. On January 2, 2002 (at age 17), while Roland was in the custody of the Juvenile Justice Commission, Dena Farber, Ph.D., CAC, conducted a psychological screening that included administering a KBIT ("2002 KBIT") and a clinical interview. (See Def. Ex. 9a at 55–59, 100). Roland received a Composite IQ score of 70, plus or minus 7; a Vocabulary IQ score of 75, plus or minus 8; and a Matrices IQ score of 69, plus or minus 9, on the 2002 KBIT. (See id. at 58). Dr. Farber documented that Roland was "cooperative with this interviewer and the interview processes" and noted that the "[e]valuation reveals a young man who has very poor judgment and little insight into his behaviors." (Id. at 59). Elsewhere, Dr. Farber wrote that the "[r]esults are likely indicative of limited schooling rather than actual cognitive functioning." (Id. at 58). Dr. Farber's assessment further includes her impressions that Roland "has trouble controlling and modulating the expression/feeling of anger," that "[h]is response to insults verbal/physical-from others is to lash out in kind. The rush of emotion feeling overwhelms him and thus interferes with any ability to plan ahead and [foresee] the likely consequences of his actions." (Id. at 59).
Dr. Farber's impression of Roland as cooperative during the administration of the 2002 KBIT refutes Dr. Denney's contention that, "[g]iven [Roland's] documented oppositional and defiant behavior during this very same time frame, there is little reason for confidence [that] these KBIT scores were an accurate reflection of his genuine intellectual abilities" (see Dr. Denney's Report at 4).
The next day, on January 3, 2002 (at age 17), Lynne Gavan, CADC, conducted a comprehensive substance-abuse assessment on Roland. (Id. at 84–86). As part of her evaluation, Ms. Gavan administered to Roland a substance-abuse screening test (the "SASSI–A2"). (Id. at 84). Ms. Gavan also documented that Roland "was cooperative with the interview process" and noted that Roland "appeared to be somewhat cognitively limited which may account for the high defensive score in the SASSI–A2 results. Many of the questions had to be explained to [him]." (Id. at 85). Ms. Gavan's additional impressions include that Roland's score on one component of the SASSI–A2 "suggests a lack of insight and awareness," and his score on a different component of the test "indicates a high risk of Acting Out behavior when combined with inadequate adult supervision, poor impulse control and poor anger managementtechniques." (Id. at 84–85). Ms. Gavan noted a second time that Roland "appears to have poor insight and somewhat limited cognitive ability." (Id. at 85).
Exposure to Risk Factors. Defense and Government experts here agree that exposure to risk factors can inhibit brain growth and that many risk factors were present in Roland's life history.
(See D.E. No. 416, Tr. at 107–08 (Dr. Marcopulos testifying that risk factors could "negatively impact neuro development" and that "things can happen during childhood and adolescence that would either" promote or inhibit healthy brain growth); D.E. No. 386, Tr. at 192 (Dr. Hunter: "There are a number of contributing factors, both prenatally as well as post natally that can also contribute to and essentially establish a cognitive disability."); D.E. No. 422, Tr. at 65–66 (Dr. Bigler testifying that ID "is a brain disorder fundamentally" and "multiple factors" "could potentially influence the brain")).
Dr. Hunter explained in his report that
it is worth noting that in Mr. Roland's case, there are several risk factors that have been identified for his neurodevelopmental delays. These include his likely exposure to illicit substances and alcohol throughout his mother's pregnancy with him, his experience of poor attachments and neglect across his infancy and childhood, the extreme level of poverty he lived in throughout his childhood and adolescence, the record of likely malnutrition he experienced, and the substandard educational instruction and support he was provided. The intersection of these traumatic and significantly abusive experiences are understood to impact neuropsychological and behavioral development substantially, and contribute as well to ongoing vulnerabilities to increased impact of additional stressors, like violence and aggression, overtime.
Pertinent to the above discussed factors associated with potential etiology, Mr. Roland does have a reported and documented history of head traumas during his childhood and into adulthood, that have likely served to enhance and increased the impact of his ID on his ongoing development of adaptive and behavioral functioning. Mr. Roland described, and the medical records supported, trauma to his head secondary to physical abuse by his uncle. He reported a head injury, where he "saw stars" and had some bleeding of the scalp, in conjunction with a school bus accident when he was 8 or 9 years of age.... These reported head traumas are likely additional factors contributing to his sustained challenges with aspects of cognitive development; as noted previously, repeated head injuries can serve to complicate and exacerbate already significant deficits in cognitive functioning.
(Dr. Hunter's Report at 7–8; see also D.E. No. 386, Tr. at 193–94 (Dr. Hunter's Testimony)).
Dr. Greenspan testified at length about evidence of risk factors in Roland's history. (See D.E. No. 408, Tr. at 78–100 (Dr. Greenspan's Testimony)). Dr. Greenspan began by outlining the three biggest risk factors for ID: prenatal exposure to alcohol, malnutrition, and lack of parental stimulation. (Id. at 79–81). He then explained that these risk factors can be additive or duplicative, meaning "the more risk factors the more likely that it would result in ID." (Id. at 79). Dr. Greenspan ultimately concluded that "[h]ere you have a child or a person with multiple risk factors, and the three biggest that are known to cause ID. Put those together you have a kid who is very much at risk for ID." (Id. at 81; see also Dr. Greenspan's Report at 26 (discussing Roland's exposure to risk factors)).
At the hearing, Dr. Morgan also recognized Roland's exposure to a "plethora" of risk factors:
There are numerous risk factors for ID. And based on Mr. Roland's history, he actually has many of them. He came from an impoverished background. He had almost no parental support. He had intermittent school attendance. His parents were addicted to drugs and died when he was young. They died of AIDS. Mr. Roland had gone from one living situation to another, from [f]amily to family, and moved around a lot. There were numerous interventions from [DYFS]. There were problems with just basic, normal care in the home. Basic normal hygiene. Nutrition. There were a plethora of factors that were at risk for Mr. Roland.
(D.E. No. 414, Tr. at 9–10 (Dr. Morgan's Testimony)).
Dr. Marcopulos likewise testified that there was evidence in the record of Roland's exposure to several risk factors, including trauma, neglect, malnutrition, poverty, parental alcohol and drug use, and abandonment. (D.E. No. 416, Tr. at 107–10 (Dr. Marcopulos's Testimony)). Unlike Drs. Hunter and Greenspan, the Government's experts did not address Roland's exposure to risk factors in their written reports. But both Drs. Morgan and Marcopulos testified that they nevertheless considered them in their analyses. (See D.E. No. 414, Tr. at 10–11 (Dr. Morgan's Testimony); D.E. No. 416, Tr. at 109–10 (Dr. Marcopulos's Testimony)).
Social Security Administration Records. The SSA determined that Roland was "learning disabled" in 1996, when he was 11 years old. (D.E. No. 384, Tr. at 128–131 (Ms. Bruckner's Testimony); Def. Ex. 17 at 8). As a result of this determination, Roland received Supplemental Security Income ("SSI") benefit payments beginning on January 17, 1996. (See D.E. No. 384, Tr. at 108, 125, 130–134 (Ms. Bruckner's Testimony); Def. Ex. 17 at 5, 8). The SSA conducted a reevaluation (known as a "continuing disability review") of its learning-disability determination in 1999 (when Roland was 14) and determined instead that Roland was "mentally retarded" (or "MR"). (D.E. No. 384, Tr. at 95–96, 108–109 (Ms. Bruckner's Testimony); Def. Ex. 17 at 7).
The SSA does not disburse SSI payments if a claimant is incarcerated. (D.E. No.384, Tr. at 134 (Ms. Bruckner's Testimony)).
The Court finds Roland's SSA records extremely probative, credible, and informative to the Court's analysis of all three prongs of the ID definition. Although most of Roland's SSA records were "purged" (i.e., destroyed) in the normal course of business, the remaining incontrovertible records confirm that Roland received disability payments based upon a determination that he was "mentally retarded." (D.E. No. 384, Tr. at 114, 134 (Ms. Bruckner's Testimony)). The Court heard testimony from two witnesses employed at the SSA who authenticated the documents and explained the SSA's disability-determination process: Melissa Bruckner, an SSA employee at the New York Regional Office's Center for Disability and Program Support; and Herman Huber, a clinical psychologist employed as a psychological consultant at the SSA's Division of Disability Determination Services for over 30 years. (See generally id. ; D.E. No. 423 (Dr. Huber's Testimony)). Having had the opportunity to consider these witnesses' testimony and demeanor at the hearing, the Court finds both witnesses to be knowledgeable about the SSA's procedures, clear, straightforward, and credible. For that reason, the Court relies heavily on their uncontested testimony, which is summarized below.A changing diagnosis from learning disabled to MR is made by a medical consultant. (D.E. No. 423, Tr. at 33 (Dr. Huber's Testimony)). In Roland's case, Dr. Huber was the medical consultant who changed Roland's learning-disability designation to MR. (D.E. No. 384, Tr. at 113–114, 143, 162 (Ms. Bruckner's Testimony); D.E. No. 423, Tr. at 18 (Dr. Huber's Testimony); see also Def. Ex. 17 at 7).
Dr. Huber testified that the SSA conducts a "global" assessment of the child and considers both IQ scores and adaptive functioning. (D.E. No. 423, Tr. at 26 (Dr. Huber's Testimony)). To assess intellectual functioning, Dr. Huber testified that it was his practice to request an IQ test for "nearly all" applicants suspected of having a potential ID. (Id. at 23). He described IQ tests as "central to the evaluation if there is an allegation of intellectual disability or mental retardation." (Id. ). The only time Dr. Huber would not request an IQ test was when "you have a claimant who is so disabled, so limited, that IQ testing isn't even possible. In that case you wouldn't require it because it couldn't be done." (Id. at 24). For cases involving mild ID, on the other hand, IQ scores would be part of the determination "[v]irtually all the time." (Id. ).
(D.E. No. 423, Tr. at 26 (Dr. Huber testifying that the SSA looks "to see what the IQ scores are currently" and "to assess adaptive functioning ... to see if that indicates that the child is still functioning with a learning disability, or is it more accurately labeled mental retardation")).
The SSA's disability determination was also based on the claimant's adaptive functioning. (Id. at 26–28). "[W]ith a diagnosis of mental retardation," the SSA is "looking at a child whose cognitive abilities are limited across the board, generally, in all spheres." (Id. at 27). Specifically, the SSA is "looking at various contexts in which the child operates [e.g.,] school, home, outside of the home, to see whether the functioning is consistent in all the domains that the child functions in." (Id. at 26–27).
Although Dr. Huber had no independent recollection of conducting Roland's continuing-disability review, three points from his testimony are particularly noteworthy. First , Dr. Huber testified that he would not diagnose a claimant as fitting the criteria for ID if he did not believe it to be the case after a thorough review of the claimant's records. (See id. at 20, 26–28, 30–32). Second , Dr. Huber stated that if he thought an application was incomplete, he would request additional records, testing, or both. (See id. ). Third , Dr. Huber testified that he would not change a diagnosis from learning disabled to ID based purely on statements from a child's parents or guardians. (Id. at 39).
The Government disputes the probity of these records, arguing that Roland "presented no evidence that an IQ test" or "any adaptive functioning tests ... were administered by the SSA." (Gov. Post–Hearing Opp. ¶¶ 112–15). The Government is correct that the records do not reveal any specific tests administered to Roland when he was 14. But the records do reveal that a "consultative examiner" administered at least one test to Roland. (D.E. No. 384, Tr. at 117, 120–21 (Ms. Bruckner's Testimony); see also Def. Ex. 17 at 7). While this last point does not directly refute the Government's argument, it does reinforce Dr. Huber's statement (and thus enhances his credibility) that he would not base an ID diagnosis purely on statements from a child's parents or guardians.
See Ybarra , 869 F.3d at 1026 ("We note that requiring individuals to provide formal test scores from their developmental period would likely 'creat[e] an unacceptable risk that persons with intellectual disability will be executed' because not everyone who is intellectually disabled receives formal testing at a young age.") (citing Hall , 134 S.Ct. at 1990 ).
A "consultative examiner" is an independent psychologist or medical doctor hired to conduct specific tests requested by the SSA as part of the SSA's initial or continuing-disability review. (D.E. No. 384, Tr. at 117, 120–21 (Ms. Bruckner's Testimony)).
In light of the above, the Court finds that Roland's SSA records are probative and reliable, and the Court will give them considerable weight in its analysis.
Current Allegations. On June 5, 2013, Roland was charged in a Second Superseding Indictment, alleging (among other things) six counts of Murder in Aid of Racketeering in violation of 18 U.S.C. § 1959, five of which have been authorized by the Attorney General of the United States for a sentence of death. (See D.E. No. 66, Second Superseding Indictment ("Indictment"); D.E. No. 273, Amended Notice of Intent to Seek the Death Penalty ("Death–Penalty Notice")). Specifically, the Death–Penalty Notice states that "the circumstances of the offenses charged in Counts Three, Five, Six, Seven, and Eight of the Second Superseding Indictment are such that, in the event of a conviction, a sentence of death is justified ...." (Death–Penalty Notice at 1).
Roland's pattern of racketeering activity allegedly spanned from January 2003 through March 2011, when Roland was 18 through 26 years old. (Indictment at 9–13; 34–40). The first of Roland's death-penalty-eligible murders is alleged to have occurred on December 4, 2003, when Roland was 19 years old. (Id. at 9–10, 34–35; Death–Penalty Notice at 1–2). Roland was arrested on May 17, 2012 (see D.E. dated May 17, 2012; D.E. Nos. 26, 29–30), and is currently confined at the Metropolitan Detention Center ("MDC") in Brooklyn, New York. (See Gov. Ex. 349 at 2; see also D.E. No. 418, Tr. at 41–42 (Dr. Segal testifying that Roland's confinement at MDC began on September 2, 2015)).
IV. DISCUSSION
As instructed by the Supreme Court in Atkins , Hall , and Moore , the Court relies primarily on the professional clinical standards established by the APA and AAIDD in assessing whether Roland is ID. Although those standards do not represent "a constitutional command," the Court frames its analysis of the evidence in terms of those clinical standards.
Hooks v. Workman , 689 F.3d 1148, 1172 (10th Cir. 2012) ; see also Hall , 134 S.Ct. at 2000 ("The legal determination of intellectual disability is distinct from a medical diagnosis, but it is informed by the medical community's diagnostic framework.").
The Court uses the three-prong clinical framework to structure its reasoning and cites particular exhibits or testimony to explain how that evidence factored into its decision. The parties, however, are familiar enough with the extensive factual record so the Court will not reiterate in this already-lengthy Opinion all the evidence that was presented to, and considered by, the Court at each stage. Suffice it to say, the Court recognizes the stakes and seriousness of the Atkins issues and has attempted to address each major point raised by the parties—even if some of the evidence is not discussed at length.
A. Prong One: Deficits in Intellectual Functioning
To prevail on the first prong of his ID claim, Roland must prove, by a preponderance of the evidence, that he displays "significantly subaverage intellectual functioning." See AAIDD–11 at 6; DSM–5 at 33; Hall , 134 S.Ct. at 1994 (noting that the first criterion of ID is "significantly subaverage intellectual functioning").
i. Definitional Standards
For the AAIDD, "intellectual functioning ... includes reasoning, planning, solving problems, thinking abstractly, comprehending complex ideas, learning quickly, and learning from experience." AAIDD–11 at 31. For Prong One, the APA likewise "refers to intellectual functions that involve reasoning, problem solving, planning, abstract thinking, judgment, learning from instruction and experience, and practical understanding. Critical components include verbal comprehension, working memory, perceptual reasoning, quantitative reasoning, abstract thought, and cognitive efficacy." DSM–5 at 37.
Assessing intellectual functioning, even with the aid of standardized instruments, is an inexact science. See AAIDD–11 at 31. Nevertheless, IQ tests are the best available tools for measuring intellectual functioning. Id. Accordingly, both the AAIDD and the APA frame Prong One of ID in terms of IQ scores. In this regard, the APA describes Prong One in part as follows:
Intellectual functioning is typically measured with individually administered and psychometrically valid, comprehensive, culturally appropriate, psychometrically sound tests of intelligence. Individuals with intellectual disability have scores of approximately two standard deviations or more below the population mean, including a margin for measurement error (generally +5 points). On tests with a standard deviation of 15 and a mean of 100, this involves a score of 65–75 (70 ± 5).
DSM–5 at 37.
The AAIDD Manual similarly provides:
The "significant limitations in intellectual functioning" criterion for a diagnosis of intellectual disability is an IQ score that is approximately two standard deviations below the mean , considering the standard error of measurement for the specific instruments used and the instruments' strengths and limitations.
AAIDD–11 at 31. The AAIDD emphasizes that the "intent of this definition is not to specify a hard and fast cutoff point/score for meeting the significant limitations in intellectual functioning criteria of ID." Id. at 35; see also DSM–5 at 37 ("IQ test scores are approximations of conceptual functioning but may be insufficient to assess reasoning in real-life situations and mastery of practical tasks.").
Again, the guidelines stress the importance of clinical judgment in interpreting IQ test scores. See AAIDD–11 at 35 ("The use of 'approximately' reflects the role of clinical judgment in weighing the factors that contribute to the validity and precision of a decision."); DSM–5 at 37 ("Clinical training and judgment are required to interpret test results and assess intellectual performance."); (see also supra at 477-78).
"In the Atkins context, the Court must examine the reliability and validity of IQ scores, and consider the credibility of witnesses that proffer expert opinions on those scores." Montgomery , 2014 WL 1516147, at *26 (quoting Salad , 959 F.Supp.2d at 871 ); see also Hardy , 762 F.Supp.2d at 883 (noting that "as the degree to which a matter is left to an individual clinician's judgment increases, so does the degree to which the Court must rely on its assessment of the relative competence and credibility of the individual experts before it to resolve disputes between them").
ii. Measuring Intellectual Functioning
1. Standard Error of Measurement and Confidence Intervals
Standard Error of Measurement. One factor that must be considered in the interpretation of a person's IQ score is the standard error of measurement ("SEM"). "An IQ score is subject to variability as a function of a number of potential sources of error, including variations in test performance, examiner's behavior, cooperation of test taker, and other personal and environmental factors." AAIDD–11 at 36; see also Hall , 134 S.Ct. at 1995 ("An individual's IQ test score on any given exam may fluctuate for a variety of reasons. These include the test-taker's health; practice from earlier tests; the environment or location of the test; the examiner's demeanor; the subjective judgment involved in scoring certain questions on the exam; and simple lucky guessing.") (citing User's Guide at 22). So, the SEM, "which varies by test, subgroup, and age group, is used to quantify this variability and provide a stated statistical confidence interval within which the person's true score falls." AAIDD–11 at 36.
The following exchange from Dr. Hunter's testimony is relevant to some of the variabilities that may have affected Roland's scores:
THE WITNESS: Standard error of measurement is actually a way of understanding how the variability that occurs when we take tests can be made accountable. We [c]all it error because it is something that we can't control, and so for every measure that we administer, it is understood that there will be variabilities in how a person approaches different tasks that have to do with environment. Their understanding of the question at that moment, maybe. These are things that we are trying to take into account, but also there are things about, just the situation with testing, that are harder to control. The standard of measurement that allows, you understand there is a range of variability, a range of possible scores, that will fall in any given time that one takes that test.
THE COURT: So the standard error of measurement is external factors out of your control. I understand Friday you were able to interview Mr. Roland, it was a quiet setting. Then on Saturday it was visitation so there was a lot of interruption. Does it account for that in terms of, you know, the external factors, whether it be, you know, crowded and loud facilities, or jailhouse conditions that may exist for Mr. Roland before he walks into the examination. Things like that.
THE WITNESS: Without a doubt, yes. One of the things that is really important in testing are our observations of how the day goes for the individual. This is a long day. There is a lot that is being asked of them. So the key here is to recognize, even in just doing one test, we recognize there are external factors that might apply which is why then ultimately we then work to account for that, and that is exactly what the standard error allows us to do.
(D.E. No. 387, Tr. at 3–4 (Dr. Hunter's Testimony)).
The Supreme Court "instructs that, where an IQ score is close to, but above, 70, courts must account for the test's standard error of measurement." Moore , 137 S.Ct. at 1049. "The SEM reflects the reality that an individual's intellectual functioning cannot be reduced to a single numerical score." Hall , 134 S.Ct. at 1995 ; see also Moore , 137 S.Ct. at 1049 (citing the User's Guide at 22–23). As the Supreme Court explained, the SEM is "a statistical fact, a reflection of the inherent imprecision of the test itself." Moore , 137 S.Ct. at 1049. "For purposes of most IQ tests, this imprecision in the testing instrument means that an individual's score is best understood as a range of scores on either side of the recorded score within which one may say an individual's true IQ score lies." Id. (explaining that Moore's score of 74, adjusted for the SEM, yields a range of 69 to 79); see also DSM–5 at 37 (indicating that the SEM is generally a five-point range). "[To] ignore[ ] the inherent imprecision of these tests risks execut[ion] of a person who suffers from intellectual disability." Hall , 134 S.Ct. at 2001 ; see also Brumfield , 135 S.Ct. at 2278 (finding unreasonable a state court's conclusion that a score of 75 precluded an intellectual-disability finding).
See also AAIDD–11 at 36 ("Reporting an IQ score with an associated confidence interval is a critical consideration underlying the appropriate use of intelligence tests and best practices; such reporting must be part of any decision concerning the diagnosis of ID.").
Confidence Intervals. Much of the hearing focused on the relationship between the SEM and confidence intervals. (See, e.g. , D.E. No. 423, Tr. at 112 (Dr. McGrew explaining the relationship between the two phenomena)). "The SEM is used to calculate the confidence interval, or the band of scores around the observed score, in which the individual's true score is likely to fall. Confidence intervals express test score precision and serve as reminders that measurement error is inherent in all test scores and that observed test scores are only estimates of true ability." (Def. Ex. 24 ("WAIS–IV Manual") at 46). Dr. McGrew likewise explained that a "person does not obtain a specific IQ score when tested. A person obtains a range of possible IQ test scores with a certain degree of confidence." (Dr. McGrew's Report at 7).
Although the Supreme Court thoroughly discusses the SEM in Hall , the Court does not indicate whether lower courts must use a 68% confidence interval (defined as IQ test score ± one SEM) or a 95% confidence interval (defined as IQ test score ± two SEMs) to determine the defendant's IQ score range. See AAIDD–11 at 36 (describing difference between one SEM and two SEMs). Rather, the Court consistently referred to the use of "the SEM" in the singular.
See, e.g. , Hall , 134 S.Ct. at 1995 ("each separate score must be assessed using the SEM"); id. at 1999 ("clinical definitions have long included the SEM"), id. at 2000 ("By failing to take into account the SEM and setting a strict cutoff at 70, Florida goes against the unanimous professional consensus."); see also Wilson , 170 F.Supp.3d at 362–63 (noting that Hall fails to identify a specific confidence interval).
Dr. Morgan explained at the hearing, however, that the "real score for a psychological test is within the 95% confidence interval. That accounts for the potential error inherent in all testing." (D.E. No. 412, Tr. at 157 (Dr. Morgan's Testimony)). Moreover, the parties' experts agree that the 95% confidence interval is to be used for index and Full Scale IQ ("FSIQ") scores. (See D.E. No. 420, Tr. at 91–92 (Dr. Denney's Testimony); D.E. No. 427, Tr. at 10 (Dr. Morgan's Testimony); D.E. No. 387, Tr. at 7, 15 (Dr. Hunter's Testimony); Dr. McGrew's Report at 7). Dr. McGrew also explained:
The concept of error tolerance in measurement and experiments is recognized in most sciences, as well as the need to account for acceptable levels of error when presenting scientific data and evidence. Since IQ tests do not possess perfect reliability, there is a degree of known error in each IQ test score. As per scientific and professional standards, each of Mr. Roland's IQ test scores should be interpreted as a range of scores—bounded by a 95% confidence interval band (+/- 5 IQ score points). The notion of an acceptable error tolerance of 5% (conversely, a 95% confidence interval) has a long history in the sciences, and is grounded in reasoned logic, mathematical and statistical theory, and statistically tractable mathematical quantification of the characteristics of the normal curve.
(Dr. McGrew Report at 5).
The Court will, therefore, follow the experts' guidance (and the approach set forth in Wilson ) and apply a 95% confidence interval (i.e., two SEMs) in evaluating Roland's IQ scores. See Wilson , 170 F.Supp.3d at 372, 375 (interpreting Hall as requiring a 95% confidence interval and applying same to Wilson's IQ score).
2. Flynn Effect
The Flynn Effect, named after James R. Flynn, is a phenomenon that, over time, standardized IQ test scores tend to increase with the age of the test (about 0.30 points per year) without a corresponding increase in actual intelligence in the general population. See AAIDD–11 at 37; User's Guide at 23. "That is, individuals tested today on an IQ test normed many years earlier will obtain inflated IQ scores, as the older test norms are obsolete for individuals in contemporary society." (Dr. McGrew's Report at 14). Both the AAIDD and the APA consider the Flynn Effect an important factor in examining IQ scores. See AAIDD–11 at 37; DSM–5 at 37. Flynn suggests—and the AAIDD recommends—a downward departure of IQ scores by 0.3 points per year based on when the IQ test was administered relative to when the IQ test's norms were produced (i.e., a Flynn adjustment or correction). See User's Guide at 20–21, 23; AAIDD–11 at 37, 95–96 (providing the precise calculation of .33 times the number of years that have elapsed from the last time the test was normed until taken by the subject).
District courts, upon consideration of expert testimony, may apply or reject the Flynn Effect. See In re Cathey , 857 F.3d 221, 227 n.33 (5th Cir. 2017) (citing Ledford v. Warden, Georgia Diagnostic & Classification Prison , 818 F.3d 600, 640 (11th Cir. 2016) ); see also Montgomery , 2014 WL 1516147, at *27–28 (declining to apply Flynn-adjustment to defendant's IQ score because it would not have affected the Court's analysis); Hardy , 762 F.Supp.2d at 866 (applying Flynn-adjustment to defendant's IQ score).
Federal courts across the country have recognized the legitimacy of applying the Flynn Effect to IQ scores. See e.g. , Holladay v. Allen , 555 F.3d 1346, 1350 n.4, 1358 (11th Cir. 2009) (holding that the district court was not clearly erroneous in crediting the psychologist that concluded the IQ scores needed to be adjusted for the Flynn Effect); Walker v. True , 399 F.3d 315, 322–23 (4th Cir. 2005) (remanding for an evidentiary hearing in part because the district court "refused to consider relevant evidence, namely the Flynn Effect evidence"); Lewis , 2010 WL 5418901, at *11 ("recognize[ing] the Flynn Effect as a best practice for an intellectual disability determination"); Hardy , 762 F.Supp.2d at 864, 866 ("[T]here is in fact published, peer-reviewed research supporting the existence of the Flynn Effect for the test Hardy took and the IQ range in which his score fell," and "correcting for the Flynn Effect is a 'best practice' in the field and therefore should be done."); Wiley v. Epps , 668 F.Supp.2d 848, 894 (N.D. Miss. 2009) ("The Court finds that regardless of whether the 'Flynn effect' is considered as a precise mathematical formula in this case, it will take into consideration the obsolescence of test norms in weighing the evidence concerning Petitioner's intellectual functioning."); Davis , 611 F.Supp.2d at 488 ("[T]he Court finds the defendant's Flynn effect evidence both relevant and persuasive, and will, as it should, consider the Flynn-adjusted scores in its evaluation of the defendant's intellectual functioning."); Thomas v. Allen , 614 F.Supp.2d 1257, 1278 (N.D. Ala. 2009) ("It also is undisputed that Professor Flynn's recommendation—i.e., 'deduct 0.30 IQ points per year (3 points per decade) to cover the period between the year the test was normed and the year in which the subject took the test'—is a generally accepted adjustment.").
As it turns out, the Court need not delve too deeply into this issue because three of Roland's IQ scores—70, 71, and 75—are within the range of mild ID regardless of a Flynn correction. (See Dr. Hunter's Report at 12 (calculating a FSIQ of 71); Dr. Morgan's Report at 10 (calculating a FSIQ of 75); Def. Ex. 9a at 58 (listing Roland's 2002 KBIT composite IQ score of 70)). Generally, "a full-scale IQ score of 70–75 or lower ordinarily will satisfy the first requirement for a finding of intellectual disability." McManus v. Neal , 779 F.3d 634, 650 (7th Cir. 2015). And although the Supreme Court does not provide explicit guidance on how courts should treat multiple IQ test results, "the facts in Hall require lower courts to consider evidence of adaptive functioning if even one valid IQ test score generates a range that falls to 70 or below." Wilson , 170 F.Supp.3d at 366, 372–75 (relying on one of nine IQ scores to determine that Wilson satisfied Prong One); see also Moore , 137 S.Ct. at 1061 n.1 (noting that Hall "reached no holding as to the evaluation of IQ when an Atkins claimant presents multiple scores"); id. at 1045 & n.4, 1047 (noting that Moore had seven IQ scores—including a 78—and relying on Moore's score of 74).
(See also Dr. Denney's Report at 26 ("In my professional opinion the Flynn Effect related to the data regarding Mr. Roland is irrelevant in this case because Flynn Effect changes to the test scores would not have made a difference in terms of whether the IQ scores were, or were not, in the range of mild ID given the confidence intervals—presuming the scores were a valid reflection of his genuine cognitive functioning.")).
See also Atkins , 536 U.S. at 309 n.5, 122 S.Ct. 2242 ("It is estimated that between 1 and 3 percent of the population has an IQ between 70 and 75 or lower, which is typically considered the cutoff IQ score for the intellectual function prong of the mental retardation definition."); Lewis , 2010 WL 5418901, at *8 ("Thus, a person can still be considered intellectually disabled if his IQ score is between 70 and 75.").
Roland's fourth IQ test, the KBIT–2, also yielded a composite score of 78. (See Dr. Hunter's Report at 16; Dr. McGrew's Report at 6).
The Court will nonetheless recognize the Flynn Effect as a best practice for an ID determination. The AAIDD mandates the application of the Flynn Effect when a clinician administers a test with outdated norms, especially in light of the retrospective diagnosis here. See AAIDD–11 at 95–96; id. at 37 ("[B]est practices require recognition of a potential Flynn Effect when older editions of an intelligence test (with corresponding older norms) are used in the assessment or interpretation of an IQ score."). The DSM–5 likewise recognizes the Flynn Effect as one of the factors that may affect IQ test scores. See DSM–5 at 37. Moreover, Roland's experts posit that the Court should apply a Flynn adjustment. (See Dr. Hunter's Report at 16–17; Dr. McGrew's Report at 13; Dr. Greenspan's Report at 11–12). And Dr. Morgan testified that a Flynn correction would not affect his analysis in this case. (D.E. No. 414, Tr. at 30–31 (Dr. Morgan's Testimony); id. at 162–63 (Dr. Morgan testifying that he did not apply a Flynn-adjustment because Roland's IQ scores would nevertheless be invalid)). Dr. Denney also testified that although the "issue is still unsettled[,] I think it is fair and reasonable to consider the potential effect of Flynn ...." (D.E. No. 420, Tr. at 77 (Dr. Denney's Testimony)). In light of the AAIDD's mandate, the evidence presented by both parties, and other federal courts' practices, this Court will adjust Roland's IQ scores to correct for the Flynn Effect.
(See also Dr. McGrew's Report at 15–16 ("Key contemporary Flynn effect issues bearing on the diagnosis of intellectual disability in the Atkins context were covered in a special 2010 issue of the Journal of Psychoeducational Assessment (JPA). The consensus of almost all authors who contributed to the JPA Flynn effect issue ... was that IQ test norm obsolescence (i.e., the Flynn effect) is an established scientific fact.") (citing authors who contributed to the special issue of the JPA))).
3. IQ Tests: WAIS–IV and KBIT
Expert witnesses for both Roland and the Government described the Wechsler Adult Intelligence Scale, Fourth Edition ("WAIS–IV"), as the "gold standard" in intelligence testing. And federal courts routinely rely on Wechsler IQ test scores in making prong-one determinations. See, e.g. , Montgomery , 2014 WL 1516147, at *26 ; Smith , 790 F.Supp.2d at 501.
(See, e.g. , D.E. No. 386, Tr. at 54 (Dr. Hunter testifying that there is very little dispute that the WAIS is the "gold standard" IQ test); D.E. No. 422, Tr. at 178 (Dr. Bigler testifying that WAIS is the "gold standard"); Dr. Morgan's Report at 10 (describing the WAIS–IV as "the state-of-the-art individually administered intelligence test").
The psychometrics of an IQ test are designed to aggregate data from the item level, to the subtest level, to the index scores, to the FSIQ scores. (Dr. McGrew's Report at 20–22; D.E. No. 412, Tr. at 140 (Dr. Morgan's Testimony); D.E. No. 386, Tr. at 81–82 (Dr. Hunter's Testimony)). "[A]t each successive level of summation or aggregation[,] the reliability (and validity) of the resulting score indices increases." (Dr. McGrew's Report at 20–22). The WAIS–IV measures four indices: Verbal Comprehension Index, Perceptual Reasoning Index, Working Memory Index, and Processing Speed Index. (D.E. No. 412, Tr. at 140 (Dr. Morgan explaining the WAIS–IV indices)). These indices are further broken down into ten subtests and item levels. (Dr. McGrew's Report at 20–22; see also infra at 505-06).
The FSIQ score "is the best approximation of an individual's overall cognitive functioning." Davis , 611 F.Supp.2d at 485 ; Lewis , 2010 WL 5418901, at *10 ("The court considers the full scale IQ score as the best indicator of Prong 1 intellectual functioning."). The parties' experts agree that a FSIQ is the "most reliable measure" of determining intellectual functioning. (D.E. No. 422, Tr. at 69–70 (Dr. Bigler: "The Full Scale IQ score is the most reliable measure."); D.E. No. 423, Tr. at 66 (Dr. McGrew: "Full Scale IQ score is the most robust, reliable and valid score on intelligence tests."); D.E. No. 427, Tr. at 41 (Dr. Morgan agreeing that the composite and FSIQ scores are "the most reliable measures")). In fact, Dr. Morgan testified that "[o]nly the Full Scale IQ is really relevant, and required for the diagnosis, to satisfy prong 1." (D.E. No. 414, Tr. at 148 (Dr. Morgan's Testimony)). The AAIDD Manual also emphasizes reliance on "a global (general factor) IQ as a measure of intellectual functioning." AAIDD–11 at 41.
The WAIS–IV Manual echoes the experts' testimony:
The reliability coefficients for WAIS–IV composite scores are excellent and are generally higher than those of the individual subtests that comprise the composite scores. This difference occurs because each subtest represents only a narrow portion of an individual's entire intellectual functioning, whereas the composite scores summarize the individual's performance on a broader sample of abilities.
(WAIS–IV Manual at 43).
Intellectual functioning can also be assessed by the KBIT. The original KBIT had two subtests: Vocabulary and Matrices. (D.E. No. 414, Tr. at 45 (Dr. Morgan's Testimony)). The most recent version of the test ("KBIT–2") has added a third subtest called "Riddles." (Id. ). The subsections together yield a full-scale composite score. (Id. at 41).
Consistent with the experts' testimony, clinical guidelines, and caselaw, the Court will place significant weight on Roland's FSIQ scores on the WAIS–IV. See User's Guide at 10 (urging clinicians to "[u]se individually administered, standardized instrument(s) that yield a measure of general intellectual functioning"). IQ scores alone, however, are not dispositive of a person's intelligence; as noted above, "one needs to use clinical judgment in interpreting the obtained score." AAIDD–11 at 35. iii. Roland's IQ Test Performance
Roland has four available IQ test scores. Dr. Hunter administered the WAIS–IV to Roland in November 2016, yielding an IQ score of 71, with a 95% confidence interval range of 68 to 76. (Dr. Hunter's Report at 12; Dr. McGrew Report's at 3–4). Dr. Morgan also administered the WAIS–IV to Roland approximately four months later, in March 2017, yielding an IQ score of 75, with a 95% confidence interval range of 71 to 80. (Dr. Morgan's Report at 3–4). Roland was also administered the 2002 KBIT by Dr. Farber, when he was 17, which resulted in a composite score of 70, with a 95% confidence interval range of 65 to 75. (Def. Ex. 9a at 55, 58; Dr. McGrew's Report at 10). Finally, in March 2017, Dr. Morgan administered the KBIT–2 to Roland, which resulted in a composite score of 78. (Gov. Ex. 354; D.E. No. 414, Tr. at 46, 163 (Dr. Morgan's Testimony)).
Applying a Flynn-adjustment to Roland's IQ scores results in a:
• 68 on Dr. Hunter's test, with a 95% confidence interval range of 63 to 73;
• 72 on Dr. Morgan's test, with a 95% confidence interval range of 67 to 77;
• 69 on the 2002 KBIT, with a 95% confidence interval range of 64 to 74; and
• 74 on the KBIT–2, with a 95% confidence interval range of 69 to 79.
(Dr. McGrew's Report at 6).
As noted earlier, the WAIS–IV measures four indices. A detailed summary of Roland's indices scores is provided in the chart below.
Dr. Hunter's Test Dr. Morgan's Test WAIS-IV Indices Sum SS %ile Sum SS %ile VERBAL COMPREHENSION 17 76 5 17 76 5 PERCEPTUAL REASONING 15 71 3 20 81 10 WORKING MEMORY 15 86 18 16 89 23 PROCESSING SPEED 9 71 3 10 74 4 FULL SCALE IQ 56 71 3 63 75 5
(See generally Dr. Morgan's Report; Gov. Ex. 195, Score Comparison Chart; Dr. Hunter's Report; Def. Ex. 19E, Dr. Hunter's Score Summary Sheet).
These indices comprise additional subtests. (See generally Dr. Morgan's Report; Dr. Hunter's Report). The results of Roland's subtests are summarized in the following chart.
Dr. Hunter's Test Dr. Morgan's Test WAIS-IV Subtests Raw Scaled Raw Scaled VERBAL COMPREHENSION Similarities 13 4 14 5 Vocabulary 21 6 26 7 Information 8 7 5 5 (Comprehension) 19 8 n/a n/a PERCEPTUAL REASONING Block Design 20 5 20 5 Matrix Reasoning 8 4 17 9 Visual Puzzles 8 6 9 6 (Figure Weights) 9 6 n/a n/a (Picture Completion) 8 6 n/a n/a WORKING MEMORY Digit Span 22 7 21 6 Arithmetic 12 8 14 10 (Letter-Number Sequencing) 15 7 n/a n/a PROCESSING SPEED Symbol Search 18 5 25 7 Coding 34 4 29 3 (Cancellation) 36 9 n/a n/a
(See generally Dr. Morgan's Report; Gov. Ex. 195, Score Comparison Chart; Dr. Hunter's Report; Def. Ex. 19E, Dr. Hunter's Score Summary Sheet).
There is no statistically significant difference in the FSIQ scores between Dr. Hunter's and Dr. Morgan's testing. (Dr. Hunter's Report at 2, 7, 16; Dr. McGrew's Report at 8–9; D.E. No. 427, Tr. at 17 (Dr. Morgan's Testimony); Dr. Deimey's Report at 13–14). There is also no statistically significant difference in the index scores between Dr. Hunter's and Dr. Morgan's testing. (Dr. McGrew's Report at 10–12; D.E. No. 427, Tr. at 40–41 (Dr. Morgan's Testimony); Dr. Denney's Report at 13–14; D.E. No. 418, Tr. at 185–86 (Dr. Deimey: "The reality is the test protocol publishers scoring program on serial WAIS–4 assessment reveals the indices are virtually the same. They are consistent.")). And the two KBIT composite IQ scores are also within the SEM confidence band values of each other and the two WAIS–IV tests. (Dr. McGrew's Report at 8). Moreover, only one often WAIS–IV subtests changed significantly from Dr. Hunter's testing to Dr. Morgan's testing. (D.E. No. 427, Tr. at 9 (Dr. Morgan's Testimony); Dr. Denney's Report at 13–14; see also D.E. No. 418. Tr. at 185–86 (Dr. Denney testifying that the "subtests are virtually consistent hi total score performance except for matrix reasoning")).
iv. Roland's RBANS Test Performance
As part of their test battery, Drs. Hunter and Morgan also administered the Repeatable Battery for Assessment of Neuropsychological Status ("RBANS"). The RBANS "is a grouping of tests that assess fundamental or basic neuropsychological functioning." (D.E. No. 412, Tr. at 138 (Dr. Morgan's Testimony)). Below is a summary of Roland's RBAN scores.
Dr. Hunter's Test Dr. Morgan's Test RBANS Indices Raw SS Raw SS IMMEDIATE MEMORY 38 76 34 69 VISUOSPATIAL/CONSTRUCTIONAL 12 56 26 66 LANGUAGE 26 68 21 51 ATTENTION 45 64 27 53 DELAYED MEMORY 33 74 32 44 TOTAL 338 59 283 50
(See generally Dr. Morgan's Report; Gov. Ex. 195, Score Comparison Chart; Dr. Hunter's Report; Def. Ex. 19E, Dr. Hunter's Score Summary Sheet).
v. Assessing Test Validity
1. Methods of Assessing Validity
Effort Testing. Both parties' experts indicated that effort testing is an important component of accurately measuring a person's IQ. Dr. Morgan submitted an affidavit to this Court attesting to the significance of employing multiple effort tests. (D.E. No. 289–1 ("Dr. Morgan's Aff.") at 5). And Dr. Hunter testified that effort testing "is an important component to understanding the effort and consistency of effort, and the motivation that individuals present during testing." (D.E. No. 386, Tr. at 202–03 (Dr. Hunter's Testimony)).
Drs. Hunter and Morgan each administered multiple effort tests on Roland. The parties' experts agree that Roland passed all of these tests. For Roland, the analysis ends here. He argues that the "accepted clinical standard is to accept the results of IQ and neuropsychological testing when there are passed validity scores." (Def. Post–Hearing Submission ¶¶ 136, 649).
(D.E. No. 414, Tr. at 14–15 (Dr. Morgan: "So Mr. Roland was, as we know, administered performance validity tests by both Dr. Hunter and me. I also administered symptom validity tests. By and large, Mr. Roland passed those formal tests of validity and effort."); D.E. No. 422, Tr. at 57–62 (Dr. Bigler explaining that Roland passed all effort tests administered by Drs. Hunter and Morgan)).
In support of his position, Roland relies on the following exchange from Dr. Bigler's testimony at the hearing:
Q. What is the clinical standard about accepting the results of neuropsychological testing? A. Yes. The neuropsychological tests have been done in the presence of what appear to be tasks, symptom performance validity effort measures, and the embedded aspects have been tasked, and there are ways of describing or demonstrating why any discrepancy or any aspects of the testing might be interpreted differently, but if there is a way to explain that, you accept the findings. You accept that this is a valid evaluation.
(D.E. No. 422, Tr. at 45 (Dr. Bigler's Testimony)).
Subtest–Score Comparisons. Dr. Morgan also recognized that administering effort tests is a "type[ ] of validity assessment," and failure of these effort tests is a sign of test invalidity. (Dr. Morgan's Report at 8–9; D.E. No. 414, Tr. at 11 (Dr. Morgan stating that "performance validity tests and symptom validity test[s]" are "another way to analyze validity")). He noted in his report that Roland's passing of these effort tests "[o]rdinarily ... would suggest that the test data ... are valid and a credible representation of his cognitive function." (Dr. Morgan's Report at 9). But, because this case involves secondary gain, Dr. Morgan applied another methodology: comparing subtest scores between tests. (Gov. Opp. Br. at 11; D.E. No. 414, Tr. at 11 (Dr. Morgan: "This particular method, in looking at how one's score relates to the known practice effect on any given test is an assessment analysis of validity on the individual's test taking performance.")).
The Government's contrary position is grounded in the difference between primary and secondary gain. (See Gov. Opp. Br. at 10–12; Dr. Morgan's Report at 7–8). In the primary-gain context, "patients see doctors in order to get an accurate diagnosis and treatment for their condition. They seek examinations for treatment and relief from symptoms. They want to get better." (Dr. Morgan's Report at 8). "Unlike primary gain contexts, secondary gain contexts do not involve genuine clinical concern on the part of the examinee. Rather, the examinee seeks an assessment to benefit or gain in some way from the examination." (Id. ). For Dr. Morgan, the "present criminal context arguably represents the most critical of all secondary gain contexts, the death penalty." (Id. ).
In opposing Roland's Atkins motion, the Government has implied that because this is a forensic setting, the Court should apply a somewhat heightened standard for assessing Roland's ID claim. Indeed, Dr. Morgan applied the instant methodology specifically because this is a forensic setting (i.e., a secondary-gain context). (See Gov. Opp. Br. at 12 ("A consideration of test results, and the possibility of feigned effort, is particularly important with respect to an intellectual disability determination under Atkins. This is true because ... this is a 'secondary gain' context."); Dr. Morgan's Report at 8 ("The present criminal context arguably represents the most critical of all secondary gain contexts, the death penalty."); see also D.E. No. 420, Tr. at 171–72 (Dr. Denney citing favorably an article that recommends higher cut-off scores for performance-validity tests in Atkins cases)).
The suggestion that a standard other than those listed in the clinical guidelines is appropriate because this case involves secondary gain or an Atkins claim is counter to both caselaw and common sense. Courts are required to follow the "prevailing clinical standards," not an undefined forensic standard, in determining whether a defendant is ID. See Moore , 137 S.Ct. at 1050 ; see also id. at 1049 (referring to the "prevailing clinical standards" as "established medical practice"); id. at 1050–52 (examining the multiple manners in which Texas had violated the "prevailing clinical standards"); see also Hall , 134 S.Ct. at 1999 (noting that Atkins cited the "clinical definitions for intellectual disability"). Given the high stakes in Atkins litigations, courts generally exercise caution, and the legal standard has arguably become even more protective than the clinical standard. See Wilson , 170 F.Supp.3d at 391 ("[T]he Supreme Court's decision in Hall strongly suggests that the legal standard for intellectual disability in Atkins cases has become more protective than the clinical standard."); see also Hall , 134 S.Ct. at 1990 (noting the "unacceptable risk that persons with intellectual disability will be executed"); Davis , 611 F.Supp.2d at 488 ("In the forensic context, however, where an individual's eligibility for a death sentence depends on a somewhat arbitrary numerical cutoff, precision and accuracy in determining that individual's IQ score, both at present and in the past, become critically important. Eligibility for the death penalty is not a lottery, and a greater effort to achieve accurate results is both necessary and appropriate."); Hardy , 762 F.Supp.2d at 866 (same).
So, to the extent that the Government's experts proffer opinions that run afoul of the clinical standards, the Court rejects them here.
According to the Government, "[c]omparing a subject's results from one test to another is a particularly effective method for identifying feigned effort because it is very difficult to 'fake it' in the exact same way the second time." (Gov. Opp. Br. at 12 (without citation)). For Dr. Morgan, "[w]hen examinees perform differently on the same test questions in two separate examinations (beyond the standard error of measurement), in the absence of progressive disease, the only logical and reasonable explanation is the behavior of the examinee. " (Dr. Morgan's Report at 9 (emphasis in original)). The Court will consider the parties' arguments, the methodology of the parties' experts, caselaw, and the clinical standards to assess the validity of Roland's IQ scores and determine whether Roland meets the requirements of Prong One of the ID definition.
The Court notes that there is no statistically significant difference in Roland's FSIQ and index scores between the two WAIS–IV tests, and Roland's two KBIT composite IQ scores are also within the SEM confidence band values of each other and the two WAIS–IV tests. (See supra at 506-07).
The Court again notes that there was no statistically significant difference in nine of ten WAIS–IV subtest scores between the two administrations. (See supra at 506-07).
2. Practice Effect
"The practice effect refers to gains in IQ scores on tests of intelligence that result from a person being retested on the same instrument." AAIDD–11 at 38. According to the AAIDD, "established clinical practice is to avoid administering the same intelligence test within the same year to the same individual because it will often lead to an overestimate of the examinee's true intelligence." Id. ; see also User's Guide at 23. The APA also recognizes that practice effect is one of the factors that may affect test scores. See DSM–5 at 37.
According to Dr. Hunter, the practice effect "is a commonly observed phenomenon with testing using initially novel information and tasks, that once engaged, becomes encoded and remembered. As a result, future performances can become much stronger given that experience and recognition." (Def. Ex. 41 ("Dr. Hunter's Rebuttal Rep.") at 6). Practice effect can occur from exposure to the type of task, not necessarily exposure to a specific question or item. (D.E. No. 387, Tr. at 18 (Dr. Hunter's Testimony); Dr. Hunter's Rebuttal Rep. at 5 ("[P]ractice effects represent the learning and memory of a specific context of a test and its components, not just acquired knowledge of completed items within a test.")). Dr. Morgan also explained that although "cognitive science doesn't really fully understand why we have practice effects and why it works," practice effect nevertheless "is a real phenomenon, there is no doubt." (D.E. No. 412, Tr. at 159 (Dr. Morgan's Testimony)). But these practice effects should diminish as time between two test administrations increases. (Id. at 165). Finally, regarding Roland's WAIS–IV tests, the parties' experts agree that it is possible to demonstrate practice effect on one subtest, but not others. (D.E. No. 414, Tr. at 149 (Dr. Morgan's Testimony); D.E. No. 387, Tr. at 19 (Dr. Hunter's Testimony); D.E. No. 423, Tr. at 161 (Dr. McGrew's Testimony)).
The issue here, however, is not whether the Court should take into account the general practice-effect phenomena in interpreting Roland's IQ scores. Rather, the Government posits that any explanation attributing the differences in Roland's IQ scores to practice effect is "meritless." (Gov. Opp. Br. at 14). Dr. Morgan concluded that attributing Roland's increased score to practice effects "makes little sense." (Dr. Morgan's Report at 16). Instead, Dr. Morgan compares the differences within WAIS–IV subtests to attack the scores' validity and reliability, noting that Roland exhibited a pattern of differential effort where some of the subtest scores increased while others decreased. (See id. at 9–13; D.E. No. 414, Tr. at 11 (Dr. Morgan explaining that "looking at how one's score relates to the known practice effect on any given test is an assessment analysis of validity on the individual's test taking performance"); see also infra at 512-15). The Court will address these issues in detail below. For now, the Court notes only that it will—as clinicians recommend—consider a possible practice effect as a factor, among others, in assessing the reliability or uncertainty of Roland's scores on particular IQ tests or subtests. See DSM–5 at 37 (recognizing practice effects as a factor that may affect test scores); AAIDD–11 at 40 (recommending that clinicians interpret IQ scores in reference to factors such as practice effect).
3. Roland's Effort–Test Performance
a. Dr. Hunter's Evaluation
As part of his test battery, Dr. Hunter administered the following effort tests to Roland:
• Test of Memory Malingering ("TOMM"). The TOMM is a standalone performance-validity test designed to determine whether the examinee is malingering. (Dr. Hunter's Report at 12). The TOMM is a well-accepted and valid measure of assessing the validity of testing someone with ID. (D.E. No. 422, Tr. at 58 (Dr. Bigler's Testimony)). Roland passed the TOMM. (Dr. Hunter's Report at 12).
• California Verbal Learning Test ("CVLT"). The CVLT is a forced-choice embedded validity measure designed to determine whether the examinee is malingering; the test is appropriate for individuals with ID. (D.E. No. 422, Tr. at 58–59 (Dr. Bigler's Testimony)). Roland passed the CVLT. (D.E. No. 386, Tr. at 168 (Dr. Hunter's Testimony)).
• Reliable Digit Span ("RDS"). The RDS is an embedded validity measure that assesses the subject's ability to repeat digits back in the forward direction and in the reverse direction. (D.E. No. 422, Tr. at 59 (Dr. Bigler's Testimony)). Roland passed the RDS. (Id. ; Dr. Hunter's Report at 13–15).
• List Learning. List Learning is an embedded measure in the RBANS where the subject is provided a list of 20 words and presented with a binary choice of whether a certain word was on the list. (D.E. No. 422, Tr. at 59 (Dr. Bigler's Testimony)). Roland passed the List Learning test. (Dr. Hunter's Report at 13–15).
• Rarely Missed Items. The Rarely Missed Items, a forced-choice validity measure embedded in the Wechsler Memory Scale, requires that the examiner tell a story and provide a cue to the examinee. (D.E. No. 422, Tr. at 60 (Dr. Bigler's Testimony)). The subject then selects whether the cue is true or false based on the story the subject heard. (Id. ). Roland passed the Rarely Missed Items test. (Dr. Hunter's Report at 13–15).
• Vocabulary–Digit Span Difference. Another assessment of malingering is comparing the difference between the scores of the Vocabulary and the Digit Span subtests of an intelligence test. (D.E. No. 422, Tr. at 61 (Dr. Bigler's Testimony)). A substantial difference between the two scores is an indicator of feigning. (Id. ). Roland passed this embedded measure, as there is no significant difference between those scores. (Id. at 61–62; Dr. Hunter's Report at 13–15).
In sum, Roland passed a total of six different validity measures with Dr. Hunter. Dr. Bigler testified that, according to the clinical standard, "if we were seeing someone like this in the clinic, that would be end of discussion. We would be basically accepting the assessments as a reflection of this individual's test performance." (D.E. No. 422, Tr. at 62 (Dr. Bigler's Testimony)).
Moreover, the parties' experts agree that the examiner is in the best position to assess the examinee's motivation and effort, and the examiner's impressions should also be considered. (See id. at 105–07; D.E. No. 420, Tr. at 195–96 (Dr. Denney's Testimony)). Dr. Hunter emphasized in his contemporaneous notes, written report, and testimony his clinical impression that Roland demonstrated good effort. In his notes, for example, Dr. Hunter recorded that Roland was "showing solid collaboration and motivation" and "sustained solid effort, patience." (Def. Ex. 44, Dr. Hunter's Notes). His expert report states that "[a]cross both testing sessions, Mr. Roland presented as collaborative" and evidenced a "seriousness towards the evaluation itself" and, "[w]hen observed to become most frustrated with the demands of a task, encouragement for continued effort (when allowed) was seen to help Mr. Roland remain engaged and focused. He always attempted to provide a response to the demands presented to him, although he was open in acknowledging challenge with complex and difficult tasks. Effort and motivation were appropriate ...." (Dr. Hunter's Report at 12). Dr. Hunter echoed these observations at the hearing: "I would say the key thing throughout all the testing with Mr. Roland was his willingness to try very hard. He was putting in effort, always.... I believe that he took this seriously. And I think he also understood that this was something that was important that was being asked of him. And again, the numbers allow us to test and consider motivation and effort, he passed those." (D.E. No. 386, Tr. at 129 (Dr. Hunter's Testimony)).
b. Dr. Morgan's Evaluation
Dr. Morgan also administered the following effort tests to Roland in March 2017:
• TOMM. Roland passed the TOMM. (D.E. No. 427, Tr. at 46–47 (Dr. Morgan's Testimony); D.E. No. 422, Tr. at 58 (Dr. Bigler's Testimony)).
• CVLT. Roland passed the CVLT. (D.E. No. 427, Tr. at 46–47 (Dr. Morgan's Testimony); D.E. No. 422, Tr. at 58 (Dr. Bigler's Testimony)).
• RDS. Roland passed the RDS. (D.E. No. 414, Tr. at 14–15 (Dr. Morgan's Testimony); D.E. No. 427, Tr. at 46–47 (Dr. Morgan's Testimony); D.E. No. 422, Tr. at 59 (Dr. Bigler's Testimony); D.E. No. 420, Tr. at 173 (Dr. Denney's Testimony)).
• List Learning. Roland passed the List Learning test. (D.E. No. 427, Tr. at 46 (Dr. Morgan's Testimony)).
• Structured Inventory of Malingered Symptomatology ("SIMS"). SIMS is a standalone validity measure that Roland also passed. (Dr. Morgan's Report at 9).
• Minnesota Multiphasic Personality Inventory ("MMPI"). The MMPI, a symptom-validity measure, "is an important test for determining intellectual disability because of its validity scales." (Dr. Morgan's Aff. at 4). For Dr. Morgan, "the MMPI is especially effective in gauging a subject's effort, truthfulness, and honesty, vis-à-vis neuropsychological testing, and as such, it is an effective means of determining whether the subject is 'malingering.' " (Id. ). Roland passed the MMPI. (Dr. Morgan's Report at 9; D.E. No. 427, Tr. at 47 (Dr. Morgan's Testimony); D.E. No. 414, Tr. at 143 (Dr. Morgan's Testimony); D.E. No. 422, Tr. at 57, 62 (Dr. Bigler's Testimony)).
Roland, therefore, also passed a total of six validity measures administered by Dr. Morgan. (D.E. No. 427, Tr. at 46–47 (Dr. Morgan testifying that he administered—and Roland passed—four performance-validity tests and two symptom-validity tests)). Together with Dr. Hunter's tests, Roland passed a total of twelve validity measures. (See D.E. No. 422, Tr. at 57–62 (Dr. Bigler's Testimony)).Regarding Roland's demeanor during Dr. Morgan's testing, Dr. Morgan testified that Roland "was pleasant. He answered [Dr. Morgan's] questions. He wasn't resisting or guarded." (D.E. No. 412, Tr. at 98 (Dr. Morgan's Testimony); see also id. at 101 ("Q. Did anything about the way he was behaving during the testing cause you any concern? A. Not really.")).
vi. The Government's Test–Validity Concerns
The overarching theme of the Government's opposition to Roland's Atkins claim is that Roland was exhibiting questionable effort or malingering cognitive deficits during testing and thus his IQ scores are unreliable indicators of his true intellectual abilities. The Government argues that, as a result of Roland's malingering and failure to put forth adequate effort on psychological testing, he has not met his burden in showing that he is ID and ineligible for the death penalty. The Court must therefore consider whether Roland was exerting inadequate effort or malingering during his IQ testing. See Montgomery , 2014 WL 1516147, at *28–29 ; Nelson , 419 F.Supp.2d at 902.
Throughout their testimony and reports, the parties' experts refer to Roland's "effort," "feigning," and "malingering." (See D.E. No. 420, Tr. at 161–65 (Dr. Denney describing the difference among the three concepts)).
Malingering. Malingering occurs when a "person is willfully attempting to appear more impaired than they truly are for secondary gain." (Id. at 86, 162). Secondary gain in this context refers to external motivators, such as "monetary benefits" or release from responsibilities (like military service). (Id. ).
Feigning. Feigning refers to "impairment on tests for reasons other than what would fall under the category of malingering." (Id. at 162–63). Dr. Denney described the reasons for feigning as those that fit "psychological needs," (e.g., feigning illness to get attention). (Id. at 162). Feigning on a test can look exactly like malingering because they both mean an "intent to deceive." (Id. at 163; D.E. No. 422, Tr. at 149 (Dr. Bigler's Testimony)). The difference is in their classification (or the reason behind the subject's deceitful intent). (D.E. No. 420, Tr. at 163 (Dr. Denney's Testimony)).
Effort. Effort refers to the examinee's motivation during a test. (D.E. No. 412, Tr. at 86–87 (Dr. Morgan's Testimony)). "Ideally, you want somebody coming into the task with motivation to do well, and they put forth optimal effort. At least adequate effort. Those people produce valid results." (D.E. No. 420, Tr. at 164 (Dr. Denney's Testimony)). Poor effort or lack of attention during a test is not necessarily malingering. (D.E. No. 422, Tr. at 18 (Dr. Denney's Testimony); D.E. No. 412, Tr. at 145 (Dr. Morgan's Testimony)). But the presence of any of these concepts indicates that the test results are invalid because they don't reflect the examinee's true ability. (D.E. No. 412, Tr. at 145 (Dr. Morgan's Testimony); D.E. No. 420, Tr. at 141–43; 162–65 (Dr. Denney's Testimony)).
1. Dr. Morgan's Assessment of Roland's IQ Scores
The Government's primary expert, Dr. Morgan, applied a subtest-comparison methodology in assessing the validity of Roland's two WAIS–IV IQ scores. (Dr. Morgan's Report at 9–13). While Roland's performance on the effort tests would ordinarily suggest that the test data "are valid and a credible representation of his cognitive function," Dr. Morgan nevertheless concluded that both IQ scores are invalid as the "test data obtained does not represent valid, credible, and scientifically objective measures of Mr. Roland's cognition and his intellect." (Id. at 9). Dr. Morgan based his conclusion on what he characterized as "significant discrepancies" in some of the subtest scores between the two administrations that, according to him, can only be attributed to Roland's behavior and effort. (Id. at 9–13). Dr. Morgan's chief concerns are summarized below.
Matrix Reasoning Subtest—Increased Performance. Dr. Morgan stated that the 4-point difference obtained in the FSIQ between the two WAIS–IV administrations "is largely due to the significant difference obtained on the Perceptual Reasoning Index." (Id. at 10). This is so because of the discrepancy of Roland's score on the Matrix Reasoning subtest, which is part of the Perceptual Reasoning Index. (Id. ). "Matrix Reasoning assesses non-verbal, visual, problem solving. It relies on visual analysis, logic and abstract, analytic reasoning." (Id. ). As noted above, with Dr. Hunter, Roland obtained a raw score of 8 (for a scaled score of 4); whereas four months later with Dr. Morgan, Roland obtained an increased raw score of 17 (for a scaled score of 9). (Id. ). Dr. Morgan concluded that the practice-effect phenomena could not account for this discrepancy because (i) during Dr. Hunter's exam, Roland did not see the test items on which he later improved and so "the assertion that the practice effects contributed to his improvement makes no sense because he never saw the items"; and (ii) one cannot feign improvement or a better performance, so his improvement is not due to "anything other than his ability." (Id. ).
Information Subtest—Decreased Performance. In addition to performing better on some aspects of the March 2017 examination, Dr. Morgan also pointed out that Roland performed worse on others. Dr. Morgan highlighted Information-subtest questions that Roland knew the answers to in November 2016, but not in March 2017. (Id. at 12). In Dr. Hunter's Information subtest, Roland earned a scaled score of 7; with Dr. Morgan, Roland earned a scaled score of 5. (Id. ). When asked what water is made of, Roland answered "H2O" with Dr. Hunter but said only "oxygen" with Dr. Morgan. (Id. ). When asked on what continent Brazil is located, he correctly answered "South America" with Dr. Hunter but said he did not know with Dr. Morgan. (Id. ). Dr. Morgan asserted that "[e]ither one knows this information or one does not. One does not forget this kind of information." (Id. ). Given that there is no "positive practice effect," Dr. Morgan concluded that absent "a progressive neurologic illness that might cause such a performance discrepancy," "the differences in performance can only be attributable to Mr. Roland's behavior and test-taking attitude." (Id. ; see also D.E. No. 412 at 161 (Dr. Morgan's Testimony)).
Pattern of Differential Effort. As an additional example of Roland's test-taking behavior and differential effort between the two examinations, Dr. Morgan pointed to the index scores of the RBANS test domains. (Dr. Morgan's Report at 12–13). In some of the domains, Roland's score increased during the March 2017 administration; for others, it decreased. (Id. at 13). For Dr. Morgan, the "differences in scores obtained on the same tests over time, where some test results are improved and other test results have declined" are "a phenomenon representing unreliability of scores and invalidity." (Id. ). "These discrepancies ... are unlikely and cannot be attributed to the mental status of Mr. Roland. Rather, they have to do with his behavior and effort. At a minimum, these discrepancies on the same test on two occasions, indicate the presence of variable and questionable effort, and resultant unreliable, invalid scores; at a maximum, they suggest malingered performance." (Id. ).
2. Dr. Denney's Assessment of Roland's IQ Scores
Echoing Dr. Morgan's conclusions, Dr. Denney opined that Roland did not apply sufficient effort during Dr. Hunter's testing. (D.E. No. 418, Tr. at 181 (Dr. Denney's Testimony)). Moreover, Dr. Denney asserted that "the consistency of Dr. Morgan's data with Dr. Hunter's data also supports [his] belief that Dr. Morgan's data is equally invalid." (Id. at 197). He further opined that, "[g]iven the entirety of the record, the malingering side fits better for the more recent events" (i.e., the two WAIS–IV IQ tests), and the 2002 KBIT is invalid "due to poor task engagement." (D.E. No. 420, Tr. at 165 (Dr. Denney's Testimony); id. at 66–67 (Dr. Denney testifying that both WAIS–IV IQ scores and the 2002 KBIT do not reflect Roland's true intellectual ability, that Roland dos not show a deficit in intellectual functioning that would place him in the ID range, and that Roland fails to satisfy the first criterion for ID)). In addition to endorsing Dr. Morgan's assessment of Roland's intellectual functioning, Dr. Denney cites five other reasons to support his opinion, summarized below.
Failed Easy Items. In three different subtests of Dr. Hunter's WAIS–IV administration—Block Design, Similarities, and Figure Weights—Roland failed easier items after he successfully passed harder start items. (D.E. No. 418, Tr. at 181 (Dr. Denney's Testimony); D.E. No. 420, Tr. at 196–97 (same)). "The fact that this occurred three times suggested an atypical test taking attitude consistent with poor task engagement." (Dr. Denney's Report at 11). Relying on one study that evaluated malingering on the Social Security Disability Consultative Exam, Dr. Denney concluded that "low IQ people who are motivated to do well do not cause this pattern." (D.E. No. 420, Tr. at 198 (Dr. Denney's Testimony)).
(See Gov. Ex. 373, Chafetz, Abrahams, Kohlmaier, "Malingering on the Social Security Disability Consultative Exam: A New Rating Scale").
Digit Span. Roland's performance on the Digit Span subtest, which is part of the Working Memory Index of WAIS–IV, "doesn't make sense," because "it doesn't create a pattern which is consistent with real cognitive abilities." (D.E. No. 418, Tr. at 181–82 (Dr. Denney's Testimony)). For this subtest, a series of numbers are read to the subject and the subject recites them back from memory, either in the same forward sequence, or backwards, depending on the instructions. (See D.E. No. 420, Tr. at 62–63 (Dr. Denney's Testimony)). Dr. Denney stated that Roland "could repeat up to 7 digits forward, yet he could not repeat a string of 4, 5, or 6 digits twice in a row," "indicat[ing] the WAIS–IV result from November 2016 is not valid." (Dr. Denney's Report at 11; D.E. No. 417, Tr. at 134–36 (Dr. Denney's Testimony)).
Subtest–Score Inconsistency. In reviewing Roland's March 2017 WAIS–IV subtest scores, Dr. Denney compared Roland's performance on some subtests (that are consistent between the two WAIS–IV administrations) to results of other subtests (that are also consistent between the two WAIS–IV administrations), arguing inconsistency between them. (Dr. Denney's Report at 12). For example, Dr. Denney compared Roland's (i) Arithmetic scaled score of 10 to Roland's Digit Span scaled score of 6, arguing that "[s]uch differences rarely occur" because Arithmetic requires "the very skills demonstrated in Digit Span"; (ii) Symbol Search scaled score of 7 to Roland's Coding scaled score of 3, arguing that the difference is "also considered atypical and a rare occurrence since they both measure similar cognitive skills"; and (iii) Coding subtest, in which Roland made five errors, to Roland's Arithmetic subtest, arguing that the pattern "does not make clinical sense for a person who can perform Arithmetic in his head." (Id. ). For Dr. Denney, these "inconsistencies raise serious concern about the validity of the test results." (Id. ).
Nonsensical Responses. During Dr. Hunter's Information subtest, Roland was asked "who wrote Hamlet," to which he responded, "Cat in the Hat." (Id. at 11; D.E. No. 418, Tr. at 182 (Dr. Denney's Testimony)). Dr. Denney described Roland's answer as a "nonsensical response," thus another "indication of invalidity." (D.E. No. 418, Tr. at 182 (Dr. Denney's Testimony)).
Effort Tests. Dr. Denney did not attribute any significance to the fact that Roland passed multiple effort tests. (Id. at 116). For Dr. Denney, failing effort tests conclusively means that the test data is invalid. (Id. ). But passing effort tests, on the other hand, does not automatically suggest that the results are valid; rather, one must "question the rest of the test data." (Id. ). He explained:
The important thing is if you understand the relationship between sensitivity and specificity, that tells you that if the test result is positive, you can take that to the bank. And feel comfortable about the findings. But if they are negative, you can't take that to the bank. You have to also question the rest of the test data just because those are negative doesn't mean the rest of the test data are valid. You need to go into the test data and look at the pattern and see do they make clinical sense.
(Id. ).
vii. Analysis of Roland's IQ Scores and Test Validity
1. Effort Testing
The Court finds it significant that both parties' experts indicated that effort testing is an important component of accurately measuring a person's IQ. (See supra at 507). Dr. Morgan explained that effort tests are a method of analyzing validity, and that a failure of these effort tests indicates test invalidity. (See supra at 507-08). Given that Roland passed twelve of these test-validity measures, the Court is perplexed regarding the reason Dr. Morgan decided to apply the instant subtest-comparison methodology. Dr. Morgan explained that he did so because this is a forensic setting and therefore involves secondary gain. (See supra at 507-09). But this is a puzzling position for Dr. Morgan to take since—knowing the facts of this case—he previously attested to the Court that
[t]he standard practice when conducting an examination to determine intellectual disability is to conduct "effort tests"—tests to see if the subject is malingering—throughout the examination. Employing multiple effort tests throughout the examination is the best way to test for the feigning of selective deficits – i.e. to determine if the subject is "faking it."
(Dr. Morgan's Aff. at 5). He further attested that the MMPI—which he administered and Roland passed—"is especially effective in gauging a subject's effort, truthfulness, and honesty, vis-à-vis neuropsychological testing, and as such, it is an effective means of determining whether the subject is 'malingering.' —i.e. intentionally trying to perform poorly on the examination." (Id. at 4). Dr. Morgan's representations in his affidavit therefore conflict with Dr. Morgan's reasons for applying a different test-validity method here.
Moreover, the Court finds it concerning that Drs. Morgan and Denney glossed over consideration of the fact that Roland passed twelve effort tests, including standalone and embedded measures. (See supra at 509-12). Especially disturbing is Dr. Denney's catch-22 explanation that had Roland failed these validity measures, you can "take that to the bank," but the fact that he passed requires the Government to "question the rest of the test data." (D.E. No. 418, Tr. at 116 (Dr. Denney's Testimony)). Dr. Morgan's and Dr. Denney's decision to ignore Roland's performance on these tests indicates a resistance to recognize Roland's cognitive deficits, which undermines these experts' credibility.
Ultimately, the Court is not persuaded by the Government's experts' reasons for employing their subtest-comparison methodology to assess the validity of Roland's IQ scores. On the contrary, the Court finds credible Roland's experts' position that the standard clinical practice is to accept the results of test data when the examinee passes the validity measures. Particularly compelling was Dr. McGrew's testimony—especially because he is an expert in "applied psychological measurements, theories of human intelligence, and interpretation of intelligence tests"—that
(See, e.g. , D.E. No. 422, Tr. at 44–45 (Dr. Bigler: "If you pass these validity measures, and the test results internally make sense with regards to history, background, information, that you have, clinical presentation, eyes on observation of the individual, then you trust the data.")).
[i]n my 42 years of experience ... I have never seen the validity of intelligence tests with such consistency questioned based on looking at what do they say to this item, what did he say in this item, and talking about raw score differences, which you shouldn't be doing .... I have never seen going down to this level of minutia in terms of the hierarchy of how intelligence tests are built. All psychological tests are built from the bottom up, the aggregate. I have never seen this. It does not comport ... with the prevailing scientific standards, community standards in psychological testing and particularly in intelligence testing, and I am just kind of, I am, I am flabbergasted.
(D.E. No. 425, Tr. at 28–29 (Dr. McGrew's Testimony)).
Dr. Bigler also credibly testified that in situations when the subject passes the validity measures—as is the case here—"you accept the findings as a reflection of what the person's ability is" if there is a "potential reasonable explanation for what went on." (D.E. No. 422, Tr. at 119 (Dr. Bigler's Testimony)). So, the Court will now determine if there is any credence to the subtest-score discrepancies raised by the Government's experts and if there is a reasonable explanation for them.
2. Conclusions on Dr. Morgan's Assessment of Roland's IQ Score
Citing individual examples, Dr. Morgan claimed that Roland's subtests had "significant discrepancies" and thus both of Roland's WAIS–IV IQ scores are invalid, even though Dr. Morgan conceded that the discrepancies were not of statistical significance. The Court concludes to the contrary.
Matrix Reasoning Subtest—Increased Performance. Dr. Morgan relied on the differences in the Matrix Reasoning subtest scores as the primary evidence supporting his conclusion that Roland's IQ tests are invalid. As noted above, for Dr. Hunter, Roland obtained a raw score of 8 (for a scaled score of 4); whereas four months later for Dr. Morgan, Roland obtained an increased raw score of 17 (for a scaled score of 9). (Dr. Morgan's Report at 10). Indeed, it is this discrepancy that most contributes to the four-point increase in the March 2017 administration. (Id. ). Dr. Morgan contended that the practice-effect phenomena could not account for this discrepancy because Roland did not previously see the test items on which he later improved and, because one cannot feign improvement, Roland's improvement is not due to "anything other than his ability." (Id. ). After seeing and hearing Dr. Morgan testify and considering the entirety of the evidence presented at the hearing, the Court finds Dr. Morgan's opinions conclusory and unconvincing.
At the outset, the Court points out that Roland's overall improvement in the WAIS–IV administered by Dr. Morgan, including his improvement on the Matrix Reasoning subtest, is inconsistent with any malingering strategy and is indicative of adequate effort. Moreover, the Court credits Dr. McGrew's opinion that it is not appropriate to examine individual subtest scores to assess validity because they are the most unreliable scores and have the greatest variability. (Dr. McGrew's Report at 23). Indeed, Dr. Morgan also testified that a FSIQ is the "most reliable measure" of determining intellectual functioning. (D.E. No. 427, Tr. at 41 (Dr. Morgan agreeing that the composite and FSIQ scores are "the most reliable measures")). Having had the opportunity to observe all of the experts' testimony and demeanor at the hearing, the Court is persuaded by the reasoning and conclusions of Roland's experts (summarized below) and relies on them in rejecting Dr. Morgan's conclusions.
See Hardy , 762 F.Supp.2d at 867 (concluding that the defendant's "overall improvement during the testing by the government ... is inconsistent with any malingering strategy and is indicative of adequate effort").
The Court is unpersuaded by Dr. Morgan's assertion that Roland's performance on the Matrix Reasoning subtest cannot be explained by the phenomena of practice effect because Roland did not see the test items on which he later improved. (See supra at 512-13). Dr. Hunter compellingly explained that practice effect can occur from exposure to the type of task, not necessarily exposure to a specific question or item. (See D.E. No. 387, Tr. at 18 (Dr. Hunter's Testimony)). The point is that the subject becomes stronger at a novel task given previous experience and recognition. (Dr. Hunter's Rebuttal Rep. at 6). Again, "practice effects represent the learning and memory of a specific context of a test and its components, not just acquired knowledge of completed items within a test." (Id. at 5). Dr. McGrew agreed:
Citing the WAIS–IV Manual, Dr. Hunter explained that the "range of difference observed for the Matrix Reasoning subtest (a component subtest of the Perceptual Reasoning Index) across the testings conducted of Mr. Roland by myself and Dr. Morgan is indicative of Mr. Roland's recognition of the broader context of the measure itself and its demands across a relatively short period of time (e.g., a time frame that is suggested in fact by the developers of the WAIS–IV to be short and vulnerable to significant practice effects)." (Dr. Hunter's Rebuttal Rep. at 6).
Practice effects are not based on the exposure to exact same sets of items between a first and second assessment. Mr. Roland would not necessarily need to see the "new" Matrix Reasoning items he had not seen previously during his first WAIS–IV to perform better on his second WAIS–IV Matrix Reasoning test administration. The most accepted research-based explanation of practice effects is that they occur when an individual is exposed to novel types of tasks, and as they move through the items, they may adopt certain general problem-solution or test-taking strategy for solving the tasks. Their general test-taking or problem-solving strategy may become more refined even during the first test session as they interact with successive test items. Then, during the second retest exposure, they do not necessarily recall the exact items, but they recall the general types of tasks and the general solution or test-taking strategy they used previously and invoke this that
task-specific strategy immediately, or continue to refine their solution strategy to become ever more effective during the second exposure (Duff et al., 2012; Estives et al., 2012).... Additionally, recent research has studied "how" individuals solve matrix reasoning items (via eye-tracking technology). This research has confirmed the general strategy refinement hypothesis (Hayes, Petrov & Sederberg, 2015). This research has suggested that up to 30% of the increase in matrix reasoning score performance (upon retesting) is due to refinement in visual-spatial problem solving strategies and not changes in the underlying inductive reasoning measured by matrix reasoning tests. That is, individuals learn a specific matrix reasoning skill for "how to" approach and solve the reasoning problems, not necessarily increase their underlying reasoning ability.
(Dr. McGrew's Report at 32–33).
Moreover, as Drs. Bigler and McGrew explained (and common sense indicates), in addition to practice effects, another plausible explanation is that the variation in the Matrix Reasoning subtest score could be explained by Roland's impairment, guessing, or strengths kicking in during the second administration. (D.E. No. 425, Tr. at 29, 153–54 (Dr. McGrew's Testimony); D.E. No. 422, Tr. at 192–96 (Dr. Bigler's Testimony)).
Crippling to Dr. Morgan's conclusion is the fact that Matrix Reasoning has one of the lowest test-retest stability coefficients of all the subtests. (Dr. McGrew's Report at 21; WAIS–IV Manual at 48; D.E. No. 423, Tr. at 30, 116–17 (Dr. McGrew testifying that the Matrix Reasoning subtest "is tied for bottom in terms of stability over time")). According to the WAIS–IV Manual, Matrix Reasoning has a test-retest stability of .74, which indicates that "first and second scores are not expected to be perfectly stable for all individuals." (Dr. McGrew's Report at 36). The fact that the Matrix Reasoning is "tied for bottom in terms of stability over time" is contrary to—and therefore undercuts—Dr. Morgan's and Dr. Denney's assertion that the only explanation for a statistically-significant change in scores between the two administrations is poor effort or malingering.
(See also D.E. No. 423, Tr. at 117 (Dr. McGrew's Testimony: "Q. So ... what does that mean, if it is tied for bottom for stability? That means you would expect more variation? A. It means if you tested somebody time 1 and time 2, the variability in the scores between two time would be greater than it would be for like test information.... [O]n matrix reasoning, you are going to see a lot of moving around of the scores.")).
Agreeing with the persuasive, thorough, and credible explanations provided by Drs. McGrew, Hunter, and Bigler, the Court concludes that using a single statistically-different subtest score (especially in light of Matrix Reasoning's low test-retest stability coefficient) to undermine nine statistically insignificant subtest scores, four statistically insignificant index scores, and two statistically insignificant FSIQ scores on the WAIS–IV is inappropriate.
Wilson , 170 F.Supp.3d at 365 (noting that "when assessing intellectual functioning for the purpose of determining eligibility for the death penalty," Hall "suggests that ... courts should resolve uncertainty in favor of defendants").
Information Subtest—Decreased Performance. Comparing Roland's scores on the Information subtest between the two WAIS–IV administrations, Dr. Morgan pointed out that in addition to performing better on some aspects of the March 2017 examination, Roland also performed worse. (See supra at 513). Such a pattern, for Dr. Morgan, again suggests that Roland's performance cannot be explained by the phenomena of practice effect. The Court finds Dr. Morgan's blanket rejection of practice effect unconvincing, as the parties' experts—including Dr. Morgan—testified at the hearing that it is possible to demonstrate practice effect on some subtests, but not others. (D.E. No. 414, Tr. at 149 (Dr. Morgan's Testimony); D.E. No. 387, Tr. at 19 (Dr. Hunter's Testimony); D.E. No. 423, Tr. at 161 (Dr. McGrew's Testimony)).
Next, Dr. Morgan highlighted two Information-subtest questions that Roland knew the answers to in November 2016, but not in March 2017. (See supra at 513). As a reminder, Dr. Morgan focused on Roland's inconsistent responses to what water is made of and where Brazil is located. (See supra at 513). Dr. Morgan then summarily concluded that absent "a progressive neurologic illness that might cause such a performance discrepancy, ... the differences in performance can only be attributable to Mr. Roland's behavior and test-taking attitude." (Dr. Morgan's Report at 12).
Dr. Morgan's conclusion is suspect for a number of reasons. For one, he ignores Dr. Hunter's contemporaneous notes that Roland lacked confidence when responding to Dr. Hunter's question about Brazil. (See Def. Ex. 44 at 3 (Dr. Hunter noting "i.e., Brazil—came to it several times bf committed")). In explaining his notes at the hearing, Dr. Hunter stated: "The question is where is Brazil? What continent is Brazil? And the answer is one that he was getting to. But was not showing certainty about." (D.E. No. 386, Tr. at 154 (Dr. Hunter's Testimony)). Dr. Hunter also pointed out that he did not provide Roland feedback on whether that answer was correct. Because Roland demonstrated uncertainty when he gave the correct answer about Brazil with Dr. Hunter, and because Dr. Hunter did not provide any feedback on whether Roland's answer was correct, the Court finds it plausible that Roland failed to provide the correct answer with Dr. Morgan without demonstrating a poor test-taking attitude. Even more, the fact that there is another plausible explanation for Roland's response with Dr. Morgan indicates that (unlike Dr. Morgan's conclusion) the differences in performance may not "only be attributable to Mr. Roland's behavior and test-taking attitude" (Dr. Morgan's Report at 12). Given the plausibility of an alternative explanation and the preponderance standard here, the Court is not persuaded by Dr. Morgan's conclusion.
(D.E. No. 387, Tr. at 250 (Dr. Hunter's Testimony: "Q. You talk about the difference on your test, [where Roland] ultimately answered that Brazil was in South America but did not ... with Dr. Morgan? A. That's correct. Q. In your notes you note that he was very hesitant and took some time before coming to that answer? A. Yes, I did. Q. Do you give [him] feedback about whether he was right? A. No."); see also id. at 17–18 (Dr. Hunter testifying that "we don't give feedback. We don't tell them if it is right or wrong")).
Besides, Dr. McGrew provided additional credible and reasonable explanations that further undercut Dr. Morgan's conclusion. First , he pointed out that Dr. Morgan's conclusion—that absent "a progressive neurologic illness that might cause such a performance discrepancy, ... the differences in performance can only be attributable to Mr. Roland's behavior and test-taking attitude" (Dr. Morgan's Report at 12)—"is not based on any known scientific research regarding the ability of an individual to provide similar answers to individual test items across a four-month interval." (Dr. McGrew's Report at 27–28). Second , Dr. McGrew explained that to "expect 100% perfect agreement for all WAIS–IV individual test items across a four-month interval is misleading and inconsistent with the known low reliabilities of individual test items." (Id. ). He emphasized that it has "long been known in the field of psychometrics that individual subtest items have very low reliability (typically ranging from .10 to .40 ...)." (Id. at 21). Third , he noted that there is no "established scientific method for determining the validity of an individual's tested performance at the item level across time." (Id. at 27–28). Finally , and most convincingly, Dr. McGrew highlighted that Dr. Morgan provides only two examples of differences within the Information subtest, and the scores of this subtest are statistically insignificant between the two administrations. (Id. at 21, 26).
Relying on Dr. McGrew's and Dr. Hunter's credible explanations, the Court finds that Dr. Morgan's criticisms of Roland's performance on the Information subtest are unpersuasive, and the Court declines to invalidate Roland's IQ scores on this basis.
See also Lewis , 2010 WL 5418901, at *10 (disregarding the conclusions of two experts who used index scores rather than FSIQ scores in their determinations).
Pattern of Differential Effort. As an additional example of Roland's test-taking behavior, differential effort between the two examinations, and test invalidity, both Drs. Morgan and Denney pointed to the 30-point scaled score drop from November 2016 to March 2017 in one of Roland's RBANS indices. (Dr. Morgan's Report at 12–13; Dr. Denney's Report at 13). Specifically, they noted the RBANS Delayed Memory Index scaled score of 44 obtained during Dr. Morgan's assessment (as compared to a scaled score of 74 during Dr. Hunter's assessment). (See Gov. Ex. 195; see supra at 506-07).
The Court rejects Dr. Morgan's and Dr. Denney's conclusions, in part, because of a plausible explanation for the 30-point difference credibly and persuasively outlined by Dr. Bigler. Dr. Bigler clarified that the 30-point Delayed Memory scaled score drop is actually reflective of a small raw score difference of two points, which led to a greater scaled score difference because the raw score was at the tail-end of the distribution and therefore sensitive to small changes. (D.E. No. 422, Tr. at 122–25 (Dr. Bigler's Testimony)). Notably, Dr. Denney endorsed Dr. Bigler's raw-score explanation. (See Dr. Denney's Report at 21 (stating that Dr. Bigler "is correct in the description that a raw score 18 would result in a 30 Index score point drop compared to a score of 19")). In light of this evidence, the Court agrees with Dr. Bigler that the two-point difference in the raw scores reveals that "the magnitude of what [Roland] didn't recall is minimal." (Id. at 134–35). Dr. Bigler further emphasized that the RBANS also contained two embedded validity measures, which Roland passed. (D.E. No. 422, Tr. at 59 (Dr. Bigler's Testimony)). That evidence, too, is detrimental to the merits of Dr. Morgan's and Dr. Denney's conclusion.
Dr. Bigler explained:
Turning to the Delayed Memory index score, Dr. Morgan does not provide statements based on raw data comparisons, only on the standard score comparisons, which frankly are misleading. Dr. Morgan's findings are listed first and Dr. Hunter's are in parentheses and highlighted. The raw data reflects the following scores for the Delayed Memory index: List Recall 4 (5), List Recognition 18 (20), Story Recall 5 (6) and Figure Recall 5 (4). Obviously, at the raw data level, these scores are not discrepant. When looking up the standard score in the RBANS Update Administration Manual, under ages 20 – 39 (Table 26) the intersection of Dr. Morgan's evaluation places the Standard score at 44, but note this is at the end of the List Recognition statistical distribution, with a score of 18. Mr. Roland's score of 20 with Dr. Hunter, jumps his score up by 30 standard score points (74) because he recalled one more word on the List Recall. This is entirely within expected limits of variability and simply cannot be interpreted as a discrepant finding. On page 13, concerning the RBANS, Dr. Morgan argues: "These discrepancies, where some performances are improved and some declined over time are unlikely and cannot be attributed to the mental status of Mr. Roland. Rather, they have to do with his behavior and effort. At a minimum, these discrepancies on the same test on two occasions indicate the presence of variable and questionable effort, and resultant unreliable, invalid scores; at a maximum, they suggest malingered performance." Obviously, given the above explanation that the RBANS raw scores are not discrepant, this statement by Dr. Morgan cannot be supported.
(Dr. Bigler's Report at 7).
3. Conclusions on Dr. Denney's Assessment of Roland's IQ Score
Dr. Denney concluded that Roland's FSIQ scores are invalid and that Roland was malingering during the administration of the two WAIS–IV IQ tests. (D.E. No. 420, Tr. at 66–67, 165 (Dr. Denney's Testimony)). It is without question that Dr. Denney's distinction as one of only seven individuals in the world who are board-certified in both neuropsychology and forensic psychology is impressive. But, having had an opportunity to consider the testimony, conclusions, and demeanor of all experts at the hearing and all the evidence proffered by the parties, the Court finds Dr. Bigler's testimony and conclusions more thorough, credible, and therefore more persuasive. So, compared to Dr. Bigler, the Court finds Dr. Denney's testimony to be less credible and will assign it less weight.
Nonsensical Responses. In support of his conclusion that Roland was malingering during Dr. Hunter's test, Dr. Denney stated that Roland showed an "improper test taking attitude" because Roland said "Cat in the Hat" wrote Hamlet. (Dr. Denney's Report at 11; see also supra at 514-15). For Dr. Denney, this is a nonsensical response indicative of test invalidity. (See supra at 514-15). Dr. Hunter, on the other hand, testified that Roland's answer was indicative of "an impulsive response out of frustration, uncertainty"; he believed Roland "was not sure about a response." (D.E. No. 386, Tr. at 152 (Dr. Hunter's Testimony)). But, as Dr. Bigler explained during his testimony, malingering requires that someone know the right answer and provide a false one. (D.E. No. 422, Tr. at 48 (Dr. Bigler's Testimony)). In other words, malingering requires an "intent to deceive." (Id. at 149). The Court therefore finds that Roland's answer—even if impulsive, uncertain, or nonsensical—does not support Dr. Denney's inference that Roland was malingering (i.e., intended to deceive).
Failed Easy Items. Next, Dr. Denney posits that Roland's performance on four subtests during Dr. Hunter's exam further supports the conclusion that Roland was malingering. (See supra at 513-14). Specifically, in the Block Design, Similarities, and Figure Weights subtests, Roland failed easier items after he successfully passed harder start items. (See supra at 513-14). And during the Digit Span subtest, Roland "could repeat up to 7 digits forward, yet he could not repeat a string of 4, 5, or 6 digits twice in a row." (See supra at 514). Dr. Denney concluded that the only explanation was Roland's malingering or poor effort. (See supra at 513-14). This conclusory opinion is reason enough for the Court to disregard Dr. Denney's conclusion. Indeed, even the "Slick article," on which Dr. Denney relies for his assessment, does not justify his conclusion. (See D.E. No. 420, Tr. at 165–66 (Dr. Denney discussing the Slick article)). In fact, the article instructs that "[t]o conclude that a person is malingering, one must rule out the alternatives." (See Def. Ex. 85f at 558). Having had an opportunity to hear and see Dr. Denney testify, and after considering his testimony, written report, and all of the evidence in the record, the Court is not convinced that Dr. Denney's analysis ruled out alternatives or that Dr. Denney considered any explanations other than Roland's effort or malingering.
(See Def. Ex. 85f, Daniel J. Slick, Elisabeth M.S. Sherman & Grant L. Iverson, "Diagnostic Criteria for Malingered Neurocognitive Dysfunction Proposed Standards for Clinical Practice and Research").
Rather, the Court is persuaded by Dr. Bigler's credible and thorough explanation that several alternatives may be possible here. (D.E. No. 422, Tr. at 199–200 (Dr. Bigler's Testimony)). Dr. Bigler explained that many factors can affect test performance, including (among others) time of day, sleep history, and boredom. (See id. ). Addressing Dr. Denney's concerns on Roland's Digit Span backward-forward performance, Dr. Bigler testified that, rather than poor effort being the only explanation, "[i]f there is variable task engagement with the forward aspect, the person just may muster up more attention with the reverse aspect." (Id. at 90). And the Court does not need an expert to conclude that people can have variable task engagement, attention spans, and mood swings that may affect test performance. Common sense indicates that poor effort or malingering cannot possibly be the only explanations here.
Effort Tests. Moreover, the Court finds that Dr. Denney's skepticism about the significance of passing effort tests injures his credibility. (See supra at 514-15). Equally harmful to Dr. Denney's credibility is the article he cites in support of his assertion that "[i]t is typical for individuals attempting to feign on intelligence tests to place results in the range of mild ID." (Dr. Denney's Report at 20–21 (citing Gov. Ex. 368)). That article, however, cautions that it is "erroneous to assume that such observations are conclusive and we would advocate that other assessment procedures specifically developed for detecting malingering are administered in order to substantiate the results." (Gov. Ex. 368 at 315). And here, it is undisputed that Roland passed all "assessment procedures specifically developed for detecting malingering." (See id. ; supra at 509-12).
Subtest–Score Inconsistency. Finally, regarding Dr. Denney's comparisons of some of Roland's subtest scores during Dr. Morgan's administration to the results of other subtests (see supra at 514-15), the Court agrees with the concerns raised in Smith and "questions the appropriateness of comparing the results of one subtest with a different subtest and then arguing they are somehow inconsistent. It is akin to the proverbial comparing of apples with oranges." 790 F.Supp.2d at 492. In any event, the Court finds credible and persuasive Dr. Bigler's and Dr. McGrew's "straightforward ways to explain" Dr. Denney's concerns here. For starters, Dr. Bigler noted that while the subtests do have some correlation (a "shared variance" that accounts for a certain percentage of the relationship), the remaining percentage of the relationship between these subtests is unknown. (See D.E. No. 422, Tr. at 99 (Dr. Bigler's Testimony)).
(See D.E. No. 422, Tr. at 87–88 (Dr. Bigler testifying that "there are straightforward ways to explain these findings")).
Explaining the alleged "inconsistencies" in Roland's Arithmetic and Digit Span subtest scores, Dr. Bigler testified that the Arithmetic subtest has verbal, language, and context aspects that are not present in the Digit Span thereby comprising "different tasks" and engaging "different neurological function." (Id. at 87–88). Consistent with Dr. Bigler's testimony, Dr. McGrew pointed out that the correlation between the Arithmetic and Digit Span subtests is 0.60, noting that they have only 36% in common. (D.E. No. 425, Tr. at 58 (Dr. McGrew's Testimony); see also WAIS–IV Manual at 62 (listing intercorrelations of subtest scores)). In other words, the 36% correlation reveals that the subtests have less in common than they have not in common. (D.E. No. 425, Tr. at 59 (Dr. McGrew's Testimony)).
Comparing Roland's performance on the Symbol Search and Coding subtests, Dr. Denney revealed during cross-examination that the correlation between these two subtests is 0.65, and they therefore have only 42% in common. (D.E. No. 420, Tr. at 209 (Dr. Denney's Testimony); see also WAIS–IV Manual at 62 (listing intercorrelations of subtest scores)). Dr. Denney also testified that the correlation between the Coding and Arithmetic subtests is 0.43, which reveals that the area of overlap is only 18%. (D.E. No. 420, Tr. at 207 (Dr. Denney's Testimony); see also WAIS–IV Manual at 62 (listing intercorrelations of subtest scores)). Put differently, "82 percent of the scoring is different between the two tests." (D.E. No. 420, Tr. at 207 (Dr. Denney's Testimony)).
Given that each of the subtests—despite some intercorrelation among them—test different skills and only have a certain percentage of overlap, the Court is not inclined to accept Dr. Denney's conclusion that these alleged "inconsistencies raise serious concern about the validity of" Roland's IQ tests (see Dr. Denney's Report at 12). This result is appropriate especially in light of the preponderance standard here.
4. Conclusions on Test Validity
The Court has carefully considered the Government's experts' rationales for rejecting the validity of Roland's IQ scores, but finds in light of the entirety of the evidence that their conclusions must be rejected. As noted earlier, IQ tests are designed to aggregate data from the item level, to the subtest level, to the index scores, to the FSIQ scores. (See supra at 503-04). The Government's experts' overarching argument appears to be that subtest score discrepancies (if enough are present) raise red flags, regardless of whether the differences in scores are statistically significant. (See supra at 512-15). But, as Dr. McGrew explained, "individual subtest items have very low reliability (typically ranging from .10 to .40 ...)." (Dr. McGrew's Report at 21). The WAIS–IV Manual similarly states that subtest scores are less reliable because "each subtest represents only a narrow portion of an individual's entire intellectual functioning, whereas the composite scores summarize the individual's performance on a broader sample of abilities." (WAIS–IV Manual at 23). Considering that individual subtest scores are the least reliable scores and have the greatest variability, the Court is not persuaded by the Government's experts' position that Roland's alleged discrepancies in the subtest scores (i.e., the most unreliable pieces of information) between the two WAIS–IV administrations—of which nine of ten are statistically insignificant—suggest that both FSIQ scores are invalid. (See supra at 506-07). The Government's experts' rationale for rejecting the validity of Roland's IQ scores is unpersuasive for the independent reason that each of the concerns raised by Drs. Morgan and Denney could be addressed by alternative explanations, as provided by Drs. Hunter, McGrew, and Bigler. (See supra at 521-23). And the Government's experts' failure to even consider any of these alternatives before discounting them again calls into question their ultimate conclusions. (See supra at 512-15). Taking into account the plausibility of the alternative explanations proffered by Roland's experts and the preponderance standard here, the Court again rejects Dr. Morgan's and Dr. Denney's conclusions.
Like in Hardy , this Court also "finds this testimony unsettling, as it evidences a casual attitude towards the science that undergirds IQ testing, and more specifically, the proper statistical interpretation of the various tests at issue." 762 F.Supp.2d at 870.
Notably, the test-comparison methodology employed by the Government's experts here appears to have been rejected by all courts to have considered it, citing similar concerns. See, e.g. , Smith , 790 F.Supp.2d at 494 (holding that expert's "idiosyncratic picking apart of a few isolated responses to challenge the overall results was overreaching and simply not credible"); Hardy , 762 F.Supp.2d at 870 (rejecting experts' conclusions based on "test score discrepancies ... regardless of whether the differences in scores are statistically meaningful"); Lewis , 2010 WL 5418901, at *10 (disagreeing with experts' "decision to use the index scores, instead of the full scale scores, in their determinations"). At the hearing, the Court asked the Government whether any "precedent supports this method of validation." (D.E. No. 426, Tr. at 31). The Government relied on Montgomery (id. at 31–32), but that reliance is misplaced. There, the court found that the defendant's IQ scores were valid and that the defendant displayed adequate effort—despite failing some of the effort tests. See Montgomery , 2014 WL 1516147, at *37. So, the authority on which the Government relies also fails to support its questionable validation practices.
Considering the significance of the potential secondary gain in this case, the Court nevertheless undertook an exhaustive evaluation of whether Roland was exerting inadequate effort or malingering during his IQ testing. (See supra at 514-23). Having considered each experts' written reports, testimony, and demeanor at the hearing, and all of the evidence proffered by the parties, the Court finds that Roland's two WAIS–IV scores are not invalid due to low effort or malingering. Contrary to the Government's position, the evidence presented at the hearing provides several indications of test validity.
Wiley v. Epps , 625 F.3d 199, 213 (5th Cir. 2010) ("A defendant must also prove through appropriate testing that he is not malingering.").
Remarkably, Roland passed a total of twelve test-validity measures designed to detect poor effort and malingering, six of which were administered by Dr. Morgan. (See supra at 509-12). Both parties' experts indicated that effort testing is an important component of accurately measuring a person's IQ. (See supra at 507). As already noted, Dr. Morgan described these effort tests as the "best way to test for the feigning of selective deficits—i.e. to determine if the subject is 'faking it.' " (Dr. Morgan's Aff. at 5). And, the Court finds credible Roland's experts' position that the standard clinical practice is to accept the results of test data when the examinee passes the validity measures. (See supra at 507, 515-16). Consistent with this view, the court in Montgomery —authority on which the Government relies—held that if a defendant passes most of the efforts tests administered to him, that defendant satisfies his burden of showing that it is more likely than not he was demonstrating adequate effort on his IQ tests. See 2014 WL 1516147, at *44. Here, it is undisputed that Roland passed all twelve effort tests administered to him. (See supra at 509-12). Dr. Hunter repeated in his contemporaneous notes, written report, and testimony at the hearing that Roland was always putting in effort during testing, and nothing in the record suggests that Dr. Morgan noted any signs of a lack of effort during his administration. (See supra at 510-11, 512). In fact, Dr. Morgan testified that he had no concerns about the way Roland "was behaving during the testing." (D.E. No. 412, Tr. at 101 (Dr. Morgan's Testimony)). The Court thus finds that Roland did not malinger or apply poor effort during the administration of the two WAIS–IV IQ tests. The Court comes to this conclusion in part out of Dr. Hunter's and Dr. Morgan's vast experience in administering these tests and their clinical ability to spot subpar performance.
Another indication of test validity is the highly consistent IQ test results, not only in the FSIQ scores but also in the scoring of the indices and subtests as well. (See supra at 506-07). Roland's Prong–One experts (Drs. Hunter, Bigler, and McGrew) uniformly asserted that consistency of scores is important to assessing the validity of test results. Courts have also recognized that consistency among intelligence testing, together with other consistent evidence, can be a compelling argument against the possibility of malingering. See, e.g. , Nelson , 419 F.Supp.2d at 902 ("However, the most compelling argument against the possibility of malingering in this case is the overwhelming consistency among all of the intelligence testing, academic testing, neuropsychological testing, and anecdotal recollections, including those dating from the time when Nelson had no incentive to malinger."); see also Shields , slip op. at 23 (holding that consistency among defendant's IQ scores supported their validity and discredited the position that defendant's results were "the product of poor effort or malingering to obtain some secondary gain").
(See Dr. Hunter's Report at 16; Dr. McGrew's Report at 11, 21, 28; see also D.E. No. 422, Tr. at 155 (Dr. Bigler testifying that "consistency is very important"); D.E. No. 423, Tr. at 88 (Dr. McGrew: "There is no statistically significant difference between these four different estimates of general intelligence. It also tells me that there is a high degree of convergence ... If you have multiple pieces of information telling you the same thing, you are confident that ... it is valid.")).
First , there is no statistically significant difference in the FSIQ scores between Dr. Hunter's and Dr. Morgan's testing. (See supra at 506-07). The two KBIT composite IQ scores are also within the SEM confidence band values of each other and the two WAIS–IV tests. (See supra at 506-07). The Court agrees with Dr. McGrew's assessment that "[t]he multiple score convergence (two WAIS–IV IQ test scores; two brief KBIT/KBIT–2 IQ scores) increases the confidence ... in concluding that Mr. Roland meets ... Prong One for a diagnosis of intellectual disability." (Dr. McGrew's Report at 12).
(See also D.E. No. 423, Tr. at 88 (Dr. McGrew: "If you have multiple pieces of information telling you the same thing, you are confident that ... it is valid.... this is compelling consistency.")).
Second , there is no statistically significant difference in the index scores between Dr. Hunter's and Dr. Morgan's testing. (See supra at 506-07). Dr. McGrew testified that the correlation between the patterns of the two WAIS–IV tests is "extremely high," noting that there was no statistically significant difference on any of the four WAIS–IV indices between the two administrations. (D.E. No. 423, Tr. at 107 (Dr. McGrew's Testimony); see also Dr. McGrew's Report at 29). This fact, according to Dr. McGrew, is "[a]n additional source of IQ score consistency and convergence," which means that Roland's IQ scores are "a reliable and valid estimate of his general intellectual functioning." (Dr. McGrew's Report at 10–11).
Third , there is no statistically significant difference in nine out of ten subtest scores between Dr. Hunter's and Dr. Morgan's testing. (See supra at 506-07). This means "that across the 10 WAIS–IV subtests, 4 composite index scores, and 1 Full Scale IQ score (15 total scores), Mr. Roland's performance on both WAIS–IV test administrations are remarkably convergent and consistent. That is, 14 of 15 (approximately 93% of the available WAIS–IV score information) of his WAIS–IV Time 1 to Time 2 scores are not statistically significantly different." (Dr. McGrew's Report at 23). "An objective statistical comparison of the score profiles (correlation) reveals a high degree of pattern consistency (correlation of .83/.86)." (Id. at 28). This Court agrees with Dr. McGrew that "[s]uch a high degree of similarity would be extremely difficult to fake by any individual across the 10 WAIS–IV subtests across a four-month period." (Id. ).
Finally , and significantly, the Court notes that Roland had a similar pattern of strengths and weaknesses across Dr. Morgan's and Dr. Hunter's testing. For example, Dr. McGrew pointed to a graph where he overlaid Roland's raw scores on both WAIS–IV tests and compellingly explained at the hearing that, with the exception of the Matrix Reasoning subtest, Roland's performance, i.e., "general shape, visually is the same." (D.E. No. 423, Tr. at 107 (Dr. McGrew's Testimony)). Dr. McGrew further pointed out that Roland's "[m]ountains and valleys, strengths and weaknesses" exhibit a pattern that is "very stable." (Id. ; see also Dr. McGrew's Report at 29). Equally compelling is Dr. McGrew's testimony that Roland's consistent pattern of strengths and weaknesses is "near impossible" to fake. (D.E. No. 425, Tr. at 82 (Dr. McGrew's Testimony)). He explained that "unless you know the test and you specifically know the items ... I don't see how it could be done." (Id. ).
(See also D.E. No. 422, Tr. at 153 (Dr. Bigler's Testimony: "Q. Looking at his history, looking at the overall picture, ... the 2002 KBIT score, and these two testings as well as neuropsych testing, how possible would it be to fake this profile over all of these administrations? A. I think it would be very difficult.")).
The overwhelming evidence therefore is that Roland's responses to both tests are consistent at every meaningful level. The Court agrees with the analysis in Shields that, in light of the consistency of Roland's IQ scores (from the time he was 17), it is simply not plausible that Roland's WAIS–IV scores are artificially low as a result of poor effort or malingering for secondary gain. See slip op. at 24. As the Shields court noted, "[t]o accept the Government's theory would mean that Defendant is excellent at faking low IQ scores. It would also mean that at a relatively young age Defendant adopted a strategy of generating artificially low IQ scores in order to hide his intelligence and further that Defendant has been remarkably effective at getting these scores on different tests over several years to fall close to one another in a way that would be difficult even for a psychometric expert." Id.
After considering all of the experts' written reports, testimony, and demeanor at the hearing, the evidence proffered by the parties, the clinical guidelines, the Supreme Court's instructions in Atkins , Hall , and Moore , and other relevant caselaw, the Court again concludes that using a single statistically significant subtest score difference to undermine nine statistically insignificant subtest scores, four statistically insignificant index scores, and four statistically insignificant IQ scores is inappropriate. Accordingly, the Court finds that Roland's IQ scores are valid, reliable, and demonstrate significantly subaverage intellectual functioning to satisfy Prong One of the ID definition.
viii. Other Indicia of Roland's Deficits in Intellectual Functioning
Other evidence in the record also supports the conclusion that Roland's IQ scores are valid and Roland therefore has deficits in intellectual functioning. The most compelling evidence supporting Roland's deficits in intellectual functioning is the SSA's determination in 1999 that Roland was mentally retarded, when he was 14. (See Def. Ex. 17 at 7). Dr. Huber, the medical consultant who changed Roland's designation from learning disabled to mentally retarded, testified that it was his practice to request an IQ test for "nearly all" applicants suspected of having a potential ID and that he would not diagnose a claimant as fitting the criteria for ID if he did not believe it to be the case after a thorough review of the claimant's records. (See supra at 497). Nothing in the record suggests that this process did not occur in evaluating Roland's claim. Notably, Roland began receiving SSI benefit payments in 1996, when the SSA initially determined that he was learning disabled. (See Def. Ex. 17 at 5, 8). As a result, there was no reason for Roland to malinger in 1999, when the SSA changed his designation to mentally retarded.
The APA, again, provides that deficits in intellectual functioning are evidenced by deficits in reasoning, problem solving, planning, abstract thinking, judgment, academic learning, and learning from experience. See DSM–5 at 33. The record is fraught with examples of these deficits. For example, in administering the 2002 KBIT, when Roland was 17 years old, Dr. Farber noted that her evaluation "reveals a young man who has very poor judgment and little insight into his behaviors." (Def. Ex. 9a at 59). About Roland's effort, Dr. Farber recorded that Roland was "cooperative with this interviewer and the interview process," undercutting concerns of any poor test-taking attitude. (Id. ). Similarly, during a substance-abuse screening and interview at the Juvenile Justice Commission, when Roland was 17 years old, the evaluator noted that Roland "appeared somewhat cognitively limited." (Id. at 85). And in 2005, while in the custody of the New Jersey Department of Corrections, Roland again "presented as cognitively, socially and behaviorally immature individual with limited judgment and decision-making." (Gov. Ex. 110 at 7).
In light of this evidence, the Court finds that Roland's IQ scores comport with Roland's clinical history. Roland's SSA determination and noted cognitive limitations, combined with his poor academic performance and exposure to a plethora of risk factors (see supra at 494-96) buttress this Court's finding that Roland's IQ scores are valid and demonstrate that Roland has significant subaverage intellectual functioning to satisfy the requirements of Prong One.
The Government argues that "[t]hough Defendant performed poorly in school as a child, the evidence suggests ... that performance was related to behavioral and emotional issues rather than cognitive deficits." (D.E. No. 433 ("Gov. Post–Hearing Submission") ¶¶ 287, 298). This argument has been routinely rejected by courts and clinical guidelines alike. (See 480). Indeed, the Supreme Court rejected the same argument in Moore as recently as three months prior to Roland's evidentiary hearing. There, the lower court held that "Moore's problems in kindergarten were more likely cause[d] by emotional problems than by intellectual disability." Moore , 137 S.Ct. at 1051. In vacating the lower court's decision, the Supreme Court held that "[t]he existence of a personality disorder or mental-health issue, in short, is not evidence that a person does not also have intellectual disability." Id. ; see also Brumfield , 135 S.Ct. at 2280 (determining that the relevance of defendant's "antisocial personality" diagnosis is "unclear" and not inconsistent with ID, since "individuals with [ID] also tend to have a number of other mental health disorders, including personality disorders"). For the same reasons, the Court dismisses the Government's argument here.
ix. Prong One: Conclusion
The Court concludes that Roland has satisfied his burden of proving that he more likely than not suffers from significantly subaverage intellectual functioning to satisfy Prong One of the ID definition. See AAIDD–11 at 41; DSM–5 at 33. The Court has thoroughly analyzed the Government's experts' reasons for invalidating Roland's IQ scores. But, after carefully considering each expert's qualifications, reports, testimony, and demeanor at the hearing, and the relevant evidence proffered by the parties, the Court concludes that all of the credible evidence lends full support to the validity of those scores. The Court therefore finds that Roland's valid, consistent IQ scores—particularly the 71 and 75 FSIQ scores on the "gold standard" WAIS–IV—comfortably place him approximately two or more standard deviations below the mean, with or without a Flynn adjustment. The Court now turns to the other criteria relevant for an ID diagnosis.
See Atkins , 536 U.S. at 309 n.5, 122 S.Ct. 2242 (noting that "an IQ between 70 and 75 or lower ... is typically considered the cutoff IQ score for the intellectual function prong of the mental retardation definition"); McManus , 779 F.3d at 650 ("[A] full-scale IQ score of 70–75 or lower ordinarily will satisfy the first requirement for a finding of intellectual disability."); see also Moore , 137 S.Ct. at 1045 & n.4, 1047 (relying on Moore's IQ score of 74); (supra at 498-502).
The "intellectual-disability inquiry" does not end, "one way or the other, based on [the defendant's] IQ score. Rather, in line with Hall , [the Supreme Court] require[s] that courts continue the inquiry and consider other evidence of intellectual disability where an individual's IQ score, adjusted for the test's standard error, falls within the clinically established range for intellectual-functioning deficits." Moore , 137 S.Ct. at 1050. In fact, where the lower end of a defendant's score falls at or below 70—as two of Roland's Flynn-adjusted IQ scores do here—the deciding court must move on to consider the defendant's adaptive functioning. See id. at 1049 ("Because the lower end of Moore's score range falls at or below 70, the CCA had to move on to consider Moore's adaptive functioning."); (supra at 502-03).
B. Prong Two: Deficits in Adaptive Functioning
To prevail on the second prong of his ID claim, Roland must prove, by a preponderance of the evidence, that he displays "significant limitations ... in adaptive behavior as expressed in conceptual, social, and practical adaptive skills." See AAIDD–11 at 6; DSM–5 at 33; Hall , 134 S.Ct. at 1994 (noting that the second criterion of ID is "deficits in adaptive behavior").
i. Definitional Standards
As explained above, the APA defines adaptive functioning in terms of three broad domains:
The conceptual (academic) domain involves competence in memory, language, reading, writing, math reasoning, acquisition of practical knowledge, problem solving, and judgment in novel situations, among others. The social domain involves awareness of others' thoughts, feelings, and experiences; empathy; interpersonal communication skills; friendship abilities; and social judgment, among others. The practical domain involves
learning and self-management across life settings, including personal care, job responsibilities, money management, recreation, self-management of behavior, and school and work task organization, among others.
DSM–5 at 37–38. To meet Prong Two, a person's adaptive functioning in at least one of these three domains must be "sufficiently impaired that ongoing support is needed in order for the person to perform adequately in one or more life settings at school, at work, at home, or in the community." Id. at 38; Moore , 137 S.Ct. at 1046 ("In determining the significance of adaptive deficits, clinicians look to whether an individual's adaptive performance falls two or more standard deviations below the mean in any of the three adaptive skill sets (conceptual, social, and practical)."). The AAIDD, similarly, defines Prong Two as "significant limitations ... in conceptual, social, and practical skills." AAIDD–11 at 21. Moreover, "the deficits in adaptive functioning must be directly related to the intellectual impairments described in [Prong One]." DSM–5 at 38.
The Court notes that "directly related" does not mean causation. See Wilson , 170 F.Supp.3d at 370 ("With respect to the DSM–[5]'s effect on the legal standard for prong two, the court finds that this single sentence is insufficient to impose a requirement for a defendant to prove specific causation."). The parties also agree that this single sentence in the DSM–5 does not impose a causation requirement. (See D.E. No. 423, Tr. at 5, 7).
ii. Assessing Adaptive Functioning
Standardized Measures. Although not a formal component of the diagnostic criteria in either the DSM–5 or the AAIDD Manual, both guidelines direct clinicians to use standardized measures of adaptive functioning when possible. See DSM–5 at 37 ("Adaptive functioning is assessed using both clinical evaluation and individualized, culturally appropriate, psychometrically sound measures."); AAIDD–11 at 43 ("[S]ignificant limitations in adaptive behavior should be established through the use of standardized measures normed on the general population."). The AAIDD cautions, however, that "clinicians must recognize that adaptive behavior instruments are imperfect measures of personal competence that distinguish persons with and without ID as they face the everyday demands of life." AAIDD–11 at 51.
Federal courts have been reluctant to rely heavily on such tests, particularly in the Atkins context where they often are based on retrospective recollections of an individual's youth. In Hardy , for example, the district court noted that "the selection of the tests used to assess adaptive behavior, the persons selected as informants, the conduct of the interviews, and the ultimate interpretation of the tests' results are a good deal more dependent on subjective clinical judgment than the assessment of IQ." 762 F.Supp.2d at 883. The court concluded that "as the degree to which a matter is left to an individual clinician's judgment increases, so does the degree to which the Court must rely on its assessment of the relative competence and credibility of the individual experts before it to resolve disputes between them."Id. ; see also Wilson , 170 F.Supp.3d at 368 (considering the results of standardized adaptive-functioning tests but placing "significantly greater weight on the clinical judgment of the experts the court finds most credible, along with record evidence from Wilson's youth"); Williams , 1 F.Supp.3d at 1147–48 (determining that it would place "some weight" on the results of standardized tests, but noting that the "breadth of evidence enables the court to take a multifactorial approach"); Salad , 959 F.Supp.2d at 878 ("Prong two generally requires a more expansive investigation of a defendant's life history and skill levels than could be fully evaluated through use of a normed instrument.").
In this case, experts on both sides administered standardized measures of Roland's adaptive behavior. Some issues, however, have been raised about the reliability of these measures. (See infra at 533 n.85, 539 n.92, 541 n.95). In such cases, other sources of information may be used to evaluate whether a defendant has deficits in adaptive behavior. See Montgomery , 2014 WL 1516147, at *47 (listing examples of other sources and citing AAIDD–11 at 48).
The AAIDD instructs:
If a standardized assessment measure cannot be used (e.g., if the assessment cannot be reliably administered per the test's recommended administrative procedures or if there are no reliable respondents to provide adaptive behavior information regarding the assessed person), other sources of adaptive behavior information can be used. In these infrequent cases, other information-gathering methods can be employed, such as ... review of school records, medical records, and previous psychological evaluations; or interviews with individuals who know the person and have had the opportunity to observe the person in the community but may not be able to provide a comprehensive report regarding the individual's adaptive behavior in order to complete a standardized adaptive behavior scale.
AAIDD–11 at 48.
Comprehensive Assessment. The AAIDD calls for a "comprehensive assessment of adaptive behavior" and instructs that an "individual's adaptive behavior should be evaluated using multiple respondents and multiple sources of converging data. Relevant archival data may include medical evaluations, school records, prior psychoeducational evaluations, Social Security Administration records, employment history, and family history." AAIDD–11 at 49–50. "Obtaining information from multiple respondents and other relevant sources ... is essential to providing corroborating information that provides a comprehensive picture of the individual's functioning." Id. at 47. The APA likewise instructs that adaptive functioning is assessed using "both clinical evaluation and individualized ... measures." DSM–5 at 37. Similar to the AAIDD, the APA directs clinicians to other sources of information including, "educational, developmental, medical, and mental health evaluations." Id. In light of this guidance, courts routinely use multiple sources of information, beyond standardized measures, to assess adaptive functioning.
A "comprehensive assessment of adaptive behavior will likely include a systematic review of the individual's family history, medical history, school records, employment records (if an adult), other relevant records and information, as well as clinical interviews with a person or persons who know the individual well." AAIDD–11 at 45.
See, e.g. , Hall , 134 S.Ct. at 1996 (stating that "factors indicating whether the person had deficits in adaptive functioning ... include evidence of past performance, environment, and upbringing"); Williams , 1 F.Supp.3d at 1145–46 ("Prong two generally requires a more expansive investigation of a defendant's life history and skill levels than could be fully evaluated through use of a normed instrument."); Salad , 959 F.Supp.2d at 878 (same); Davis , 611 F.Supp.2d at 492 (finding "a relative consensus that the best way to retroactively assess a defendant's adaptive functioning is to review the broadest set of data possible, and to look for consistency and convergence over time").
Accordingly, in assessing Roland's adaptive functioning, the Court will consider the results of the standardized measurements, but it will place significantly greater weight on the clinical judgment of the experts the Court finds most credible, testimony from fact witnesses the Court finds most credible, and historical evidence from Roland's life. See Williams , 1 F.Supp.3d at 1147–48 ; Montgomery , 2014 WL 1516147, at *47 ; Davis , 611 F.Supp.2d at 492.
iii. Analysis of the Experts' Assessment of Roland's Adaptive Functioning
1. Dr. Morgan's Assessment of Roland's Adaptive Functioning
Dr. Morgan based his testimony and report on an interview with Roland; a review of Roland's developmental, academic, and medical records; a thorough review of Roland's criminal and legal history, including various DVDs, CDs, letters, emails, and transcripts of statements and interviews; letters he believed Roland authored from prison; recordings of Roland's phone conversations from prison; a review of Roland's SSA award notice; interviews with three individuals. (See Dr. Morgan's Report at 2–15).
Dr. Morgan concluded that Roland does not demonstrate significant limitations in adaptive functioning because (i) Roland's "functional adaptive behavior deficits fall 1.2 standards below the mean," and (ii) "[r]ecordings of conversations and correspondence reveal none of the naiveté, confusion, or lack of ability to participate normally in conversation one would expect of someone with ID." (Id. at 15).
In assessing Roland's adaptive functioning, the Government's experts relied, in part, on "correspondences from Mr. Roland dated 2010 and 2011 while incarcerated." (See Dr. Morgan's Report at 4; D.E. No. 427, Tr. at 73, 119–20, 132–35 (Dr. Morgan testifying that he relied on Roland's letters in his assessment); D.E. No. 416, Tr. at 221–22 (Dr. Marcopulos testifying that she reviewed some, but not all, letters authored by Roland); Joint Rebuttal Report at 9 ("Roland writes letters from prison to friends and family."); Gov. Exs. 131–46; Def. Exs. 11a–q). For Dr. Morgan, these letters "convey an understanding of events, human relationships, display good memory and knowledge of activities of himself and others." (Dr. Morgan's Report at 4).
At the hearing, however, cross-examination revealed that some of these letters were not written by Roland. (See D.E. No. 427, Tr. at 73–135 (Dr. Morgan's Testimony); Gov. Exs. 131–35). And "because there were so many letters," Dr. Morgan (reasonably) could not determine at the hearing if a specific letter was more or less significant in reaching his conclusion. (D.E. No. 427, Tr. at 119–35 (Dr. Morgan's Testimony)). As indicated at the hearing, the Court awards these letters no weight. (D.E. No. 427, Tr. at 131).
Drs. Morgan and Marcopulos bear no fault for reviewing letters that Roland did not write, but relying on these letters nevertheless damages these experts' credibility and calls into question their expert judgment. This is true particularly for Dr. Morgan (who reviewed both the letters and Roland's indictment) because a thorough review of these documents reveals that in some of the letters, the author refers to Roland in the third person. (See, e.g. , Gov. Exs. 132 & 135 (referring to "UZI" in the third person); Indictment at 1 (listing "FARAD ROLAND, a/k/a 'B.U.,' a/k/a 'UZI' "); Dr. Morgan's Report at 2 ("The indictment spells out the use of murder and violence in pursuit of their drug dealing criminal trade.")). While referring to UZI in the third person does not, on its own, prove that Roland did not write the letters, it should have prompted further exploration from these experts.
Standardized Instrument. For his assessment of Roland's adaptive functioning, Dr. Morgan administered the Vineland Adaptive Behavior Scales–Third Edition ("Vineland") to Cheryl Whitehead. (Id. at 14). Ms. Whitehead was Roland's neighbor and "the mother of children that Mr. Roland went to school with and played with for many years." (Id. ). "She saw him on almost a daily basis between the ages of 7 and 12 and rated his adaptive behavior on the Vineland based on her recollection of his behavior at the age of 12." (Id. ). The Vineland resulted in the following scores: (i) the Communication domain (which corresponds to the conceptual domain) yielded a score of 77, with a 90% confidence interval range of 70 to 84; (ii) the Daily Living Skills domain yielded a score of 94, with a 90% confidence interval range of 86 to 102; and (iii) the Socialization domain yielded a score of 83, with a 90% confidence interval range of 77 to 89. (Id. ; Gov. Ex. 357, Vineland Score Report)). "The overall adaptive behavior composite (ABC) resulted in a standard score of 82 that is at the 12th percentile rank and with a confidence interval of 77 to 87." (Dr. Morgan's Report at 14).
The Communication domain on the Vineland corresponds to the conceptual domain in both the AAIDD and the APA definitions of ID. (See D.E. No. 416, Tr. at 182 (Dr. Marcopulos testifying that the Communication scores on the Vineland are "considered conceptual, part of the conceptual domain.")).
Dr. Morgan ultimately concluded that
Roland's overall adaptive behavior composite places him in the low average range. The score of 82 represents 1.2 standard deviations below the mean (mean = 100, standard deviation = 15). Mr. Roland's overall score on the adaptive behavior composite or on the three domain scores does not reach the criterion of two standard deviations below the mean for a classification of intellectual disability.
(Id. ).
Historical and Qualitative Information. In addition to the Vineland results, Dr. Morgan also relied on historical and qualitative information to conclude that Roland did not demonstrate adaptive behavior deficits that fall two standard deviations below the mean, as required for an ID diagnosis. (See id. at 14–15). Below are some of Dr. Morgan's assessments based on his review of this information:
• "Numerous examples of Mr. Roland's behavior are available in the form of telephone conversations from jail with friends and family, and a video and recording of an interview conducted by the Hillside New Jersey Police Department, where Mr. Roland demonstrates essentially normal adaptive behavior abilities in terms of communication, understanding, following conversations, providing information, understanding the relevance of various concepts, and overall functional adaptive living skills." (Id. at 14).
• "[I]n the Hillside Police Department interview, Mr. Roland ... is seen and heard conversing normally with his interviewers. He understands their questions, remembers many facts, and provides relevant information, displays memory of events and others' activities and in all, shows no confusion, lack of understanding or inability to comprehend. At times, he is appropriately guarded, protective and suspicious. He does not display naiveté or 'gullibility.' " (Id. ).
• "In transcripts of conversations with Ms. Barnes, he again converses normally, shows memory for people, places, events and issues that had transpired between them. He displays humor, concern and appropriate emotional connection." (Id. ).
• "In the phone call dated November 12, 2016, he ... indicates he knows about Google. He understands the word 'confirm.' " (Id. ).
• "Roland's emails reflect emotional relatedness to family, the use of streetwise language, the use of verbal metaphors such as 'bundle of joy' (for a baby). They also reveal logical and coherent thinking and normal comprehension." (Id. at 15).
2. Dr. Greenspan's Assessment of Roland's Adaptive Functioning
Dr. Greenspan based his testimony and report on an interview with Roland; interviews with 11 individuals familiar with Roland (including Roland's brother, cousins, former girlfriend, housemates, and teachers) ; and an extensive review of Rola nd's life history records, including records regarding Roland's developmental, academic, medical, SSA, and legal history. (Dr. Greenspan's Report at 2–6; D.E. No. 408, Tr. at 72 (Dr. Greenspan's Testimony)). Dr. Greenspan concluded that a diagnosis of ID is warranted because "Roland has significant limitations in all three of the ID definitional criteria." (Dr. Greenspan's Report at 28).
As noted at the hearing on June 14, 2017, the affidavit of Todd Carter "is stricken from the record. The Court will not consider Todd Carter's affidavit." (D.E. No. 410, Tr. at 8).
Standardized Instrument. In evaluating Roland's adaptive functioning, Dr. Greenspan administered the Adaptive Behavior Diagnostic Scale ("ABDS"), a standardized instrument published in 2016. (Id. at 16). Dr. Greenspan selected the ABDS in part because it is the only adaptive-behavior rating instrument currently on the market that was specifically devised for diagnostic purposes, in that it has been validated against ID and non-ID samples. (Id. ). So, while other instruments determine if one is in the ID range by seeing how many standard deviations one falls below the general population mean, the ABDS allows for comparisons to both ID and non-ID samples. (Id. ).
In the Joint Rebuttal Report, Dr. Marcopulos criticized Dr. Greenspan's selection of the ABDS, noting that it is a "newly published, little researched test instrument." (Joint Rebuttal Report at 6). As Dr. Greenspan emphasized at the hearing, however, the clinical guidelines do not provide any guidance on which rating instrument clinicians should use. (D.E. No. 408, Tr. at 56 (Dr. Greenspan's Testimony)). He further noted that the "Vineland–III" (which Dr. Morgan used) "has only been out a few months," and, in any event, the ABDS "correlates very strongly with the Vineland." (Id. at 57, 62–65). Dr. Greenspan also explained that, unlike the Vineland, the ABDS corresponds better with the DSM–5 and the AAIDD definition of ID. (Id. ). Finally, Dr. Greenspan testified that although the ABDS is a new instrument, it has nevertheless undergone multiple trials of validity to ensure that the test does what it is designed to do. (Id. ). For these reasons, the Court is not persuaded by Dr. Marcopulos's criticisms of the ABDS.
Dr. Greenspan advised that, for adaptive-behavior rating scales, (i) self-ratings should not be used for diagnostic purposes; (ii) one should attempt to use multiple raters, to control as much as possible for bias or for divergences of opinion, especially in retroactive ratings; (iii) retrospective ratings should be used when a subject has been incarcerated for more than a year; and (iv) the Court should note that the standardized measures often measure "overt behavior (e.g., brushes teeth) and do not pay sufficient attention to underlying cognitive underpinnings (e.g., understands the importance of brushing teeth) of such behaviors." (Id. at 16–17). To his last point, Dr. Greenspan noted that he elected to use the ABDS in part because it has more judgment-oriented items than the other instruments. (Id. at 17).
Dr. Greenspan administered the ABDS to four individuals (referred to as raters or respondents) he believed knew Roland very well during the targeted age. The raters were Amin Roland (Roland's older brother by six years) who rated Roland at the age of 10; Habeeb Robinson (Roland's childhood friend of roughly similar age) who rated Roland at the age of 9; Kaia Macon (Roland's former girlfriend and mother of Roland's child) who rated Roland at the age of 23; and Jeanette ("Gigi") Carter (Roland's older maternal cousin) who rated Roland at the age of 10 to 11. (Id. ; D.E. No. 409 at 108 (Dr. Greenspan clarifying that Gigi rated Roland "at age ten, eleven"); Def. Ex. 50 ("Dr. Greenspan's ABDS Decl.") at 2–4). All raters produced scores that show Roland to be substantially deficient in at least one of the three adaptive domains. (Dr. Greenspan's Report at 18; Dr. Greenspan's ABDS Decl. at 4).Dr. Greenspan noted that the "ratings were quite low, falling in the area of moderate (minus three deviations) rather than the Mild (minus two standard deviations) ID." (Dr. Greenspan's Report at 17). Dr. Greenspan nevertheless concluded that Roland "is probably functioning more adaptively in the mild ID range." (D.E. No. 409, Tr. at 43–44 (Dr. Greenspan's Testimony)).
Dr. Greenspan found the raters to be "credible informants" because their descriptions were "similar" and they "had ample opportunity to observe [Roland's] behavior over time." (Dr. Greenspan's Report at 23). He described them as "quite emphatic and clear, and descriptive, in discussing [Roland's] deficits." (Id. ). Dr. Greenspan also noted that "not only were the witnesses consistent with each other, but in all of their descriptions and responses on the ABDS, [he] saw a consistent picture that matched deficits in [Roland's] intellectual functioning as indicated on earlier IQ scores (the KBIT from 2002, when he was 17), and recent IQ scores, as well as neuropsychological testing." (Id. at 24).
Historical and Qualitative Information. In addition to the four raters, Dr. Greenspan interviewed seven other individuals familiar with Roland. (See id. at 6 (listing individuals and their relationship to Roland)). He noted their observations in great detail in his report and summarized them at length at the hearing. (See, e.g. , id. at 22–24)). In assessing Roland's adaptive functioning, Dr. Greenspan also reviewed archival records, and noted that a "second source of qualitative information comes from the comments in the records that are available which speak to adaptive functioning." (Id. at 19). Below are some examples of information from Roland's historical records Dr. Greenspan noted in his report:
• "Roland complained that every time he tried to sell drugs he would get caught";
• Roland "lacked an ability to abide by rules of juvenile correctional facilities, and showed poor judgment when confronted by staff or other residents";
• "Roland was a 'follower' and that his disruptive behavior reflected that tendency";
• Roland "was noted to have limited insight and poor cognitive functioning in assessments";
• "[w]hile in juvenile detention he reports a medical problem in very simple terms: he has 'holes in his teeth' "; and
• Roland "misspells his last name on TABE Form in 2001, 'Ralrand Farad.' " (Id. ).
On May 1, 2017, the Government moved to exclude any testimony regarding evidence that was obtained from Dr. Greenspan's use of a structured interview that he refers to as the "Common Sense Questionnaire" ("CSQ"). (See D.E. No. 328 ("Government's Daubert Motion")). The Government renewed its motion at the evidentiary hearing during Dr. Greenspan's testimony. (D.E. No. 410, Tr. at 43 ("Your Honor, at this time the [G]overnment renews its Daubert motion with respect to CSQ and ask[s] that all testimony with regard to the CSQ be stricken from the record.")). In evaluating Roland's Atkins claim, the Court does not consider or rely on Dr. Greenspan's CSQ or any testimony relating to it. Accordingly, the Government's Daubert Motion is denied as moot.
3. Conclusions on Dr. Morgan's Assessment of Roland's Adaptive Functioning
Dr. Morgan's evaluation of Roland's adaptive functioning is even less persuasive than his evaluation of Roland's intellectual functioning. The Court finds Dr. Morgan's testimony and conclusions in this matter unreliable and substantially lacking credibility in several respects.
The most glaring problem with Dr. Morgan's credibility and ultimate conclusion is that his Prong–Two evaluation fails to comport with the clinical standards. First and foremost, Dr. Morgan failed to undertake the comprehensive assessment that is required by the clinical guidelines (see supra at 529-31). A comprehensive evaluation of Roland's academic, medical, SSA, and other relevant records is especially necessary in this case, as the stakes are high and Dr. Morgan was aware of the risk that Ms. Whitehead (the only Vineland rater) may be biased and therefore unreliable (see infra at 541 n.95). The Court finds that Dr. Morgan's failure to conduct a comprehensive assessment of Roland's adaptive behavior, as instructed by the clinical standards (see supra at 529-31), is detrimental to his credibility.
In addition to failing to consider the entirety of Roland's records available to him, the Court is concerned about Dr. Morgan's unreasonable dismissal of anything at all that might suggest a different conclusion from his own. For example, he admitted during cross-examination that there was evidence of Roland's impairments in the record, yet he failed to address them in his report. (See supra at 492-94; infra at 536-37 n.89). Nothing justifies the absence in Dr. Morgan's report of any risk factors or the fact that his own test concluded that Roland had deficits in the conceptual domain —both of which he acknowledged during cross-examination. (See supra at 495-96). It appears that he ignored data and possible inferences so as to avoid even having to acknowledge the possibility that Roland has adaptive deficits or that he is ID. Given the significant role of clinical judgment and the highly subjective nature of an ID evaluation, the effect of having such a pervasive bias present in the evaluator is hard to overstate.
The Vineland administered by Dr. Morgan to Ms. Whitehead resulted in a Communication domain score of 77, with a 90% confidence interval range of 70 to 84. (See Gov. Ex. 167 at 14). Dr. Morgan revealed during cross-examination that Roland is impaired in the Communication domain of the Vineland (which corresponds to the conceptual domain in both the AAIDD and the APA definitions of ID). (D.E. No. 427, Tr. 71–72 (Dr. Morgan's Testimony); D.E. No. 416, Tr. at 182 (Dr. Marcopulos's Testimony)). When asked why he failed to note that Roland would meet the requirement for deficits in the conceptual domain in his expert report—given that an ID-diagnosis requires deficits in only one domain of adaptive functioning—Dr. Morgan answered, "[p]erhaps I should have, in retrospect." (Id. at 73).
Dr. Morgan's failure to disclose that Ms. Whitehead's rating demonstrated that Roland had deficits in the conceptual domain is concerning for the additional reason that it suggests Dr. Morgan ignored the guidelines' requirement of establishing deficits in only one of the three adaptive-functioning domains. (See supra at 528-29).
Equally troubling, and also in contravention of the clinical guidelines, Dr. Morgan analyzed Roland's verbal behavior and emphasized Roland's strengths to the exclusion of his deficits. For example, he notes in his written report that in a recorded call from prison, Roland (i) "indicates he knows about Google"; (ii) "understands the word 'confirm' "; and (iii) "displays social awareness." (Dr. Morgan's Report at 14). The AAIDD, however, explicitly cautions against use of verbal behavior in ID assessments. See AAIDD–11 at 102 ("Do not use ... verbal behavior to infer level of adaptive behavior or about having ID."); User's Guide at 20 (same). Citing Dr. Greenspan, the AAIDD explains that this restriction is based on the fact that "[t]here is not enough available information, and there is a lack of normative information." AAIDD–11 at 102; (see also Dr. Greenspan's Rebuttal Rep. at 9–10). Dr. Greenspan likewise testified that it is inappropriate to rely on verbal behavior to assess Roland's adaptive functioning. (See D.E. No. 409, Tr. at 39 (Dr. Greenspan testifying that "we should not use verbal behaviors, including written communications, to rule out ID")).
Moreover, Dr. Morgan's cherry-picked examples (that might indicate strengths in Roland's adaptive functioning) to conclude Roland does not demonstrate adaptive-behavior deficits also contravene the guidelines because, as the Supreme Court pointed out, "the medical community focuses the adaptive-functioning inquiry on adaptive deficits. " Moore , 137 S.Ct. at 1050 (citing AAIDD–11 at 47 ("significant limitations in conceptual, social, or practical adaptive skills [are] not outweighed by the potential strengths in some adaptive skills"); and DSM–5 at 33, 38 (inquiry should focus on "[d]eficits in adaptive functioning")). For these reasons, the Court will afford little weight to Roland's statements during these recorded calls in assessing Roland's adaptive functioning.
See Lewis , 2010 WL 5418901, at *26 (rejecting government expert's conclusion based on defendant's "verbal behavior in jailhouse telephone recordings, a videotaped interrogation, and communications between Defendant and himself" and noting that the AAIDD 2010 "explicitly cautions against use of verbal behavior in an intellectual disability assessment"); Davis , 611 F.Supp.2d at 494 ("The Court finds these [jail] telephone calls largely irrelevant to the assessment of the defendant's adaptive functioning.... [T]hese conversations are evidence of what the defense experts referred to as the 'cloak of competence,' which is the powerful tendency of mildly mentally retarded people to mask or compensate for their deficits.").
Yet another issue with Dr. Morgan's evaluation of Roland's adaptive functioning, and also contrary to the clinical guidelines, is that Dr. Morgan's opinion appears to be tainted by unjustified and inaccurate conceptions of how ID individuals are expected to act and appear. For example, in his report, Dr. Morgan asserted that "[d]uring numerous routine interviews by mental health professionals in the New Jersey prison system who were Ph.D. level psychologists and master's level social workers, no one during any of those interviews documented the suspicion that he had intellectual disability or that he had cognitive impairment of any kind." (Dr. Morgan's Report at 4). He further opined that "[r]ecordings of conversations and correspondence reveal none of the naiveté, confusion, or lack of ability to participate normally in conversation one would expect of someone with ID." (Id. at 15). And, at the hearing, Dr. Morgan testified, "here is a gentleman who has been seen, numerous occasions by mental health professionals over a period of time talking to, interviewing him, doing mental status examinations with him as is required by the prison, that surely something would have been, would have emerged in terms of discussions, discussing with him, that would on have been a tip-off that perhaps there was ID present." (D.E. No. 414, Tr. at 174–175 (Dr. Morgan's Testimony)). The AAIDD, however, cautions against the use of stereotypes in determining whether someone is ID. See User's Guide at 25–26; (supra at 479-80). Indeed, Dr. Morgan testified that he "agree[s] with the stereotype issues in the AAIDD. Stereotypes are wrong for anybody, especially people with ID." (D.E. No. 414, Tr. at 171 (Dr. Morgan's Testimony)). He even testified that ID "is not something you can see by looking at someone." (Id. at 175). Yet, by suggesting that Roland's ID would have been suspected by professionals who interacted with him and concluding that Roland's verbal behavior in recorded conversations reveals none of the traits that "one would expect of someone with ID," Dr. Morgan appears to rely on at least the stereotype that people with ID "look and talk differently from persons from the general population." See User's Guide at 26. But, as the Supreme Court has instructed (and Dr. Morgan appears to agree), "stereotypes, much more than medical and clinical appraisals, should spark skepticism." Moore , 137 S.Ct. at 1052.
This statement also taints Dr. Morgan's credibility because mental health professionals did, in fact, note issues with Roland's cognitive impairments. (See, e.g. , Gov. Ex. 110 at 7–8 (a doctor of psychology noting that Roland "presented as a cognitively, socially and behaviorally immature individual with limited judgment and decision-making"); Def. Ex. 9a at 85 (Ms. Gavan twice noting that Roland "appears to have poor insight and somewhat limited cognitive ability"); see also supra at 949-95). Dr. Morgan's assertion, therefore, is further proof that he failed to consider Roland's comprehensive history, providing yet another reason for this Court to discredit his opinions.
Moreover, Dr. Morgan testified that a notation of Roland's "poor judgment or limited insight ... would be relevant in thinking about intellectual disability" because it "would be a red flag, it would be a clue." (D.E. No. 414, Tr. at 176 (Dr. Morgan's Testimony)). Yet, when confronted during cross-examination with evidence from (i) the New Jersey Department of Corrections that Roland "[presented] as a cognitively, socially and behaviorally immature individual with limited judgment and decision making"; (ii) Ms. Gavan's assessment that Roland "appeared to be somewhat cognitively limited"; and (iii) Dr. Farber's assessment that Roland "has poor judgment and little insight into his behavior, Dr. Morgan dismissed such evidence as either irrelevant or vague. (See id. at 177–80). Despite his testimony just minutes earlier that such a notation would be a "red flag" or a "clue," Dr. Morgan dismissed this evidence on the grounds that (i) "it is so vague as to what cognitively limited means"; and (ii) he "d[idn't] think it is relevant" because "individuals with poor judgment don't necessarily have ID ... Poor judgment is a general term. Anybody can have poor judgment. People with high IQs can have poor judgment." (Id. ). The Court views Dr. Morgan's testimony here as another example of dismissing any evidence that might suggest a different conclusion from his own. (See supra at 534-38).
In any event, the Court agrees with Roland's experts that none of the cherry-picked examples on which Dr. Morgan—or any of the Government's experts—relies is necessarily inconsistent with mild ID. (See, e.g. , D.E. No. 387, Tr. at 111–13 (Dr. Hunter testifying that Roland's statements during recorded phone calls or during the Hillside Police Department interview are not inconsistent with his finding that Roland is ID)). Dr. Greenspan, who also reviewed this evidence, explained that people with ID (i) "can carry on conversations that are syntactically adequate"; (ii) "can carry on a conversation"; and "can joke around on the phone or in person, just as you would expect a much younger person, a non-ID person to do." (D.E. No. 409, Tr. at 38–39 (Dr. Greenspan's Testimony); D.E. No. 410, Tr. at 107–07 (Dr. Greenspan's Testimony); see also D.E. No. 420, Tr. at 75–76 (Dr. Denney testifying that people with ID can have strengths and weaknesses, can support their family, can be competent to stand trial, can romantically love and be romantically involved)).
See also United States v. Shields , 480 Fed.Appx. 381, 391 (6th Cir. 2012) (Clay, J., dissenting) (dissenting on other grounds but agreeing with the district court's determination that the defendant was ID even though "[h]e is skilled at convincing others around him to accomplish tasks on his behalf, he ignores social strictures [sic] ... he deceives others to achieve his personal objectives ... [and he] is often deceptive and manipulative"); Hardy , 762 F.Supp.2d at 902–03 (noting that the defendant was a "reasonably successful street level crack cocaine distributer within the projects" but concluding that "a person with mild [ID] is capable of running such an operation," and that such a person also is capable of shooting someone and committing other capital crimes).
Moreover, unlike Dr. Greenspan, Dr. Morgan did not personally conduct extensive interviews of those well acquainted with Roland. Instead, he relied on a single rater who, as it turns out, believes that Roland killed her son. (D.E. No. 414, Tr. at 73 (Dr. Morgan's Testimony)). Dr. Morgan did not address Ms. Whitehead's possible bias in his report, though he testified that he "factored [it] in." (Id. at 74). At the hearing, Dr. Morgan conceded that Ms. Whitehead's rating might "be positively skewed, and not represent Mr. Roland's actual adaptive functioning at the time that the observations took place." (Id. ). This concession is particularly relevant because, even with the risk of Ms. Whitehead's bias against Roland, the Vineland results indicated that Roland has deficits in the conceptual domain.
Ultimately, the Court suspects that Dr. Morgan's evaluation of Roland's adaptive functioning may have been influenced by his clear belief that Roland's IQ scores did not justify a diagnosis of ID. When considering Dr. Morgan's improper reliance on stereotypes and verbal behavior, his failure to recognize that Roland met the requirements for deficits in the conceptual domain based on Ms. Whitehead's rating, and his glossing over of any evidence that could support a differing conclusion, the Court is left with the undeniable impression that Dr. Morgan failed to engage in a truly objective evaluation of Roland. The Court therefore concludes that Dr. Morgan's credibility was thoroughly impeached and awards little weight to his opinions on Roland's adaptive functioning.
4. Conclusions on Dr. Greenspan's Assessment of Roland's Adaptive Functioning
Though Dr. Greenspan is undoubtedly one of the preeminent scholars on ID, the Court does have a few misgivings about his testimony. The Court's primary concern is Dr. Greenspan's recordkeeping and oversight of information provided by two of the ABDS raters. Specifically, the Government pointed out during cross-examination that when Dr. Greenspan administered the ABDS to Ms. Carter, his notes indicated that Ms. Carter rated Roland at two different ages: 10 and 17.5. (See Gov. Ex. 168 at 8). Then Dr. Greenspan noted Ms. Carter's score as a 32, which is impossible because the lowest score on the ABDS is 40. (Id. ). When the Government's experts pointed out Dr. Greenspan's scoring error, Dr. Greenspan rescored the tests and provided additional pages that he had inadvertently left out. (See Gov. Ex. 353; Dr. Greenspan's ABDS Decl. at 4). The Government also takes issue with Dr. Greenspan's assessment of Ms. Carter as a credible rater despite the fact that her description of Roland's deficits on the ABDS would require Roland to be institutionalized. (See Gov. Ex. 168 at 8).
See, e.g. , Ybarra , 869 F.3d at 1021 (recognizing Dr. Greenspan as "the most-cited authority in the 2002 and 2010 diagnostic manuals of the American Association on Intellectual Disabilities"); Lewis , 2010 WL 5418901, at *2 (describing Dr. Greenspan as a "preeminent scholar on intellectual disability").
The Government next pointed out Dr. Greenspan's discrepancies on Ms. Macon's evaluation. The Government first noted that Ms. Macon rated Roland at age 23, which is beyond the norms for the ABDS. (Id. ). Then, the Government raised concerns about the fact that Ms. Macon did not complete the practical form for the ABDS, which Dr. Greenspan initially explained was because she lacked information to do so but later said it was because he had run out of time while interviewing her. (D.E. No. 409, Tr. at 160 (Dr. Greenspan's Testimony); Gov. Ex. 355 at 4). The Court agrees with the Government's well-founded concerns, and for those reasons, assigns no weight to this aspect of Dr. Greenspan's assessment (i.e., the ratings provided by Ms. Carter and Ms. Macon). The Court will, however, rely on Ms. Carter's testimony at the hearing because, having had an opportunity to hear and see Ms. Carter testify, it finds Ms. Carter to be a credible and reliable witness.
The Government also contends that Amin Roland is not a credible rater in part because of Amin Roland's criminal history. (D.E. No. 381, Tr. at 25–27; D.E. No. 412, Tr. at 189–96; D.E. No. 426, Tr. at 185–86). Defense and Government experts agreed, however, that a rater's criminal history does not affect that rater's credibility regarding how well they know Roland. (See D.E. No. 408, Tr. at 76 (Dr. Greenspan testifying that a rater's criminal history does not make a rater unsuitable and that the "main criterion is do they know [Roland] well and are they being truthful"); D.E. No. 427, Tr. at 60 (Dr. Morgan's Testimony)). Indeed, Dr. Greenspan testified that Habeeb Robinson and Amin Roland (both of whom have criminal records and were interviewed in prison) "know Mr. Roland better than almost anybody," and they "produced a wealth of information[ ] that was validated across with other people and other information." (D.E. No. 408, Tr. at 76 (Dr. Greenspan's Testimony)). Dr. Morgan, too, in discussing Ms. Whitehead's rating—on which the Government repeatedly relies—stated that Ms. Whitehead's criminal record was not significant and does not affect her credibility. (D.E. No. 427, Tr. at 60 (Dr. Morgan's Testimony)). The Court also notes that Amin Roland's ratings were corroborated by witnesses who testified credibly at the hearing as well as documentary evidence. (See infra at 542-45; 547-48; 549-51). For these reasons, the Court rejects the Government's contention that Amin Roland is not a credible rater because of his criminal history.
Notwithstanding these problems with Dr. Greenspan's credibility, the Court nevertheless credits Dr. Greenspan's opinions, particularly with regard to Prong Two. Although much of Dr. Greenspan's income is attributed to expert testimony, the Court takes note of the fact that Dr. Greenspan is credited with providing the three-domain framework for assessing Prong Two and is recognized as a leading expert in the field (as demonstrated by the fact that he is the most cited authority in both the AAIDD–11 and the online edition of the DSM–5). The Court is also comforted by the fact that, despite his personal views on the death penalty, Dr. Greenspan testified that he has nevertheless declined to testify in a number of capital cases because he believed that the defendants were not ID. (D.E. No. 409, Tr. at 57–59 (Dr. Greenspan's Testimony)). In fact, Dr. Greenspan testified that he determined that another defendant in a death-penalty case was not ID as recently as one month prior to Roland's evidentiary hearing. (Id. at 62).
The Court credits Dr. Greenspan's opinion in part because of his dutiful adherence to the clinical standards. Specifically, the Court finds that Dr. Greenspan conducted a comprehensive evaluation of Roland's adaptive functioning that included an extensive review of Roland's life history records, including records regarding Roland's developmental, academic, medical, SSA, and legal history. (Dr. Greenspan's Report at 2–6; D.E. No. 408, Tr. at 72 (Dr. Greenspan's Testimony)). At the hearing, Dr. Greenspan was able to recall much of the information contained in those records from memory, without the aid of exhibits. Dr. Greenspan's thoroughness is also demonstrated by the fact that he interviewed 11 individuals and acknowledged Roland's risk factors. (See supra at 495-96; 532-34). An example of Dr. Greenspan's thoroughness—and credible exercise of clinical judgment—is his further investigation when he noticed that Roland had received some high grades at Sojourn (an anomaly in light of Roland's other records and test scores). (Dr. Greenspan's Report at 9). To understand this anomaly, Dr. Greenspan interviewed Kamala Conway, a retired teacher who worked at Sojourn during the relevant period. (Id. ).
Also helpful to the Court's positive credibility determination is the fact that (aside from the issues with the two raters discussed above) Dr. Greenspan's testimony and written reports were largely unimpeached. In fact, Dr. Marcopulos, the Government's adaptive-functioning rebuttal expert, was especially lacking in credibility. For starters, the Government represented to this Court that "the adaptive component" of the Joint Rebuttal Report "was written by Dr. Marcopulos." (D.E. No. 427, Tr. at 67–68). Yet, Dr. Marcopulos testified that she did not author a substantial portion of the adaptive-functioning section, which calls into question the reliability of the entire Joint Rebuttal Report and the credibility of both authors. (See, e.g. , D.E. No. 416, Tr. at 152 (Dr. Marcopulos testifying that she did not write a certain portion of the Joint Rebuttal Report); id. at 190–91 (Dr. Marcopulos testifying that she did not write another portion of the Joint Rebuttal Report)). The Court also questions the thoroughness of Dr. Marcopulos's review because her testimony comprised mostly general statements with little or no evidence from the record to support her opinions. (See, e.g. , D.E. No. 416, Tr. at 148 (Dr. Marcopulos testifying that Roland could use a cell phone but failing to recall any evidence to support this assertion)). For example, while Dr. Marcopulos criticized Dr. Greenspan's ABDS ratings as "inconsistent with other source of information documenting that Roland could perform these behaviors" (Joint Rebuttal Report at 9), at the hearing she could not provide a single source of information that was inconsistent with the ratings. (D.E. No. 416, Tr. at 148–49 (Dr. Marcopulos: "I don't recall the evidence that I knew for that.")).
As the record currently stands, it is unclear which expert, if any, wrote the following portion of the Joint Rebuttal Report:
Moreover, these ratings are strikingly at odds with The Vineland Adaptive Behavior Scale–3 of Cheryl[ ] Whitehead, which resulted in an Adaptive Behavior Composite of 82, 1.2 standard deviations below the population mean. Recorded conversations when Mr. Roland was incarcerated, a video of a police interview, letters and emails, also reveal someone who is clearly not functioning in the extremely low range of adaptive skills. Mr. Roland consistently demonstrated a clear awareness of others, recognition of specific events from his past, including matters concerning financial aspects of his involvements, talked about how to arrange bail from "rich friends," lure someone without being suspicious, and in all, hardly demonstrated himself to be a person with extremely low adaptive skills necessitating significant supports in the environment. These observations present a very different picture of Mr. Roland compared with Dr. Greenspan's, with the ratings he obtained from informants and from the "qualitative sources" noted in his report.
Despite the assertions of Dr. Greenspan regarding Mr. Roland's adaptive functioning with ratings and qualitative statements suggesting a much more severely impaired person, real-world observations of his behavior, along with his alleged leadership role in criminal activities, speaks strongly against the characterization that he is so adaptively impaired as to not only meet ID diagnostic standards, but to be profoundly impaired. Indeed, if the ratings obtained by Dr. Greenspan from his informants were accurate and correct, Mr. Roland would likely have IQ scores in the range of the 40s to 50s.
(Joint Rebuttal Report at 9 (emphasis in original)).
Moreover, although Dr. Marcopulos expressed her frustration at the "lack of records available that could have answered" some of her adaptive-functioning questions (id. at 26), she appears to have ignored many of the records that were available to her. (See, e.g. , id. at 74 (Dr. Marcopulos testifying that, other than the evidence obtained from Dr. Greenspan's raters, there is no evidence in the record of Roland's adaptive-functioning deficits as a child or adult); id. at 125–26 (conceding during cross-examination that there is evidence in the record on Roland's deficits); id. at 134 (testifying that Roland "had limitations definitely in academic functioning")). Indeed, it is unclear from Dr. Marcopulos's testimony as well as her portion of the Joint Rebuttal Report on exactly what records she relied—if any—in forming her expert opinion. (See, e.g. , D.E. No. 416, Tr. at 221–22 (Dr. Marcopulos testifying that she reviewed some, but not all, letters authored by Roland); supra at 540-41).
Dr. Marcopulos's failure to either review or recall records from Roland's life calls into question her clinical judgment, since such judgment is based on, among other things, "specific knowledge of the person and his or her environment," AAIDD–11 at 85.
The Court further credits Dr. Greenspan's opinion regarding Roland's Prong–Two deficits because both Dr. Morgan (the Government's primary expert) and Dr. Marcopulos (the Government's adaptive-functioning rebuttal expert) conceded at the hearing that Roland had deficits, especially in the conceptual domain. (D.E. No. 427, Tr. at 71–72 (Dr. Morgan testifying that Roland is impaired in the Communication domain); D.E. No. 416, Tr. at 125–26 (Dr. Marcopulos conceding that there is evidence in the record on Roland's deficits); id. at 134 (Dr. Marcopulos testifying that Roland "had limitations definitely in academic functioning")). Indeed, both standardized measures administered by the parties' experts revealed deficits in the conceptual domain.
The Court acknowledges the Government's criticisms that the ABDS raters may be biased because they have an incentive to help Roland, just as it acknowledges Roland's criticism that Ms. Whitehead may also be biased because she blames him for her son's death. The fact that all of these potentially biased raters agreed that Roland has deficits in the conceptual domain, however, suggests that their ratings are reliable, particularly for that domain. Moreover, while the Court recognizes concerns of potential rater bias, such concerns are unavoidable in an Atkins context, which often requires a retrospective diagnosis. See AAIDD–11 at 95–96; (supra at 478-79). Using "knowledgeable informants" who are "very familiar with the person," AAIDD–11 at 47, inevitably means relying on raters who may potentially be biased. See, e.g. , Hall , 134 S.Ct. at 1991 (relying on testimony from defendant's siblings); In re Cathey , 857 F.3d at 239 (relying on defendant's sister and former wife). Because of the unavoidable risk of rater bias, it is even more important to follow the guidelines' instructions that a valid diagnosis of ID should be based on a comprehensive evaluation that includes multiple sources of information. See AAIDD–11 at 29, 95–96, 100; DSM–5 at 39; User's Guide at 20–11; (supra at 478-79 (listing the guidelines' requirement of a comprehensive evaluation and the AAIDD's guidelines for clinicians charged with making retrospective diagnoses)).
Finally, as noted in the Court's analysis of Roland's intellectual functioning, the Court awards no weight to the Government's experts' conclusion—which also departs from the clinical guidelines—that Roland's records reveal behavioral or emotional problems, or learning disabilities, rather than ID. (See supra at 480, 527-28 n.74; D.E. No. 416, Tr. at 76–79 (Dr. Marcopulos's Testimony); Dr. Morgan's Report at 17 ("[T]here is nothing in Mr. Roland's history to indicate he was ID in childhood. Educational records show emotional and behavioral problems.")).
Accordingly, the Court finds Dr. Greenspan's testimony, in light of his vast experience and comprehensive assessment, to be credible, reliable, and helpful to the Court on the issues before it. Thus, the Court credits Dr. Greenspan's opinion above those of Drs. Morgan and Marcopulos. Moreover, the Court finds that the record is replete with evidence of Roland's limitations in adaptive functioning (see infra at 541-49). Roland therefore has proved by a preponderance of the evidence that he has significant limitations in adaptive skills to satisfy Prong Two of the ID definition.
iv. Roland's Conceptual Skills
Definitional Standards. According to the AAIDD and the APA, the conceptual domain of adaptive functioning involves language skills; reading and writing ability; understanding of money, time, and number concepts; and acquisition of practical knowledge; problem solving; and judgment in novel situations. See AAIDD–11 at 44; DSM–5 at 37. For mild ID, the APA describes the conceptual domain as:
For preschool children, there may be no obvious conceptual differences. For school-age children and adults, there are difficulties in learning academic skills involving reading, writing, arithmetic, time, or money, with support needed in one or more areas to meet age-related expectations. In adults, abstract thinking, executive function (i.e., planning, strategizing, priority setting, and cognitive flexibility), and short-term memory, as well as functional use of academic skills (e.g., reading, money management), are impaired. There is a somewhat concrete approach to problems and solutions compared with age-mates.
DSM–5 at 34.
Experts' Assessment. Roland's experts determined that he demonstrated significant deficits in the conceptual domain. (D.E. No. 386, Tr. at 204–05 (Dr. Hunter's Testimony); D.E. No. 408, Tr. at 102–03 (Dr. Greenspan's Testimony)). The results of the two ABDS ratings on which the Court relies—those of Mr. Robinson and Amin Roland who used Roland's target age of 9 and 10, respectively—indicate that Roland had significant deficits in the conceptual domain. (Dr. Greenspan's ABDS Decl. at 3; Dr. Greenspan's Report at 16–18). Drs. Morgan and Marcopulos also conceded during cross-examination that Roland had deficits in the conceptual domain. Considering that the parties' experts agree that Roland has significant deficits in the conceptual domain, it would be unreasonable for the Court to conclude otherwise. See Van Tran v. Colson , 764 F.3d 594, 610 (6th Cir. 2014) ("[T]he courts strain the limits of reasonableness by rejecting expert opinions based exclusively on the courts' own inexpert analysis."); see also Ybarra , 869 F.3d at 1026 (finding lower court's contradictory analysis of Prongs Two and Three "troubling" since "the only clinical experts to testify on Prongs 2 and 3 opined that the prongs were satisfied"). And Roland's deficits in the conceptual domain are further demonstrated by the wealth of the evidence in the record.
(See D.E. No. 427, Tr. at 71–72 (Dr. Morgan's Testimony); D.E. No. 416, Tr. at 137 (Dr. Marcopulos's Testimony)).
Academic Record. As repeatedly noted above, Roland's academic record is one of overwhelming failure. (See supra at 492-94). School records reveal that Roland was delayed in acquiring age-appropriate skills. At age 9, Roland was placed in special-education classes and, by age 11, he still could not read. (See Def. Ex. 15a, Student Data Listing; D.E. No. 384, Tr. at 176 (Ms. Gresham's Testimony)).As Roland grew older, he fell further behind his peers. (See, e.g. , Gov. Ex. 353 (compiling chart of Roland's TABE results over time)). At age 14, Roland scored in the lowest of three categories in both Language Arts Literacy and Mathematics on the New Jersey Grade 8 Proficiency Assessment Individual Student Report. (See Def. Ex. 15e at 8). At age 17, Roland had a grade-equivalent of 3.5 in reading, 3.3 in applied mathematics, 3.6 in total mathematics, 2.2 in language, and 0.3 in spelling. (See id. at 2–3). Corroborative of these scores is the fact that Roland misspelled his own last name on the test scoresheet, which his teacher described as "typical of Mr. Roland." (Id. ; D.E. No. 384, Tr. at 12. (Ms. Bohm's Testimony)). Also at 17, Roland failed both the reading and writing sections of the New Jersey Grade 11 High School Proficiency Test. (See Def. Ex. 15e at 9). Roland's IEP from the same year again demonstrates that Roland continued to struggle behind his peers, with a grade-equivalent of 4.4 in reading, 3.6 in total math, 2.6 in language, and 4.7 in spelling. (See Def. Ex. 15b at 5 (IEP listing Roland's results)). Indeed, the IEP states that Roland "is achieving significantly below grade level." (See id. at 16).
Roland remained behind his peers even in his twenties. For example, at age 20, Roland had a grade-equivalent of 6.6 in reading, 2.4 in language, 4.1 in spelling, 6.4 in mathematics computation, 2.1 in applied mathematics, and 4.2 in total mathematics. (See Gov. Ex. 113 at 5). And, at age 22, Roland had a grade-equivalent of 7.6 in reading, 2.3 in language, .0 in spelling, 5.0 in mathematics computation, 5.4 in applied mathematics, and 5.2 in total mathematics. (See id. ). The Court agrees with Dr. Hunter that these scores are "letting us know at that time that [Roland] had very limited attainment of academic skills ...." (D.E. No. 386, Tr. at 184 (Dr. Hunter's Testimony)).
Roland's March 2011 TABE results, when he was 26 and incarcerated for almost three years, demonstrate an improvement in his abilities, with a grade-equivalent of 6.4 in reading, 3.5 in language, 12.5 in spelling, 10.0 in mathematics computation, 6.0 in applied mathematics, and 7.8 in total mathematics. (See Gov. Ex. 113 at 5). Dr. Morgan views Roland's improvement as "another example of differential effort." (D.E. No. 414, Tr. at 32–34 (Dr. Morgan's Testimony)). For the reasons below, the Court is unpersuaded by Dr. Morgan's conclusory assertion.
First , the AAIDD explicitly mentions, as part of its ID definition, that "[w]ith appropriate personalized supports over a sustained period, the life functioning of the person with ID generally will improve." AAIDD–11 at 7. Moreover, the Court is persuaded by the parties' experts, including Dr. Morgan, who uniformly testified that people with ID can improve. (See, e.g. , D.E. No. 427, Tr. at 76 (Dr. Morgan's Testimony); D.E. No. 416, Tr. at 113 (Dr. Marcopulos's Testimony); D.E. No. 386, Tr. at 62 (Dr. Hunter's Testimony); D.E. No. 410, Tr. at 54 (Dr. Greenspan's Testimony)). Indeed, Dr. Morgan noted in his report that "[a]daptive behavior is modifiable. In contrast to cognitive ability, which is considered relatively stable for most people over time, adaptive behavior can erode or improve as a result of intervention." (Dr. Morgan's Report at 13). Dr. Morgan also testified that (aside from Roland's score on mathematics computation) Roland's performance on the March 2011 TABE is possible (although he has never seen it). (See D.E. No. 414, Tr. at 37–38 (Dr. Morgan's Testimony)).
Second , Roland's improvement may be due to the structured environment of the correctional facility in which he was confined, which is contrary to the AAIDD's instruction that "the person's strengths and limitations in adaptive skills should be documented within the context of the community and cultural environment typical of the person's age peers." AAIDD–11 at 45; (see also Dr. Morgan's Report at 13 (noting that "adaptive behavior can erode or improve as a result of intervention")). Experts here testified that a structured environment, such as prison, can improve an individual's skills. (See, e.g. , D.E. No. 409, Tr. at 25 (Dr. Greenspan testifying that a prison "is one of the most supportive environments" and noting that "Roland greatly improved his reading and writing skills once he was in prison"); see also D.E. No. 387, Tr. at 213 (Dr. Hunter's Testimony)). This Court agrees with the court's reasoning in Wilson that "the ability to perform adequately with ongoing support does not negate a finding of intellectual disability." Wilson , 170 F.Supp.3d at 369–70 ; see also Hardy , 762 F.Supp.2d at 899 (noting that "an institutional environment of any kind necessarily provides 'hidden supports' ").
Dr. Greenspan further testified:
People with ID do better in highly supported environments. Prison is one of the more supported environments ... people with ID ... often typically do better in highly structured environments. The real test is how they do in unstructured environments such as the community which is generally very unstructured or at least presents less opportunities to deal with novelty and ambiguity. It is clearly not appropriate to generalize from highly structured environment, how someone does in the community. The best way, or one of the ways of defining the field of psychology, or one of the principle beliefs espoused by psychologists is that the best predictor of future behavior is past behavior. That is why we do retrospective behavior of how people functioned in the community, because that gives us a basis for knowing how they really, what their real adaptive limitations might be. It also enables to us make some predictive statements about how they will do once they were, or if they are in[ ] prison.
(D.E. No. 409, Tr. at 41–42 (Dr. Greenspan's Testimony)).
Third , the AAIDD stresses that "within any individual, limitations often coexist with strengths." AAIDD–11 at 7. And Moore , relying on the AAIDD Manual, notes that significant limitations in adaptive skills are "not outweighed by potential strengths in some adaptive skills." 137 S.Ct. at 1050. So, Roland's March 2011 TABE results, even if considered a strength, do not outweigh his deficits in the conceptual domain. For these reasons, the Court finds that Roland's 2011 TABE results, while relevant, do not negate the conclusion that Roland has substantial deficits in the conceptual domain of adaptive functioning.
Documentary Evidence. Additional, unimpeached evidence in the record demonstrates that Roland has limited cognitive judgment and further supports the Court's conclusion that Roland has significant deficits in the conceptual domain. As noted earlier, at age 14, the SSA determined that Roland was mentally retarded. (Def. Ex. 17 at 7). Ms. Gavan, Roland's substance-abuse-screening evaluator at the Juvenile Justice Commission, noted that, at age 17, Roland "appeared to be somewhat cognitively limited" and repeated again that Roland "appear[ed] to have poor insight and somewhat limited cognitive ability." (See Def. Ex. 9a at 84–85). Dr. Farber's psychological assessment at the same time similarly "reveal[ed] a young man who has very poor judgement and little insight into his behaviors." (See id. at 59). The Court also agrees with Dr. Greenspan that Roland's low composite score of 70 and Matrices score of 69 on the 2002 KBIT, when he was 17, is further evidence of deficits in the conceptual domain. (D.E. No. 408, Tr. at 105 (Dr. Greenspan's Testimony)). Finally, at age 20, while in the custody of the New Jersey Department of Corrections, Roland "presented as [a] cognitively, socially and behaviorally immature individual with limited judgment and decision-making." (Gov. Ex. 110 at 7).
Witness Testimony. Moreover, Roland's deficits were not confined to test results and evaluators, but were apparent to those around him. Ms. Carter, during her particularly compelling and credible testimony, provided additional examples of Roland's deficits. Ms. Carter reported that, compared to his peers, Roland was slower to walk, talk, and read during his developmental years. (D.E. No. 385, Tr. at 107–11 (Ms. Carter's Testimony)). For example, at age 8, Roland could not read simple words like "kick" or "jump." (Id. at 111). At ages 10 to 12, Roland could not read road signs. (Id. at 128–29). He only knew that red meant stop, and green meant go, but could not read something as simple as the word "yield." (Id. ). He had difficulty understanding jokes and following stories. (Id. at 129–30). Ms. Carter recalled that, even by age 11, Roland had trouble counting, did not understand the difference in value between coins, and would return from the store with either no change or the wrong change. (Id. at 130–31).
Captain Thomas, who observed Roland in 1998 and 2001, testified that Roland was "slower" than other residents at the Essex County Youth Detention Center. (D.E. No. 384, Tr. at 79 (Cap. Thomas's Testimony)). Ms. Bohm, Roland's teacher from his time at Sojourn in 1998 through 2001, testified that she met Roland when he was 14, and he could not read or write at the time. (D.E. No. 384, Tr. at 26 (Ms. Bohm's Testimony)). She recalled that Roland had "extremely limited skill sets" and "didn't have any skill set in reading, writing and math that I could ascertain." (Id. at 13). Similar to Captain Thomas's assessment, Ms. Bohm described Roland as "less skilled than the other students of his age in that particular grouping," and testified that compared to the other students "he was at the bottom." (Id. at 18, 41).
Captain Thomas explained:
you have to talk to him on numerous occasions. I have to talk to him more than the other kids, and what I find sometime when you are talking to him, he go to you like he is listening, and he then goes back and you see him again, same thing again. You have to be talking to him repeatedly, like he seems to understand what you are saying, but then he reacts differently. We have that many, we hardly have that type of behavior coming from the other residents.
(D.E. No. 384, Tr. at 79 (Cap. Thomas's Testimony)).
Dr. Greenspan's Interviewees. The witnesses' testimony—and the record generally—were corroborated by the individuals Dr. Greenspan interviewed. Ms. Taylor, for example, reported that she had to count money for Roland, and that the local store keeper knew he could shortchange Roland. (Dr. Greenspan's Report at 20). Amin Roland reported that Roland had a "very poor understanding of money." (Id. at 19). Mr. Robinson reported that Roland "had difficulty sustaining concentration. He changed the topic often, out of the blue, because he wasn't following the conversation." (Id. at 23). Ms. Gardner and Ms. Taylor both reported that Roland could not read well into his teenage years. (Id. at 20). Ms. Macon, who lived with Roland when he was 23, reported that Roland's "writing was very poor. He wrote the way you would sound out something, 'perpus' for purpose." (Id. at 20–21). She also stated that Roland "did not even know how to spell his own middle name." (Id. at 21). Indeed, their son was named after Roland's middle name, which Roland told her was "Zakai," but Roland's birth certificate later revealed that his middle name was spelled 'Zakee.' " (Id. ). Comorbidity. The Government maintains that Roland's isolated academic successes prove that his academic failures were the result of a learning disability or truancy. (D.E. No. 381, Tr. at 137 (Government attorney noting that Roland's academic records reveal a learning disability and truancy, rather than ID); D.E. No. 409, Tr. at 124 (Government attorney questioning Dr. Greenspan on Roland's truancy)). For support, the Government points to some As and Bs Roland received on report cards from Sojourn. The Government's point is unconvincing for three reasons. First , as noted above, the diagnostic criteria for ID do not require exclusion of accompanying diagnoses. See AAIDD–11 at 58–63; DSM–5 at 40; Moore , 137 S.Ct. at 1051 ; (see also supra at 480, 527-28 n.74). Second , Ms. Bohm credibly testified that, during the relevant time period, grades at Sojourn were based primarily on participation and completing tasks, not on aptitude. (D.E. No. 384, Tr. at 31–35 (Ms. Bohm's Testimony)). She also testified that "F" grades were not given out. (Id. ). And finally , the Court is reluctant to rely on any perceived strengths that may be due to the structured environment of the correctional facilities in which Roland was confined. See Moore , 137 S.Ct. at 1050 ("Clinicians, however, caution against reliance on adaptive strengths developed 'in a controlled setting,' as a prison surely is."); Wilson , 170 F.Supp.3d at 389 (rejecting government's similar argument that Wilson's isolated academic successes prove that Wilson's failures resulted from a learning disability and a willful decision not to apply himself); Hardy , 762 F.Supp.2d at 899 (noting that "an institutional environment of any kind necessarily provides 'hidden supports' "); (see supra at 543-44).
Moreover, one of those report cards reveals that for the 2001–2002 school year, at age 17, Roland was in the ninth grade—indicating that he was held back two grades. (See Def. Ex. 15e at 1; Dr. Greenspan's Report at 9).
Criminal Adaptive Behavior. The Government also points to evidence of Roland's criminal behavior to argue that Roland possessed unimpaired conceptual capacity. (See Gov. Opp. Br. at 20 ("Dr. Morgan examined Defendant's current abilities by reviewing the following: audio recordings and transcripts of Defendant's jail calls; video recordings and transcripts of Defendant's conversations with police officers; transcripts of Defendant's various court hearings; letters and emails sent from Defendant in jail; and the relevant facts of Defendant's past criminal behavior.") (citing Dr. Morgan's Report at 13–15)). In reviewing such evidence, Dr. Morgan concluded that "Roland demonstrates essentially normal adaptive behavior abilities in terms of communication, understanding, following conversations, providing information, understanding the relevance of various concepts, and overall functional adaptive living skills." (Dr. Morgan's Report at 14). For example, in analyzing Roland's behavior during a recorded interview with the Hillside Police Department, Dr. Morgan opined that Roland "is seen and heard conversing normally with his interviewers. He understands their questions, remembers many facts, and provides relevant information, displays memory of events and others' activities and in all, shows no confusion, lack of understanding or inability to comprehend...." (Id. ).
While the Court recognizes that this evidence is ostensibly hard to reconcile with a finding that Roland is ID, the Court does reconcile this evidence against the backdrop of Roland's experts who maintained that such examples of adaptive capacity are not inconsistent with ID. (See supra at 537-38). Furthermore, this evidence is accredited less weight for the additional reasons that the clinical standards proscribe reliance on a "person's street smarts, behavior in jail or prison, or criminal adaptive functioning" and prohibit the use of "verbal behavior to infer level of adaptive behavior." User's Guide at 20, 22. Moreover, Dr. Morgan's conclusion here raises the same concerns about verbal behavior and relying on stereotypes that the Court addressed above, and the Court is reluctant to rely on Dr. Morgan's opinion for those same reasons. (See supra at 535-38).
At the hearing, Dr. Morgan expressed disagreement with the guidelines' prohibition of using this type of evidence, calling the prohibition "somewhat arbitrary." (D.E. No. 414, Tr. at 53–54 (Dr. Morgan's Testimony)).
Conclusion. For the conceptual domain, the evidence of Roland's deficits is almost overwhelming. Accordingly, the Court finds by a preponderance of the evidence that Roland has demonstrated significant deficits in the conceptual domain of adaptive functioning. As outlined above, these deficits developed concurrently with Roland's intellectual functioning deficits during his developmental period, and they persisted at all relevant times. Although Roland has now met his burden at Prong Two, the Court will nevertheless address the remaining domains for the sake of completion.
v. Roland's Social Skills
Definitional Standards. The social domain of adaptive functioning incorporates "interpersonal skills, social responsibility, self-esteem, gullibility, naïveté (i.e., wariness), follows rules/obeys laws, avoids being victimized, and social problem solving." AAIDD–11 at 44. The APA expands on this slightly: "The social domain involves awareness of others' thoughts, feelings, and experiences; empathy; interpersonal communication skills; friendship abilities; and social judgment, among others." DSM–5 at 37. Persons with mild ID may face the following problems:
Compared with typically developing age-mates, the individual is immature in social interactions. For example, there may be difficulty in accurately perceiving peers' social cues. Communication, conversation, and language are more concrete or immature than expected for age. There may be difficulties regulating emotion and behavior in age-appropriate fashion; these difficulties are noticed by peers in social situations. There is limited understanding of risk in social situations; social judgment is immature for age, and the person is at risk of being manipulated by others (gullibility).
Id. at 34 (emphasis in original).
Evidence of Roland's Social Skills Deficits. The record is replete with evidence of Roland's significant deficits in the social domain. Both Amin Roland's and Mr. Robinson's ABDS ratings indicate that Roland had significant deficits in the social domain. (Dr. Greenspan's ABDS Decl. at 3; Dr. Greenspan's Report at 16–18). Amin Roland reported that Roland "did not understand subtle cues, body language, or implications. He might understand teasing if it was direct, but he sometimes asked Amin to explain what someone meant. Conversation with [Roland] had to be straightforward." (Dr. Greenspan's Report at 21–22). Captain Thomas, who observed Roland as a teenager, corroborated Amin Roland's statement that Roland did not understand subtle cues, body language, or implications. Captain Thomas explained that Roland would get too close to him when the two spoke, and he had to "always tell him ... stay a certain distance from me. And sometimes he would do it. The next time he would forget and he would do the same thing." (D.E. No. 384, Tr. at 68–69 (Cap. Thomas's Testimony)). Captain Thomas would repeatedly raise this issue, and Roland sometimes "would start to cry." (Id. ). A probation report from 1999, when Roland was 15, similarly noted that Roland's "peers have focused on problem areas concerning his unwillingness to take in constructive criticism without hurt feelings." (Def. Ex. 9a at 33).
The probation report further stated that Roland "is also a follower and tends to attempt to impress others by feeding into negative behavior." (Id. ). Captain Thomas testified that Roland's peers at Essex County Youth House would frequently take advantage of Roland (and call him a "stunt dummy") and Roland would get into trouble as a result. (D.E. No. 384, Tr. at 70–71 (Cap. Thomas's Testimony)). He explained that the other "kids realized that ... they could get [Roland] to do anything." (Id. at 70). For example, they would "tell Mr. Roland to go and hit somebody," and "Mr. Roland would do it and get into trouble." (Id. ). Captain Thomas noted that "this continued throughout his stay." (Id. ).
Ryan , 813 F.3d at 1194 (noting that being a "follower" is "a trait the Supreme Court has identified as an indicator of impaired adaptive behavior") (citing Atkins , 536 U.S. at 318, 122 S.Ct. 2242 ("[I]n group settings [ID people] are followers rather than leaders.")).
The evidence also demonstrates that Roland was repeatedly blamed for things he did not do. Ms. Carter testified, for example, that she once watched her son break something and blame it on Roland. (D.E. No. 385, Tr. at 125–26 (Ms. Carter's Testimony)). And rather than explain that he did not do it, Roland "just took the blame." (Id. at 125–26). Captain Thomas similarly testified that, even as late as 1998 and 2001, the other residents at the Essex County Youth House would often blame Roland for things he did not do. (D.E. No. 384, Tr. at 72–73 (Cap. Thomas's Testimony)). Captain Thomas would advise Roland "and tell him to stop allowing them to use him to get himself in trouble"—advice that Roland "seemed to understand." (Id. ). But "just in a couple of minutes," he would go "back and get himself into the same trouble." (Id. ). Mr. Robinson reported that the neighborhood children would tell Roland's uncle that Roland "did things he didn't do (like stealing a bicycle) just so they could hear the beatings." (Dr. Greenspan's Report at 23). Indeed, Mr. Robinson told Dr. Greenspan that he, too, had blamed things on Roland that Roland had not done. (Id. ).
Captain Thomas described Roland as slower than the other residents, noting that he had to talk to Roland on numerous occasions ("more than the other kids") and, despite appearing to be listening, Roland would react differently. (D.E. No. 384, Tr. at 79 (Cap. Thomas's Testimony)). Mr. Robinson also told Dr. Greenspan that Roland was made fun of "because he was slow." (Dr. Greenspan's Report at 23). Ms. Carter further testified that Roland had difficulty understanding jokes and following stories. (D.E. No. 385, Tr. at 129–30 (Ms. Carter's Testimony)). She recalled that children would tease him about his hygiene and call him "stupid." (Id. at 132).
Recorded Conversations. The Government points to "actual evidence of Defendant speaking with his peers and loved ones and perceiving their social and conversational cues." (Gov. Post–Hearing Submission ¶ 252 (citing Gov. Exs. 159-65)). According to the Government, "Defendant's language is neither concrete nor immature for his age—indeed, although Defendant jokes around on some of the calls, some of the conversations are also quite profound and demonstrate that Defendant is thinking in a mature and reasoned way." (Id. (citing Gov. Ex. 165 at 5–6)). As discussed above, the Court is reluctant to rely on Roland's verbal behavior given that the AAIDD explicitly prohibits the use of "verbal behavior to infer level of adaptive behavior or about having ID." (See supra at 535-36); AAIDD–11 at 102; User's Guide at 20. Moreover, relying on Roland's verbal behavior requires reliance on stereotypes, which is also prohibited. (See supra at 536-38); User's Guide at 26; Moore , 137 S.Ct. at 1052. And, as again noted above, the Court does not view Roland's recorded phone calls as inconsistent with a diagnosis of ID. (See supra at 537-38). Dr. Greenspan, who reviewed this evidence, explained that people with ID can do many of the things that Roland does when speaking with his peers and loved ones. (See supra at 537-38). For these reasons, the Court does not view evidence of Roland's recorded conversations with his peers and loved ones while incarcerated as negating Roland's significant deficits in the social domain.
Conclusion. The Court therefore finds by a preponderance of the evidence that Roland has demonstrated significant deficits in the social domain of adaptive functioning. These deficits developed concurrently with Roland's intellectual-functioning deficits during his developmental period, and they persisted at all relevant times.
vi. Roland's Practical Skills
Definitional Standards. For the practical domain of adaptive behavior, the Court must assess Roland's "activities of daily living (personal care), occupational skills, use of money, safety, health care, travel/transportation, schedules/routines, and use of the telephone." AAIDD–11 at 44. According to the APA, the practical skills domain "involves learning and self-management across life settings, including personal care, job responsibilities, money management, recreation, self-management of behavior, and school and work task organization, among others." DSM–5 at 37. The APA expounds on the kinds of practical skill deficits that manifest themselves in persons with mild ID:
The individual may function age-appropriately in personal care. Individuals need some support with complex daily living tasks in comparison to peers. In adulthood, supports typically involve grocery shopping, transportation, home and child-care organizing, nutritious food preparation, and banking and money management. Recreational skills resemble those of age-mates, although judgment related to wellbeing and organization around recreation requires support. In adulthood, competitive employment is often seen in jobs that do not emphasize conceptual skills. Individuals generally need support to make health care decisions and legal decisions, and to learn to perform a skilled vocation competently. Support is typically needed to raise a family.
Id. at 34.
ABDS Rating. The ABDS administered to Amin Roland indicates that Roland had deficits in this domain, and there is some evidence in the record to corroborate Amin Roland's rating. (Dr. Greenspan's ABDS Decl. at 3; Dr. Greenspan's Report at 16–18).
Personal Care. Regarding personal care, the record is replete with examples of Roland's bad hygiene. Ms. Carter testified that, when Roland was about 6 or 7, children would tease him about his hygiene, and she would have to give him instructions about his hygiene "over and over." (D.E. No. 385, Tr. at 122, 132 (Ms. Carter's Testimony)). Ms. Gresham remembers Roland from the time he was 10 or 11 years old because of his "very bad hygiene." (D.E. No. 384, Tr. at 175–76 (Ms. Gresham's Testimony); id. at 177 ("I can't forget that child that had a hygiene problem.")). Ms. Taylor reported to Dr. Greenspan that (unlike her brother who was younger than Roland) she "had to teach [Roland] to brush his teeth and wash his hands. Not only did he not know to clean himself, but she had to repeat instructions several times." (Dr. Greenspan's Report at 20). Captain Thomas further testified that, even as a teenager, Roland had trouble brushing his teeth and taking care of his hair, and had to be instructed about personal hygiene matters "[a]bout every day." (D.E. No. 384, Tr. at 75–77 (Cap. Thomas's Testimony)). Corroborating these reports, medical records from the New Jersey Department of Corrections state that in 2009, at age 24, Roland had to have eight teeth extracted. (See Def. Ex. 14d at 76).
Dr. Greenspan explained that hygiene is considered an adaptive-behavior issue because "hygiene is important for health and disease prevention, but it is also important for social acceptance.... That in a sense that is not just [a] physical act, it is also a social act. It is an important part of being accepted by others and being at the same level as others." (D.E. No. 408, Tr. at 142 (Dr. Greenspan's Testimony)).
Ms. Gresham vividly recalled Roland's lack of personal care:
He had a very bad hygiene. He had bad smell. Same clothes a lot of times. You know. Dirty. And he was one of the reasons why I brought deodorant and soap in. I would pull him to the side and talk to him, and he accepted .... And it was a secret thing between me and him, now, to wash up.
(D.E. No. 384, Tr. at 175–76 (Ms. Gresham's Testimony)).
Amin Roland also reported to Dr. Greenspan that, even in his early twenties, Roland could not cook anything besides one dish: "noodles with tuna and mayonnaise." (Dr. Greenspan's Report at 21–22; D.E. No. 408, Tr. at 148 (Dr. Greenspan's Testimony)). He described Roland as frustratingly incompetent and "inept." (Dr. Greenspan's Report at 21; D.E. No. 408, Tr. at 148 (Dr. Greenspan's Testimony)).
Risk Awareness. Ms. Carter testified that when Roland played with her kids in the backyard, he "picked up a can of lighter fluid and was throwing it" because he "thought it was water in a can." (D.E. No. 385, Tr. at 125 (Ms. Carter's Testimony)). When her daughter alerted her, Ms. Carter explained to Roland that the can—which had "the danger sign with the fire"—was not water. (Id. at 122, 132). Indeed, some of the abuse that Roland suffered at the hands of Mr. Thomas was because of his lack of risk awareness. (See Def. Ex. 4a at 61 (May 28, 1997 DYFS records noting that "Mr. Winston Thomas stated that a week ago he did strike [Roland] one time on the leg when he caught him setting a fire in their basement. Ms. Thomas stated this was the second fire [Roland] started ....")).
Travel and Transportation. Amin Roland reported to Dr. Greenspan that Roland had a "poor sense of direction" and, "[d]espite having been to Amin's house repeatedly, he still drove past it." (Dr. Greenspan's Report at 22; D.E. No. 408, Tr. at 146–47 (Dr. Greenspan: "He also had trouble finding where he was going. He would continually drive around the block and not able to find Amin's house, even though he had been there many times.")). Similarly, a 2001 intake form from the Juvenile Justice Commission indicates that Roland "never went to probation because he didn't know where it was located." (Def. Ex. 9a at 50).
Use of Money. Dr. Greenspan recounted during his testimony that "[s]everal people mentioned that he had difficulty with money, that was part of the reason he was having problems with his aunt and uncle, because ... he would be shortchanged by shop keepers and he wouldn't be able to understand what was happening." (D.E. No. 408, Tr. at 116 (Dr. Greenspan's Testimony)). DYFS records reveal that, at age 12, Roland suffered abuse at the hands of Ms. Thomas for losing $2.50 she had given him to purchase cigarettes, corroborating Dr. Greenspan's testimony. (See Def. Ex. 4a at 60; supra at 492 n.32). Ms. Carter further corroborates Dr. Greenspan's account of his interviews, testifying that when Roland went to the store, he often returned without any change or the wrong change. (D.E. No. 385, Tr. at 130–31 (Ms. Carter's Testimony)).
Government's Objection. Dr. Greenspan concluded that based on Roland's life history, his assessment is that Roland "was a person who couldn't really function independently very well." (D.E. No. 409, Tr. at 20 (Dr. Greenspan's Testimony)). And from the record, it appears that Roland has never lived alone. (See, e.g. , D.E. No. 416, Tr. at 179 (Dr. Marcopulos testifying that she did not see anything in the record to indicate that Roland ever lived alone)). But, as the Government points out, Roland "has spent an overwhelming majority of his adult life behind bars" and thus has had a limited amount of time in the community for this Court to evaluate whether his practical skills were sufficiently impaired. (D.E. No. 381, Tr. at 128). The Government's argument here is valid. See DSM–5 at 34 (focusing on adulthood for the practical domain of mild ID); see also Ryan , 813 F.3d at 1194 (discussing defendant's practical-domain deficits as both child and adult); Wilson , 170 F.Supp.3d at 390–91 (finding evidence of some deficits in the practical domain insufficient to indicate whether the deficits were significant because the defendant had "only spent a little over three years of his life outside of prison since he was 15").
According to the Government, Roland "began getting arrested when he was 12 years old. Since that time, he has spent the entirety of his life either committing crimes or serving terms of incarceration." (Gov. Post–Hearing Submission ¶¶ 20, 181 (citing Def. Ex. 9; Gov. Exs. 112–13, 126–30)). Indeed, the Government avers that "there is an argument that his 'community' is a correctional facility." (Id. ¶ 175). And, if the Court does not consider Roland's criminal adaptive functioning, "then Defendant effectively ceased to exist as a person in 1997," when he was 13 years old. (Id. ¶ 181).
Conclusion. Accordingly, while the Court finds that Roland has demonstrated deficits in the practical domain, there is insufficient credible evidence to indicate that these deficits were significant. Because Roland has already demonstrated significant deficits in the other two domains, he nevertheless satisfies the second criterion for a diagnosis of ID under the standards of both the APA and the AAIDD.
vii. Prong Two: Conclusion
The Court concludes that Roland has satisfied his burden of proving by a preponderance of the evidence that he suffers from significant limitations in at least one of the three adaptive-skill domains. See AAIDD–17 at 21; DSM–5 at 38; Moore , 137 S.Ct. at 1046. Indeed, the Court finds that Roland has significant limitations in both the conceptual and social domains. (See supra at 541-49). For the reasons above, the Court does not find Dr. Morgan's assessment of Roland's adaptive-functioning to be reliably based or persuasive. To the contrary, Dr. Morgan espoused an approach to assessing adaptive functioning that is directly at odds with the clinical standards. Dr. Greenspan, on the other hand, conducted a comprehensive assessment that included reviewing Roland's historical records and interviewing 11 individuals. Dr. Greenspan's expertise in the field and his adherence to the clinical standards, together with record evidence that corroborates his interviewees' and raters' accounts of Roland's adaptive-functioning skills confirms the validity of his findings and enhances his credibility. Finally, an independent assessment of Roland's academic, legal, social, and family history and fact-witness testimony at the hearing further corroborate Dr. Greenspan's conclusions. The Court finds therefore that credible evidence establishes by a preponderance of the evidence that Roland has significant limitations in the conceptual and social domains of adaptive functioning to satisfy Prong Two of the ID definition.
C. Prong Three: Onset During the Developmental Period
To prevail on the third prong of his ID claim, Roland must prove, by a preponderance of the evidence, that his disability originated before age 18. See AAIDD–11 at 6; DSM–5 at 33; Hall , 134 S.Ct. at 1994 (noting that the third criterion of ID is "that the onset of both factors occurred before the age of 18").
i. Definitional Standards
The final criterion for an ID diagnosis is the requirement that the condition must have arisen during the developmental period. See DSM–5 at 33 (requiring "[o]nset of intellectual and adaptive deficits during the developmental period"); AAIDD–11 at 6 ("This disability originates before 18."). The purpose of the third prong "is to distinguish ID from other forms of disability that may occur later in life." AAIDD–11 at 27. "Thus, disability does not necessarily have to have been formally identified, but it must have originated during the developmental period." Id. ; see also Williams , 1 F.Supp.3d at 1148 ("This does not mean that a defendant must be diagnosed with ID before the age of eighteen, only that the disability's defining symptoms must have manifested themselves before the age of eighteen.") (emphasis in original).
ii. Evidence of Onset During the Developmental Period
At the risk of redundancy, the Court notes that it relied on the following evidence (among others) in concluding that Roland satisfied his burden of proving that his condition manifested itself during the developmental period:
Again, the Court does not attempt to provide a comprehensive summary of the voluminous record that undoubtedly includes other evidence relevant to the Court's analysis here.
• Roland was exposed to a plethora of risk factors throughout his life (see supra at 494-96);
• Ms. Carter's testimony that, in his developmental years, Roland was slower than his peers, including slower to walk, talk, and read (see supra at 492-93);
• Roland was placed in special-education classes at age 9 (see supra at 492-93);
• multiple raters indicated that Roland had significant deficits in the conceptual domain at ages 9, 10, and 12 (see supra at 531, 541-43);
• Roland was labeled mentally retarded by the SSA at age 14 (see supra at 495-97);
• by age 17, Roland was behind at least two grades (see supra at 546 n.99);
• Roland's low composite score of 70 on the 2002 KBIT, at age 17 (see supra at 494);
• multiple evaluators at the Juvenile Justice Commission noted Roland's cognitive limitations when he was 17 (see supra at 494-95);
• Captain Thomas's testimony that, as a teenager, Roland did not understand
social cues; was referred to as a "stunt dummy"; was blamed for things he did not do; and was slower than the other kids (see supra at 547-48);
• Roland's academic record reveals that he remained behind his peers at least until age 22 (see supra at 492-94).
iii. Prong Three: Conclusion
The final prong of the ID definition was the least contested. The real dispute between the parties was whether Roland's deficits were indicative of ID or instead attributable to some other condition or circumstance (e.g., learning disability, emotional disturbance, poor effort, neglect, truancy). (See supra at 527-28 n.74, 541-42, 545-46). Having considered all of the evidence proffered by the parties, it is abundantly clear to the Court that Roland's condition began well before he turned 18. Roland has therefore carried his burden of proving by a preponderance of the evidence that his condition manifested itself during the developmental period. Accordingly, Roland has satisfied the final element.
V. CONCLUSION
For the reasons above, the Court GRANTS Roland's motion for a pretrial determination of ID. Roland has shown by a preponderance of the evidence that he meets the legal standard for ID. Roland has demonstrated significantly subaverage intellectual functioning, along with significant deficits in adaptive functioning in two out of three adaptive-behavior domains. See Moore , 137 S.Ct. at 1045. These deficits arose during Roland's developmental period and persisted at all relevant times. Accordingly, Roland is ineligible for the death penalty under the Eighth Amendment and the FDPA. See Atkins , 536 U.S. at 311, 321, 122 S.Ct. 2242 ; 18 U.S.C. § 3596(c). An appropriate Order accompanies this Opinion.
On August 3, 2017, the Government moved to strike paragraphs 171, 172, 248, and 695 of Defendant's Proposed Findings of Fact and Conclusions of Law (see Def. Post–Hearing Submission). (See D.E. Nos. 438 & 441). In support of its motion, the Government argues that "Defendant's pleading references numerous articles for the first time" and those paragraphs therefore "rely on facts that are not in evidence." (Id. at 1, 3). Roland opposed the Government's motion, arguing that (with the exception of one article cited on paragraph 695) the Government was on notice of every article and book cited in the remaining paragraphs. (See D.E. No. 440 at 1). According to Roland, "each of these articles and books is in fact in the record—either as a specific exhibit, or referenced in the expert reports which themselves were entered into evidence as exhibits." (Id. at 2). As the parties can see, however, the Court's analysis nowhere relies on any of these disputed paragraphs (or any of the disputed authorities cited within them). Accordingly, the Government's motion to strike (D.E. Nos. 438 & 441) is denied as moot.
--------