April 20, 2001
IN THE MATTER OF THE COMMITMENT OF R.S., PETITIONER-APPELLANT.
On appeal from the Superior Court of New Jersey, Law Division, Essex County, SVP-7-99.
Before Judges King, Coburn and Axelrad.
The opinion of the court was delivered by: King, P.J.A.D.
NOT FOR PUBLICATION WITHOUT THE APPROVAL OF THE APPELLATE DIVISION
Argued: March 14, 2001
This is an appeal from a decision to admit testimony into evidence about actuarial risk assessment in a civil commitment hearing under the New Jersey Sexually Violent Predator Act (SVPA or Act), N.J.S.A. 30:4-27.24 to -27.37; L. 1998, c. 71, effective date August 12, 1999; see In re Commitment of M.G., 331 N.J. Super. 365, 371-74 (App. Div. 2000), for a review of the Act. See also John Kip Cornwell, John V. Jacobi and Philip H. Witt, The New Jersey Sexually Violent Predator Act: Analysis and Recommendations for the Treatment of Sexual Offenders in New Jersey, 24 Seton Hall L. Rev. 1 (1999).
Appellant R.S. was committed to the State's Special Offenders Unit at the Northern Regional Unit (NRU) in Kearny after a non-jury hearing at which the judge heard testimony from a psychiatrist and a psychologist appearing for the State. Both recommended that R.S. be committed for treatment, care and confinement as a sexually violent predator. The State's psychologist based her recommendation, in part, on the results of actuarial assessments bearing on R.S.'s risk of recidivism.
Because R.S. objected to the use of actuarial instruments, the judge held an evidentiary hearing before reaching a decision on commitment. After receiving testimony from experts for both sides, he held that actuarial instruments were properly admissible because they helped the court and satisfied the requirements of reliability. The judge ruled that R.S. posed a threat to the community because he had a mental abnormality which predisposed him to commit acts of sexual violence. On this appeal, R.S. raises only the issue of the admissibility of the actuarial assessment instruments. We uphold their admissibility. Our de novo review of the record establishes that the State has met its burden to demonstrate the tests are reliable for use in this context as an aid in predicting recidivism.
On September 15, 1999 the Attorney General filed a petition for the civil commitment of R.S., under the recently effective SVPA. The petition was accompanied by two clinical certificates for involuntary commitment prepared by Vivian Schnaidman, M.D., and Lawrence A. Siegel, M.D., identifying R.S. as a sexually violent predator. On September 15, 1999 R.S. was temporarily committed to the NRU until a final hearing could be conducted on the issue of his continuing need for involuntary commitment as a sexually violent predator.
The commitment hearing was scheduled before Judge Philip M. Freedman on March 28, 2000. There is no right to a jury trial under the SVPA. At that time, counsel for R.S. moved to exclude any testimony concerning actuarial risk assessment instruments utilized by the State's experts. Later, in June, a five-day evidentiary hearing was held on the issue of the admissibility of actuarial instruments in the three cases now before us, R.S., W.Z. and J.P. On July 5, 2000 Judge Freedman decided that the actuarial instruments were admissible in their own right and as the basis of an expert opinion. He ruled that R.S. qualified as a "sexually violent predator" because he had been convicted of a predicate sexually-violent offense, suffered from a mental abnormality which affects his volition making him likely to engage in acts of sexual violence, and should be confined for treatment. N.J.S.A. 30:4- 27.26.*fn1 Judge Freedman entered a judgment on July 6, 2000 committing R.S. to the NRU and scheduling an annual review hearing for June 14, 2001.
We first review R.S.'s prior criminal history. R.S., born December 14, 1967, has a significant history of sexual assault on prepubescent boys, under the age of thirteen. In 1989, R.S. sexually assaulted D.G., his male cousin, age nine, by digitally penetrating his anus and exposing his penis to the victim.
R.S. was arrested on February 1, 1989 in Passaic County for that offense and charged with aggravated criminal sexual contact. In a sworn statement at the time of his arrest, R.S. stated he was under the influence of alcohol when he sexually assaulted D.G. and when he was drinking he became "turned on" by the sight of young boys. On April 28, 1989 R.S. admitted his guilt to the sexually violent offense against victim D.G. On June 15, 1989 R.S. was evaluated at the Adult Diagnostic and Treatment Center (ADTC). At that time, R.S. admitted his first incident of sexually aberrant behavior occurred in 1987 when he exposed his genitals to his cousins in the basement of his home. He also acknowledged his behavior was wrong; he said when he was under the influence of alcohol, he behaved in a sexually inappropriate manner. He was diagnosed as a repetitive and compulsive sex offender eligible for sentencing to ADTC pursuant to the Sex Offender Act, N.J.S.A. 2C:47-1 to -10. On August 2, 1989 R.S. was sentenced to a five-year probationary term for the sexual assault of D.G. As a condition of his probation, mandatory substance abuse treatment and mental health treatment was ordered.
R.S. was again convicted of a sexually violent offense on July 22, 1993. He sexually assaulted four boys, between the ages of nine and twelve, on repeated occasions between September 1990 and November 1991. He committed these sexual assaults while on probation for the earlier sexual assault of D.G., age nine. R.S. lured the four young boys into his home to play Nintendo games. R.S. then played wrestling-type games with the boys during which he fondled their genitals and buttocks. He also exposed his penis to the boys and masturbated in front of them.
R.S. was arrested on November 12, 1991 and charged with ten counts of second-degree sexual assault and four counts of third- degree endangering the welfare of a child. On September 21, 1992, pursuant to a plea agreement, he pleaded guilty to four counts of second-degree sexual assault. According to the May 17, 1993 pre- sentence evaluation, he admitted to a long history of sexual attraction to young boys and his own victimization at age twelve. In describing his sexual assaults of the four boys, R.S. admitted he realized that what he was doing was wrong but was unable to stop himself; he stated, "I would say to myself, 'It's wrong.' But it didn't seem to work."
On July 22, 1993 R.S. was sentenced to a seven-year prison term with a five-year mandatory minimum. After evaluation, R.S. was ordered to serve his sex offender's sentence at ADTC. On October 15, 1993 R.S. pleaded guilty to a violation of the terms of his probation on the first sexually violent offense and was sentenced to a four-year term of incarceration, concurrent to the sentence on the later convictions.
During his incarceration at ADTC, R.S. was subject to six disciplinary charges including refusing to obey an order, threatening a staff member with bodily harm, assault, conduct which disrupts, and misusing electronic equipment. As a result of these disciplinary sanctions, on September 23, 1997, he was transferred from ADTC to East Jersey State Prison for a period of administrative segregation. On March 31, 1998 he returned to ADTC, served out his maximum sentence, and was released in September 1999.
At the SVPA commitment hearing, the State presented testimony from two experts, Dr. Stanley Kern, a psychiatrist employed by the NRU, and Dr. Jennifer Kelly, a staff psychologist at the NRU. R.S. did not present any witnesses at this final commitment hearing; however, he did present three expert witnesses at the evidentiary hearing on use of the actuarial evidence.
Kern's diagnosis was pedophilia, alcohol abuse and borderline mental retardation. R.S. was currently taking Prozac, to control compulsive behavior; Lupron, to reduce sexual desire; and Thorazine, to relieve anxiety. While Kern acknowledged that R.S. had undergone eight years of therapy, he had only recently become more involved in treatment at the NRU. R.S. still had a "impulsive control" problem and "certainly" needed treatment and confinement. Kelly testified that she prepared an evaluation of R.S. based upon a review of his treatment history, criminal record, and standardized testing, as well as her personal observations of R.S. in group sessions and a clinical interview. Kelly also administered the Minnesota Multi-Phasic Personality Inventory (MMPI-II) and scored four actuarial risk assessment tools
the Minnesota Sex Offender Screening Tool Revised (MnSOST-R), the California Actuarial Risk Assessment Tables (CARAT), the Registrant Risk Assessment Scale (RRAS), and the Rapid Risk Assessment of Sex Offender Recidivism (RRASOR). Kelly said that the results of the MMPI-II were consistent with an individual who is impulsive, easily frustrated, angry, hostile and in some ways antisocial. On the actuarial instruments, R.S. fell into the high-risk range on the MnSOST-R, the CARAT, and the RRAS, and into the moderate-risk range on the RRASOR. Kelly stated that R.S. has acknowledged that he currently has deviant sexual fantasies involving children, but he claimed that he has tried to change his fantasies of children to fantasies of adult males. However, Kelly could not be sure of the truth of his statement because R.S. was not completely forthright on the psychometric testing. She concluded in her forensic psychosexual evaluation of March 27, 2000 that R.S. is at "high risk" to reoffend and should remain at the NRU for continued sex offender and substance abuse treatment.
Kelly testified at length about the actuarial risk assessment instruments she used in her evaluation. Research discloses, she explained, that clinical judgment alone has been considered inadequate to make a determination of which sex offender is going to recidivate and which is not. Individuals working in the field of sex offender risk assessment have developed actuarial tools to aid in making predictions of future dangerousness. This is done by studying those sex offenders who recidivate to see which risk factors they have in common. Through statistical tests, the factors which repeat most often are identified and used to create the actuarial instruments.
Actuarial instruments mainly measure static factors, which Kelly explained are historical facts about the offender which do not change. Once a subject's record is reviewed and an instrument is scored, the results are then adjusted based upon the evaluator's clinical judgment of the subject's dynamic factors. Dynamic factors are factors which change over time, such as an individual's treatment progress, his attitude, and his arousal patterns. Kelly stated that dynamic factors are much more difficult to measure than static factors; they are subjective and they can vary from day to day. Kelly discussed the nature of each actuarial instrument she used, its validity and its reliability. Kelly's testimony in regard to actuarial instruments was entirely consistent with the testimony of the other State experts.
Dr. Glenn Ferguson, a psychologist employed as the clinical director at the NRU, was the first witness to testify on behalf of the State at the evidentiary admissibility hearing on the actuarial instruments relied upon by the three cases before us, R.S., W.Z., and J.P. He obtained a Ph.D. in 1997. His Ph.D. dissertation was "a validation study of the Registrant Risk Assessment Scale." Ferguson explained that the NRU uses an adjusted actuarial approach to evaluate a sexual offender's risk of recidivism. This consists of comparing actuarial instruments, psychological testing, clinical interviews and clinical observations to see when there is agreement to support a clinical diagnosis. Use of these different scales is "state-of-the-art" in the field of sex offender risk assessment because it minimizes the weaknesses inherent in any one single test.
Ferguson discussed each of the recently developed actuarial instruments used at the NRU, beginning with the CARAT. The CARAT is a purely actuarial measure developed by looking at the characteristics and personality traits of about 500 California sex offenders who had been released into the community after completing a treatment program. Comparing traits among those who recidivated, researchers derived a table very much like the actuarial tables insurance companies use to set rates. Because this is purely an actuarial measure, it does not consider dynamic factors which might contribute to an individual's recidivism. The California researchers who developed the CARAT did a validation study on the instrument which was favorable. A validation study looks at the ability of an instrument to measure what it purports to measure ÄÄ in this case, to correctly classify individuals into different risk factors for recidivism.
Ferguson testified that the original MnSOST was a clinically- derived measure in which Minnesota researchers took factors which research had shown as significant indicators for sexual recidivism and scored them, based on clinical judgment, on which were most important. The researchers then refined the scale through factor analysis and validity studies to come up with a more statistically valid approach, the MnSOST-R. The MnSOST-R is empirically based but is one of the few actuarial instruments which attempts to capture dynamic factors, such as an individual's participation in treatment. The MnSOST-R has been validated and cross-validated with good results. Ferguson explained that the difference between a validation study and a cross-validation study is that a validation study is done using the same population that was used to develop the test while a cross-validation study looks at an entirely new set of individuals.
Ferguson described the RRASOR as an empirically-based instrument designed after a meta-analysis study by two prominent researchers in the field. A meta-analysis study is a study in which researchers look at several different studies at the same time ÄÄ in the case of the RRASOR about 100 studies related to sex offender recidivism ÄÄ and come up with a set of factors that are significant in all of them. The RRASOR consists of only four factors, all of which are static. He said that it has been validated and possibly cross-validated as well.
Concerning the Static 99, Ferguson stated that it is an improvement of the RRASOR, just as the MnSOST-R is an improvement of the MnSOST. The Static 99 was not scored for R.S., but it has been used in other sex offender commitment hearings. At the request of the Public Defender, the trial judge here considered the admissibility of the Static 99 along with the other actuarial instruments at R.S.'s evidentiary hearing. The Static 99 was developed by a British researcher who combined the RRASOR with statistical instruments used in Britain and Canada. A strength of this instrument is its utility in predicting violent recidivism as well as sexual recidivism. Although it does not capture many dynamic factors, it does consider substance abuse. The Static 99 has been validated.
Finally, Ferguson discussed the RRAS, an instrument developed by clinicians and legal experts in New Jersey after the enactment of "Megan's Law," The Registration and Community Notification Law, N.J.S.A. 2C:7-1 to -11, as an objective way of assigning tier classifications to sex offenders prior to release into the community. Ferguson was a member of the group which did the validation study on the RRAS. The RRAS is a clinically developed scale, based upon a 1995 review of the literature. One of its strengths is the inclusion of several dynamic factors such as progress in treatment, community support, employment, and substance-abuse treatment. However, Ferguson admitted that because the RRAS is not empirically derived it is "on the lower end of the preference scale" and is not in the same league as the MnSOST-R or Static 99.
Ferguson also testified that even though these actuarial tools were designed for specific regional populations (the CARAT was developed for use in California), they are equally applicable to any sex offender population. Evidence for that statement comes from the meta-analysis where studies from all over the world were considered.
Ferguson also stated that an "overwhelmingly" large number of research studies support the use of static facts over the use of dynamic factors for making sex offender risk determinations. One great advantage to using actuarial instruments is that by assigning specific weight to specific factors they standardize clinical assessments by ensuring that different clinicians arrive at basically the same result. Ferguson explained that "inter-rater reliability" as applied to a risk assessment tool refers to its consistency
whether two different scorers will arrive at the same results for the same individual. Usually, the largest factors contributing to inconsistency are improper training and access to different information. The inter-rater reliability for the MnSOST-R, the RRASOR and the Static 99 are all fairly high with the Static 99 the best.
Ferguson admitted that many of the same people who created the assessment tools did the reliability and validation studies, but explained that this was because most of the instruments have not been in use long enough for peer review or replication studies. Although there are no formal testing manuals for the instruments, there are articles, technical instructions and materials on the Internet to aid the evaluators. And, there are numerous workshops around the country which offer training from the instrument developers themselves.
When asked about the correlation coefficients for the instruments, which represent the degree of agreement between the factors being considered and recidivism, Ferguson stated that they are generally in the 0.20 to 0.30 range, with the Static 99 being the best at around 0.40. While these coefficients may not seem high to the uninformed observer, he said that in the field of medicine anything over 0.15 is statistically significant. The range of .20 to .30 is much better than random chance or guesswork.
Ferguson explained it is important to understand that actuarial instruments are not predictive with regard to any particular individual; they can only indicate within what group the individual falls. In other words, an actuarial tool can say that a particular individual has characteristics similar to other individuals in a group that recidivates 70% of the time, but it cannot say that a particular individual has a 70% chance of recidivating. For this reason, actuarial instruments are not considered true psychological tests of the person. A psychological or "psychometric" test measures a personality or cognitive construct, such as I.Q., which is a unique characteristic of an individual and provides information specific to that individual. An actuarial instrument, on the other hand, measures impersonal historical factors to reach a result not predictive for a particular individual. Therefore, Ferguson said, test development standards applicable to psychometric tests do not apply to actuarial instruments. Dr. Dennis Doren also testified on behalf of the State.
Doren is a psychologist who has been involved in sex offender treatment in Wisconsin since 1983 and in sex offender risk assessment there since 1994. He explained there are basically five types of sex offender assessment procedures. The first is unguided clinical judgment; this is simply the opinion of a psychiatrist or psychologist who has no preformed set of ideas of what factors contribute to risk. The second is guided clinical judgment where the clinician has some fixed or articulable ideas of what risk factors are important, perhaps based on experience or theory. These first two methods have been used in routine civil commitment proceedings in New Jersey. The third procedure is research guided clinical judgment in which the clinician considers factors that research has shown as important. The fourth, which is the method used by the NRU, is the clinically-adjusted actuarial assessment in which the clinician starts with a statistically-based formula and makes clinical adjustments according to the specific details of each case. Finally, there is the pure actuarial method which uses statistical formulas without any clinical adjustment.
Of these methods, Doren said, research-guided clinical judgment and clinically-adjusted actuarial assessment are the most often used in the field of sexual offender risk assessment. The difference between the two approaches is the weighting of the risk factors. In research guided clinical judgment the evaluator knows what factors are important but not how much weight to give one factor relative to the others. By using statistics, the actuarial approach standardizes how much weight is given to each factor.
Doren testified that there are currently about 150-175 experts nationwide in the field of sex offender risk assessment and most employ the clinically-adjusted actuarial assessment method. At the time of his testimony, June 16, 2000, fifteen states have sexually violent predator (SVP) laws, and only two, Texas and Massachusetts, do not use actuarial assessment tools. In July 1999, Doren surveyed the thirteen states which employ these instruments to determine which were most used for risk assessment. He discovered that the RRASOR was used by most of the evaluators in all thirteen states, the MnSOST-R was used in ten states, the CARAT was used only in California, and the RRAS was used only in New Jersey. Although the CARAT and the RRAS are not frequently used, the underlying principle of both is generally accepted.
Doren stated that the clinically-adjusted actuarial method is the most accurate method for risk assessments. Research has shown that actuarial analysis is at least as efficacious as clinical judgment and often better. This is because clinicians are often not systematic in their data gathering or in their memory. A study in Canada showed that clinicians tend to overestimate violence of all types, including sexual violence, so that unguided clinical judgment tends to come up with higher assessments of risk than does the actuarial process. By restricting and structuring clinical judgment, actuarial instruments produce more refined and accurate results.
In Doren's opinion, the order of the instruments from best researched to least researched is the Static 99, the RRASOR, the MnSOST-R, the CARAT and the RRAS. However, he also said that the instruments currently in use are not comprehensive enough in terms of the factors they consider and in terms of the type of outcome they measure. The greatest shortcoming of the instruments is that they do not adequately consider dynamic factors.
Doren also acknowledged it is a misuse of the instruments to say that a person with a certain score has a specific risk of recidivism. Rather, it is proper to say that a person with a certain score is in a group that has been shown through research to have a specific risk of recidivism. Misuse of the instruments can be avoided by reading the documents which describe how to interpret the results, receiving training from someone knowledgeable about the instruments, or consulting with another professional aware of the information. Finally, Doren agreed with Ferguson that actuarial instruments are not psychological tests and are not designed specifically for psychologists. Hence, the rules applicable to psychological testing are not relevant to the development of the instruments.
Dr. Randy Kurt Otto, a psychologist and professor at the University of South Florida, testified on behalf of R.S. at the evidentiary hearing. Otto has been performing sex offender assessments in Florida since January 1999 and does not use actuarial tools, although he is familiar with how they are used and scored.
Otto stated that "a lot of people" in the field of sex offender assessment use actuarial tools, but added that the discipline of psychology has a history of psychologists using invalid instruments. He believes that the psychologists and psychiatrists who should be involved in validating these instruments are those with expertise in psychometrics and test development in general, not clinicians.
In Otto's opinion, in order for a test to be generally accepted it must be (1) published and made available for others to review, (2) proven to be reliable, (3) validated and cross- validated, (4) accompanied by a test manual, (5) critically reviewed by an independent group, and (6) associated with a known standard error of measurement. Psychologists are obligated under their code of ethics to use only those tests which meet these standards.
Otto discussed the concepts of psychometric reliability (If the test is administered to the same person on more than one occasion, are the results consistent?), inter-rater reliability (If two different individuals administer the test, are the same results achieved?) and scale consistency (Are the items on the same scale internally consistent? Do they measure the same thing?) and stated that the actuarial tools are lacking in all these areas. According to Otto, the inter-rater reliability is unknown for the RRASOR, the Static 99, the CARAT and the RRAS, and the inter-rater reliability for the MnSOST-R only holds true if all test administrators receive training.
Otto also discussed predictive validity (How well does the test predict what it purports to predict?), construct validity (How well does the test measure a particular construct, such as intelligence, as compared to other measures of the same construct?), sensitivity (How many individuals will the test identify who have the behavior for which you are testing?), and specificity (How many individuals will the test incorrectly classify who do not have the behavior for which you are testing?). Although the MnSOST-R has been validated and cross-validated, Otto thought the results of the validation questionable because the representative sample of offenders was very small. Also, the only validity data obtained to date is for the pure use of actuarials and there has never been a validity study done on the adjusted actuarial approach in general.
Otto further testified that the developers of the instruments have never reported standard error of measurement, although it can easily be calculated from the data. The informational materials available about the instruments over the Internet are not of comparable quality to a commercially-published testing manual. Articles about some of the instruments have been published in peer- reviewed journals, but in Otto's opinion none of the publications have been comprehensive.
Otto considers many of the same factors used by the instruments when he assesses a sex offender and acknowledges that the meta-analysis leading up to the RRASOR was "great work," a "wonderful" first, and a "valuable and worthwhile" attempt at developing a reliable instrument. Otto believes the data-based objectivity of the actuarial approach is "an absolute" strength. However, Otto concluded that I think it's a real concern here that these instruments promise something they don't deliver. And they have an incredible aura of scientific certainty and preciseness that's just not there if you peel away the second layer of the onion.
Therefore, I think psychologists do a disservice to the profession and psychiatrists, too, for that matter, when they use them and act as if there's this precision and with a scientific basis that's not really there.
Dr. Kay Jackson, a psychologist and co-director of the Metropolitan Center, a treatment center for sex offenders in New York City, also testified on behalf of R.S. at the evidentiary hearing. She uses actuarial instruments as a way of assisting her interviews with clients and in assessing their dangerousness.
Jackson discussed the concepts of reliability and validity and applied them to the most common actuarial instruments. She said no one has reported any reliability statistics regarding the RRAS or the CARAT. Thus, we do not know if they are good instruments for predicting recidivism. The RRASOR has a modest predictive validity and reliability and the inter-rater reliability scores for the MnSOST-R are "pretty good." Jackson did not feel she could comment on the Static 99. All of the actuarial instruments are based upon archival records and so it would be a simple matter to do a predictive validity study on them, she thinks.
Jackson testified that actuarial tools are not generally accepted in the field of psychology because she has not seen any publications concerning them in general psychological journals. To her, meaningful peer review must encompass practitioners from outside the narrow specialty field of sex offender assessment.
However, Jackson acknowledged that actuarial instruments can be useful:
I think that the actuarial tools provide important information for understanding what is going on with an offender and for making a clinician apprized of factors which he or she may have overlooked, and they're very important for understanding the probabilities of reoffense. But I do not think that by themselves they can single out or offer a definitive understanding of the propensity and imminent risk of any specific individual at a given point in time.
Jackson found it important to employ as many diverse sources of information regarding a patient as possible. The problem with the clinically adjusted actuarial approach is it might lead a clinician to assume that every factor determinative of reoffense has already been identified and assigned a probability, but that simply is not the case. Although the available research studies have been interesting and illuminating, they have not yet developed the degree of specificity which would allow a clinician to weigh absolutely essential dynamic factors, such as the effects of the passage of time, the development of a support system, realizing victim empathy, and the motivation underlying the offense.
Jackson testified that there is very persuasive research to suggest that clinical predictions for reoffense are no better than actuarial predictions and, in some circumstances, are even worse. Other research suggests that clinicians who use judgment informed by both research and actuarial instruments perform at least as well as the actuarial instruments and, in some circumstances, even better.
Jackson stated that the actuarial instruments can be meaningfully used by clinicians as long as they understand their limitations and do not rely upon them exclusively. The instruments are especially useful when used as a check list without employing the weights or probabilities their developers have assigned to them. Jackson agreed that some form of adjusted actuarial risk assessment can be expected to represent the highest standard of practice in the field in the coming years.
The final expert for R.S. was Dr. Frederick Berlin, an M.D. psychiatrist and Ph.D. psychologist, who is a professor at the Johns Hopkins University School of Medicine and founder of the Johns Hopkins Sexual Disorders Clinic. Berlin clarified at the onset that the actuarial instruments under discussion are not psychological tests, but are simply statistical methods used to identify whether an individual belongs to a group with certain kinds of risk factors. Berlin echoed the admonitions of all the experts that actuarials are not helpful in making statements about individuals within a group and can only be used to make statements about the likely outcome for the group as a whole.
Berlin was very critical of the RRASOR, which consists of four simple questions regarding the subject's number of prior sexual offenses, the age of the offender, whether the offenses involved incest, and the gender of the victims. In practice, the method usually comes down to just two questions because most individuals being evaluated are over age twenty-five and incestuous offenders are rarely considered for civil commitment. All the RRASOR is really saying is that offenders who have had more prior offenses and who have had male victims are more likely to reoffend, and that "there's some consensus in the literature [which] suggests that if you've had a male as opposed to female victims you may be at heightened risk." Berlin said when the RRASOR was tested on 230 sex offenders in Minnesota, it did "abysmally" in predicting who would reoffend and who would not. Interpreting the RRASOR's correlation coefficient, Berlin explained that it indicated that only 2% of what determined whether or not someone would recidivate was being measured by the instrument. "Ninety-eight percent of what determined whether a given individual would or would not recidivate had nothing to do with anything that was being measured by the RRASOR." Berlin concluded that even though there is a statistical correlation between RRASOR scores and recidivism, in practical terms, to him, it means "virtually nothing."
Concerning the CARAT, Berlin testified that he had seen no studies concerning its predictive validity. He also questioned its accuracy with regard to R.S. Berlin compared Kelly's report that the CARAT placed R.S. within the group found to recidivate at a rate of 70% within five years with a study done of 600 men with similar backgrounds being treated at the Johns Hopkins Sexual Disorders Clinic that indicated the five-year recidivism rate was 10%. Berlin concluded that, based on his clinical experience, the CARAT prediction could not be accurate.
Berlin described the RRAS as not an actuarial method at all because it is without any basis in statistical analyses. It was designed as a common sense, reasonable approach that allows offenders to be classified into tiers for community notification purposes, but there is no statistical basis for its scoring. Berlin knew of no study which indicates that the RRAS can accurately predict recidivism.
The MnSOST-R was also strongly criticized by Berlin as misleading and having little predictive value. The studies which have been published make it appear promising mostly because of comparison with the earlier MnSOST which was "horrible," in his view. Berlin discussed the Static 99 and noted that it has a correlation coefficient of 0.33, corresponding to a predictive value of about 11%. This is better than any of the other instruments to date. Although some would call this a "moderate" correlation, in real life terms it means that 89% of the determinative factors are not addressed by the instrument.
Berlin summed up his objections to the use of actuarial instruments as follows:
If people want to bring information in about prior offenses and say that there's some literature 'cause most people probably have figured out that the more offenses you have you're probably likely at recidivating. Fine. But don't bring in a test that only says something about a group as a whole, suggests that it says more about an individual than it can possibly say, and give an impression.
And this is my worry that there's more of a science to this than there really is. There isn't that much of a science when it comes to individuals.
Berlin did support the use of actuarial instruments as screening tools ÄÄ to see who should be subject to a commitment hearing, for example. In fact, he acknowledged that actuarials are used generally in the psychological community as screening tools. But he does not believe that they should be accepted as accurate statements about the likelihood of a given individual to recidivate. The actuarial instruments simply cannot make distinctions among people within the group as a whole.
Judge Freedman reviewed the applicable case law and observed that the final determination of dangerousness lies with the courts and not the expertise of psychiatrists and psychologists. He discussed the difference between "hard scientific evidence" and evidence of a psychological nature, noting that some scholars believe jurors do not view psychological evidence as very scientific or worthy of unquestioning acceptance. He also recognized many experts believe that "flexible, less stringent standards of reliability are appropriate [for psychological evidence] and that no one factor should be dispositive."
Although the judge acknowledged that unreliable psychological testimony might mislead a jury, he concluded that where the court is the trier of fact, as here, the risk of confusion from expert testimony is greatly diminished. He found that [t]here is little question in the Court's mind that the use of the risk assessment instruments here would be sanctioned by the relaxed standards discussed in State versus Harvey[, 151 N.J. 117 (1997)].
It is also this Court's opinion that on the record here, it is clearly established that these tests are admissible or can be used by an expert in creating expert testimony under the Frye [v. United States, 293 F. 1013 (D.C. Cir. 1923)] test as well.
Judge Freedman reviewed the expert testimony and found that it provided an empirical basis for the instruments and for the conclusion that they are to some degree predictive of future recidivism. He observed that both Jackson and Berlin, experts for the State, indicated that the instruments can be useful if their limitations are understood and noted that even Otto used them for screening purposes. The judge concluded that the actuarial instruments can be relied upon in making decisions about future dangerousness. The judge also observed that Berlin had agreed with the State's experts that the actuarial instruments are not psychological tests and hence not subject to the standards for psychological testing.
The judge stated that the crucial question is to what extent the percentages that the actuarials assign with regard to recidivism can be applied to the individual sex offender. He concluded that the instruments have not yet reached the point where they can apply a particular percentage to a particular person. However, he found them nevertheless of evidential value and helpful to know that a person is within a group where 60% recidivate as opposed to a group where only 30% recidivate.
Judge Freedman concluded that the actuarial instruments meet the Frye test because they are based on valid scientific principles, there have been sufficient reliability and validity studies done to justify their use, and they are used for risk assessment purposes throughout this country.
Concerning R.S., the judge reviewed the evidence presented at the commitment hearing and found that he has a mental abnormality which predisposes him to commit acts of sexual violence. Although the judge said that in some cases it might be difficult to draw the line between an individual who poses a threat to the community and one who does not, "in this case, I don't think it's close at all. I think that clearly R.S. clearly has the propensity to commit acts of sexual violence and it is to such a degree that he undoubtedly poses a threat."
R.S. first urges that in order to be used as evidence in SVPA commitment hearings, actuarial instruments must satisfy the test for admissibility established by Frye v. United States, 293 F. 1013 (D.C. Cir. 1923). He contends that actuarial instruments do not satisfy the Frye standard because they are not generally accepted by the scientific community and their reliability has not been established by formalized scientific testing or peer review. R.S. also asserts that the prejudicial effect of the instruments far outweighs their probative value, even in this non-jury context. The State responds that these actuarial instruments are well accepted in forensic psychology as shown by the testimony of experts in the field, the widespread use of these tools in other jurisdictions, and the many professional publications concerning them.
The standard of review which applies to a court's determination of the admissibility of scientific evidence at an SVPA commitment hearing is not settled. Generally, appellate courts apply an abuse of discretion standard to the evidentiary rulings of a trial court. State v. Conklin, 54 N.J. 540, 545 (1969). However, our Supreme Court has held that when the matter involves novel scientific evidence in a criminal proceeding, "an appellate court should scrutinize the record and independently review the relevant authorities, including judicial opinions and scientific literature." State v. Harvey, 151 N.J. 117, 167 (1997), cert. denied, 528 U.S. 1085, 120 S. Ct. 811, 145 L. Ed. 2d 683 (2000) (DNA evidence). This review may even include post-hearing publications, since "general acceptance may change between the time of trial and the time of appellate review." Id. at 168.
In Harvey, the Court stated in dicta that [o]n this appeal, we do not decide whether a different standard of appellate review should apply to a trial court's decision to admit or exclude expert testimony in civil cases, where the focus is not on whether the scientific evidence is generally accepted, but rather whether it derives from a reliable methodology supported by some expert consensus. [Ibid.]
We question whether the Court has established a different standard of review for the admission of scientific evidence in all civil cases, including SVPA commitments. However, at a minimum, Harvey makes reasonably clear that when a trial court applies the Frye test to admissibility determinations, an appellate court should employ a de novo standard of review, which we apply here.
Two recent decisions of our Supreme Court, Matter of Registrant G.B., 146 N.J. 62 (1996), and Matter of Registrant C.A., 146 N.J. 71 (1996), address the use of RRAS assessments in the tier hearing process under Megan's Law. R.S. contends that the holdings in these cases should not be applicable to SVPA commitment proceedings because commitment threatens a much more substantial liberty interest than community notification, tier hearings are not governed by the rules of evidence, and the underlying statutory schemes are not comparable. The State disagrees.
In C.A., the Court held that the RRAS "is an appropriate and reliable tool" whose use is consistent with the requirements of the statutes and case law. Id. at 107. Although C.A. involved a proceeding where the rules of evidence did not apply, the Court's reasoning in C.A. is instructive as to its view of actuarial instruments in general.
After reviewing the development of the RRAS, the Court stated that
[a]lthough the weights assigned to the categories in the [RRAS] Scale have not been scientifically proven to be valid, the State has produced sufficient evidence to convince us that the factors used in the Scale are reliable predictors of recidivism and are weighted in the Scale according to their relative effectiveness as predictors. The greater weight attached to the static categories is in accord with expert opinion on criminal sexual behavior. [Id. at 105.]
Most important, the Court stated that scientific literature has shown "that the use of actuarial concrete predictors is at least as good, if not in most cases better, in terms of reliability and predictability than clinical interviews." Id. at 106. Although C.A. argued that the RRAS was unreliable, untested, and arbitrary in operation, and the Court agreed that the RRAS had not been empirically validated through scientific field studies, the Court nevertheless found that the factors which comprise the RRAS "have been shown to be the best indicators of risk of reoffense." Id. at 107. The Court further observed that "one of the great strengths of the [RRAS] is that it can provide consistent measures of risk of reoffense." Id. at 108. Finally, the Court observed that the RRAS "is not a scientific device. It is merely a useful tool to help prosecutors and courts determine whether a registrant's risk of reoffense is low, high, or moderate." Ibid.
In G.B., the Court again soundly endorsed the RRAS, holding that a registrant could not present evidence at a tier hearing to challenge its predictive validity. G.B., 147 N.J. at 85. The Court fully realized that community notification under Megan's Law implicates "significant" liberty and privacy interests that trigger the doctrines of procedural due process and fairness. Id. at 74. Yet, the Court reasoned that the RRAS "is only a tool, albeit a useful one" that serves as "a guideline for the court to follow in conjunction with other relevant and reliable evidence in reaching an ultimate determination of the risk of reoffense." Id. at 80-81; see Matter of E.I., 300 N.J. Super. 519, 527-28 (App. Div. 1997) (the court need not blindly follow the RRAS tier calculation but may place a registrant in a lower-risk tier based on the non- violent and consensual circumstances of the case ÄÄ defendant age 21; victim age 15.)
The Court in G.B. analyzed the role of experts at a tier hearing under the case law pertaining to civil commitments in general, implying that the same rules apply in both types of proceedings. G.B., 147 N.J. at 86-87. The Court cited In re D.C., 146 N.J. 31 (1996), which concerned the civil commitment of a convicted and paroled sex offender, as representing "how trial courts should view their role in the presentation of expert testimony when a liberty interest is at stake." 147 N.J. at 86. Contrary to R.S.'s position that C.A. and G.B. are not applicable to SVPA hearings because the liberty interests and the evidentiary standards differ, the Court appeared to say that the same evidentiary standards for admissibility apply whenever a liberty interest is at stake. Under this reasoning, if the RRAS is "presumptively reliable," id. at 82, at a Megan's Law tier hearing, it is also presumptively reliable at an SVPA commitment hearing, especially when used only in conjunction with respectable clinical testimony, and indeed as merely ancillary thereto. The Court's reliance on the utility of RRAS was reiterated recently in Matter of Registrant J.M., __ N.J. __ (2001) (decided March 2, 2001) (slip op at 14-18).
The approach taken by the Court in finding that the RRAS is simply one piece of the adjudicatory puzzle ÄÄ a reliable tool which aids the court in reaching the ultimate determination of dangerousness ÄÄ is directly applicable to questions of its admissibility at an SVPA commitment hearing. Indeed, while the rules of evidence do not apply to Megan's Law tier hearings, the Court seems to be saying that due process and fundamental fairness requirements supercede those rules and require that the risk of reoffense be fairly evaluated by the trial court. 147 N.J. at 74- 75. The Court specifically concluded that the RRAS is a fair and reliable tool to aid in that evaluation. Since the RRAS satisfied the requirements of due process and fundamental fairness in G.B., we conclude it also satisfies these constitutional elements in the present matter. For us to find that the RRAS is not admissible at all in SVPA commitment hearings would disregard the Court's holdings in C.A. and G.B.. Furthermore, because the testimony before Judge Freedman was uncontroverted that the RRAS is the least widely used and least experimentally supported of the relevant actuarial instruments, we conclude that the other risk assessment tools are admissible as well.
In C.A. and G.B., the Court took notice of the fact that the use of actuarial predictors is at least as reliable, if not more so, than clinical interviews. C.A., 146 N.J. at 105. Further, the Court held that allocating weight to risk factors in accordance with scientific literature and expertise is an acceptable method of predicting future criminal sexual behavior. Ibid. Also, the Court found it highly desirable to have a method which provides consistency in risk determinations. Id. at 108. Finally, the Court specifically stated that the RRAS is presumptively reliable in measuring a sex offender's risk of reoffense. G.B., 147 N.J. at 81-82. For these reasons, we uphold the admissibility and use of actuarial instruments at SVPA hearings as a factor in the overall prediction process under the precedent of C.A. and G.B.
We next consider the question of admissibility of expert scientific evidence under the more general standard. The admission of expert testimony is governed by N.J.R.E. 702, which provides:
If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education may testify thereto in the form of an opinion or otherwise.
This rule imposes three basic requirements on the admission of expert testimony:
(1) the intended testimony must concern a subject matter that is beyond the ken of the average juror;
(2) the field testified to must be at a state of the art such that an expert's testimony could be sufficiently reliable; and
(3) the witness must have sufficient expertise to offer the intended testimony. [State v. Kelly, 97 N.J. 178, 208 (1984).]
Both parties agree that the experts presented at the hearings in this matter were properly qualified to offer testimony regarding sex offender risk assessment. Both parties also agree that the experts' specialized knowledge was beyond the ken of the average juror and helpful to the court in deciding the issue of admissibility of actuarial instruments. The sole question, then, is whether actuarial instruments as indicators of sex offender recidivism have achieved a state of the art so that an expert's testimony based in part upon them is sufficiently reliable.
Although the expert testimony at issue involves behavioral science, which is concededly subjective and less tangible than the techniques of physical science, our Court has applied the same test to its admissibility. See State v. Fortin, 162 N.J. 517, 525 (2000) (expert testimony concerning an application of behavioral science must be evaluated under the test for admission of scientific evidence); State v. Cavallo, 88 N.J. 508, 518 (1982) ("the policies for applying the test to physical techniques such as radar apply as well to psychiatric testimony").
New Jersey has long recognized that in order to be admitted into evidence, a novel scientific test must meet the standard articulated in Frye v. United States, 293 F. 1013 (D.C. Cir. 1923). State v. Doriguzzi, 334 N.J. Super. 530, 539 (App. Div. 2000).
Although Frye has been replaced in the federal court system in favor of the more lenient standards of Federal Rule of Evidence 702 as set forth in Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 113 S. Ct. 2786, 125 L. Ed. 2d 469 (1993), in New Jersey, with the exception of toxic tort litigation, Frye remains the standard. [Ibid.]
The test, which has been applied in criminal and civil cases alike, is whether the specific scientific community generally accepts the evidence. State v. Spann, 130 N.J. 484, 509 (1993); Windmere, Inc. v. International Ins. Co., 105 N.J. 373, 386 (1987).
A proponent of a newly-devised scientific technology can prove its general acceptance in three ways:
(1) by expert testimony as to the general acceptance, among those in the profession, of the premises on which the proferred expert witness based his or her analysis;
(2) by authoritative scientific and legal writings indicating that the scientific community accepts the premises underlying the proffered testimony; and
(3) by judicial opinions that indicate the expert's premises have gained general acceptance.
The burden to "clearly establish" each of these methods is on the proponent. [State v. Harvey, 151 N.J. at 170 (citations omitted).]
To establish that a technology is generally accepted in the profession, a party need not necessarily show there is a unanimous belief in the absolute infallibility of the techniques that underlie the scientific evidence. Windmere, 105 N.J. at 378. "The fact that a possibility of error exists does not preclude a conclusion that a scientific device is reliable." Ibid., citing, Romano v. Kimmelman, 96 N.J. 66 (1984). The burden on the proponent of the evidence is to prove that it is a "non- experimental, demonstrable technique that the relevant scientific community widely, but perhaps not unanimously, accepts as reliable." Harvey, 151 N.J. at 171.
Despite strong opposition from many mental health professionals, the United States Supreme Court has held that the testimony of psychiatrists and psychologists bearing on the future dangerousness of an individual is admissible. Barefoot v. Estelle, 463 U.S. 880, 896-99, 103 S. Ct. 3383, 3396-97, 77 L. Ed. 2d 1090, 1106-08 (1983). In reaching that conclusion, the Court quoted the following passage from Jurek v. Texas, 428 U.S. 262, 274-76, 96 S. Ct. 2950, 2957-58, 49 L. Ed. 2d 929, 940-41 (1976):
It is, of course, not easy to predict future behavior.
The fact that such a determination is difficult, however, does not mean that it cannot be made. Indeed, prediction of future criminal conduct is an essential element in many of the decisions rendered throughout our criminal justice system. . . . What is essential is that the jury have before it all possible relevant information about the individual defendant whose fate it must determine. [Barefoot, 463 U.S. at 897; 103 S. Ct. at 3396, 77 L. Ed. 2d at 1106-07.]
This reasoning was followed by the New Jersey Supreme Court in Doe v. Portiz, 142 N.J. 1, 34 (1995), when it found that there is nothing quintessentially inscrutable about a prediction of future criminal conduct. The Court held that an expert, with "a fair degree of experience with sex offenders and their characteristics, along with adequate knowledge of the research in this area," is qualified to testify concerning a sex offender's risk of reoffense at a Megan's Law tier hearing. Id. at 35.
Barefoot allowed testimony concerning future dangerousness based upon the clinical judgment of qualified mental health professionals. In this matter now before us, the experts agreed that substantial research has shown that actuarial instruments may be better predictors of future violence than clinical predictions. The New Jersey Supreme Court has found as fact that actuarial instruments are at least as reliable, if not more so, than clinical interviews. C.A., 146 N.J. at 105.
Since expert testimony concerning future dangerousness based on clinical judgment alone has been found sufficiently reliable for admission into evidence at criminal trials, we find it logical that testimony based upon a combination of clinical judgment and actuarial instruments is also reliable. Not only does actuarial evidence provide the court with additional relevant information, in the view of some, it may even provide a more reliable prediction of recidivism.
With regard to the Frye test standard, the State established that the use of actuarial instruments is generally accepted by professionals who assess sex offenders for risks of reoffense. All of the experts agreed that actuarial tools are in general use in the area of sex offender assessment. Doren's uncontradicted testimony indicated that actuarial instruments are commonly used in thirteen of the fifteen states with SVP laws. All of the experts agreed that the instruments can be useful, if properly understood and administered. They also agreed on the scientific validity of the principles underlying the instruments ÄÄ that sex offenders who recidivate share certain common factors that can be evaluated and weighed to arrive at a statistical prediction of recidivism. The main area of disagreement is whether the instruments have been sufficiently peer-reviewed and validated to be considered inherently reliable predictors of future dangerousness. The State's experts expressed the belief that while there is certainly much more work to do in this developing science, the tests are sufficiently reliable to provide clinicians with useful information concerning a sex offender's risk of reoffense. Of the three witnesses who testified on behalf of R.S., only Berlin really disagreed with this statement, contending that the instruments meant "virtually nothing." However, Berlin later partially contradicted himself by admitting that the instruments can be useful for some purposes, such as screening tools to "see who gets to a civil commitment, for example."
Judge Freedman observed that there was a general consensus among the experts that actuarial instruments are, at least to some degree, reliable in predicting future dangerousness. Recognizing that questions concerning the instruments' overall utility and reliability are more properly questions of what weight to accord them than they are questions of admissibility, the judge found that the instruments cannot assign a specific risk to a particular individual. However, Judge Freedman concluded that knowing the risk group to which an offender belongs relative to other offenders is helpful in making a future dangerousness determination.
There is no question that a substantial amount of reliability must be assured before scientific evidence may be admitted. State v. Cavallo, 88 N.J. at 518. The extensive expert testimony in this matter concerning validation studies, cross-validation studies, reliability studies, correlation coefficients, and clinically derived factors attests to such reliability in this context, where the actuarials are not used as the sole or free-standing determinants for civil commitment. They are not litmus tests. There is no requirement that the actuarial instruments be the best methods which could ever be devised to determine risk of recidivism. What is required is that they produce results which are reasonably reliable for their intended purpose. As observed, the Supreme Court, after reviewing the development of the RRAS, has concluded that it is a presumptively reliable instrument. G.B., 147 N.J. at 81-82.
We are convinced that "[w]hat constitutes reasonable reliability depends in part on the context of the proceedings involved." Cavallo, 88 N.J. at 520. Admissibility of the evidence entails a weighing of reliability against prejudice in light of the context in which the evidence is offered. Expert evidence that poses too great a danger of prejudice in some situations, and for some purposes, may be admissible in other circumstances where it will be more helpful and less prejudicial. [Ibid.]
SVPA commitment hearings are tried before a judge, not a jury. The court understands that it is the ultimate decision maker and must reach a conclusion based upon all of the relevant evidence "psychiatric or otherwise ÄÄ according each type such weight as [it] see[s] fit." State v. Fields, 77 N.J. 282, 308 (1978). An experienced judge who is well-informed as to the character of the actuarial instruments and who is accustomed to dealing with them is much less likely to be prejudiced by their admission than a one- case, fact-finding jury would be. The judge can accord the appropriate weight to actuarial assessments in any given case, or reject them. This conclusion follows directly from the Court's reasoning in Cavallo:
the expert evidence in parole determinations is offered . . . to establish whether there is substantial likelihood that the inmate will commit future crimes. . . . Unlike a jury trial, in the parole context the evidence is heard by the Parole Board, which routinely makes predictions of the sort required and which has substantial experience evaluating psychiatric testimony. Therefore, such testing can be of reasonable assistance to the factfinder there without the concomitant risk of serious confusion, prejudice and diversion of attention that is likely to result from its admission in a jury trial. [Cavallo, 88 N.J. at 525-26.]
We find R.S.'s N.J.R.E. 403 argument that the actuarials' prejudicial effect outweighs their probative value in this non-jury proceeding unconvincing.
Similarly unavailing is R.S.'s contention that the actuarial instruments cannot be accepted in the scientific community because they were not developed in accordance with the rules of ethics followed by psychologists. The fact that the development and use of the actuarial instruments does not conform with the American Psychological Association's Ethical Principles of Psychologists and Code of Conduct and the American Educational Research Association's Standards for Educational and Psychological Testing was a main thrust of Otto's testimony. Judge Freedman considered this testimony, but rejected it, agreeing with the State's experts and Berlin that the actuarials are not psychological tests and hence not subject to the association's standards for psychological testing. This conclusion is amply supported by the evidence. Actuarial instruments do not measure psychological constructs such as personality or intelligence. In fact, they do not measure any personal attributes of the particular sex offender at all. Rather, they are simply actuarial tables ÄÄ methods of organizing and interpreting a collection of historical data. The standards governing the development and use of psychological tests are not applicable here.
While "[p]roof of general acceptance within a scientific community can be elusive," Harvey, 151 N.J. at 171, the State has established that the actuarial instruments are reliable tools for help in predicting a sex offender's risk of reoffense. However, even if we had doubts that the instruments have "passe[d] from the experimental to the demonstrable stage", ibid., we think the State has demonstrated general acceptance through the other two prongs of the test set forth in Harvey.
The second aspect of the Harvey test provides that a proponent can establish general acceptance "by authoritative scientific and legal writings indicating that the scientific community accepts the premises underlying the proffered testimony." Id. at 170. At the hearing and on this appeal, the State has presented numerous articles and pamphlets concerning the development and use of actuarial assessment instruments which have appeared in such journals as Psychology, Public Policy and Law, The Journal of Psychiatry and Law, Law and Human Behavior, and The Family Law Quarterly. Although many of the articles are by the same author, Dr. R. Karl Hanson, of the Department of the Solicitor General of Canada, others are written by academics and forensic psychologists from around this country.
In a very recent article, not available to Judge Freedman, researchers report the results of a meta-analytic study comparing three actuarial instruments: the RRASOR, the SACJ-Min (Structured Anchored Clinical Judgment Minimum), and the Static 99. R. Karl Hanson and David Thornton, Improving Risk Assessments for Sex Offenders: A Comparison of Three Actuarial Scales, 24 Law & Hum. Behav. 119 (2000). The researchers found that for prediction of sex offense recidivism, the Static 99 is more accurate than either the RRASOR or the SACJ-Min. Id. at 127. Further, the study showed that all three tests exhibit similar predictive accuracy both for rapists and for child molesters. Ibid. The correlation coefficient for the Static 99 was given as 0.33, which was reported to represent moderate predictive accuracy. Id. at 129. While admitting that the correlation coefficient of 0.33 accounts for only about 10% of the variance, the authors noted that
[e]stimating absolute recidivism rates is a difficult task because many sex offenses go undetected. Observed recidivism rates (especially with short follow-up periods) are likely to substantially underestimate the actual recidivism rates. Nevertheless, Static 99 identified a substantial subsample of offenders (approximately 12%) whose observed sex offense recidivism rate was greater than 50%. At the other end, the scale identified another subsample whose observed recidivism rates was only 10% after 15 years. Differences of this magnitude should be of interest to many applied decision makers. [Id. at 130.]
Concerning the use of the Static 99 in sex offender risk assessments, the authors conclude that Static 99 does not claim to provide a comprehensive assessment for it neglects whole categories of potentially relevant variables (e.g., dynamic factors). Consequently, prudent evaluators would want to consider whether there are external factors that warrant adjusting the initial score or special features that limit the applicability of the scale (e.g., a debilitating disease or stated intentions to reoffend). Given the poor track record of clinical prediction, however, adjustments to actuarial predictions require strong justifications. In most cases, the optimal adjustment would be expected to be minor or none at all. [Id. at 132.]
Of interest, Dr. Otto, who testified at the evidentiary hearing for R.S., co-authored an article which states, "[w]hile an actuarial approach to assessment may be distasteful to the judiciary, which displays a preference for focusing on the individual case rather than class membership, this approach is the most accurate approach to psychological assessment." Randy K. Otto & James N. Butcher, Computer-Assisted Psychological Assessment in Child Custody Evaluations, 29 Fam. L. Q. 79, 87 (1995). This statement seemingly conflicts with testimony given by Otto at the hearing. R.S. contends that the article is irrelevant to the question of sex offender risk assessment and only serves to illustrate that Otto "harbors no bias against the use of actuarials in general." However, Otto's endorsement of actuarial assessment in family law custody matters does show his acceptance of the scientific principles underlying actuarial methods and his recognition that they can be useful in judicial proceedings.
In an article reporting the results of a meta-analysis of the merits of actuarial methods, two researchers from the University of Minnesota conclude that to use clinical judgment in preference to actuarial data when predicting such things as risk of recidivism "is not only unscientific and irrational, it is unethical." William M. Grove & Paul E. Meehl, Comparative Efficiency of Informal (Subjective, Impressionistic) and Formal (Mechanical, Algorithmic) Prediction Procedures: The Clinical-Statistical Controversy, 2 Psychol. Pub. Pol'y & L. 293, 320 (1996). These two researchers are ardent supporters of actuarial approaches in a wide variety of contexts. R.S. contends that Grove and Meehl provide no information concerning the reliability of the actuarial instruments challenged in this case. While it is true that the article does not discuss specific methods of sex offender risk assessment, it does strongly support the position that actuarial methods are superior to clinical judgment. In fact, it even implies that evaluations should be conducted based upon actuarial findings alone, unmodified by clinical adjustments.
Hanson, a leading authority in the field of sex offender assessment, compares the empirically-guided clinical approach to risk assessment with the pure actuarial approach and the adjusted actuarial approach and concludes:
[a]s research progresses, actuarial approaches are expected to substantially outperform the guided clinical approaches, but currently each approach has demonstrated roughly equivalent (moderate) predictive accuracy. In particular, each of these approaches can be expected to reliably identify a small subgroup of offenders with an enduring propensity to sexually reoffend. [R. Karl Hanson, What Do We Know About Sex Offender Risk Assessment? 4 Psychol. Pub. Pol'y & L. 50, 67 (1998).]
R.S. stresses that Hanson admits that even the best actuarial instruments are far from perfect and it would be imprudent for a clinician to automatically defer to them. However, Hanson has never claimed in any of his articles that actuarials are perfect, only that they are a marked improvement over unaided clinical judgment. The main point of Hanson's What Do We Know About Sex Offender Risk Assessment? is that actuarial instruments can provide a court with useful information, particularly with regard to offenders falling at either extreme of the risk scale.
R.S. submitted two articles in support of his contention that actuarial instruments are still experimental methods that are unreliable predictors of future dangerousness. In the first, the author reviews currently used assessment procedures and criticizes both the RRASOR and the MnSOST-R as being experimental procedures "that cannot support expert testimony in a legal proceeding." Terence W. Campbell, Sexual Predator Evaluations and Phrenology: Considering Issues of Evidentiary Reliability, 18 Behav. Sci. & L. 111, 123 (2000). Although Campbell's article makes numerous assertions of a legal nature, he does not support his legal conclusions with citations to relevant case law or statutes. The impression one gets from reading the article is that while the author has extensive experience testifying as an expert witness in commitment hearings, he has little experience actually developing or testing risk assessment techniques.
The second article is a well-balanced and easily understood review of current sex offender risk assessment techniques. Judith V. Becker and William D. Murphy, What We Know and Do Not Know About Assessing and Treating Sex Offenders, 4 Psychol. Pub. Pol'y & L. 116 (1998). The authors observe that "improvements in prediction have been made using actuarial methods" and "actuarial prediction continues to outperform clinical prediction." Id. at 124. The authors do note, however, that even though actuarials significantly improve prediction over pure chance, they still produce a number of false positives and negatives. Id. at 126. For that reason, despite the "significant advance" their application to present individual offenders is still problematic. Ibid.
While the number of articles presented by both parties in this matter is not vast, our Court has never required a specific number of articles to satisfy the test of general acceptance. Harvey, 151 N.J. at 174. Rather, the focus should be on "whether existing literature reveals a consensus of acceptance regarding a technology." Ibid. "Further, 'under appropriate circumstances, speeches, addresses, and other similar sources may be used to demonstrate the acceptance of a premise by the scientific community.'" Ibid., quoting State v. Kelly, 97 N.J. 178, 211 n.17 (1984).
Our Supreme Court observed, in its most recent Megan's Law registration case,
"More recent studies continue to demonstrate that static factors, such as offense history, are better predictors of long-term recidivism. R. Karl Hanson & Andrew J.R. Harris, Where Should We Intervene? 27 Criminal Justice & Behavior 6 (2000) (citing R. Karl Hanson & Monique T. Bussiere, Predicting Relapse: A Meta-Analysis of Sexual Recidivism Studies, 66 Journal of Consulting & Criminal Psychology, 348-362 (1998). [Matter of Registrant J.M., __ N.J. __ (2001) (slip op. at 16)].
In sum, the State has produced articles which endorse the use of actuarial instruments to predict the future dangerousness of sex offenders. All of these articles are serious, scientific efforts to tackle the elusive problem of looking inside the human mind and emotions to predict the future. In addition, uncontested evidence at the hearing indicates that numerous workshops are held throughout the country on the subject of risk assessment each year and that a great deal of material is available over the Internet concerning the use of actuarial instruments for sex offender evaluations. This evidence, taken together, establishes that actuarial instruments are an accepted and advancing method of helping to assess the risk of recidivism among sex offenders. Thus, we find the second prong of the Harvey test has been established.
Finally, a proponent may prove the general acceptance of novel scientific technology "by judicial opinions that indicate the expert's premises have gained general acceptance." Harvey, 151 N.J. at 170. There are reported opinions from other jurisdictions in which courts have accepted the results of actuarial risk assessments into evidence. Actuarial assessments have been used at sexual predator commitment hearings in California, Washington, Wisconsin, Minnesota, Florida and Illinois. In People v. Poe, 88 Cal. Rptr. 2d 437, 440-41 (Ct. App. 1999), the court found that a RRASOR score in the high risk category adjusted with appropriate clinical factors supported a finding that the defendant was likely to engage in sexually violent behavior if released. See also People v. Otto, 95 Cal. Rptr. 2d 236, 241-42 (Ct. App.), review granted, 6 P.3d 149 (Cal. 2000) (results of RRASOR admitted into evidence at SVP trial). In Garcetti v. Superior Court, 102 Cal. Rptr. 2d 214, 239 (Ct. App.), review granted, (March 21, 2001), the court held that the Static 99, "a psychological instrument that uses an actuarial method to produce a profile of a person's likelihood of reoffense with an accuracy rate of over 70 percent, and that is supplemented or adjusted by use of clinical factors, can form the basis for an expert opinion on future dangerousness."
In In re Linehan, 557 N.W. 2d 171, 189 (Minn. 1996), vacated on other grounds, 522 U.S. 1011, 118 S. Ct. 596, 139 L. Ed. 2d 486 (1997), an individual committed under Minnesota's SVP law challenged his commitment because the State's expert had failed to use actuarial methods in his risk assessment. The civilly committed appellant argued that by failing to perform actuarial analysis, the State had ignored "state of the art" evidence and the "best available scientific knowledge and methodology." Ibid. The court rejected this argument, noting that the state's expert in fact did rely on base rate statistics in arriving at his recommendations. Ibid. The Minnesota court also found that enhanced accuracy can be achieved by combining actuarial methods with clinical judgment. Ibid.
In In re Detention of Campbell, 986 P. 2d 771, 779 (Wash. 1999), cert. denied, ___ U.S. ___, 121 S. Ct. 880, ___ L. Ed. 2d ___ (2001), the court summarily rejected a challenge to expert actuarial testimony at a SVP commitment trial saying simply that predictions of future dangerousness are admissible and differences in opinion go to the weight of the evidence and not its admissibility. The dissent took the majority to task, however, stating that the real issue in the case was "whether the psychiatric community has accepted the reliability of either the clinical or actuarial method to predict dangerousness." Id. at 787 (Sanders, J., dissenting). After discussing the scientific literature concerning actuarial methods, Judge Sanders concluded that "[s]ince neither the clinical nor the actuarial method to predict the likelihood of reoffense has gained general acceptance in the psychiatric community, the Frye standard has not been met. To hold otherwise would be to allow preference for result to dictate the boundaries of science." Id. at 788. Actuarial instruments continue to be used at SVP hearings in Washington. See, e.g., In re Detention of Thorell, 99 Wash. Ct. App. 1041, 2000 WL 222815 (Ct. App. 2000) (published decision but unpublished opinion; this is an unpublished opinion despite citation as 99 Wash. App. 1041, because the state reporter lists the case only as an affirmance.) In Thorell, the state introduced several actuarial instruments, including the RRASOR, at an SVP hearing.
Cases which have accepted the use of actuarial instruments without comment include Pedroza v. Florida, 773 So. 2d 639 (Fla. Dist. Ct. App. 2000) (RRASOR, MnSOST-R); In re Detention of Walker, 731 N.E. 2d 994 (Ill. App. Ct. 2000) (RRASOR); and In re Commitment of Kienitz, 597 N.W. 2d 712 (Wis. 1999) (five unnamed actuarial instruments); c.f., Westerheide v. State, 767 So. 2d 637, 657-60 (Fla. Dist. Ct. App. 2000), review granted, (January 23, 2001). Our research has revealed no state appellate court decision which has found actuarial instruments inadmissible at SVP proceedings. R.S. also submits two trial court orders refusing to admit the results of actuarial risk assessments into evidence. In Florida v. Klein, Case No. 05-1999-CF-08148 (Fla. Cir. Ct. 2000), the Circuit (trial) Court held, noting in its order that it disagreed with other Circuit Court rulings on the point, that the RRASOR and the MnSOST-R are not reliable predictors of risk of reoffense. Id. at 3. Although the Klein court noted that actuarial methods, in general, are based upon sound scientific principles, it concluded that the RRASOR and the MnSOST-R have not been peer reviewed, do not have instruction manuals, and are not true psychological tests. It is clear from the decision that the testimony of Dr. Berlin, who testified for R.S. at the evidentiary hearing, had a major influence on the Klein court's decision. Ibid. In In re the Detention of Harold Johnson, No. LACV038974 (Iowa Dist. Co. Ct. 2000), the court held that the RRASOR, the MnSOST, the MnSOST-R and the Static 99 are not sufficiently reliable to present to a jury. The court based this holding on the facts that actuarials are still experimental, there has been no peer review of the underlying data, the developers themselves express reservations about the accuracy of the instruments, and there is a lack of experimental data to support the instruments' predictions. Also, the Iowa trial court believed that the results of the assessments would have an exaggerated impact on the jury. In both Klein and Johnson, the trial courts evaluated the admissibility of the actuarial instruments under the Frye standard. (Our research has revealed no subsequent history for either of these cases.) With the exception of Judge Sanders' dissent in Campbell, no appellate courts have considered articulately the Frye standard when upholding the introduction of actuarial instruments at SVP hearings. The reasons given for accepting actuarials have varied from none at all to analyses of the role of experts under Barefoot v. Estelle, 463 U.S. 880, 103 S. Ct. 3383, 77 L. Ed. 2d 1090. Even though other appellate courts have not specifically and articulately approved actuarials under Frye, we find it more significant that they actually have accepted these instruments as reliable and helpful evidence in sexual predator proceedings. We conclude the third prong of the Harvey test has been met.
The State has established that actuarial instruments are generally accepted by mental health professionals who practice in the field of sex offender risk assessment; that there is support in scientific literature, at workshops and on the Internet for the use of these instruments; and that actuarial instruments have been accepted by the courts of at least six other states. We affirm Judge Freedman's conclusion that actuarial instruments satisfy the Frye test and are admissible for consideration by the State's experts in this situation.
We affirm the judgment finding by clear and convincing evidence that R.S. suffers from a personality disorder with a propensity to engage in acts of sexual violence and poses a threat to the health and safety of the community within the meaning of N.J.S.A. 30:4-27.26.