Is the science about transgenderism settled?

Transgender activists and their allies claim that it is. For example, last year the New York Times published an op-ed by Nathaniel Frank, director of the What We Know Project at Cornell University, who declared: “Our findings make it indisputable that gender transition has a positive effect on transgender well-being.”

The basis for this assertion was a literature review of fifty-six studies that “directly assessed the effect of gender transition on the mental well-being of transgender individuals.” Frank wrote that the “vast majority of the studies, 93 percent, found that gender transition improved the overall well-being of transgender subjects,” while the rest showed null or mixed results. He claimed that regret is rare, and downplayed the limitations of the research, insisting that “the quality and quantity of research on gender transition are robust, showing unmistakably that it’s highly effective.” That would seem to settle the matter.

But the research does not support these claims. Examining the fifty-two studies that are cited as overwhelming evidence for the efficacy of transition reveals a complex picture that challenges the current transgender agenda. To borrow an analogy, the What We Know Project is trying to show us a brick wall of evidence—solid scientific study stacked on solid scientific study. But a closer look reveals a lot of cardboard painted to look like bricks.

Start your day with Public Discourse

Sign up and get our daily essays sent straight to your inbox.

First, two of the studies have to be excluded. One is out because it is an attempt to mathematically model the cost/benefit of gender transition, rather than providing original research on the effects of transition. The second inadmissible entry is a Brazilian study that the project tries to count twice. Even careful scholars make mistakes, but this double-counting has apparently persisted unnoticed for over a year. Both Cornell and the New York Times overlooked this error, as did everyone who linked to the project as a definitive answer to debate over transition and gender dysphoria. Perhaps having a stack of studies to brandish was more important than assessing or even counting them accurately.

The Five Largest Studies

Surveying the remaining fifty papers reveals significant limitations. The most obvious problem is the low number of subjects in most of the studies. All else being equal, larger samples yield better results; there are even helpful online calculators to help researchers and pollsters determine how many subjects they need for reliable results.

Of the fifty relevant papers identified by the project, only five studies (10 percent) had more than 300 subjects, while twenty-six studies (52 percent) had fewer than 100. Seventeen studies (34 percent) had fifty or fewer subjects, and five of those had a sample size of twenty-five or less. Smaller studies can still be useful, but when a paper’s findings are presented as authoritative, it matters whether it had a sample size of 2800, 280, or twenty-eight. Unfortunately, the creators of the What We Know Project made no effort to distinguish between study types and sizes. Thus, among the studies touted as providing overwhelming scientific evidence for the efficacy of transition were a qualitative study based on interviews with eighteen subjects and a study of twenty-two trans-women (who were compared to twenty-two women as controls) that examined the utility of occupational therapy in transition. The “mounds of scholarly studies” that Frank cites turn out to include a lot of academic molehills.

And it gets worse.

The five largest studies were methodologically weak. The “2014 British study” that Frank cited in his New York Times piece was “a narrative analysis of qualitative sections of a survey” from 2012 that was hosted online by SurveyMonkey and promoted through the UK by LGBT groups and support organizations. A “narrative analysis” of parts of a self-selecting online survey disseminated by LGBT advocates is not scientifically dispositive, even if the New York Times permitted it to be presented as such.

Another large study recruited subjects for its online survey by advertising “on online groups and discussion forums that were dedicated to FTM [female-to-male] members. . . . Upon survey completion, participants were entered into a lottery drawing for cash prizes.” An Internet survey was also used by a study that recruited subjects online and via flyers and postcards in the San Francisco area, though in that case participants only “received a discount coupon redeemable at an Internet store.” Yet another study consisted of a “1-time self-report survey” completed by a “community sample of 573 transgender women with a history of sex work” who “received financial compensation for their time.” These surveys may provide us with some data, but they are not reliable or representative scientific evidence for the efficacy of transition.

The most rigorous of the five largest studies cited by the What We Know Project was a Swedish review of fifty years of applications for sex reassignment surgery. Its relevant finding was that, “The regret rate defined as application for reversal of the legal gender status among those who were sex reassigned was 2.2 percent for the whole period 1960–2010.” Transgender activists often cite results of this sort, but they are ignoring several difficulties.

The first problem is that this methodology probably undercounts the regret rate, as its definition of regret overlooks those who were unhappy with their transition but did not apply to reverse it. It would not count those who succumbed to depression or addiction, or who lived unhappily after transition. Nor does the What We Know Projects note that a related study by some of the same researchers showed a horrifyingly high rate of suicide among its post-surgery subjects—nineteen times that of the general population. Finally, this data is drawn from a population with strict pre-transition screening, and the results likely do not apply where transition is less regulated. It is dangerous to assume that the regret rate of rigorously screened Swedish adults will apply to poorly screened American adolescents.

In sum, in a set of studies notable for small sample sizes, the five largest all have significant weaknesses, especially in relying on self-selecting subjects taking one-time, online surveys.

Small Studies, Serious Flaws

Many of the smaller studies had similar problems. An Australian effort had self-selected subjects complete a short, online survey, and an American study recruited participants “at San Francisco Bay Area transgender community events, as well as outreach through transgender-focused email listservs and websites advertising the study.” A Canadian study relied on “respondent-driven sampling” and offered small incentives for completing the survey and recruiting more respondents. These self-selecting, self-reported, one-time surveys are not worthless, but they are more anecdotal than authoritative. They lack the reliability of representative, controlled scientific studies, and the What We Know Project seems to have intentionally elided the difference.

Nor are these the only studies with problems. Many of the more clinical studies that sought to measure the effects of transition on well-being not only were small, non-representative, and self-reported; they also lacked controls. Furthermore, their results were often more ambiguous than the What We Know Project acknowledged. Some, such as this Brazilian study, showed mixed results, with some areas of well-being improving after transition and others deteriorating. Others, such as this Italian study, included results that complicate the transgender narrative; this research found that trans-women reported lower “body uneasiness” when using cross-sex hormones, but also that “No significant differences were observed between” those who did and did not use cross-sex hormones “in the female-to-male (FtM) sample.” Accepting both halves of this study would discredit the project’s narrative as much as support it, so its architects chose to promote only the result they wanted.

Some research also indicated higher rates of desistence from transition than transgender advocates usually acknowledge. In one Finnish study, for example, of the eighty-eight patients who completed a survey, only thirty-two had completed transition. Most of those who had not completed transition still planned to, but six had been advised to discontinue transition, five had chosen to discontinue transition, and three reported regretting the transition steps they had undertaken. Similarly, in a Dutch study of “325 consecutive adolescent and adult applicants for sex reassignment, . . . 222 started hormone treatment, 103 did not; 188 completed and 34 dropped out of treatment.” This study found that those who completed transition reportedly functioned well and had low rates of regret. However, the average time between surgery and follow-up was less than two years, and another study found that a longer time frame showed worse outcomes. Also, the high number of applicants who did not go through with transition raises questions about how to screen those seeking transition. Even among those actively seeking transition, many have second thoughts.

Some of the more positive studies had further complicating factors. For instance, a pair of studies had to qualify their results by noting that “patients were recruited from a specialized gender unit in Italy where the care pathway provides continuous psychological support. We can’t exclude a positive effect of psychological treatment.” The design of these studies made it impossible to differentiate between the effects of therapy and transition, but the project counted them as unequivocal evidence for the benefits of transition.

More Long-Term Follow-Up Is Needed

A further difficulty with the project’s narrative is that findings associating transition with immediate improvements in perceived well-being do not necessarily contradict the accounts of many persons who later came to regret transition. Those who regret transition often report that they initially felt better; only as time passed did they realize that transition had not resolved underlying problems. Furthermore, reliance on self-reported metrics is a poor substitute for objective evaluations; it is a commonplace that those who are struggling often insist that they are just fine.

Thus, there is a need for longitudinal studies to assess the medium-to-long-term effects of transition on well-being. Unfortunately, many of the longitudinal studies touted by the What We Know Project are crippled by low response rates as well as self-reported data. For instance, a European study of patients who had undergone surgical transition had only sixty-two of 107 patients participate (58 percent), while thirty could not be reached and fifteen refused. Another European study had 201 out of 546 respond—just 37 percent. In addition to these poor response rates, some follow-up studies were also extremely small, such as a sample of twelve FtM patients (out of seventeen) who completed a survey after breast reduction surgery.

It is possible that those who cooperate with long-term studies are representative of those who cannot be found or refuse to participate. It is even possible that non-participants are doing better than those who take part in follow-up studies. But it is also possible that the pool of non-participants is where bad outcomes are most likely to be found—not just those with regrets, but also the depressed, the addicted, the homeless, and the dead. It is reasonable to suppose (and some research suggests) that those with worse outcomes would be unreachable or uncooperative. Of course, this remains speculative so long as so much data is missing. What is clear is that is it reckless to ignore the low response rate of longitudinal studies when promoting them.

Activism, Not Scholarship

The What We Know Project seeks to overwhelm with numbers—fifty scientific studies! But many of these studies have serious methodological flaws. Examining the studies shatters the illusion of overwhelming scientific evidence in favor of transition. The most obvious research difficulty is the small sample sizes. Excluding studies with fifty or fewer subjects would leave thirty papers; removing studies with fewer than 100 subjects would leave only twenty-four. Furthermore, the largest studies had some of the weakest methodologies.

Overall, the research is more limited than the hype surrounding them would suggest, and reveals a much more complicated reality than the simple pro-transition narrative. Even the Obama administration came to similar conclusions about the poor quality of much of the research on the efficacy of transition.

Instead of overwhelming quantitative evidence that transition is the best treatment for gender dysphoria, many of the studies provide a qualitative sample of trans-identified persons who report that transition was beneficial for them. This is important information, but it is not definitive. In general, the studies themselves faithfully reported their methodology and were often frank about their weaknesses. Rather, it was those at the What We Know Project who chose to present these studies as dispositive proof and to ignore or downplay their faults and limitations. This was activism, not scholarship.

When it comes to transgender issues, there is serious cause for concern that the scientific process is being corrupted by political and social pressure. For example, the groundbreaking qualitative study that identified the phenomenon of rapid onset gender dysphoria was subject to aggressive criticism and the author lost a consulting job even though her work was vindicated. Faced with the prospect of such backlash, some researchers may choose not to publish results that might upset transgender activists.

The identification of rapid onset gender dysphoria illustrates why research should not be subjected to such pressure, especially when considered alongside the multiple studies that noted the importance of careful screening before transition in order to minimize the risk of regret—it is disingenuous to pretend that reported regret rates from countries where transition is carefully screened can be applied to nations where it is not.

Even though many transition procedures are irreversible, American activists insist that transition is the only treatment for gender dysphoria, that it should begin immediately, even with children, and that therapeutic efforts to resolve gender dysphoria by reconciling patients to their natural, biological bodies should be outlawed. In their view, any manifestation of gender dysphoria indicates an immutable transgender identity that requires transition.

This radical agenda is simply not supported by the evidence. The studies assembled by the What We Know Project do not prove that transition is the best treatment for gender dysphoria, let alone that it should be the only permissible treatment. Rather, they show that the science is not settled. When it comes to helping those who suffer from gender dysphoria, there is much more work to be done.