A Critical Review of “Gender Mix and Team Performance: Evidence from Obstetrics”

A new study posted by the National Bureau of Economic Research (NBER) claims to find that all-female teams of obstetricians cause fewer maternal complications during delivery than all-male or mixed gender teams. The theory is that gender norms make female obstetricians better at communicating and collaborating, which improves health outcomes.

The authors, Ambar La Forgia and Manasvini Singh, draw their conclusion from an analysis of medical records for all births delivered in Florida hospitals between 2006 and 2018. They note that birth records contain the name of the “attending” physician as well as that of the “operating” physician. In 77% of the 2.5 million records they have, the same person is listed as the attending and operating physician. But 23% of the time different names are listed.

La Forgia and Singh focus their analysis on the subset of births with two different doctors listed so that they can compare the rates of maternal complications with different gender combinations of the two doctors. They claim that when both the lead (as they relabel “attending”) and assisting (as they relabel “operating”) are female, there is the lowest rate of maternal complications. Having both the lead and assisting obstetricians as male results in the highest rate of maternal complications. Having a male lead and a female assistant is a little better than having a female lead and a male assistant, but both produce more complications than all female-teams and fewer than all-male teams.

The authors claim that the gender composition of obstetric (OB) teams is “quasi-random,” so the results should be treated as if they were the causal findings of a randomized experiment. To buttress that claim, they observe that the recorded prior health conditions of patients do not differ across different gender compositions. They also point to the quirks of hospital scheduling and the gender of who happens to be on call as approximating a true experiment.

The study looks impressive, with over a half million observations and 74 pages filled with tables and figures to address anticipated objections. And at first blush their findings seem alarming, claiming that “severe maternal complications are 15.8% higher in male-only teams and 7.1 – 10.8% higher in mixed-gender teams compared to female-only teams.” Closer examination reveals that even if their analysis is completely correct, the benefit of all-female teams of obstetricians would only reduce the rate of having at least one maternal complication by 0.18%, from an average of 2.61% for all cases to 2.43% for those served by all-female teams. In addition, their model estimating the effect of the gender composition of obstetric teams controlling for hospital and year fixed effects only explains 0.7% of the variance in outcomes. Since some of that variance is explained by the year or hospital in which delivery took place, gender composition only accounts for a fraction of 0.7% of the variance in maternal complications. Thus, gender composition does not explain at least 99.3% of the variance. These are tiny and weak effects that are only statistically significant because the analysis examined over a half million observations.

But there are serious flaws in their analysis that should make us doubt even these tiny effects. The rest of this review presents concerns with the validity of their analysis and the credibility of their conclusions.

Biased Selection of Cases into the Analysis

The main difficulty with the La Forgia and Singh study revolves around the fact that they are only examining outcomes for the minority subset of cases in which two obstetricians are listed on the birth record. It is clear that whether a birth record has one or two doctors listed is not random. In fact, it is highly likely that whether a delivery has one or two doctors is correlated both with having a female doctor and with the rate of maternal complications, and not due to anything unique in the communication and collaboration among all female teams. Consequently, the authors introduce a significant bias into their analysis. More specifically, there is a subset of cases selected into the study involving two doctors that are more likely to be typical deliveries with low rates of maternal complications if one of those doctors is female.

Let’s walk through each step of this argument. Having two doctors on the birth record in Florida does not necessarily mean that two obstetricians were both present and working together to deliver the baby. The authors acknowledge in some cases two doctors actively collaborate to manage a complicated delivery but in other cases “the Lead physician monitored the patient up until delivery, but the Assisting physician performed the delivery because of a scheduling change or the Lead attending another birth.” While both scenarios involved communication and collaboration, the nature of communication is quite different – one entails a delivery that is likely foreseen to be complicated (and hence the “assist” with both doctor’s present for the delivery) while the other entails a handoff (with only the “assisting” physician present for the delivery).

It is almost certainly the case that female obstetricians are more likely to be involved in the cases with two doctors that result from a shift change as opposed to an anticipated complicated delivery with one doctor supporting another. This is likely because, on average, female obstetricians tend to work fewer hours than their male counterparts, which naturally leads to more patient handoffs to another doctor for delivery.

A survey of 3,698 OBGYN doctors conducted by the American College of Obstetricians and Gynecologists found that female respondents “worked 10% fewer hours, saw 9% fewer patients, and performed 21% fewer procedures.” Another survey of 541 obstetrician-gynecologists similarly found 22.1% of women reported working 60 or more hours per week compared to 31.5% of male OBGYN doctors. Even after adjusting for age differences, this study found that “women worked 4.1 fewer hours per week than men.”

We find confirmation of this pattern within the results that La Forgia and Singh report. In Table A.1 we see that female obstetricians in their data set have 21.6% fewer deliveries per year than do male obstetricians (200.49 versus 157.17). If female obstetricians work shorter hours, on average, then they will more frequently be involved in the hand-off of patients because of shift changes, which results in a higher percentage of cases with female OBs having two doctors listed on the birth record than for male doctors.

It is also important to note that deliveries that take longer, on average, result in a higher rate of maternal complications. As an analysis of over 50,000 deliveries in Scottland found, “As the duration of second stage of labour increased each hour, the risk of obstetric anal sphincter injuries, episiotomies and PPH [postpartum hemorrhage] increases significantly. Women were over 2 times more likely to have a forceps or caesarean birth.” This analysis confirms the common guidance found on medical web sites, like this warning from the Cleveland Clinic about the increased risks of complications: “Prolonged labor increases your chances of needing a different type of delivery. For example, your healthcare provider may need to use medical instruments, like a vacuum or forceps, to help deliver your baby. Prolonged labor also increases your chances of having a C-section.”

The association between length of labor and the rate of complications means that routine deliveries that take less time and result in fewer complications are more likely to be handled by a single doctor if that doctor is male. But those same routine deliveries without complications are more likely to have two doctors listed on the birth record if the obstetrician is female because they tend to work shorter hours and would be more likely to hand-off those deliveries to another doctor because of a shift change.

The net effect of this is more deliveries without complications involving male obstetricians are handled by a single doctor and excluded from La Forgia and Singh’s analysis. At the same time uncomplicated deliveries involving female obstetricians are more likely to be handled by two doctors and therefore included in the La Forgia and Singh analysis. Their data set ends up disproportionately stocked with routine deliveries involving female obstetricians — the exact kind of low-risk cases that are less likely to appear for male physicians because they work longer hours and have relatively fewer handoffs. Rather than finding that teams of obstetricians involving females reduce maternal complications because of their advantages in communication and collaboration, they are really just drawing conclusions from a biased data set.

Despite filling 74 pages with analyses to address anticipated concerns with their analysis, La Forgia and Singh never present the obvious one to address the issues raised here. They never show whether female obstetricians share the billing with another doctor on the birth record at a higher rate than do male obstetricians. That is, they never show whether the data set they examine over-represents routine deliveries for female doctors while under-representing them for male-doctors.

Using the limited information they do report, I am able to estimate the over-representation of deliveries involving female OBs in their data set focused on those involving two doctors. According to Table A.1, the 1,010 female OBs included in the study are involved in 369,818 deliveries or about 28.17 each per year over the 13 years included in the study. Table A.1 also discloses the average number of births per year for doctors involved in each gender combination. Taking a weighted average for female doctors, yields 157.17 deliveries per year. Dividing 28.17 by 157.17 we see that about 17.9% of the cases involving female OBs have two doctors listed on the birth record. Making the same calculations for male OBs reveals that about 16.2% of their cases have two doctors listed on the birth record. Female OBs are over-represented in the data set La Forgia and Singh examine by about 9.3 percent.

Even modest biases in the data set can yield the results they report. The advantage of all female teams they claim is quite small. According to the “main specification” in Table 3, having an all-female team reduces the rate at which there will be at least one maternal complication by 0.18%. This tiny effect is only statistically significant because they have over half a million cases in their analysis. It is also worth noting that the gender composition of the obstetric team explains less than 0.7% of the variance in outcomes. They have a weak finding easily distorted by the over-representation of routine deliveries involving female obstetricians that are included in their analysis.

Incomplete Medical Record When Controlling for Risk Factors

Another concern is whether all-female teams of obstetricians tend to have patients who are at lower risk of developing complications. It is true that they have medical records that could contain information on “23 patient risk factors that are predictive of maternal complications.” And when they use that information to model the expected rate of complications, they find no difference across the four possible gender compositions of the doctor-pairs that they examine (See Figure 1, Panel A).

But as they acknowledge, “variations in coding behavior by physicians… could influence our ability to accurately assess the patient’s clinical risk.” They attempt to address this data limitation by also controlling for “the number of diagnosis codes recorded in the patient’s medical record” but that doesn’t really resolve the problem. Controlling for the number of diagnosis codes might correct for the fact that doctors inconsistently code problems that are observed, but it does not correct for problems that are unobserved and therefore entirely absent from the medical record.

If patients have not been receiving regular medical care prior to delivery, the doctors may only record the risk factors that the patient discloses at the time of admission. For example, the 23 risk factors include things like “substance abuse or smoking,” “known or suspected fetal abnormalities,” “hypertension,” “spotting complicating pregnancy,” and “excessive weight gain during pregnancy.” These are the kinds of risk factors that may not be entered into the medical record if the patient has not had these issues identified during prenatal care or fails to disclose them when admitted for delivery. The absence of information would be treated the same as not having that risk factor.

According to the March of Dimes, 17.3% of expecting women only begin to receive prenatal care in the second trimester and 7.3% first receive it later or not at all. These women are less likely to have complete medical records that would fully capture their risk factors. These same women are less likely to be active choosers of the obstetric practice they would prefer for delivery.

There are also all-female OBGYN practices that recruit patients by extolling the benefits of having female doctors. For example, a fairly typical practice in Massachusetts emphasizes on its web site: “Women-led OB/GYN practices like Essex County OB/GYN prioritize patient-centered care, focusing on the unique needs, preferences, and lifestyles of each individual. This approach means your voice matters in every decision made, and your care is tailored to suit your specific circumstances.”

More advantaged women with more complete medical records are more likely to be drawn to all-female practices like this even if only because they are more likely to be active choosers of who they want to help with their delivery. Disadvantaged women who lack prenatal care and have incomplete medical records are more likely to be assigned to whoever is on call at the hospital, including more male obstetricians. The difference in the rate of maternal complications can be driven by unobserved differences in risk factors rather than superior communication and collaboration skills among female obstetricians.

Do All-Female Teams Have Intersectional Benefits?

Further evidence to support this suspicion can be found in La Forgia and Singh’s claim that “female-only teams not only achieve the lowest complication rates for Black women, but are also the only team type to have no racial disparity in maternal outcomes.” That is, La Forgia and Singh claim that gender combinations of teams other than all-female ones not only produce higher average rates of maternal complications, but they do even worse with their black patients than with their white ones.

They have no theory to explain why female teams of obstetricians would have particular advantages with respect to black patients. The theory they are advancing is that female teams are better at collaboration and communication, but it is unclear why this should be more important in preventing complications among black patients than among non-black patients other than as a result of being better positioned to manage the additional risk factors that black patients, on average, may bring to delivery.

The patient’s race, like the presence of hypertension, fetal abnormalities, or substance abuse, is among the pre-existing qualities that women bring to delivery. La Forgia and Singh claim to have controlled for all relevant pre-existing issues to make their claim that the communication and collaboration advantages of female teams during labor are what cause lower rates of complications.

Rather than demonstrating an additional benefit of all-female teams, their finding of differential outcomes for black patients suggests that their analysis suffers from incomplete information about risk factors that are more often found among black women and are also correlated with whether patients have all-female teams. Rather than being evidence of a theoretically unlikely intersectional benefit, this result is evidence of serious omitted variable bias.

What About the Baby?

Even if the La Forgia and Singh study were correct in finding a lower rate of maternal complications for all-female doctor teams, their analysis would still be incomplete. There are often trade-offs between risking maternal complications and preventing serious complications for the newborn baby. When delivery stalls and the baby shows signs of distress, doctors have to consider interventions including C-sections that increase the rate of complications for the mother but may save the baby from worse complications. Recognizing that C-section may be over-used does not erase the existence of these trade-offs nor does it mean that the optimal outcome for both mother and baby is the one that minimizes risks for the mother without considering those of the baby.

The Florida data set that La Forgia and Singh are using includes information about outcomes for the baby, but they do not report any analyses of those outcomes. Without considering how all-female teams may affect complications for the babies they deliver, it would be at least premature to conclude that they produce superior outcomes because of better communication and collaboration.

Conclusion

We have no evidence that La Forgia and Singh intended to produce a biased analysis or mislead their readers, but that is what they have in effect done. They have limited their analysis to a minority subset of cases in which two doctor names are listed on birth records without considering how those cases may be selected in a way that biases their result. They have controlled for observed medical conditions without considering how incomplete medical records likely bias their analysis. They have focused on maternal complications related to delivery without considering the possible tradeoffs between those complications and outcomes for the baby.

While these defects in their study may not be intentional, they do not occur at random. Researchers are tempted to overlook or downplay problems when they have strong ideological priors about the results they expect to find. The danger of bias resulting from ideological blinkers has historically been held in check by the enforcement of academic standards and the general adoption of skeptical norms. The process of peer review is further meant to hold ideological priors in check.

Unfortunately, NBER papers, despite being broadly respected and influential, do not go through any peer review process before being posted. And the check of peer review has weakened as more reviewers and editors share the same ideological preferences. The forum of “Researcher, Heal Thyself” is meant to address these shortcomings by offering critical reviews even if the researchers and their peers are unable to identify concerns on their own.

Foreign Entanglements in the Higher Education Compact

Do No Harm’s DEI Tracker Flags 36% of U.S. Medical Schools

What Rank-and-File Physicians Think About DEI and Pediatric “Gender-Affirming Care”: Evidence from Florida

Major medical organizations (e.g. medical specialty societies and the American Medical Association) generally take progressive positions on pediatric sex change interventions, DEI, and adjacent issues. For example, the American Medical Association, American Academy of Pediatrics, and Endocrine Society vouch for the supposed safety and efficacy of pediatric sex change interventions, and many medical organizations have publicly embraced DEI.

Polling data indicates that the official position of medical organizations on gender and DEI issues is often at odds with the popular sentiments of the American public. What remains largely unclear, however, is the extent to which the official positions of medical organizations reflect the opinions of rank-and-file doctors.

Continue reading Do No Harm’s full resource below.

What-Physicians-Think-about-DEI-and-GAC-in-Florida-11-21-25

Here Are the Top-Ranked – and Worst-Offending – Medical Schools

Major Pediatric Gender Studies, Major Flaws

Major DEI Studies, Major Flaws