Traffic Stops in Nashville

Hit-rate for Searches

Hit rates are an important way to investigate racial bias in search decisions. If a police officer decides to conduct a discretionary search, they make the decision based on their perceived likelihood of finding evidence of a crime. Probable cause searches, in particular, have an evidentiary burden where the officer must meet the reasonable person standard set by the Supreme Court. The officer must have articulable facts and circumstances that a reasonable officer would believe indicates a crime. If a particular demographic group is more likely to possess contraband, the standard of evidence being applied to that group is higher. Conversely, if a demographic group is less likely to have evidence, officers are more likely to search that group with a lower standard of evidence.

First, code some summary statistics

The following code takes all traffic stops from 2017, groups them by race and gender, removes the stops where gender is unknown, then calculates search and hit rates. Rates are presented as percentages.

race.gender.stats<-coded2017%>%
  group_by(Sex, race)%>%
  filter(Sex!="U")%>%
  summarise(n=n(),
            pcevidence=sum(evidence[probablecause==1], na.rm=T),
            consentevidence=sum(evidence[consentsearch==1], na.rm=T),
            probablecause=sum(probablecause, na.rm=T),
            consentsearch=sum(consentsearch, na.rm=T),
            evidence=sum(evidence, na.rm=T),
            drugs=sum(drugs, na.rm=T),
            weapons=sum(weapons, na.rm=T),
            allsearch=sum(search, na.rm=T)) %>%
  mutate(probcause.rate=round((probablecause/n)*100, digits = 2),
         consent.rate=round((consentsearch/n)*100, digits = 2),
         allsearch.rate=round((allsearch/n)*100, digits = 2),
         allsearch.hitrate=round((evidence/allsearch)*100, digits = 2),
         probcause.hitrate=round((pcevidence/probablecause)*100, digits = 2),
         consent.hitrate=round((consentevidence/consentsearch)*100, digits = 2),
         race.gender=paste(race, Sex, sep = ", "))

Make some plots

p <- race.gender.stats %>% plot_ly(
  x = ~race.gender,
  y = ~probcause.hitrate,
  type = "bar",
  color = ~race
) %>%
  layout(title = "Hitrate During Probable Cause Searches",
         xaxis = list(title = ""))
p

It looks like police are most likely to find evidence when they search white drivers suggesting that they use a lower standard of evidence when deciding to search black drivers compared to white drivers.

Try a logistic regression

First with only race/gender categories

pcdata<-subset(coded2017, coded2017$probablecause==1)
mod<-glm(evidence~ black.male + black.female + hispanic.male + hispanic.female +
           otherrace.male + otherrace.female + white.female , 
         data = pcdata, family = "binomial")
summary(mod)
## 
## Call:
## glm(formula = evidence ~ black.male + black.female + hispanic.male + 
##     hispanic.female + otherrace.male + otherrace.female + white.female, 
##     family = "binomial", data = pcdata)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.335  -1.159   1.028   1.196   1.626  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       0.30427    0.07822   3.890 0.000100 ***
## black.male       -0.34856    0.09227  -3.778 0.000158 ***
## black.female     -0.56500    0.11684  -4.835 1.33e-06 ***
## hispanic.male    -0.10253    0.16583  -0.618 0.536383    
## hispanic.female  -1.31587    0.42021  -3.131 0.001739 ** 
## otherrace.male   -0.36881    0.36781  -1.003 0.316001    
## otherrace.female -0.70973    0.91622  -0.775 0.438555    
## white.female      0.05900    0.14188   0.416 0.677548    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 4754.3  on 3429  degrees of freedom
## Residual deviance: 4711.2  on 3422  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 4727.2
## 
## Number of Fisher Scoring iterations: 4

But…these results might be confounded by other factors. Let’s controll for age, the precinct the stop ocurred in, and whether the stop was investigatory (i.e. the police made the stop based on suspicion rather than a traffic violation).

mod2<-glm(evidence~ black.male + black.female + hispanic.male + hispanic.female +
           otherrace.male + otherrace.female + white.female +
            Age.of.Suspect + precinct + invstop, 
         data = pcdata, family = "binomial")

summary(mod2)
## 
## Call:
## glm(formula = evidence ~ black.male + black.female + hispanic.male + 
##     hispanic.female + otherrace.male + otherrace.female + white.female + 
##     Age.of.Suspect + precinct + invstop, family = "binomial", 
##     data = pcdata)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.679  -1.159   0.822   1.149   1.719  
## 
## Coefficients:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        0.358584   0.192786   1.860 0.062884 .  
## black.male        -0.436664   0.096572  -4.522 6.14e-06 ***
## black.female      -0.649243   0.121917  -5.325 1.01e-07 ***
## hispanic.male     -0.224033   0.172354  -1.300 0.193655    
## hispanic.female   -1.464218   0.423106  -3.461 0.000539 ***
## otherrace.male    -0.550659   0.374837  -1.469 0.141815    
## otherrace.female  -0.770214   0.942702  -0.817 0.413913    
## white.female       0.072878   0.145139   0.502 0.615579    
## Age.of.Suspect     0.004552   0.003543   1.285 0.198799    
## precincteast      -0.170255   0.163716  -1.040 0.298369    
## precincthermatage -0.141240   0.163645  -0.863 0.388089    
## precinctmadison   -0.134295   0.173292  -0.775 0.438360    
## precinctmidtown    0.188981   0.169454   1.115 0.264750    
## precinctnorth     -0.032506   0.171068  -0.190 0.849297    
## precinctother     -0.074706   1.014947  -0.074 0.941324    
## precinctsouth      0.031426   0.165103   0.190 0.849042    
## precinctwest      -1.014721   0.178520  -5.684 1.32e-08 ***
## invstop            0.445370   0.144543   3.081 0.002061 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 4754.3  on 3429  degrees of freedom
## Residual deviance: 4617.0  on 3412  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 4653
## 
## Number of Fisher Scoring iterations: 4
stargazer::stargazer(mod, mod2, type = "html", style = "asr")
evidence
(1) (2)
black.male -0.349*** -0.437***
black.female -0.565*** -0.649***
hispanic.male -0.103 -0.224
hispanic.female -1.316** -1.464***
otherrace.male -0.369 -0.551
otherrace.female -0.710 -0.770
white.female 0.059 0.073
Age.of.Suspect 0.005
precincteast -0.170
precincthermatage -0.141
precinctmadison -0.134
precinctmidtown 0.189
precinctnorth -0.033
precinctother -0.075
precinctsouth 0.031
precinctwest -1.015***
invstop 0.445**
Constant 0.304*** 0.359
N 3,430 3,430
Log Likelihood -2,355.625 -2,308.492
AIC 4,727.249 4,652.985
p < .05; p < .01; p < .001