A mathematical model of stuff I read in “The New Jim Crow”


Throughout high school, my friend David and I had a weekend tradition. We would go to Safeway, stock up on candy, then go to Mid-Town Video and rent either an anime or a trashy sci-fi. Maybe something more artsy like Forbidden Zone if we felt heady. Then we’d stay up late watching movies, taking an occasional break to draw or eat. Life was hard.
In any case, during the first step, the junk food purchase. I would have about 5 bucks to spend on candy. One of these purchases sticks in my mind to this day, forever burned by the horrible sick feeling of hyperglycemia. It was the time I spent all 5 bucks on Swedish fish. The only debate was calculating the best deal per ounce. So M and M’s were out, and Swedish fish were in. Now that I am older and have a frontal lobe, I find it weird that I didn’t want to diversify at all, or possibly get a small amount of better candy (Sorry, Swedish fish are awful). However, the reason I’m beginning this MLK Day essay with this non-story is that it serves as a model for something larger, something which might be eroding the gains made during the civil rights movement.swedish_fish

So, anyway, as a kid, I valued quantity over quality. I bring up this Swedish fish thing because it’s a good example of “the maximum principle”. Let’s say, hypothetically, Safeway had 10 candies with prices p_{1},\dots,p_{10} per ounce of candy. I have n dollars. How do I spend my money? If I pay c_{1}, \dots, c_{10} dollars for each candy, then the total weight of candy purchased is W = (c_{1} / p_{1}) + \cdots + (c_{10}/ p_{10}). How do we maximize W, subject to the constraint that I can only spend 10 bucks, ie. c_{1} + \cdots +c_{10} \leq 10. The solution is that of common sense. I should spend all 10 bucks on the cheapest candy. That this solution is so drastic is a generic characteristic of many optimization problems (namely, linear semidefinite ones). The solutions are pretty extreme, lying on the boundary of possibilities.
This is the maximum principle in a nutshell. If these extreme solutions are practically ridiculous, it should tell us something about our values. Do you really want a full pound of Swedish fish?… really!? No you don’t, because that’s fucking disgusting. Is it possible that police are (perhaps subconsciously) trying to optimize something, which is leading to results which are extremely biased towards convicting black adult males?
There is another reason I bring up this example of the Swedish fish. It illustrates that optimization (possibly with respect to a stupid penalty function) is something we do instinctively. As a high school student, I didn’t know anything about higher mathematics, let alone convex optimization. It was instinctive that I spend my 10 bucks on Swedish fish.
Similarly, it is not outside the realm of possibility for other human entities (such as law enforcement) to inadvertently optimize certain quantities without even thinking about it or being aware of it.

Let us suppose that the justice system was incentivized to produce convictions. In fact, it is suggested that this is the case, to an unreasonable degree in Chapter 2 of “The New Jim Crow” by Michelle Alexander (heretofore referred to as TNJC). The police and prosecutors have finitely many resources at their disposal and must decide how to distribute resources. Maximizing the number of convictions (on average) means maximizing the function



where \alpha_B and \alpha_W are positive and satisfy \alpha_B + \alpha_W = \bar{\alpha} for some bound \bar{\alpha} between 0 and 1. The numbers \alpha_B and \alpha_W represent the amount of resources allocated towards predominantly black neighborhoods and white neighborhoods respectively. We should note that this construction is color blind. All these equations are symmetric with respect to swapping B and W everywhere. From the discussion earlier about Swedish fish, we know what the solution to this problem looks like. The solution is (generically) either (\alpha_B, \alpha_W) = (\bar{\alpha},0) or (\alpha_B,\alpha_W) = (0 , \alpha_W). That is to say, all effort will go towards convicting black people, or all effort will go to convicting white people.  This is communicated well, throughout TNJC:

The prevalence of illegal drug activity among all racial and ethnic groups creates a situation in which, due to limited law enforcement resources and political constraints, some people are made criminals while others are not. Black people have been made criminals by the War on Drugs to a degree that dwarfs its effect on other racial and ethnic groups, especially whites. And the process of making them criminals has produced racial stigma. -Michelle Alexander, “The New Jim Crow”, Chapter 5

TNJC paints a picture of an America with a color blind justice system, which nonetheless produces racist outcomes: an incarceration rate heavily skewed towards black men.
The book illustrates how, once convicted, the rules of the game change, and it’s very hard to become a “normal” citizen again. Moreover, the current system appears to amplify even small racial biases over time by incentivizing higher conviction rates, the use of mandatory minimum sentences, and by applying disproportionate scrutiny to black people.

This can all be modeled via a simple dynamical system in 4 variables (a time-dependent Markov chain). Let us divide the population into 4 parts: black and not convicted (B_n), black and convicted (B_c), white and not convicted (W_n), white and convicted (W_c). We will design a system which models the portion of the population in each of these states. This will be given by the tuple of non-negative numbers (p_{Bn}(k), p_{Bc}(k), p_{Wn}(k), p_{Wc}(k)) where each p(k) denotes the proportion of the population a given time k = 0,1,2,\dots.

In order to understand how this system evolves in time, we need to write a rule, which tells us how the state of things at time k propagates into the future, at time k+1. The most naive model is a Markov model. This means, we must estimate the proportion of flux between various states. For example, the proportion of black non-convicts which become convicts in the next state can be given by \alpha_{B}(k) p_{Bn}(k) since \alpha_{B}(k) represents the police effort put into convicting black people, while p_{Bn}(k) represents people who are yet to be convicted. In this way, we can derive the next state of the system given its current state:

eomfor each time k = 0,1,2,\dots. The constant \epsilon is a positive number which represents the rate at which criminals re-enter society and are considered normal (e.g. they can vote). The book suggests this is rare, which is to say that \epsilon is small. The time-varying numbers \alpha_B(k) and \alpha_W(k) represent how much effort is put forth by law enforcement to obtain convictions. These efforts are obtained by maximizing U(\alpha_B,\alpha_W) at each time k subject to the constraint \alpha_B(k) + \alpha_W(k) = \bar{\alpha} and \alpha_B(k),\alpha_W(k) \geq 0. In order to maximize U we need to estimate the probabilities of producing convictions in white or black neighborhoods.  Such an estimates can only be built from data which police and law-enforcement have access to.  As a proxy we could use the population of known white (or black) convicts as a proportion of all white (or black) people. This means using the estimates:
In a way, this approximation is sensible because it uses the data available to law-enforcement. When a police chief responds to complaints that there is over-policing in black neighborhoods, the common justification is that black neighborhoods are where crime is. From the perspective of law enforcement, one could argue that the data supports dispatching resources to black neighborhoods. However, as is described well in TNJC, using this as a justification for over-policing black communities has the effect of producing outcomes with extreme racial bias.
The maxima of U are achieved by setting


Again, this policy is color blind, in the sense that one gets the same outcome upon swapping B and W everywhere. This would be consistent with law enforcement’s insistence that they are not a racist institution, despite staggering disparities in the application of the law. More profoundly, the equations of motion are color blind too.
Now let us see what happens if we put this dynamical system on a computer. As an initial condition, let us incorporate the fact that 13.2 % of America is black (U.S. Census 2014). So as an initial condition, we could consider the case where roughly 1 in 10 people in both populations are convicts (maybe off by some itty-bitty amount). That is, we may consider

Where \delta = 0.00000001.
With this initial condition, we get this video:


The green represents the portion of the population consisting of black convicts. Blue is the remaining portion of the black population.  Turquoise is the portion of white convicts. Finally, red is the remaining portion of the white population. Assuming that black people never transform into white people and vise-versa, the sum of convicted black (or white) and non-convicted black (or white) people is constant. We observe that despite having initially equal proportions of convictions, the disparity between the two populations is amplified in time, until most black people are convicted while white people  normalize at a slow rate (controlled by \epsilon). At this point, some might object, “you told me your system was color blind! You lied. Why did all the black people get convicted!?”
The answer is that, the system is color blind, but the initial condition determines the final state of things. If instead we used the initial condition

we could get this video

So we see that the initial state of things has a large impact that is amplified in time under our current justice system. This means that small biases which favor black-incarceration are amplified in time under this color blind set of equations, until the entire black population is in prison (admittedly unrealistic, but the qualitative aspects of the model are what’s really important).  It is not difficult to buy the premise that bias in America has not been on the side of black people at some time in our history (this is an understatement if there ever was one). What has been shown is the most simple mathematical model I could come up with, inspired by the main thesis which TNJC illuminates: A color blind justice system is capable of producing racist outcomes. Our justice system is a case in point.

Some takeaways

Admittedly this model is cartoonish and simple. However, if the simplest models admits these extreme properties, then it is not farfetched to assume that more sophisticated and realistic models are capable of admitting them as well. In fact, the model is probably too generous, in that it assumes a color blind police force.  This is something that is explicitly challenged in TNJC (just to name one source). More importantly, the model does hone in on certain phenomena quite well. We are living in world where people are very polarized in their opinions. People who are “pro-police” argue that enforcing the law is the job of the police, and they are just doing what makes sense given the data available. People who question the morality of the status quo (e.g. Professor Alexander, me, #BlackLivesMatter, … ), find the outcome of this logic disturbing. Being convicted carries virtually permanent repercussions (this is why \epsilon is so small in our model). You can’t vote, you can’t get access to publicly funded financial assistance (e.g. Medicaid), your driving privileges can be revoked, etc. These repercussions make it difficult to re-enter society, and the chance of recidivism is high.  The way TNJC puts it, incarceration places individuals in a parallel universe, where the rule of law is completely outside my own experience.

Is there any merit to this system? After all, is it not in the best interest of society that criminals be caught? I don’t think it is in our interest when the application of law is so skewed. Simply put, white people (I really mean non-black here) do not get the same experience as black people. This confounds policy discussions. Our national discussion seems analogous to the following conversation I had with my aunt:

Me: Hey did you see that movie, “28 days later”?

My aunt:  Why would I see that?  I don’t really care for Sandra Bullock.

Me:  Uh.  Sandra Bullock is not in this movie.

My aunt:  Still, it’s a sequel to a Sandra Bullock movie.  Meh, not interested.

Me:  No trust me, Sandra Bullock has nothing to do with this.

My aunt:  I bet you she does, it’s a sequel.

Me:  Really.  “28 days later” is a sequel?  Are you sure you’re not thinking of 28 weeks later?

My aunt:  Yeah.  It’s obvious from the title.  I don’t even need to see the poster.  

[on and on]

This is not an effective conversation.   For those who are confused, see this.

This feeling of two Americas is exacerbated because the number of actual criminals (caught or not) is pretty high in any population. When politicians bring forth tough-on-crime policies, they are trying to get the vote of people who are untouched by such policies. Naively, I would have assumed such policies are applied equally to all racial groups. However, in practice such policies are consistently applied much more heavily to black people than to other races. Where a white person might get a warning or a fine, a black person is convicted. Perhaps this is optimal from the sense of boosting numbers (something incentivized in get-tough policy, again chapter 2 of TNJC), but it is not optimal from any moral perspective worthy of synaptic activity.

The fraction of black men who are criminals (black criminals/black population) is undoubtedly higher than the proportion of non-black men who are criminals. This is just a matter of counting which leaves no room for argument. I could show-off and rattle some numbers, but to be honest my research would boil down to googling, which you can do without my help. However, that a higher proportion of black men are convicted says little in regards to how crime-prone black men are when the asymptotic behavior of the justice system inevitably yields extreme results. This later point is not reflected upon often enough. Perhaps this is because it’s hard to articulate these dynamical notions. Maybe the notions of feedback loops and stability are not exciting unless you’re an engineer or an applied mathematician. In any case, if nobody bothers to articulate these ideas, then the notion of the black criminal as the prototypical criminal will be left unchallenged. So I hope this feeble attempt is not in vain.



If you have the time to the read TNJC, you will see that the model I have presented here is too generous towards our justice system. There are numerous areas where racial bias enters the picture, and I have ignored all these factors by imposing a color blind model.
Therefore, this model might grant our justice system more color-blindness than it is worthy of. These unaccounted effects certainly do not help the situation. Nonetheless, it seems ridiculous to propose ridding people of racial bias as a real solution to our problems.
Instead, what this model suggests is that we should make a model that is able to dampen the effects of racial bias. We will never be angels, but it’s just stupid to double down on our mistakes like this. No matter how many Swedish fish I ate, they were still disgusting. Our justice system should be designed so that entities who are less than angelic can run it.  We need a system that does not amplify bias.

One way to change this is to be skeptical towards legislation which claim to be tough on crime. In particular, mandatory minimums really amplify any decisions made by the justice system, including racially biased ones. This, along with various revocations (such as the right to financial aid or the legal obligation to report prior convictions) make for a small \epsilon in our model. Secondly, if this model has any merit, it suggests that supporting efforts to diversify communities would help immensely.
Police might allocate resources to communities of color based on data and rational argument, but it is embarrassing that this is even an option. Why are we so segregated in 2016? Well, there are reasons, but that’s a whole other can of worms (google “red lining”)… Maybe the next post, but I hope to write something light and funny next time.

So, happy MLK day. The essay will end with the obligatory MLK quote. However, before that, let’s at least mention #BlackLivesMatter. This movement is disruptive. I’m sorry if you found yourself stuck in traffic or something, but every effective civil rights movement in our history has been disruptive. If you read enough history you can just count and observe that there are only a few times per lifetime where one can witness a movement clearly on the side of “progress”. To me, this is one of those points. I support #BlackLivesMatter, and I would like my friends to support them too. Here is the link to the store


where you can donate and/or buy t-shirts etc. As a side note, BLM is trying to obtain regular donations. This would make BLM much easier to operator by allowing them to draw reasonable forecasts of future budgets. Regular donations by passionate supporters is the way wonderful things happen… like HBO.

Okay…. I gotta stop writing now. Let’s end with that quote I promised:

Let us be those creative dissenters who will call our beloved nation to a higher destiny. To a new plateau of compassion, to a more noble expression of humanness.


2 thoughts on “A mathematical model of stuff I read in “The New Jim Crow”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s