## Friday, February 26, 2016

### Clinton Many More Rich Supporters Than All Other Candidates Combined

With over 26,600 supporters giving at or above the federal individual contribution maximum of $2,700, Hillary Clinton far exceeds the number of contributions by wealthy donors than any other candidate (Table 1: Huge). No only that but the number of huge contributions for Hillary Clinton exceeded the number by her chief rival Bernie Sanders by a factor of 60 to 1. Table 1: This table shows the number of campaign contributions by size of the contribution. Candidate Itemized TotalHuge ($2700+) Large ($1000-2699) Med ($200-1000) Small ($1-200) Unitimized ($1-200)*
Clinton 106,285,874.12 120,495,96423,620 15,307 33,906 152,640 568,404
Sanders 26,509,365.75 93,883,341394 3,442 29,374 319,374 2,694,959
Carson 23,995,461.23 57,001,5771,137 4,722 24,411 159,917 1,320,245
Cruz 34,821,534.81 54,070,6454,078 4,738 23,909 181,351 769,964
Kasich 7,459,832.14 8,401,5051,837 1,327 1,977 1,524 37,667
Rubio 26,915,622.40 32,371,1755,699 6,183 11,193 36,019 218,222
Trump 1,773,225.61 7,407,238194 289 2,432 2,634 225,360
Unitimized support is assumed given at $25 per contribution. Yet, Clinton is not only getting more support among wealthy people than her Democratic Socialist rival. Clinton also has 77% more support among wealthy people than all other candidates combined. In complete contrast, challenger Bernie Sanders has far more support among small contributors with nearly 5x as many small contributions (less than$200 in total) as Hillary Clinton.

Among Republicans the candidate which comes closest to the grassroots support that Sanders has is Ben Carson who has less than half the number of unitimized donations as Sanders.
 Figure 1: A collection of histograms showing number of contributions grouped by size of contribution.
Figure 1, shows this information is yet another way.

The easiest and clearest way of reading this information is that Hillary is backed by big money and Sanders by small.

## Thursday, February 25, 2016

### Overwhelming Growth In National Support for Bernie Sanders Mapped

The FEC just released the most recent campaign contributor data and the results show a strong continued widespread growth in support for Bernie Sanders across the country.
 Figure 1: A map of what counties and states support Bernie Sanders relative to that of Hillary Clinton in January 2016.
As of the end of January 2016, 88% of states have more reported contributions to the Sanders campaign than to the Clinton campaign.
 Figure 2: A map of what counties and states support Bernie Sanders relative to that of Hillary Clinton in December 2015.
This is a significant growth from December which only reported 75% of states backing Sanders.
 Figure 3: A map of what counties and states support Bernie Sanders relative to that of Hillary Clinton in November 2015.

 Figure 4: Going back to June we can see that the vast majority of states primarily backed Hillary. Sanders was initially very poorly known outside of New England.

From the figures we can see that Nevada is the least supportive state among western states of Sanders while Iowa has been about equally favorably disposed to Sanders as surrounding states. The South continues to be the strongest region of the country supporting Hillary Clinton while the just about everywhere else is beginning to lean increasingly towards Sanders.

We should remember when looking at these maps that using itemized contribution data underestimates the number of individual contributors and contributions as only large donations need to be logged. Bernie Sanders has many more small contributors that are not individually reported than Hillary Clinton, constituting about 74% of his contributions while Hillary Clinton only has about 16% of her contributions too small to report.

The net result is that the reported data vastly underestimates the number contributors to the Sanders campaign relative to those of the Clinton campaign. This information does not capture the type of contributors to each campaign. Hillary Clinton has the largest portion of her funds coming from wealthy donors as any campaign. In fast she has more contributors giving the maximum allowable donation to her campaign than all other campaigns (including Republicans combined).

 Figure 5: Histogram of contribution size. X-axis is the size of the contribution and the y-axis is the number of contributions for that candidate.
From Figure 5, we can see that Hillary Clinton has massively more large campaign contributions than all other candidates combined with a total of 23,620 campaign contributions of value $2700 or more compared with Sanders, Trump, Kasich, Ruz, Rubio, and Carson which collectively only have 13,339 contributions$2700 or more.

In contrast Sanders has raised the majority of his campaign funds from small donors. The amount raised by Sanders from non-itimized small donors is 67 millions which is 13 million dollars greater than the sum of all non-itimized contributions to Rubio, Cruz, Clinton, and Trump combined.

## Monday, February 22, 2016

### Hillary 1993: Largest Drop in Girl Names EVER; Chelsea Distant Second

Recently, I wrote an little post that got a lot of attention and some criticism, As First Lady, Popularity of Babies Named "Hillary" Dropped by an Unprecedented 90%.

The attention was likely due to the large number of people who are attempting to evaluate Hillary Clinton as a viable general election candidate. These people might rightly or wrongly assume that measures of popularity from her as First Lady are at least informative in predicting how she will do in the general election.

The criticism were largely based on my lack of scientific methodology and largely completely supported. The article was more meant as a statistical note rather than as a serious discussion. Yes, among all presidents since Nixon, the popularity of First Lady names have dropped over the term as first lady but none so much as the name Hillary.

However, some good potential alternative explanations are possible:
1. Are such drops typical of female names that peak in popularity in general?
2. Could Hillary Clinton have briefly lead to the popularizing of the name "Hillary" which later fell off after she became first lady?

So, I decided to come back to the data in an attempt to understand if the dramatic drop in the popularity of the name Hillary was just bad luck or likely related to her stint as First Lady, 1992-2001.
 Figure 1: The number of girls born each year which were named either Hillary, Hilary, Chelsea, Chelsey, or Kelsey. All of the names dropped in popularity following 1992.

And that is when I realized how truly unusual the drop in the name Hillary was. Looking at the top 1000 most popular female names that peaked in popularity between 1880 and 2014, the name Hillary peaking the year before becoming First Lady, experienced the single largest drop in popularity of any name during the first year being First Lady, 1993.

Not only that, but the popularity of the Chelsea, Hillary and Bill's daughter, also took a tremendous blow during entire stint of Hillary's term as First Lady.

Table 1: This table show how the name Hillary and Chelsea dropping in popularity from their all time peaks in 1992. Hillary proportion and Chelsea proportion are what percentage of girls were named those names relative to the peak year. The rank is how low that proportion compares with the top 1000 most popular names in that proportion.

#   Year               Hillary Proportion   Rank                  Chelsea Proportion   Rank
11992100.0%100.0%
2199342.2%169.8%22
3199416.2%147.7%8
4199512.3%141.8%12
5199612.4%236.3%9
6199711.7%227.6%5
719989.6%121.8%4
8199910.1%216.8%5
9200010.0%214.7%5
10200110.3%413.1%5

From Table 1, we can see that the popularity of the name Hillary dropped dramatically from the 1992 peak to 42.2% of that peak the next year to 16.2% of that peak the following year to 12.3% of that peak the following year. When compared with the entire top 1000 most popular female names which peaked, the name "Hillary" is ranked either 1 or 2 for the largest drop in popularity for the years between 1992 and 2001. For the 5th, 6th, 8th, 9th, and 10th years other female names have happened to drop temporarily in popularity below Hillary but did not stay that low for long.

It is for this reason that the popularity of the name Chelsea can be ranked 8 on average yet still be considered the second largest drop of any female name in the decade after the name has peaked. Looking at Table 2 we can see that the average 10 year ranking for Hillary is by far the lowest compared with any other of the 1000 names. The name Chelsea on the other hand is ranked second but it is really tied with Latoya and these names are not far ahead than the names Aisha, Mindy, Ciara, and Jaime.

Table 2: This table presents a ranking of the 20 largest drops in popularity of baby names for the last 130 years for the top 1000 most popular female girl names. The Ave 10 Year Rank is the rank of name in terms of largest popularity drop over the ten years following the peak. This is the average for each rank from the rank column such as that found in Table 1. Peak is the year at name peaked in popularity. Proportion 10yr is what proportion of children after ten years relative to the peak year are named that name. We can see that Hillary and Chelsea are the second lowest, but for this indicator they happen to be higher than Latoya which peaked in 1984 and Sheena which peaked in 1984.
#      Ave 10 Yr Rank      Name      Peak    Proportion 10yr
12Hillary199210%
28Chelsea199213%
38Latoya19847%
410Aisha197724%
511Mindy197922%
612Ciara200518%
712Jaime197620%
816Jeannine192928%
917Chelsey199217%
1017Rosalie193829%
1118Sheena19847%
1220Tracey197021%
1328Arielle199129%
1428Peggy195822%
1537Ariel199132%
1638Gale195719%
1738Stefanie198329%
1839Deana197031%
1939Tracy197029%
2040Christie197537%

These tables indicate that the names Hillary and Chelsea hit their peaks simultaneously in 1992 and then dropped dramatically in the decade that followed, more dramatically than any other of the top 1000 female names that peaked in the last 130 years (at least when using the 10 year worse drop ranking average). Surprisingly, the alternative spelling of the name Chelsea as Chelsey and  also hit its peak in 1992 and is ranked as the 9th fastest falling name in popularity on Table 2.

So it looks pretty bad right?

Well, maybe not. Perhaps the name Hillary and Chelsea just became temporarily very popular in 1992 because of the popularity of the First Family, then dropped off in popularity after the family lost its novelty. If this is the case then we should see a tepid or non-existent growth in the popularity of the names Hillary and Chelsea in the 10 years proceeding 1992, perhaps a rapid peak in popularity in 1991 and 1992 followed by a modest decline over time.
 Figure 2: This figure shows how the popularity of the names "Hillary" and "Chelsea" grew and fell relative to other female names that have peaked. The x-axis has been normalized so that 0 indicates the peak year for all female names while negative x represents years before peaking and positive, years after peaking. ALL is the average proportion to peak for ALL of the top 1000 female names. Top 100 is the average proportions for the Top 100 fastest falling names while Top 20 is the average proportions for the Top 20 fastest falling names.

From Figure 2, we can see that the Top 100 and Top 20 names (in terms of those that fell the fastest Table 2) did demonstrate more short term steep rising and falling than that of the average for all names and more so for the Top 20 than the Top 100. This may suggest the names Hillary and Chelsea coincidentally rose and fell in a slightly more extreme version than that of the top falling names.

However, looking at the years preceding the peak year we can see that it is unlikely that the drop in popularity of the names Hillary and Chelsea is due to a temporary fascination with these two names. This is because there was a decade long trend suggesting rising popularity of the names Hillary and Chelsea years before they would have been known as public figures. In addition to that, for the name Hillary in particular the rise in popularity was less dramatic than that of the average for the Top 20 suggesting leading into the peak year that it should lose popularity at a rate between that of the Top 20 and the Top 100.

This is not what happened! Within two year the name Hillary had fallen significantly lower in popularity than it was 10 year prior. The name Chelsea also fell dramatically, such that by the end of the 90s it was far less popular than it was during the beginning of the 80s.

Not only that, but the alternative spelling of the name Hillary as Hilary plateaued in 1990 but holding popularity at 96% of that of the peak until 1992 after which time it dramatically fell to 28% relative to peak levels in 1993. Likewise even the close name Kelsey also peaked in 1992 before falling to 25% of its 1992 levels within a decade.

Table 3: This table list the names which peaked for the years 1990-1994 in order of their total popularity. The column Proportion 10yr is what percent of babies were named this name after 10 years from the peak of name popularity.

#PeakNameProportion 10yr                     #PeakNameProportion 10yr
11990Alyson86%11992Carissa61%
21990Blanca55%21992Chelsea13%
31990Cassandra53%31992Chelsey17%
41990Courtney43%41992Christian41%
51990Cristina50%51992Hillary10%
61990Elizabeth74%61992Kasey44%
71990Erika51%71992Kelsey28%
81990Hilary7%81992Silvia70%
91990Katherine68%
101990Leanna55%11993Alexandra65%
111990Mara82%21993Alexandria63%
121990Meagan41%31993Hayley64%
131990Megan60%41993Jasmine73%
141990Rachael55%51993Kassandra43%
151990Samantha74%61993Katelyn88%
161990Stephanie31%71993Kelsie42%
81993Raven59%
11991Ana91%91993Susana59%
21991Ariel32%101993Tania83%
31991Arielle29%111993Taylor54%
41991Ashleigh53%121993Victoria76%
51991Bianca51%
61991Devon31%11994Alejandra76%
71991Kara50%21994Allison72%
81991Karissa72%31994Briana61%
91991Kayla72%41994Larissa77%
101991Kirsten66%51994Marina58%
111991Mercedes63%61994Marissa56%
121991Molly73%71994Tori47%
131991Shelby45%

But what if there was just an unusual number of babies born in 1992 and perhaps that was a peak year for many baby names?

From Table 3 we can see that the exact opposite seems to be the case. The year 1992 did not have an unusually large number of peaking names but rather only 8 names that peaked which was lower than the average of 11.2 for the five years between 1990 and 1994. By looking within the names that peaked for each of the years, we can also see that across all of the names that peaked that year, names that were variants of Hillary or Chelsea were the names were proportionally the least popular 10 years later. Overall, Table 3 supports the assertion that the popularity of the names Hillary and Chelsea and their variants were likely negatively affected by the Clintons's time in the White House.

Conclusions
From looking more closely at the names data, it seems pretty clear that the popularity of the name Hillary as well as Chelsea and their variants were powerfully damaged by the family's term in the highest office. Not only did the names Hillary and Chelsea decrease in popularity rapidly but this decrease was unparalleled among the top 1000 most popular female names recorded for the last 130 years.

How much this finding should be taken into consideration when choosing a presidential candidate, I don't know.

Finally, I would like to say something personal. Whatever happened with the Clinton family happened a long time ago. I was only 10 in 1992 and Chelsea was only 12. It is really hard to imagine anything that Chelsea could have done that would have warranted the kind of public disgust that would have driven the observed unpopularity of the name Chelsea leading it to drop at a such a rate only truly outmatched by her mother's unpopularity.

This strikes me as unfair. And truthfully, this whole analysis strikes me as distasteful.

I, of all people, am the last person in the world who should be criticizing people for their popularity. This is something I have struggled with as I have been the subject of organized negative attacks and public humiliation previously (see Turkopticon: Defender of Amazon's Anonymous Workforce).

Yet, these matters should be brought to the public light because they are of consideration as Hillary Clinton is seeking the nomination for the Democratic party. If there is some fundamental disgust in how a large portion of the American people see her and her daughter Chelsea then it should be brought forward sooner rather than later.

This is after all an election and good or bad an election is a popularity contest.

Source Code on GitHub

## Friday, February 19, 2016

### Big Business Backs Hillary: Small Bernie

Big business, lawyers, and the financial sector are the largest campaign backers of Hillary Clinton. Collectively they represent 35.5 million dollars donated to her campaign, 38% of total itemized funds donated to the Clinton campaign in 2015.

Bernie Sanders on the other hand is largely backed by a diverse collection of individuals: engineers, health care workers, artists, self-employed, academics, as well as to a much smaller extent business executives. Collectively these seven top donor industries only add up to 3.8 million dollars or about 20% of the itemized funds donated to the Sanders campaign in 2015.

It is worth noting that these numbers underestimate the total funds donated to each campaign due to reporting laws which require only funds adding up to $200 to be reported individually. The different effect of the reporting law is dramatic for the two candidates. For Hillary Clinton, because the vast majority of her funds come from large donors she has to itemize 84% of the funds she receives. Bernie Sanders on the other hand is largely backed by small money and therefore must only report approximately 26% of the funds donated to his campaign. The result of this disparity of reporting is that superficially it appears that Hillary Clinton is raising vastly more money, but in reality her big donors are pretty much being matched by Sanders numerous small donors. See: Analysis: Clinton backed by Big Money: Sanders by Small However, this article is not discussing the size of contributions but the sectors contributing to each candidate.  Figure 1: Proportion of itemized funds in each category. In Figure 1, we can see that the Sanders' coalition is composed of a wide variety of working class individuals as well as intellectuals and artists. Hillary Clinton's coalition is composed much more narrowly of large business, law, and finance donors. In order to do this analysis, I drew from the individualized 2015 year end data reported to the FEC by both campaigns. Within this data individuals report their personal occupation title. Because I wanted to look at just the sectors backing the candidates, I did not include in this analysis contributors who were unemployed, retired, or did not fit within any of these categories. This represented a significant quantity of data with 30% of the individualized contributions for the Sanders campaign and 36% of the funds for the Clinton campaign either not categorized or missing. Likewise for the Sanders campaign 38% of funds came from the unemployed in contrast to only 19% for the Clinton campaign. Those occupations failed to be classified into industries either because they did not seem relevant to the above categories or because they did not appear within the list of top 260 occupations donating to campaigns. The classification of different occupations into different industries was done someone subjectively as well, with a lot of personal judgements. For example, should "Public Relations Executive" be classified in the category "Public Relations" or "Business Executives"? Due to the large proportion of contributions not classified and the difficulty of classifying some occupations I am not 100% confident that my analysis would be consistent if someone else were to do the same thing I did. That said, the differences between the backers of the Sanders campaign and the Clinton campaign seem pretty stark. Thus I think it reasonable to conclude that Sanders seems to be largely supported by the working class and intellectuals in contrast to Clinton who seems to be supported by the business, legal, and finance wizards of the world. Source Code Related Articles: Analysis: Clinton backed by Big Money: Sanders by Small Legally Rig An Election: A Citizen's Guide to Gerrymandering Nevada:Sanders has 6x the Supporters as Clinton The Simple Reason Sanders Is Winning As First Lady, Popularity of Babies Named "Hillary" Dropped by an Unprecedented 90% Hillary Clinton's Biggest 2016 Rival: Herself Cause of Death: Melanin | Evaluating Death-by-Police Data Obama 2008 received 3x more media coverage than Sanders 2016 The Unreported War On America's Poor What it means to be a US Veteran Today ## Thursday, February 18, 2016 ### Legally Rig An Election: A Citizen's Guide to Gerrymandering You are running for class president against a pimpled-nosed, blond barbarian. You have given your best speech and your obnoxious opponent has given his best speech. The teacher is about to call on the class to vote! The time of reckoning is upon you. As she is just announcing a hand raising in support of your opponent, you count in your head: three of your friends clearly in favor of you, three of your opponents friends, and three undecided classmates in the middle, each with a 50% chance roughly of voting for you. If the vote is held now it will all be up to those three undecided/independent voters who gets elected. But wait! You have an idea, you shout out! "Wait wait! I have a fun idea. How about we vote in three groups of three? The winner of the most groups wins the election", you suggest. The teacher, unaware of your wiley ways, shrugs. You act quickly to divide the room into three groups of three. 1. All three of your opponents friends you place in one group. 2. Two of your friends you place in another group with one undecided. 3. The remaining friend you place with the two remaining undecided. You signal happily to the teacher, now you are ready for the votes to be cast. Remember, previously, you calculated the chance of winning as 50-50. Now, you calculate the chance of winning as 75% Really? Well, let's count the new probable outcomes for each voting group/district. 1. The group with all of your opponent's friends will vote for him. 2. The group with two of your friends will 50% vote 3/3 for you and 50% vote 2/3 for you which means they will go for you. 3. The final group with one of your friends and two undecided is where the action is. You know your friend will vote for you. So there are two remaining random votes. There is a 25% chance they will both vote for you so you get 3/3 making you win. There is also a 50% change only one of them will vote for you 2/3 making you win. Thus there is only one outcome remaining in which you lose. That is if both undecided voters vote against you. This only happens 25% of the time (50% x 50%). From this voting system you may have noticed something. By grouping classmates in this way, it is possible to win the class election without getting the majority of the votes. To see this, let's imagine the independent in group 2 votes against you and one of the independents in group 3. The total votes against you is 5 while the votes for you are 4. But because you carefully constructed the groups. You win two groups out of three and still win the election. At this point, the teacher would not be happy but it is what she agreed to so...tough luck! So how did this happen? There is a few ways to look at this. One is that before Gerrymandering (regrouping into districts) you had three people who could vote against you. Now, you have stuck one of those voters with two sympathetic voters causing that vote to no longer count. All that remains is two voters who both need to vote against you in order to counteract the effect having your friend in Group 3. This ideal of rigging a class election may seem absurd but it is exactly the kind of thing establishment figure do when they rewrite voting district lines to include or exclude groups depending upon how they affect the likely voting outcome. Redistricting though often involves the votes of thousands of people. The concept though is the same. I have written a small simulation showing how effective Gerrymandering can be on slightly larger scales. Within the simulation I set up 10 voting districts. In each of these districts there are 40 voters. Each district goes to whoever wins the most votes within that district. A total win is calculated by whoever wins the most districts.  Figure 1: A grid display of 400 voters when there is no Gerrymandering. Each square represents a different voter. Lighter colored squares have a higher likelihood of voting for you. Darker colored squares have a higher likelihood of voting against you. Looking at Figure 1, we see that without Gerrymandering there are naturally some districts that seem more likely to vote for us and some that seem less. The method of gerrymandering I propose in order to try to increase the likelihood of us winning is a simple method of reorganizing the vote. First we sort all of the individuals by likelihood of voting for us. Next we group those who are the least likely to vote for us into "Gerrymandered districts". The rest we distribute randomly. Of course we should not really be calling any district in particular Gerrymandered because really the whole population has been Gerrymandered.  Figure 1: A grid display of 400 voters where the first two districts have been "Gerrymandered". Each square represents a different voter. Lighter colored squares have a higher likelihood of voting for you. Darker colored squares have a higher likelihood of voting against you. Using this simple method of Gerrymandering for two districts, we can see extreme changes in likely voting outcomes for Districts 1 and 2 (Figure 2). What is less easy to observe but even more important is the subtle changes in Districts 3 through 10. Compared with Figure 1, these districts have now lost the voters previously much more likely to vote against you. This makes each of these 7 districts slightly more likely to vote for you. The net affect is a certain loss in Districts 1 and 2 but a better than prior outcome in the remaining districts (you actually almost always win in this scenario if the popular vote is split 50-50). In order to calculate expected outcomes, I repeat the simulation 200 different times under each voter preference scenario (lean) and gerrymandering scenario and average the number of wins across simulations.  Figure 3: Within each number of districts Gerrymandered outcomes are simulated for 200 simulation runs. The orange line that crosses the 50% mark at 0 indicates that this is the expected outcome if there is no Gerrymandering happening. From Figure 3 we begin to see how effective gerrymandering can be on the likelihood of winning. By using our method to gerrymander districts we seem to gain about 4 percentage points for each district we gerrymander up until we have four districts "gerrymandered". At the most favorable outcome, when we have gerrymandered four districts, we have effectively gained a 16 point lead against our rival. This means the popular vote could be 58-42 in favor of our rival and we could still win the outcome about half the time. Something interesting happens at the fifth district gerrymandered. We have now gerrymandered half of the districts. At this last step we now are penalized for gerrymandering by 5 points. I do not have an easy explanation at this time. Conclusion In reality the practice of gerrymandering is much more complicated and subtle than this. There are typically restrictions on how you can group individuals, usually based on geography. Likewise you don't know how individuals are going to vote but you may have a pretty good idea how certain demographics are likely to vote. This complicates the methods. But I am sure there are "social engineering" or "vote engineering" firms out there that are able to exploit some of the strategies outlined here and other strategies in order to maximize the effect of gerrymandering. That said, the classroom example and the simulations above capture the essence of gerrymandering. Gerrymandering is the practice of grouping voters together in such a way as to prevent those who are voting against you from having a vote. As such, gerrymandering is an enemy to democracy. It is typically used by establishment candidates to insulate themselves from challengers. This allows establishment figures to feel that their seat is safe even when they accept private or political payoffs for voting consistently in ways which are against the good of their constituency. (Code on Github) Related Posts: Nevada:Sanders has 6x the Supporters as Clinton The Simple Reason Sanders Is Winning As First Lady, Popularity of Babies Named "Hillary" Dropped by an Unprecedented 90% Hillary Clinton's Biggest 2016 Rival: Herself Cause of Death: Melanin | Evaluating Death-by-Police Data Obama 2008 received 3x more media coverage than Sanders 2016 The Unreported War On America's Poor What it means to be a US Veteran Today ## Wednesday, February 17, 2016 ### Nevada:Sanders has 6x the Supporters as Clinton The most recent three polls coming out of Nevada have surprised many by indicating that Sanders is tied with Clinton for the primary vote in that state. This news is shocking to many because the previous five polls done in that state indicated Clinton had a commanding lead. However, those previous polls were old with the most recent one collected in December. When the first poll, of the new year came out on February 8th surveying a massive 1236 people in Nevada, the not-fake-news website DailyNewsBin.com published an article explaining that the poll was a fake manufactured by Republicans in an attempt to undermine Hillary Clinton's campaign. This of course, coming from a totally legitimate website, coincidentally established in the summer of 2015, which bans commenters who don't support Hillary Clinton, and almost exclusively publishes articles expressly supporting Hillary Clinton. Fortunately for those of us supportive of the iconoclast Bernie Sanders, further surveys have confirmed the statistical tie between the candidates. Yet, a basis for believing that Sanders has equal support in Nevada should not come as a surprise to anybody tracking donor information from the state as far back as December of 2015. Based on the year-end information for 2015 logged with the FEC, in Nevada Sanders had 1450 contributions compared with only 1361 from Clinton. Not only did Sanders have more logged contributions in Nevada than Clinton for 2015, if you look at contributions over time there has been a steep growth in both number of contributions and the number of contributors (Figures 1 and 2).  Figure 1: Number of contributions to each democratic campaign over time.  Figure 2: Number of contributors to each democratic campaign over time. Both of these figures show the real struggle Clinton has had in picking up support from the general population relative to her populist rival Sanders. Yet, these graphs vastly underestimate the number of supporters Bernie has relative to his rival. This is because the vast majority of contributions are from supporters who are giving less the FEC mandating reporting minimum of$200.

For Sanders is backed largely by small money with only 26% of his support itemized and reported to the FEC while Clinton on the other hand is backed by big money from a relatively few number of contributors and therefore reports 84% of her contributions. If we make the simple assumption that those contributions which are not reported look pretty much the same as those which are reported then the picture becomes even more dramatic (Figures 3 and 4).

 Figure 3: Number of contributions over time in Nevada adjusted for missing observations.
 Figure 4: Number of contributors over time in Nevada adjusted for missing observations. By December the gap of support between candidates is a chasm with Sanders have an estimated 6.4 times as many supporters as Clinton in Nevada.
Form Figures 3 and 4 we can see that the number of people willing to put their personal money in to back Sanders in Nevada is vastly greater than those supporting Hillary Clinton. Not only that, but those numbers have continued to grow over time while Clinton's numbers have remained relatively stagnant.

It is unclear as to how the number of people willing to put their money behind a candidate relates to how the general population of voters will vote. Yet those members of the Clinton campaign, including the folks at the not-fake-news website DailyNewsBin.com, must find these numbers combined with those of the polls at least somewhat worrying.

Code: GitHub

## Saturday, February 13, 2016

### The Simple Reason Sanders Is Winning

Sanders has way more backers across the United States (with the possible exception of the South).

Hillary Clinton might be doing well at the polls. However, the shocking fact of polling is that only 8-9% of those asked to participate in polls combined with most polls given to landline owners, the populations being polled do not currently represent of the voting population.

The results of these two factors is what we saw in the Iowa and New Hampshire primaries. That is, even the best estimates were off by a significant margin. As a result, of the growing inability to execute effective polls, we must look to other sources of data. Some have looked at search results, twitter posts, facebook posts, etc. These, posts sources of information seem useful though they are difficult to interpret.

A much better indicator of campaign health, I would suggest, is the ability of a candidate to inspire a wide and diverse base of supporters. From my last post, Analysis: Clinton backed by Big Money: Sanders by Small, it is clear from the data filed with the Federal Election Commission that Sanders has a massive number of small supporters relative Clinton's relatively small number of large supporters.

The implications of this information are initially unclear. Obviously, any candidate would want more supporters. However, distribution of supporters is important. Perhaps all of Sanders' supporters are in the North East around Vermont and New Hampshire and thus his message is not being picked up by the rest of the country.

Before jumping into the maps let me just first warn that due to the immensity of small contributions to the Sanders campaign, we do not have information on 74% of contributor information in contrast to that of the Clinton campaign in which we are only missing 14% of contributor data.

In the following maps I am counting how many contributors have contributed to each campaign in each county (the next unit smaller than a state, much like a municipality or district in other countries) of the contiguous United States (apologies Alaska, Hawaii, and our protectorates). A county is ranked from 0 to 1 with 0 being all of those who contributed to a campaign contributed to the Clinton campaign and a 1 being if all of those who contributed to a campaign contributed to the Sanders campaign. Any numbers in between indicate proportion of contributions to the sanders campaign from total number of contributions. Counties without recorded contributions are left out.

In April, very few people knew who Bernie Sanders is. Hillary Clinton however was well known and had people across the country contributing to her. From the map we can see that 82% of counties who had contributors, had the majority of contributors to Hillary.

As early as May 2015, we can see that Sanders is rapidly closing the contributor gap, knocking the number of countries in which Clinton leads from 82% to 62%.
In June, more of the same; Sanders gaining a significant foothold in California and New England.
We can see that even as Sanders is closing in on Hillary's lead, more of the counties in the US start participating in the process.
By as early as August, we can see that the average number of contributors across counties only has a 6 point gap between Clinton and Sanders.
And into September, Sanders has taken the lead in contributing counties across the country.
In September Sanders does not gain ground but Clinton also does not lose ground.
But going into November, whatever gains Clinton had made become lost as an increasingly larger portion of the counties start contributing.
By the end of 2015, even with 74% of Bernie's contributions not being recorded compared with only 14% of Hillary's, Bernie has people committed enough to contribute to his campaign from across the US that 67% of counties favoring him relative to 31% that favor Hillary.

So what? Does this really matter?

Given that for every one Bernie supporters showing up in the data we know, there are approximately three contributors not showing up in the data, this is a pretty huge margin of supporters willing to give up their personal resources in order to support the Sanders presidential bid.

Yet, even these numbers are only from the end of December. January was the biggest month on record so far with Bernie Sanders for the first time out-raising Hillary. Once those contributions are reported to the FEC, I am certain a much bigger chunk of the map will be blue.

Despite Bernie mobilizing contributors from all around the country, you still may be unconvinced of Bernie's significant and otherwise difficult to observe edge over Hillary.

If so, scroll through these maps one more time and get a feel for the growing numbers and diversity of supporter rallying in the country. The momentum is with Bernie Sanders. Unless something dramatic and unexpected happens, Sanders is going to continue to dominate the primaries.

## Tuesday, February 9, 2016

### Analysis: Clinton backed by Big Money: Sanders by Small

This article examines FEC data in depth and finds what most people already know. Hillary Clinton's presidential bid is financed largely through a relatively small quantity of big donors while Bernie Sanders' presidential bid is funded by numerous small donors.

In order to do our analysis, we look at four hundred thousand individualized contributions reported to the FEC at the end of the 2015 year. These contributions are only reported for individuals who have donated $200 or more. Because many individuals give smaller than$200, this means we do not have individualized information for many individuals. For Hillary Clinton, this means about 16% of contributions are not reported as individualized. For Bernie Sanders, this means 74% of contributions are not reported as individualized.
 Table 1: The histogram captures the distribution of contributions. Note the x axis is scaled by log10 so the same distanced exists between 1 and 10 as, 10 and 100, or 100 and 1000
From Table 1, we can see that throughout the entire 2015, Clinton has vastly more large contributors than Sanders with over 20,000 campaign contributors giving the maximum contribution value of $2700*. Clinton also has a larger number of large donors giving the contribution values with another 20,000 donors giving between$500 and $2700. Conversely, for the smaller value donations, Sanders has many more contributors than Clinton with nearly 35,000 contributions at$100 compared with Clinton's 23,000. With $50 donations, Sanders also does much better with over double the number of donations with over 40,000 contributions compared with Clinton's 20,000. The difference is even more stark with Sanders receiving nearly 40,000 ten dollar donations compared with Hillary's 12,000. There are some ways to avoid the legal contributions limits as discussed in this NPR article.  Figure 2: A series of box-plots comparing contributions over time for 2015. The horizontal dotted line is at the individual maximum of$2700. There is also a maximum of 5000 that can be contributed to a PAC. Many of the contributions that exceed $2700 end up having part of them refunded to the contributor because they exceed the legal limit. Outliers are indicated with 'x's. Note the y axis is on the log10 scale. Looking at November, we can see some two significant outliers to from the "Hillary Victory Fund" at 1.6 and 1.8 million respectively. These are reported as unitemized but this seems rather unique. From Figure 2 we can see some pretty shocking facts about the nature of her contributions early in her bid. In April and May and almost into June the upper quartile (top 25%) of her contributions were at or above the legal maximum. This is vastly different from Sanders who had a handful of contributions at or about the legal maximum but nothing close to the number by Clinton. Overall, difference between the two in April and May could not be any more stark with the upper quartiles for Sanders at or below the median for Clinton for nearly all of the months observed. Overall we can see there is a significant amount of movement in the size of donations over time. For both Clinton and Sanders, there is a bit of a race to the bottom. This is driven somewhat by the nature of the reporting laws as contributions are not reported until an individual has given at least$200. After that, all contributions are reported. Thus many of the smaller contributions will be reported as repeat contributors keep donating.

 Figure 3: Shows the distribution of contributions by candidate. This figure is the the same as Figure 2 except the y-axis is not scaled by log10 and the upper limit is set at the legal maximum of 2700. Outliers above 1.5x the interquartile range have also be removed.
Form Figure 3, we can see that the difference in the nature of contributions by candidate is vast with almost all of the contributions to the Sanders campaign of less than $500. For the first two months over half of the listed contributions for the Clinton campaign was$500 or more. Over time, the average size on contributions decreased though much faster for the Sanders campaign.

From Figures 2 and 3 we might be concerned that Sander's campaign is not capable of raising sufficient funds to compete with the Clinton campaign. However, this is forgetting that Sanders has many many more contributors than Clinton. In order to get an estimate of the number of contributions that are given but itemized, I look at the number of contributions each quarter unitized and assume that those contributions are on average $30 (probably a high estimate). Table 1: Total Not-Itemized Contributions by quarter. # of Contributions is based on assuming each of these contributions estimated$30.

$Not-Itemized# Of Contributions Clinton First Report (July)$8,098,571 269,952
Fall Report (October)$5,193,811 173,127 Year End Report (December)$5,707,408 190,247
Sanders
First Report (July)$10,465,912 348,864 Fall Report (October)$20,187,064 672,902
Year End Report (December)\$23,421,034 780,701

From Table 1, we can see that Clinton initially reported nearly as many small contributions as Sanders, those contributions have since fallen off while Sanders small contributions have significantly increased in order to outpace Clinton by four to one.

 Figure 4: Total number of contributions over time and the difference between the two.
Smoothing the number of small contributors across the months campaigning we end up with Figure 4 in which we quickly see how vast the difference between the number of contributors to the Sanders campaign and the Clinton campaign are.

Initially, Clinton enters the race a little earlier with a quarter million contributions. However, once Sanders enters the campaign, he quickly gains support with his total number of contributions matching that of Clinton by June 5th and continues to grow. By September 20th Sanders has already collected twice the number of contributions that Clinton has.

So how does this map to total contributions collected over time? We already know that Clinton has a large number of big donors on her side.
 Figure 5: Total quantity of dollars contributed over time.
From Figure 5, we can see that despite Clinton getting an early and big hand up from large money. As early as July, the difference in funds raised by Clinton was just over 30 million more than that raised by Sanders. However, despite continuing to have large contributions, this difference has not increased much since with as of the end of the year, only a little more than a 35 million dollar different from individual contributions.

Overall, this is an AMAZING fact. Somehow, despite having the majority of wealthy democratic donors in her corner, Clinton has failed to out-raise Sanders since July!

Not only that, but Hillary is in a difficult position, many of her largest donors have already maxed out their ability to legally contribute to her campaign, yet very few of Sanders contributors have gotten close to maxing out their legal ability to contribute to campaigns. Of course there are always the dubiously legal contributions to candidate Super-PACs made legal by the infamous "Citizens United" supreme court ruling.

However, as Sanders has campaigned against Super-PACs and Hillary is attempting to win over his supporters, it will certainly be interesting to see how fundraising changes moving forward as she risks being hamstrung by her narrow but affluent base.

Source Code on GitHub

