1.Who votes?

Election supplements in the Current Population Survey (CPS, US Census) provide possibly the most comprehensive micro data to examine this question.

Although this data is available since 1964, the CPS has gathered consistent citizenship data (eligible voters) only since 1996. The analysis below uses data from 1996-2012.

We also explore data from American National Election Studies (ANES) that provides a longer time series (1948–2012) and a large set of covariates to explore trends in diversity of the electorate.

Note: For all survey based analysis we use sampling weights provided in the official data documentation (CPS, ANES). A literature in Political Science makes adjustments to Census sampling weights to account for nonresponse/overreporting by taking into account actual state vote counts, see Hur and Achen, 2013. The analysis reported here (e.g. Appendix Table 1) matches voting rates reported in offical Census reports with no further adjustments.


Geography


Time & Geography


Let’s start with a summary across time and geography. Lower voting rates in non-presidential years and certain regions across the country are evudent.

s1 <- voter %>%
  group_by(year, state, `FIPS Code`) %>%
  summarise(
    D = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))


fig <- (lapply(split(s1, s1$year), function(s1) {
  highchart() %>%
    hc_add_series_map(
      usgeojson, s1,
      name = "% who voted", value = "D", joinBy = c("statefips","FIPS Code"),
      borderWidth = 0.1, dataLabels = list(enabled = TRUE, format = '{point.properties.postalcode}')) %>%
    hc_title(text =  unique(s1$year)) %>%
    hc_colorAxis(dataClasses = color_classes(c(seq(40, 60, by = 10), 101),
                                             colors = c("black", c("#A020F0")))) %>%
    hc_tooltip(valueSuffix="%") %>%
    hc_legend(enabled = FALSE) %>%
    hc_add_theme(hc_theme_flatdark())
}))

hw_grid(fig, rowheight = 300, ncol = 3)  %>% browsable()

Race


Voting by Race


Notice that for the first time in 2012 black voter turnout exceeded white turnout.

a1 <- (voter) %>%
  group_by(year,  Race) %>%
  summarise(No.of.Obs = n(),
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100,1))

hchart(a1, type="line", x=year, y=Percent_who_Voted, group=Race)%>%
hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_title(text = "Voting Rates by Race") %>%
  hc_subtitle(text = "Source: Current Population Survey, Election Supplement")%>%
  hc_tooltip(crosshairs= list(enabled= TRUE, color=c("#F8F8FF")),
             shared = TRUE, borderWidth = 1, headerFormat = "Voting Rates {point.year} <br>") %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"))  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_colors(race_cols)%>%
  hc_legend(enabled = TRUE)  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_add_theme(hc_theme_db())

Age


Voting by Age


Patterns by age are as well known: young vote at a lower propensity. What seems interesting is the sharp drop in voting rates in non-presidentail elections. We pursue such gaps in the Political Mining section.

age <- (voter) %>%
  group_by(year, age_group) %>%
  summarise(
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))

hchart(age, type="line", x=year, y=Percent_who_Voted, group=age_group)%>%
  hc_title(text = "Voting Rates by Age") %>%
  hc_subtitle(text = "Source: Current Population Survey, Election Supplement")%>%
  hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_tooltip(crosshairs= list(enabled= TRUE), 
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"))  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_colors(age_cols)%>%
  hc_add_theme(hc_theme_db())

Marital


Voting by Marital Status


Similar to patterns observed above: Unmarried (younger) don’t vote as frequently. Stability of marriage seems to have an impact (beyond just age as we shall see later).

mar <- (voter) %>%
  group_by(year, marital) %>%
  summarise(
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))

hchart(mar, type="line", x=year, y=Percent_who_Voted, group=marital)%>%
  hc_title(text = "Voting Rates by Marital Status") %>%
  hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_subtitle(text = "Source: Current Population Survey, Election Supplement")%>%
  hc_tooltip(crosshairs= list(enabled= TRUE), 
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"))  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_colors(mar_cols)%>%
  hc_add_theme(hc_theme_db())

Education


Education & Voting


Socio-economic status and voting rates are highly correlated as seen below. CPS data has good information on education (income variable has approx X % missing obs).

The sample here is large enough to break analysis by every level of education category. We however pool data across 1996-2012 election cycles (by presidential vs. non-presidential) since sample size for population below high school is still small.

Note that all education analysis is filtered to Age over 24.

edu <- filter (voter, age>24) %>%
  group_by(education, p_election) %>%
  summarise(
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))

edu<-hchart(edu, type="bar", x=education, y=Percent_who_Voted, group=p_election)%>%
  hc_title(text = "% who voted (1996--2012): By Education") %>%
  hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_subtitle(text = "Source: Current Population Survey, Election Supplement")%>%
  hc_tooltip(crosshairs= list(enabled= TRUE, color="#CDC673"), backgroundColor = "#f0f0f0",
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"), min=10)  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_colors(p_cols)%>%
  hc_add_theme(hc_theme_elementary ())
hw_grid(edu, rowheight = 700)%>% browsable()

Education & Voting: Presidential Elections Only


The grphs below show voting rates in US presidential election from 1996–2012 by education and gender. The graph on right is for White population only.

e12 <- filter(voter,p_election=="Presidential election" ) %>%
  group_by(education, sex) %>%
  summarise(
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))

e12a<-hchart(e12, type="bar", x=education, y=Percent_who_Voted, group=sex)%>%
  hc_title(text = "Voting Rates in Presidential Elections 1996-2012") %>%
  hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_subtitle(text = "All Voters")%>%
  hc_tooltip(crosshairs= list(enabled= TRUE, color="#CDC673"), backgroundColor = "#f0f0f0",
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"), min=10)  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_colors(sex_cols)

e12w <- filter(voter, p_election=="Presidential election" & Race=="white") %>%
  group_by(education, sex) %>%
  summarise(
    Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100))

e12b<-hchart(e12w, type="bar", x=education, y=Percent_who_Voted, group=sex)%>%
  hc_title(text = "Voting Rates in Presidential Elections 1996-2012") %>%
  hc_plotOptions(line = list( dataLabels = list(enabled = TRUE), enableMouseTracking = TRUE)) %>% 
  hc_subtitle(text = "WHITE POPULATION ONLY")%>%
  hc_tooltip(crosshairs= list(enabled= TRUE, color="#CDC673"), backgroundColor = "#f0f0f0",
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who Voted"),
           labels = list(format = "{value}%"), min=10)  %>%
  hc_xAxis(title="")%>%
  hc_legend(layout = "horizontal", verticalAlign = "bottom",
            floating = FALSE, align = "center")  %>%
  hc_exporting(enabled = TRUE) %>%
  hc_colors(sex_cols)
  
  
  hw_grid(e12a, e12b, ncol=2, rowheight = 600)%>% browsable()


2. ANES


The analysis below uses Time Series Cumulative files from the American National Election Studies (ANES, 1948-2012). All analysis uses sampling weights to match official voter turnout reported by ANES

a1<-readRDS("./Data/anes.rds")

# voting over time
yr<-filter(a1,PresElection=="P")%>%
    group_by( year)%>%
    summarise(No.of.Obs = n(),Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )
datatable(yr, 
              extensions = 'Buttons',  options = list(
                dom = 'Bfrtip', 
                buttons = c('copy', 'csv', 'excel', 'pdf', 'print')),
              caption = 'Table 1: Voting rates, Source: ANES ')%>%
  formatStyle('Per_Voted',  color = c("purple"),  fontWeight = 'bold')



Income


# voting by income
yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Income33)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )

yr<- na.omit(yr )
yr<-spread(yr,  Income33, Per_Voted)

v1<-highchart() %>%
  hc_chart(type = "line")%>%
  hc_xAxis(categories = yr$year,  
           plotBands = list(
             list(from = 0, to = 1, color = "rgba(0,0,100,0.1)", label = list(
               text = "Truman", verticalAlign = "bottom",
               style = list(color = "black"), textAlign = "center", rotation = 0, y = -5)), 
             list(from = 1, to = 3, color = "rgba(100, 0, 0, 0.1)", label = list(
               text = "Eisenhover", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)), 
             list(from = 3, to = 5, color = "rgba(0,0,100,0.1)", label = list(
               text = "Kennedy/LBJ", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 5, to = 7, color = "rgba(100, 0, 0, 0.1)",label = list(
               text = "Nixon/Ford", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 7, to = 8, color = "rgba(0,0,100,0.1)",label = list(
               text = "Carter", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 8, to = 11, color = "rgba(100, 0, 0, 0.1)",label = list(
               text = "Reagan/Bush1", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 11, to = 13, color = "rgba(0,0,100,0.1)",label = list(
               text = "Clinton", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 13, to = 15, color = "rgba(100, 0, 0, 0.1)",label = list(
               text = "Bush2", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)),
             list(from = 15, to = 18, color = "rgba(0,0,100,0.1)", label = list(
               text = "Obama", verticalAlign = "bottom",
               style = list(color = "#606060"), textAlign = "center", rotation = 0, y = -5)))) %>%
  
  hc_add_series(name = "Low Income", data = yr$Low33, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Mid Income", data = yr$Mid33, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "High Income", data = yr$High33, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Income") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
   hc_legend(enabled = TRUE)  %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_exporting(enabled = TRUE)

v1 %>%
  hc_add_theme(hc_theme_tufte2(grid=FALSE))



Education


yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Education)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )

ed_cols<-c("#212121", "#545454", "#B8B8B8", "#EEC900")
yr<- na.omit(yr )
yr<-spread(yr,  Education, Per_Voted)

highchart() %>%
  hc_chart(type = "spline")%>%
  hc_xAxis(categories = yr$year) %>% 
  hc_add_series(name = "0-8 grades", data = yr$`0-8 grades`, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "High school", data = yr$`High school`, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Some college", data = yr$`Some college`, dataLabels = list(enabled = TRUE)) %>%
    hc_add_series(name = "College+", data = yr$`College+`, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Education") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_legend(enabled = TRUE)  %>%
  hc_colors(ed_cols)%>%
  hc_exporting(enabled = TRUE)  %>%
  hc_add_theme(hc_theme_tufte2())



Race


yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Race)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )
r_cols<- c("#B23AEE", "#454545", "#878787")
yr<- na.omit(yr )
yr<-spread(yr,  Race, Per_Voted)

highchart() %>%
  hc_chart(type = "spline")%>%
  hc_xAxis(categories = yr$year) %>% 
  hc_add_series(name = "Black", data = yr$Black, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Hispanic", data = yr$Hispanic, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "White", data = yr$White, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Race") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_legend(enabled = TRUE)  %>%
  hc_colors(r_cols)%>%
  hc_exporting(enabled = TRUE)  %>%
  hc_add_theme(hc_theme_tufte2())



Political ID


yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Political_ID)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )
pol_cols<-c( "#FF4040", "#424242","#1C86EE" )
yr<- na.omit(yr )
yr<-spread(yr,  Political_ID, Per_Voted)

highchart() %>%
  hc_chart(type = "spline")%>%
  hc_xAxis(categories = yr$year) %>% 
  hc_add_series(name = "Republican", data = yr$`3. Republicans (including leaners)`, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Indipendent.Other", data = yr$`2. Independents`, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Democrat", data = yr$`1. Democrats (including leaners)`, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Political ID") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_legend(enabled = TRUE)  %>%
  hc_colors(pol_cols)%>%
  hc_exporting(enabled = TRUE)  %>%
  hc_add_theme(hc_theme_tufte2())



Religion


yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Religion)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )

yr<- na.omit(yr )
yr<-spread(yr,  Religion, Per_Voted)
rel_cols<-c("#FF6347", "#474747", "#CD950C")
highchart() %>%
  hc_chart(type = "spline")%>%
  hc_xAxis(categories = yr$year) %>% 
  hc_add_series(name = "Protestant", data = yr$Protestant, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Catholic", data = yr$Catholic, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Jewish", data = yr$Jewish, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Religion") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_exporting(enabled = TRUE)  %>%
   hc_legend(enabled = TRUE)  %>%
  hc_colors(rel_cols)%>%
  hc_legend(enabled = TRUE)  %>%
  hc_add_theme(hc_theme_tufte2())



Ideology


a1$Ideology <- as.character(a1$Ideology)
a1$Ideology[a1$Ideology == "Moderate"] <- "Other"
a1$Ideology[a1$Ideology == "Don't know"] <- "Other"
ideo_cols<-c("#1C86EE", "#EE2C2C", "#4D4D4D")
# voting by income
yr<-filter(a1,PresElection=="P")%>%
  group_by( year, Ideology)%>%
  summarise(Per_Voted = round(weighted.mean(Voted, Weight, na.rm = TRUE)*100) )


yr<- na.omit(yr)
yr<-spread(yr,  Ideology, Per_Voted)


highchart() %>%
  hc_chart(type = "spline")%>%
  hc_xAxis(categories = yr$year) %>% 
  hc_add_series(name = "Liberal", data = yr$Liberal, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Conservative", data = yr$Conservative, dataLabels = list(enabled = TRUE)) %>%
  hc_add_series(name = "Other", data = yr$Other, dataLabels = list(enabled = TRUE)) %>%
  hc_title(text = "Voting Rates by Ideology") %>%
  hc_subtitle(text = "Source: American National Election Studies (ANES)")%>%
  hc_tooltip(crosshairs = list(enabled=TRUE),
             shared = TRUE, borderWidth = 1) %>%
  hc_yAxis(title = list(text = "% who voted"),
           labels = list(format = "{value}%"))   %>%
  hc_legend(enabled = TRUE)  %>%
  hc_colors(ideo_cols)%>%
  hc_exporting(enabled = TRUE)  %>%
  hc_add_theme(hc_theme_tufte2())




3. Appendix

Summary (CPS)


a1<-voter %>%
  group_by(year, p_election) %>%
  summarise(No.of.Obs = n(), 
            Percent_who_Voted = round(weighted.mean(voted_rec, weight, na.rm = TRUE)*100, 1))

datatable(a1, colnames = c('% who Voted' = 'Percent_who_Voted', 'Election Type' = 'p_election'),
          extensions = 'Buttons',  options = list(
            dom = 'Bfrtip', 
            buttons = c('copy', 'csv', 'excel', 'pdf', 'print')),
          caption = 'Appendix Table 1: Voting rates, Source: US Census ')%>%
  formatStyle('% who Voted',  color = c("#FF7F24"),  fontWeight = 'bold')




Cleaning Data


Codes for data cleaning.

##Clean up CPS data

cps_vote <- read_csv("cps_vote.csv")
voter <- cps_vote
## Reordering voter$Race
colnames(voter) <-c("Year","FIPS Code", "state", "metro","weight", "age", "sex", "race" , "marital" , "hispanic" ,
                    "education", "edu", "vote", "voted")

voter$year<- as.factor(voter$Year)

## Cutting voter$age into voter$age_group
voter$age_group <- cut(voter$age, include.lowest=TRUE,  right=TRUE,
                       breaks=c(18, 25, 35, 45, 55, 65, 75, 99))


## Recoding voter$year into voter$p_election
voter$p_election <- as.character(voter$year)
voter$p_election[voter$year == "1996"] <- "P"
voter$p_election[voter$year == "1998"] <- "NP"
voter$p_election[voter$year == "2000"] <- "P"
voter$p_election[voter$year == "2002"] <- "NP"
voter$p_election[voter$year == "2004"] <- "P"
voter$p_election[voter$year == "2006"] <- "NP"
voter$p_election[voter$year == "2008"] <- "P"
voter$p_election[voter$year == "2010"] <- "NP"
voter$p_election[voter$year == "2012"] <- "P"


## Recoding voter$p_election into voter$p_election
voter$p_election <- voter$p_election
voter$p_election[voter$p_election == "P"] <- "Presidential election"
voter$p_election[voter$p_election == "NP"] <- "Non Presidential"


voter$Race<-ifelse(voter$race=='White' & voter$hispanic=='Not Hispanic','white',
                   ifelse(voter$race=='Black/Negro' & voter$hispanic=='Not Hispanic','black',
                          ifelse(voter$hispanic!='Not Hispanic','hispanic','others')))

## Recoding voter$voted into voter$voted_rec
voter$voted_rec <- as.character(voter$voted)
voter$voted_rec[is.na(voter$voted)] <- "0"
voter$voted_rec <- as.numeric(voter$voted_rec)

## Reordering voter$education
voter$education <- factor(voter$education, levels=c("None or preschool", 
                                                    "Grades 1, 2, 3, or 4", "Grades 5 or 6", 
                                                    "Grades 7 or 8", "Grade 9", "Grade 10", "Grade 11", 
                                                    "12th grade, no diploma", "High school diploma or equivalent", "Some college but no degree", "Associate's degree, occupational/vocational program", "Associate's degree, academic program",
                                                    "Bachelor's degree", "Professional school degree","Master's degree", "Doctorate degree"))

## Recoding voter$education into voter$education_rec
voter$education_rec <- as.character(voter$education)
voter$education_rec[voter$education == "None or preschool"] <- "No High School"
voter$education_rec[voter$education == "Grades 1, 2, 3, or 4"] <- "No High School"
voter$education_rec[voter$education == "Grades 5 or 6"] <- "No High School"
voter$education_rec[voter$education == "Grades 7 or 8"] <- "No High School"
voter$education_rec[voter$education == "Grade 9"] <- "No High School"
voter$education_rec[voter$education == "Grade 10"] <- "No High School"
voter$education_rec[voter$education == "Grade 11"] <- "No High School"
voter$education_rec[voter$education == "12th grade, no diploma"] <- "No High School"
voter$education_rec[voter$education == "High school diploma or equivalent"] <- "High school"
voter$education_rec[voter$education == "Some college but no degree"] <- "Some college"
voter$education_rec[voter$education == "Associate's degree, occupational/vocational program"] <- "Some college"
voter$education_rec[voter$education == "Associate's degree, academic program"] <- "Some college"
voter$education_rec[voter$education == "Master's degree"] <- "Masters+"
voter$education_rec[voter$education == "Professional school degree"] <- "Masters+"
voter$education_rec[voter$education == "Doctorate degree"] <- "Masters+"

## Reordering voter$education_rec
voter$education_rec <- factor(voter$education_rec, levels=c("No High School", "High school", "Some college", "Bachelor's degree", "Masters+"))

# Collge Dummy
## Recoding voter$education_rec into voter$education_college
voter$education_college <- as.character(voter$education_rec)
voter$education_college[voter$education_rec == "No High School"] <- "No College"
voter$education_college[voter$education_rec == "High school"] <- "No College"
voter$education_college[voter$education_rec == "Some college"] <- "No College"
voter$education_college[voter$education_rec == "Bachelor's degree"] <- "College+"
voter$education_college[voter$education_rec == "Masters+"] <- "College+"

## Reordering voter$marital
voter$marital <- factor(voter$marital, levels=c("Never married/single", "Married, spouse present", "Married, spouse absent", "Separated", "Divorced", "Widowed"))

##Define red-blue-battle states, based on https://www.washingtonpost.com/politics/clinton-holds-clear-advantage-in-new-battleground-polls/2016/10/18/2885e3a0-94a6-11e6-bc79-af1cd3d2984b_story.html

## Recoding voter$state into voter$Red_Blue_Battle
voter$Red_Blue_Battle <- voter$state
voter$Red_Blue_Battle[voter$state == "Alabama"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Alaska"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Arizona"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Arkansas"] <- "Red"
voter$Red_Blue_Battle[voter$state == "California"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Colorado"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Connecticut"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Delaware"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "District of Columbia"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Florida"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Georgia"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Hawaii"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Idaho"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Illinois"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Indiana"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Iowa"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Kansas"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Kentucky"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Louisiana"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Maine"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Maryland"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Massachusetts"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Michigan"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Minnesota"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Mississippi"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Missouri"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Montana"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Nebraska"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Nevada"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "New Hampshire"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "New Jersey"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "New Mexico"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "New York"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "North Carolina"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "North Dakota"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Ohio"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Oklahoma"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Oregon"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Pennsylvania"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Rhode Island"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "South Carolina"] <- "Red"
voter$Red_Blue_Battle[voter$state == "South Dakota"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Tennessee"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Texas"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Utah"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Vermont"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "Virginia"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Washington"] <- "Blue"
voter$Red_Blue_Battle[voter$state == "West Virginia"] <- "Red"
voter$Red_Blue_Battle[voter$state == "Wisconsin"] <- "Battleground"
voter$Red_Blue_Battle[voter$state == "Wyoming"] <- "Red"


# SAVE DATA HERE AND PUT ABOVE IN APPENDIX