2023-02-01 01:23:45 +00:00
{
"cells": [
{
"cell_type": "markdown",
2023-02-16 00:29:26 +00:00
"metadata": {},
2023-02-01 01:23:45 +00:00
"source": [
2023-02-16 00:29:26 +00:00
"# Nat Scott"
]
2023-02-01 01:23:45 +00:00
},
{
"cell_type": "markdown",
2023-02-16 00:29:26 +00:00
"metadata": {},
2023-02-01 01:23:45 +00:00
"source": [
2023-02-16 00:29:26 +00:00
"## Research question/interests\n",
"\n",
2023-02-16 00:53:33 +00:00
"**Is there a correlation between political alignment & living in neighbourhoods with large quantities of LGBT people?** The obvious answer to this question is \"yes, they are going to mostly be democrats\" but anyone who's ever been around queer people will know that this question is quite a bit more nuanced than that, and this nuance is what we hope to capture in investigating this question.\n",
2023-02-16 00:29:26 +00:00
"\n",
"- The gaybourhoods data set does not include data on residents political alignments, however, there is a wealth of electoral data available freely online that we intend on incorporating into this project. The primary difficulty then will be developing a geographic \"compatibility layer\" between the data sets so that the data can be understood in the same context. To build this, we intend on working with the OpenStreetMap API to create an additional column representing observations position space in a more neutral way, such as their coordinates.\n",
"- Alternatively, we've also considered working with an additional data set that links US zip codes to their longitude and lattitude positions. As such, incorporating this data would be as easy as merging the two tables.\n",
"\n",
"\n",
2023-02-16 00:53:33 +00:00
"**Is there a correlation between geographical stratums & being LGBT?** This question is more abstract, and will serve as a preliminary exploration of the data in hopes of establishing two key details along the way that will shape the rest of the project: how do we quantify queerness, and how do we best represent it visually?\n",
2023-02-16 00:29:26 +00:00
"\n",
"- Once again, representing this data visually will require determining the coordinates associated with each observation.\n",
"- The gaybourhoods data set defines a \"gaybourhood index\" which effectively measures how friendly a given neighbourhood is to queer people. Since this index is entirely subjective, we will need to closely evaluate it's usefulness for our project and investigate different ways to quantify \"queer-friendliness\"\n",
"- In addition to the last point, since, of course, no matter what choice of observations we make, the measurement will still be subjective, answering this research question will come more so in the form of comparing and contrasting different measurements to see what they tell us.\n",
2023-02-16 00:31:05 +00:00
"- Obviously, visualizing this among many aspects of the other research questions would involve projecting the data onto a map of the United States, so visualizing this research question would motivate many of the visualizations for other components of this project"
2023-02-16 00:29:26 +00:00
]
},
2023-03-14 01:43:11 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analysis Pipeline"
]
},
2023-02-16 00:53:33 +00:00
{
"cell_type": "code",
2023-03-14 01:43:11 +00:00
"execution_count": 3,
2023-03-02 04:25:34 +00:00
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import seaborn as sns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2023-03-14 01:43:11 +00:00
"### Loading the data"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"## counties - Relating US counties to their long/lat position on the Earth\n",
"counties = pd.read_csv(\"../data/raw/us-county-boundaries.csv\", sep=\";\")\n",
"\n",
"## pol - Election results from the 2012 American presidential election\n",
"pol = pd.read_csv(\"../data/raw/countypres_2000-2020.csv\")\n",
"\n",
"## gb - the gaybourhoods dataset\n",
"gb = pd.read_csv(\"../data/raw/gaybourhoods.csv\")\n",
"\n",
"# cords - mapping zip codes to long/lat coordinates\n",
"cords = pd.read_csv(\"../data/raw/zip_lat_long.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Cleaning the data"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"counties = counties.rename({\n",
" \"NAME\": \"name\",\n",
" \"INTPTLAT\": \"lat\",\n",
" \"INTPTLON\": \"long\",\n",
"}, axis=\"columns\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"# We only want 2012--the latest election before the gb data was collected\n",
"pol = pol[pol[\"year\"] == 2012].reset_index()\n",
"\n",
"# Get rid of undesireable columns\n",
"pol = pol.drop([\n",
" \"year\", \"state\", \"county_fips\", \"office\",\n",
" \"candidate\", \"version\", \"mode\", \"index\",\n",
"], axis=\"columns\")\n",
"\n",
"# Change the column names to make them a little more friendly\n",
"pol.rename({\n",
" \"county_name\": \"county\",\n",
" \"state_po\": \"state\",\n",
" \"candidatevotes\": \"votes\",\n",
" \"totalvotes\": \"total\"\n",
"}, axis=\"columns\", inplace=True)\n",
"\n",
"# Make cells lowercase\n",
"pol[\"county\"] = pol[\"county\"].apply(lambda x: x.capitalize())\n",
"pol[\"party\"] = pol[\"party\"].apply(lambda x: x.capitalize())"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"# Let's add long/lat columns to gb\n",
"gb = gb.merge(cords, left_on=\"GEOID10\", right_on=\"ZIP\")\n",
"\n",
"# Get rid of unneeded columns\n",
"gb = gb.drop([\n",
" \"Mjoint_MF\", \"Mjoint_SS\", \"Mjoint_FF\", \"Mjoint_MM\",\n",
" \"Cns_TotHH\", \"Cns_UPSS\", \"Cns_UPFF\", \"Cns_UPMM\",\n",
" \"ParadeFlag\", \"FF_Tax\", \"FF_Cns\", \"MM_Tax\", \"MM_Cns\",\n",
" \"SS_Index_Weight\", \"Parade_Weight\", \"Bars_Weight\",\n",
" \"GEOID10\", \"ZIP\",\n",
"], axis=\"columns\")\n",
"\n",
"# There's a lot of info baked into some of these columns. Especially the composite indexes.\n",
"# We'll leave their names as is for easy reference even if they're a little ugly.\n",
"gb = gb.rename({\n",
" \"LAT\": \"lat\",\n",
" \"LNG\": \"long\",\n",
"}, axis=\"columns\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Process/Wrangle the data"
2023-03-02 04:25:34 +00:00
]
},
{
"cell_type": "code",
2023-03-14 01:43:11 +00:00
"execution_count": 11,
2023-03-02 04:25:34 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>lat</th>\n",
" <th>long</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Hancock OH</td>\n",
" <td>41.000471</td>\n",
" <td>-83.666033</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Stafford VA</td>\n",
" <td>38.413261</td>\n",
" <td>-77.451334</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Webster NE</td>\n",
" <td>40.180646</td>\n",
" <td>-98.498590</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Dimmit TX</td>\n",
" <td>28.423587</td>\n",
" <td>-99.765871</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cedar IA</td>\n",
" <td>41.772360</td>\n",
" <td>-91.132610</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name lat long\n",
"0 Hancock OH 41.000471 -83.666033\n",
"1 Stafford VA 38.413261 -77.451334\n",
"2 Webster NE 40.180646 -98.498590\n",
"3 Dimmit TX 28.423587 -99.765871\n",
"4 Cedar IA 41.772360 -91.132610"
]
},
2023-03-14 01:43:11 +00:00
"execution_count": 11,
2023-03-02 04:25:34 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Combine the county name with the state code\n",
"def combine_name_state(row):\n",
" row[\"name\"] = f\"{row['name']} {row['STUSAB']}\"\n",
" return row\n",
"\n",
"counties = counties.apply(combine_name_state, axis=\"columns\")\n",
"\n",
"# We don't need this column anymore\n",
"counties = counties.drop([\"STUSAB\"], axis=\"columns\")\n",
"\n",
"counties.to_csv(\"../data/processed/us-county-boundaries.csv\")\n",
"counties.head()"
]
},
{
"cell_type": "code",
2023-03-14 01:43:11 +00:00
"execution_count": 12,
2023-03-02 04:25:34 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>party</th>\n",
" <th>votes</th>\n",
" <th>total</th>\n",
" <th>percent</th>\n",
" <th>lat</th>\n",
" <th>long</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Democrat</td>\n",
" <td>6363</td>\n",
" <td>23932</td>\n",
" <td>0.265878</td>\n",
" <td>32.532237</td>\n",
" <td>-86.646439</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Republican</td>\n",
" <td>17379</td>\n",
" <td>23932</td>\n",
" <td>0.726183</td>\n",
" <td>32.532237</td>\n",
" <td>-86.646439</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Other</td>\n",
" <td>190</td>\n",
" <td>23932</td>\n",
" <td>0.007939</td>\n",
" <td>32.532237</td>\n",
" <td>-86.646439</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Democrat</td>\n",
" <td>18424</td>\n",
" <td>85338</td>\n",
" <td>0.215894</td>\n",
" <td>30.659218</td>\n",
" <td>-87.746067</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Republican</td>\n",
" <td>66016</td>\n",
" <td>85338</td>\n",
" <td>0.773583</td>\n",
" <td>30.659218</td>\n",
" <td>-87.746067</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
2023-03-02 07:42:00 +00:00
" party votes total percent lat long\n",
"0 Democrat 6363 23932 0.265878 32.532237 -86.646439\n",
"1 Republican 17379 23932 0.726183 32.532237 -86.646439\n",
"2 Other 190 23932 0.007939 32.532237 -86.646439\n",
"3 Democrat 18424 85338 0.215894 30.659218 -87.746067\n",
"4 Republican 66016 85338 0.773583 30.659218 -87.746067"
2023-03-02 04:25:34 +00:00
]
},
2023-03-14 01:43:11 +00:00
"execution_count": 12,
2023-03-02 04:25:34 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Combine the county name with the state code\n",
"def combine_name_state(row):\n",
" row[\"county\"] = f\"{row['county']} {row['state']}\"\n",
" return row\n",
"\n",
"pol = pol.apply(combine_name_state, axis=\"columns\")\n",
"\n",
"# Add a percent column which will be useful when graphing\n",
"pol[\"percent\"] = pol[\"votes\"] / pol[\"total\"]\n",
"\n",
"# Attach long/lat data to each row\n",
"pol = pol.merge(counties, left_on=\"county\", right_on=\"name\")\n",
"\n",
"# Now we can get rid of the state columns\n",
2023-03-02 07:42:00 +00:00
"pol = pol.drop([\"state\", \"name\", \"county\"], axis=\"columns\")\n",
2023-03-02 04:25:34 +00:00
"\n",
"pol.to_csv(\"../data/processed/election-2012.csv\", index=False)\n",
"pol.head()"
]
},
{
"cell_type": "code",
"execution_count": 87,
2023-02-16 00:53:33 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Tax_Mjoint</th>\n",
" <th>TaxRate_SS</th>\n",
" <th>TaxRate_FF</th>\n",
" <th>TaxRate_MM</th>\n",
2023-03-02 04:25:34 +00:00
" <th>Cns_RateSS</th>\n",
" <th>Cns_RateFF</th>\n",
" <th>Cns_RateMM</th>\n",
" <th>CountBars</th>\n",
2023-02-16 00:53:33 +00:00
" <th>FF_Index</th>\n",
" <th>MM_Index</th>\n",
" <th>SS_Index</th>\n",
" <th>TOTINDEX</th>\n",
2023-03-02 04:25:34 +00:00
" <th>lat</th>\n",
" <th>long</th>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2120</td>\n",
" <td>203.301887</td>\n",
" <td>28.773585</td>\n",
" <td>174.528302</td>\n",
2023-03-02 04:25:34 +00:00
" <td>77.125329</td>\n",
" <td>6.931719</td>\n",
" <td>70.193610</td>\n",
" <td>15</td>\n",
2023-02-16 00:53:33 +00:00
" <td>6.724415</td>\n",
" <td>48.288254</td>\n",
" <td>55.012669</td>\n",
" <td>67.077054</td>\n",
2023-03-02 04:25:34 +00:00
" <td>34.093828</td>\n",
" <td>-118.381697</td>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>5080</td>\n",
" <td>205.511811</td>\n",
" <td>33.464567</td>\n",
" <td>172.047244</td>\n",
2023-03-02 04:25:34 +00:00
" <td>88.478367</td>\n",
" <td>15.617404</td>\n",
" <td>72.860963</td>\n",
" <td>17</td>\n",
2023-02-16 00:53:33 +00:00
" <td>9.834048</td>\n",
" <td>48.578469</td>\n",
" <td>58.412517</td>\n",
" <td>61.866815</td>\n",
2023-03-02 04:25:34 +00:00
" <td>37.758057</td>\n",
" <td>-122.435410</td>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>5790</td>\n",
" <td>107.772021</td>\n",
" <td>16.753022</td>\n",
" <td>91.018998</td>\n",
2023-03-02 04:25:34 +00:00
" <td>46.771050</td>\n",
" <td>5.745582</td>\n",
" <td>41.025469</td>\n",
" <td>5</td>\n",
2023-02-16 00:53:33 +00:00
" <td>4.370779</td>\n",
" <td>26.360413</td>\n",
" <td>30.731192</td>\n",
" <td>37.908747</td>\n",
2023-03-02 04:25:34 +00:00
" <td>40.742039</td>\n",
" <td>-74.000620</td>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>3510</td>\n",
" <td>80.056980</td>\n",
" <td>21.082621</td>\n",
" <td>58.974359</td>\n",
2023-03-02 04:25:34 +00:00
" <td>31.619291</td>\n",
" <td>9.315448</td>\n",
" <td>22.303843</td>\n",
" <td>10</td>\n",
2023-02-16 00:53:33 +00:00
" <td>6.055939</td>\n",
" <td>15.939869</td>\n",
" <td>21.995808</td>\n",
" <td>37.530067</td>\n",
2023-03-02 04:25:34 +00:00
" <td>40.734012</td>\n",
" <td>-74.006746</td>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2660</td>\n",
" <td>91.353383</td>\n",
" <td>12.781955</td>\n",
" <td>78.571429</td>\n",
2023-03-02 04:25:34 +00:00
" <td>21.763042</td>\n",
" <td>3.142678</td>\n",
" <td>18.620365</td>\n",
" <td>9</td>\n",
2023-02-16 00:53:33 +00:00
" <td>3.004058</td>\n",
" <td>18.280165</td>\n",
" <td>21.284224</td>\n",
" <td>35.843573</td>\n",
2023-03-02 04:25:34 +00:00
" <td>37.773134</td>\n",
" <td>-122.411167</td>\n",
2023-02-16 00:53:33 +00:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
2023-03-02 04:25:34 +00:00
" Tax_Mjoint TaxRate_SS TaxRate_FF TaxRate_MM Cns_RateSS Cns_RateFF \\\n",
"0 2120 203.301887 28.773585 174.528302 77.125329 6.931719 \n",
"1 5080 205.511811 33.464567 172.047244 88.478367 15.617404 \n",
"2 5790 107.772021 16.753022 91.018998 46.771050 5.745582 \n",
"3 3510 80.056980 21.082621 58.974359 31.619291 9.315448 \n",
"4 2660 91.353383 12.781955 78.571429 21.763042 3.142678 \n",
2023-02-16 00:53:33 +00:00
"\n",
2023-03-02 04:25:34 +00:00
" Cns_RateMM CountBars FF_Index MM_Index SS_Index TOTINDEX \\\n",
"0 70.193610 15 6.724415 48.288254 55.012669 67.077054 \n",
"1 72.860963 17 9.834048 48.578469 58.412517 61.866815 \n",
"2 41.025469 5 4.370779 26.360413 30.731192 37.908747 \n",
"3 22.303843 10 6.055939 15.939869 21.995808 37.530067 \n",
"4 18.620365 9 3.004058 18.280165 21.284224 35.843573 \n",
2023-02-16 00:53:33 +00:00
"\n",
2023-03-02 04:25:34 +00:00
" lat long \n",
"0 34.093828 -118.381697 \n",
"1 37.758057 -122.435410 \n",
"2 40.742039 -74.000620 \n",
"3 40.734012 -74.006746 \n",
"4 37.773134 -122.411167 "
2023-02-16 00:53:33 +00:00
]
},
2023-03-02 04:25:34 +00:00
"execution_count": 87,
2023-02-16 00:53:33 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
2023-03-02 04:25:34 +00:00
"gb.to_csv(\"../data/processed/gaybourhoods-nat.csv\")\n",
"gb.head()"
2023-02-16 00:53:33 +00:00
]
2023-03-02 07:42:00 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploratory Data Analysis\n",
"\n",
"In the previous section we got a pretty good idea of what the data looks like and managed to condense it to fewer key variables that will be useful when answering the research questions established above. Now, we will compare these variables, attempt to create some plots, and see if we can't uncover any interesting relationships that aren't evident by looking at the numbers alone.\n",
"\n",
"Let's start with a scatterplot of the `gb` data set. I have no idea what this is going to look like. Let's see:"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAHHCAYAAACle7JuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAABpOklEQVR4nO3deVhUZf8G8Hv2YYAZYAYEFBEdFDeS1EoBM9c3NXPp16Ll2vamWfZWSq+5VKbtWqRppVlpq9lipWlmqWWvKe5LYu7iAsIMwzD78/uDmBhBRASGwftzXVw65zxz5jtnhpmb5zzPORIhhAARERFRAJL6uwAiIiKi6mKQISIiooDFIENEREQBi0GGiIiIAhaDDBEREQUsBhkiIiIKWAwyREREFLAYZIiIiChgMcgQERFRwGKQoQp1794d3bt393cZV5X169dDIpFg/fr1/i7FxwcffICkpCQoFAqEhYVd9v2PHDkCiUSCl19+ueaLq2dKX8PPP//c36X4mD59OiQSCXJzc/1dildtf8bU198nqnkMMgHsvffeg0QigVqtxsmTJ8ut7969O9q1a+eHyqrP4/Hg/fffx/XXX4+IiAiEhoaiZcuWGDFiBDZv3lwrj2m1WjF9+vQKP/C+++47TJ8+vVYetybVVp379+/HqFGj0KJFC7z99ttYuHBhnddwuex2O9544w2kpaUhPDwcSqUSsbGxGDhwID766CO43W5/l0j1XH15L1dm2bJlmDNnjr/LqBcYZBoAu92O2bNn1+g2f/jhB/zwww81us2qmDBhAkaOHImYmBhMnz4dL7zwAm6++WZs3rwZq1atqpXHtFqtmDFjxkWDzIwZM2rlcWtSbdW5fv16eDwezJ07F6NGjcLtt99e5zVcjnPnziE1NRUTJkxASEgIpkyZggULFuDhhx9GUVERhg0bhueff96vNVL9Vx/ey5fCIPMPub8LoCvXoUMHvP3228jIyEBsbGyNbFOpVNbIdi7HmTNnMG/ePNx3333l/vKfM2cOzp07V+c1Xe3Onj0LANU6pOQP99xzD7KysrB8+XIMGTLEZ11GRgb++OMPHDhwwE/V1T2XywWPx+PvMohqFXtkGoCnnnoKbre7Sr0yixcvRo8ePRAVFQWVSoU2bdpg/vz55dqVPX595swZyOXyCv9COXDgACQSCTIzM73LCgoK8OijjyIuLg4qlQpGoxEvvPDCJT9QDx8+DCEEUlNTy62TSCSIioryWVZQUICJEyeiWbNmUKlUaNKkCUaMGOEdB+BwODB16lR07NgROp0OwcHBSE9Px08//eTdxpEjRxAZGQkAmDFjBiQSCSQSCaZPn45Ro0bhzTff9D5+6U8pj8eDOXPmoG3btlCr1WjUqBEeeOAB5Ofn+9TZrFkzDBgwAD/88AM6dOgAtVqNNm3a4Isvvqh0f5T67LPP0LFjRwQFBcFgMODuu+/2OZR4qTovZt68eWjbti1UKhViY2Mxbtw4FBQU+NQ9bdo0AEBkZKR3v1SkqjUsXLgQLVq0gEqlQufOnbFly5Zybfbv34/bbrsNERERUKvV6NSpE77++utLPp/ffvsNq1evxv33318uxJTq1KkThg8f7r1dlfeIEALNmjXDrbfeWm57NpsNOp0ODzzwgM9yt9uNp556CtHR0QgODsbAgQNx/Pjxcve/1GsLXHwsyahRo9CsWTPv7bJjkebMmePdz3v37vW2KSgowKhRoxAWFgadTofRo0fDarX6bNflcuHZZ5/13r9Zs2Z46qmnYLfby9VwqfdQqdLXPSgoCNdddx02bNhQrg0AvPHGG2jbti00Gg3Cw8PRqVMnLFu2rMK2ZZ04cQKDBg1CcHAwoqKiMHHixArr3bBhA/7v//4PTZs2hUqlQlxcHCZOnIji4mJvm0u9l19++WV07doVer0eQUFB6NixY5XHRB08eBBDhw5FdHQ01Go1mjRpgjvvvBMmk8mn3Ycffuh9X0RERODOO+/0ef90794d3377LY4ePeqtr+x74aojKGAtXrxYABBbtmwRY8aMEWq1Wpw8edK7/sYbbxRt27b1uU/nzp3FqFGjxGuvvSbeeOMN0adPHwFAZGZm+rS78cYbxY033ui93aNHD9GmTZtyNcyYMUPIZDJx+vRpIYQQRUVFIjk5Wej1evHUU0+Jt956S4wYMUJIJBLxyCOPVPp8Tp06JQCI/v37i6KiokrbFhYWinbt2gmZTCbuu+8+MX/+fPHss8+Kzp07i6ysLCGEEOfOnRMxMTHiscceE/PnzxcvvviiaNWqlVAoFN42FotFzJ8/XwAQgwcPFh988IH44IMPxI4dO8Svv/4qevfuLQB4l3/wwQfeGu69914hl8vFfffdJ9566y0xadIkERwcLDp37iwcDoe3XXx8vGjZsqUICwsTkydPFq+++qpo3769kEql4ocffvC2++mnnwQA8dNPP3mXlb7GnTt3Fq+99pqYPHmyCAoKEs2aNRP5+flCCHHJOisybdo0AUD06tVLvPHGG2L8+PFCJpP51L5ixQoxePBgAUDMnz/fu18qUlkNhw8fFgBESkqKMBqN4oUXXhAvvviiMBgMokmTJj77avfu3UKn04k2bdqIF154QWRmZopu3boJiUQivvjii0qfU0ZGhgAgNm7cWGm7sqryHhFCiP/+979CoVCIvLw8n/t/+umnAoD45ZdfhBD/vIbt27cXycnJ4tVXXxWTJ08WarVatGzZUlitVu99q/LaClH+d7HUyJEjRXx8vPd26X5u06aNaN68uZg9e7Z47bXXxNGjR72vd0pKihgyZIiYN2+euPfeewUA8eSTT5bbLgBx2223iTfffFOMGDFCABCDBg3yaVeV95AQQrzzzjsCgOjatat4/fXXxaOPPirCwsJE8+bNfZ7XwoULvY+7YMECMXfuXDF27FgxYcKESl9Dq9UqWrZsKdRqtXjyySfFnDlzRMeOHUVycnK536eHH35Y9OvXTzz//PNiwYIFYuzYsUImk4nbbrvN2+ZSv09NmjQRDz30kMjMzBSvvvqquO666wQAsXLlykrrtNvtIiEhQcTGxornnntOvPPOO2LGjBmic+fO4siRI952zz33nJBIJOKOO+4Q8+bNEzNmzBAGg8HnffHDDz+IDh06CIPB4K1vxYoVlT5+Q8YgE8DKBplDhw4JuVzu80tfUZAp+0Faqm/fvqJ58+Y+yy788FywYIEAIHbt2uXTrk2bNqJHjx7e288++6wIDg4Wf/75p0+7yZMnC5lMJo4dO1bpcyr90AwPDxeDBw8WL7/8sti3b1+5dlOnThUAKvxy83g8QgghXC6XsNvtPuvy8/NFo0aNxJgxY7zLzp07JwCIadOmldvWuHHjREV5f8OGDQKAWLp0qc/yVatWlVseHx8vAIjly5d7l5lMJhETEyNSUlK8yy4MMg6HQ0RFRYl27dqJ4uJib7uVK1cKAGLq1KmXrLMiZ8+eFUqlUvTp00e43W7v8szMTAFALFq0yLus9Mvq3Llzl9zuxWoo/YLV6/Xi/Pnz3uVfffWVACC++eYb77KePXuK9u3bC5vN5l3m8XhE165dRWJiYqWPXxq6CgoKfJYXFxeLc+fOeX/KhoSqvkcOHDjgDXRlDRw4UDRr1sz7nit9DRs3bizMZrO3XWngmTt3rhDi8l7byw0yWq1WnD171qdt6etY9jmV7jO9Xu+9vX37dgFA3HvvvT7tHn/8cQFArFu3TghR9fdQ6fPs0KGDz34uDS1ln9ett95a7vOqKubMmSMAiE8//dS7rKioSBiNxnJBpqLPv1mzZgmJRCKOHj3qXVbZ79OF23A4HKJdu3Y+n4MVycrKEgDEZ599dtE2R44cETKZTMycOdNn+a5du4RcLvdZ3r9/f5/X/2rGQ0sNRPPmzXHPPfdg4cKFyMnJuWi7oKAg7/9NJhNyc3Nx44034q+//irXvVnWkCFDIJfL8cknn3iX7d69G3v37sUdd9zhXfbZZ58hPT0d4eHhyM3N9f706tULbrcbv/zyS6XPY/HixcjMzERCQgJWrFiBxx9/HK1bt0bPnj19utyXL1+Oa665BoMHDy6
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_naive_scatter1 = sns.scatterplot(data=gb, x=\"long\", y=\"lat\")\n",
"_ = plot_naive_scatter1.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Naive Scatteplot of the Gaybourhoods data set\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here the data does appear to trace the outline of the United States, probably because most of the cities the data set covers are along the coast. We can do something similar with the election data we cleaned earlier:"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlAAAAHHCAYAAABwaWYjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOx9e3wTVfr+k0yTJm16swqCVhAqlEukX8AWpRVWQcCy2ip4Aa1isawLUhfkZnGVpYDgDRRUqixuWVgRXJAVisK6rsB6AwRBpIIK1h8oWntLmzTTyfz+SM70zGRuKSBoz/P5+JHOTGbOnExynrzv8z6vRRRFEQwMDAwMDAwMDKZhPdcDYGBgYGBgYGD4tYERKAYGBgYGBgaGCMEIFAMDAwMDAwNDhGAEioGBgYGBgYEhQjACxcDAwMDAwMAQIRiBYmBgYGBgYGCIEIxAMTAwMDAwMDBECEagGBgYGBgYGBgiBCNQDAwMDAwMDAwR4jdHoAYPHozBgwef62G0Kbz33nuwWCx47733zvVQZFi1ahXS0tJgs9mQmJh4rodz3uDee+9F586dZdssFgsef/xx6e9XX30VFosFx44d+0XHtnXrVqSnp8PhcMBisaCmpuYXvf6ZwLFjx2CxWPDUU0+d66EwRADlZ0ALjz/+OCwWy9kfkAbUPr9m0blzZ9x7773nxVhOB+Qz9uqrr/7i16ZxTggU+XJ2OBz4f//v/4XtHzx4MHr37n0ORtZ6BAIBlJWVITMzExdccAHi4uLQrVs35Ofn48MPPzwr12xsbMTjjz+uSly2bNli6svgXONsjfPw4cO499570bVrV7z88ssoLS1VPe7KK6/EZZddBr2ORgMHDkT79u3R3Nxs6tqHDh3C448/fkbJR+fOnWGxWAz/O9dfKKeDqqoq3HbbbXA6nVi2bBlWrVqF2NjYcz0sTZzrz1hVVRWefPJJXHvttbjooouQmJiIAQMGYO3atarHNzU1YcaMGejYsSOcTicyMzOxbdu2sOPeeecdFBQUoHfv3uA4TnOBPHz4MKZPn4709HTExcWhQ4cOyMnJwe7du02Nn6wD5D+Hw4Fu3bph0qRJ+OGHH0zPw28RJ06cwOOPP459+/adk+vrfcf84Q9/+MXGsWbNGixevPgXu16kiDqXF29qasITTzyB559//oyd85133jlj54oEkydPxrJly3DzzTdj7NixiIqKQkVFBcrLy9GlSxcMGDDgjF+zsbERc+bMAYCwqNuWLVuwbNmy855Ena1xvvfeewgEAliyZAlSU1M1jxs7dixmzpyJHTt24Nprrw3bf+zYMXzwwQeYNGkSoqLMfVwOHTqEOXPmYPDgwWfs19nixYvh8Xikv7ds2YJ//OMfePbZZ3HhhRdK26+55pozcr27774bd9xxB6Kjo8/I+czgk08+QX19PebOnYshQ4b8YtdtLc71Z+yDDz5AcXExbrzxRsyePRtRUVF44403cMcdd0jPII17770X69evx0MPPYQrrrgCr776Km688Ub85z//QVZWlnTcmjVrsHbtWvTt2xcdO3bUvP4rr7yCFStW4NZbb8Uf//hH1NbWYvny5RgwYAC2bt1q+j38y1/+gssvvxw+nw87d+7Eiy++iC1btuDgwYOIiYlp3eS0Al6v1/Rn/GzjxIkTmDNnDjp37oz09HTZvpdffhmBQOCsj2Ho0KHIz88P296tW7ezfm2CNWvW4ODBg3jooYdk2zt16gSv1wubzfaLjUUN5/RpSU9Px8svv4xZs2bpflAjgd1uPyPniQQ//PADXnjhBdx///1hkY7Fixfjxx9//MXH1NZx6tQpADBM3Y0ZMwazZs3CmjVrVAnUP/7xD4iiiLFjx56NYZpGbm6u7O/vv/8e//jHP5Cbm3tWQugcx4HjuDN+Xj2Yfc+A4I+HX3JxPR/Rq1cvHDlyBJ06dZK2/fGPf8SQIUOwcOFCTJ8+XYrgffzxx3jttdfw5JNP4uGHHwYA5Ofno3fv3pg+fTr+97//SeeYP38+Xn75ZdhsNowcORIHDx5Uvf6dd96Jxx9/HC6XS9p23333oUePHnj88cdNE6gRI0agf//+AIDx48cjOTkZzzzzDN58803ceeedqq9paGg449FJh8NxRs93tvBLkYZu3brhrrvu+kWuFSlIxPJc45xqoB555BEIgoAnnnjC8NiVK1fiuuuuQ7t27RAdHY2ePXvixRdfDDuO1kD98MMPiIqKCvslBgAVFRWwWCxYunSptK2mpgYPPfQQUlJSEB0djdTUVCxcuNCQ7X/zzTcQRREDBw4M22exWNCuXTvZtpqaGvzpT39C586dER0djUsvvRT5+fn46aefAAB+vx9//vOf0a9fPyQkJCA2NhbZ2dn4z3/+I53j2LFjuOiiiwAAc+bMkcKrjz/+OO69914sW7ZMuj75jyAQCGDx4sXo1asXHA4H2rdvjwkTJqC6ulo2zs6dO2PkyJF45513JF1Kz5498c9//lN3PgjWrVuHfv36wel04sILL8Rdd90lS9kajVMLL7zwAnr16oXo6Gh07NgREydOlGllOnfujMceewwAcNFFF+lqG1JSUnDttddi/fr14Hk+bP+aNWvQtWtXZGZmAgA+/fRTjBgxAvHx8XC5XLj++utlKdpXX30Vo0ePBgD87ne/k+6JTrOWl5cjOzsbsbGxiIuLQ05ODj7//HPD+zbCm2++iZycHHTs2BHR0dHo2rUr5s6dC0EQIj6XlgbKzNi///57jBs3Dpdeeimio6PRoUMH3HzzzbopzcGDB+Oee+4BAFx11VWwWCySVoOk9Pfs2YNrr70WMTExeOSRRwAESVdBQQHat28Ph8OBPn364G9/+5vs3LQmadmyZejSpQtiYmJwww03oLKyEqIoYu7cubj00kvhdDpx88034+eff9adH7PPbmlpKbp27Yro6GhcddVV+OSTT8KOOXz4MEaNGoULLrgADocD/fv3x6ZNm3SvDwCXX365jDyRseTm5qKpqQlff/21tH39+vXgOA6FhYXSNofDgYKCAnzwwQeorKyUtnfs2NHUIt2vXz8ZeQKA5ORkZGdn44svvjB8vRauu+46AMHvVSA41y6XC1999RVuvPFGxMXFST9ozH6X7d69G8OGDcOFF14Ip9OJyy+/HPfdd5/sGLXviZ07d+Kqq66Cw+FA165dsXz5cs1x//3vf5e+7y644ALccccdsnkFWp7lQ4cO4Xe/+x1iYmJwySWXYNGiRdIx7733Hq666ioAwLhx48LS82q6o6eeegrXXHMNkpOT4XQ60a9fP6xfv15nls8ezL4nQPD7ZNCgQYiLi0N8fDyuuuoqrFmzBkBwrjZv3ozjx49Lc0DuW0sD9e6770rfT4mJibj55pvDnkWiYTt69CjuvfdeJCYmIiEhAePGjUNjY2NE93pOI1CXX3458vPz8fLLL2PmzJm6UagXX3wRvXr1wk033YSoqCj861//wh//+EcEAgFMnDhR9TXt27fHoEGD8Prrr0sLKsHatWvBcZy02DU2NmLQoEH4f//v/2HChAm47LLL8L///Q+zZs3CyZMndfOw5Ets3bp1GD16tO4vY4/HI33B3Hfffejbty9++uknbNq0Cd999x0uvPBC1NXV4ZVXXsGdd96J+++/H/X19VixYgWGDRuGjz/+GOnp6bjooovw4osv4oEHHkBeXh5uueUWAEFNT0NDA06cOIFt27Zh1apVYWOYMGECXn31VYwbNw6TJ0/GN998g6VLl+LTTz/Frl27ZF+eR44cwe23344//OEPuOeee7By5UqMHj0aW7duxdChQzXvk5z/qquuwoIFC/DDDz9gyZIl2LVrFz799FMkJiZiwoQJuuNUw+OPP445c+ZgyJAheOCBB1BRUYEXX3wRn3zyiTT2xYsXo6ysDBs2bMCLL74Il8uFK6+8UvOcY8eORWFhId5++22MHDlS2n7gwAEcPHgQf/7znwEAn3/+ObKzsxEfH4/p06fDZrNh+fLlGDx4MP773/8iMzMT1157LSZPnoznnnsOjzzyCHr06AEA0v9XrVqFe+65B8OGDcPChQvR2Ni
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_naive_scatter2 = sns.scatterplot(data=pol, x=\"long\", y=\"lat\")\n",
"_ = plot_naive_scatter2.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Naive Scatteplot of Vote Tallies from the 2012 Presidential Election\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This plot looks a little weird, which probably has to do in part with the fact that we are very thoughtlessly trying to project a portion of a geoid onto a cartesian plane using a measurement that is not cartesean coordinates\n",
"\n",
"Another issue with these plots is the fact that the data is very messy. It's not all that meaningful for more reasons than just the fact that it's messy, but being messy doesn't help all that much\n",
"\n",
"One thing that stands out is the outliers in the bottom left of the plot, which is clearly Hawaii. Hawaii is not covered by the Gaybourhoods data set at all so we can remove these rows like so:"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlAAAAHHCAYAAABwaWYjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOx9e3wU1fn+s7vZzW5uJEZBKDEKEcIlkgICSiJWQUCoJgWsgkQRDb8WJBYFxOCFEq6CgooKSrFQqAgtSJWL0NYL1LuCIBJFBeMXBA0Jue1mN7Pz+2P3TM6cOXPbbEjQeT4fP5Ld2bnPnOe87/M+r00URREWLFiwYMGCBQsWDMPe0jtgwYIFCxYsWLBwvsEiUBYsWLBgwYIFCyZhESgLFixYsGDBggWTsAiUBQsWLFiwYMGCSVgEyoIFCxYsWLBgwSQsAmXBggULFixYsGASFoGyYMGCBQsWLFgwCYtAWbBgwYIFCxYsmIRFoCxYsGDBggULFkziZ0egrr32Wlx77bUtvRu/KLz55puw2Wx48803W3pXZFi3bh0yMzPhdDqRnJzc0rvTanDnnXfi0ksvlX1ms9nw2GOPSX+/9NJLsNlsOHbs2Dndt507dyI7Oxtutxs2mw2VlZXndPvRwLFjx2Cz2bBkyZKW3hULJsA+A2p47LHHYLPZmn+HVMB7fo3i0ksvxZ133tkq9qUpIM/YSy+9dM63TaNFCBR5Obvdbvzf//2f4vtrr70WPXv2bIE9ixzBYBBr165F//79ccEFFyAxMRFdunRBQUEB3nvvvWbZZl1dHR577DEucdm+fbuhl0FLo7n288iRI7jzzjvRuXNnvPDCC1i1ahV3uSuuuAKXXHIJtDoaDRw4EO3atUNDQ4OhbR8+fBiPPfZYVMnHpZdeCpvNpvtfS79QmoLy8nLccsst8Hg8WLFiBdatW4f4+PiW3i1VtPQzVl5ejscffxzXXHMNLrroIiQnJ2PAgAHYuHEjd/n6+nrMnDkTHTp0gMfjQf/+/bF7927Fcm+88QYmTpyInj17wuFwqA6QR44cwYwZM5CdnY3ExES0b98eI0aMwEcffWRo/8k4QP5zu93o0qULpkyZglOnThk+Dz9HnDhxAo899hj279/fItvXesf8v//3/87ZfmzYsAHLli07Z9szi5iW3Hh9fT0WLlyIp59+OmrrfOONN6K2LjOYOnUqVqxYgZtvvhnjxo1DTEwMSktLsWPHDnTq1AkDBgyI+jbr6uowZ84cAFBE3bZv344VK1a0ehLVXPv55ptvIhgMYvny5cjIyFBdbty4cXjwwQfxzjvv4JprrlF8f+zYMbz77ruYMmUKYmKMPS6HDx/GnDlzcO2110ZtdrZs2TLU1NRIf2/fvh1///vf8eSTT+LCCy+UPr/66qujsr3x48fj1ltvRWxsbFTWZwQffvghqqurMXfuXAwePPicbTdStPQz9u6776K4uBg33ngjZs+ejZiYGPzjH//ArbfeKt2DNO68805s3rwZ9913Hy6//HK89NJLuPHGG/Hf//4XOTk50nIbNmzAxo0b0bt3b3To0EF1+y+++CJWr16NUaNG4Y9//CPOnj2LlStXYsCAAdi5c6fha/jnP/8Zl112GXw+H/bu3YvnnnsO27dvx6FDhxAXFxfZyYkAXq/X8DPe3Dhx4gTmzJmDSy+9FNnZ2bLvXnjhBQSDwWbfhyFDhqCgoEDxeZcuXZp92wQbNmzAoUOHcN9998k+T09Ph9frhdPpPGf7wkOL3i3Z2dl44YUXMGvWLM0H1QxcLldU1mMGp06dwrPPPot77rlHEelYtmwZfvzxx3O+T790nD59GgB0U3djx47FrFmzsGHDBi6B+vvf/w5RFDFu3Ljm2E3DyMvLk/39ww8/4O9//zvy8vKaJYTucDjgcDiivl4tGL1mQGjycC4H19aIHj164KuvvkJ6err02R//+EcMHjwYixYtwowZM6QI3gcffICXX34Zjz/+OB544AEAQEFBAXr27IkZM2bgf//7n7SO+fPn44UXXoDT6cTIkSNx6NAh7vZvu+02PPbYY0hISJA+u+uuu9CtWzc89thjhgnU8OHD0bdvXwDA3XffjdTUVDzxxBN49dVXcdttt3F/U1tbG/XopNvtjur6mgvnijR06dIFt99++znZllmQiGVLo0U1UA899BAEQcDChQt1l12zZg2uu+46tG3bFrGxsejevTuee+45xXK0BurUqVOIiYlRzMQAoLS0FDabDc8884z0WWVlJe677z6kpaUhNjYWGRkZWLRokS7b//bbbyGKIgYOHKj4zmazoW3btrLPKisr8ac//QmXXnopYmNj0bFjRxQUFOCnn34CAPj9fjzyyCPo06cP2rRpg/j4eOTm5uK///2vtI5jx47hoosuAgDMmTNHCq8+9thjuPPOO7FixQpp++Q/gmAwiGXLlqFHjx5wu91o164dJk2ahIqKCtl+XnrppRg5ciTeeOMNSZfSvXt3/POf/9Q8HwSbNm1Cnz594PF4cOGFF+L222+XpWz19lMNzz77LHr06IHY2Fh06NABkydPlmllLr30Ujz66KMAgIsuukhT25CWloZrrrkGmzdvRiAQUHy/YcMGdO7cGf379wcAfPrppxg+fDiSkpKQkJCA66+/XpaifemllzBmzBgAwG9+8xvpmOg0644dO5Cbm4v4+HgkJiZixIgR+Pzzz3WPWw+vvvoqRowYgQ4dOiA2NhadO3fG3LlzIQiC6XWpaaCM7PsPP/yACRMmoGPHjoiNjUX79u1x8803a6Y0r732Wtxxxx0AgCuvvBI2m03SapCU/scff4xrrrkGcXFxeOihhwCESNfEiRPRrl07uN1u9OrVC3/9619l66Y1SStWrECnTp0QFxeHG264AWVlZRBFEXPnzkXHjh3h8Xhw880348yZM5rnx+i9u2rVKnTu3BmxsbG48sor8eGHHyqWOXLkCEaPHo0LLrgAbrcbffv2xbZt2zS3DwCXXXaZjDyRfcnLy0N9fT2++eYb6fPNmzfD4XCgsLBQ+sztdmPixIl49913UVZWJn3eoUMHQ4N0nz59ZOQJAFJTU5Gbm4svvvhC9/dquO666wCE3qtA6FwnJCTg66+/xo033ojExERpQmP0XfbRRx9h6NChuPDCC+HxeHDZZZfhrrvuki3De0/s3bsXV155JdxuNzp37oyVK1eq7vff/vY36X13wQUX4NZbb5WdV6DxXj58+DB+85vfIC4uDr/61a+wePFiaZk333wTV155JQBgwoQJivQ8T3e0ZMkSXH311UhNTYXH40GfPn2wefNmjbPcfDB6TYDQ+2TQoEFITExEUlISrrzySmzYsAFA6Fy9/vrrOH78uHQOyHGraaD+85//SO+n5ORk3HzzzYp7kWjYjh49ijvvvBPJyclo06YNJkyYgLq6OlPH2qIRqMsuuwwFBQV44YUX8OCDD2pGoZ577jn06NEDN910E2JiYvCvf/0Lf/zjHxEMBjF58mTub9q1a4dBgwbhlVdekQZUgo0bN8LhcEiDXV1dHQYNGoT/+7//w6RJk3DJJZfgf//7H2bNmoWTJ09q5mHJS2zTpk0YM2aM5sy4pqZGesHcdddd6N27N3766Sds27YN33//PS688EJUVVXhxRdfxG233YZ77rkH1dXVWL16NYYOHYoPPvgA2dnZuOiii/Dcc8/hD3/4A/Lz8/G73/0OQEjTU1tbixMnTmD37t1Yt26dYh8mTZqEl156CRMmTMDUqVPx7bff4plnnsGnn36Kffv2yV6eX331FX7/+9/j//2//4c77rgDa9aswZgxY7Bz504MGTJE9TjJ+q+88kosWLAAp06dwvLly7Fv3z58+umnSE5OxqRJkzT3k4fHHnsMc+bMweDBg/GHP/wBpaWleO655/Dhhx9K+75s2TKsXbsWW7ZswXPPPYeEhARcccUVquscN24cCgsLsWvXLowcOVL6/ODBgzh06BAeeeQRAMDnn3+O3NxcJCUlYcaMGXA6nVi5ciWuvfZavPXWW+jfvz+uueYaTJ06FU899RQeeughdOvWDQCk/69btw5
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pol = pol[pol[\"long\"] > -140]\n",
"\n",
"plot_naive_scatter3 = sns.scatterplot(data=pol, x=\"long\", y=\"lat\")\n",
"_ = plot_naive_scatter3.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Naive Scatteplot of Vote Tallies from the 2012 Presidential Election\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAkkAAAJOCAYAAACjhZOMAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOz9eZBtyV3fi34z17TnXeOZhx7O6UFDS7RAahkJCxDDxXBlobAdQtd2cLn2s59MIBGOp2eHDTaPMPiPa4h7DQ4ewcX3RSCwsbkYXzBgBgkEatR0t9SSejp95rFOVe15WGPm+yP32rX3rjXkPmfXqTrn/D4RHdKpylqZK1eulb+VK3/fL5NSShAEQRAEQRBT8P1uAEEQBEEQxEGEgiSCIAiCIIgEKEgiCIIgCIJIgIIkgiAIgiCIBChIIgiCIAiCSICCJIIgCIIgiAQoSCIIgiAIgkiAgiSCIAiCIIgEKEgiCIIgCIJIgIIkgiAIgiCIBChIIgiCIAiCSICCJIIgCIIgiAQoSCIIgiAIgkjA3O8GEARBPIg8dvZJ3Lh+LbfcseMncOHcG/egRQRBzAuTUsr9bgRBEMSDRqFUxl//2T/ILfcfP/kh2Fb2+yoFUgSxP9BKEkEQxD4iRYS//rOfyyzzm5/68L1pDEEQU9CeJIIgCIIgiAQoSCIIgiAIgkiAgiSCIAiCIIgEKEgiCIIgCIJIgIIkgiAIgiCIBChIIgiCIAiCSICCJIIgCIIgiAQoSCIIgiAIgkiAgiSCIAiCIIgEKEgiCIIgCIJIgIIkgiAIgiCIBChIIgiCIAiCSICCJIIgCIIgiAQoSCIIgiAIgkjA3O8GEARBENn4QYhCqZxb7tjxE7hw7o170CKCeDigIIkgCOKAI0WEv/6zn8st95uf+vDeN4YgHiLocxtBEARBEEQCFCQRBEEQBEEkQJ/bCIJ4KHjs7JO4cf1aZhndPT06x/J9f672EQRx8KAgiSAecnQmfOD+3xR84/o1/PWf/YPMMrp7enSO9R/+4Qe120YQxMGEgiSCeMjRmfAB4D9+8kO5GVb3eyBFEAQxCQVJBEFooZNhpbsS87CsXhEEcX9DQRJBEPcc3dUrSmknCGI/oSCJIB5g7vUGY13Rw0XWqbsqpVPnfrSfIIiDCwVJBPEAc683GOuKHurWqRO0+L6Pv/nzf7KQOhfd/nuNTn/RJ0yC0IeCJIIgDiw6QctBDVj2g0XuGyMIgsQkCYIgCIIgEqEgiSAIgiAIIgEKkgiCIAiCIBKgIIkgCIIgCCIBCpIIgiAIgiASoCCJIAiCIAgiAQqSCIIgCIIgEiCdJIIgCGIXOkrmJExJPOhQkEQQBEHsQketnYQpiQcdCpIIgiAeIsifjiD0oSCJIAjiIWKR/nS6ARd9liPuVyhIIogDhs5eECEBzvKPRasBxF6iG3DRZznifoWCJII4YOjsBfkP//CD+Fv/7k9zj0XmrwRBEHcOSQAQBEEQBEEkQEESQRAEQRBEAhQkEQRBEARBJEBBEkEQBEEQRAIUJBEEQRAEQSRAQRJBEARBEEQCFCQRBEEQBEEkQEESQRAEQRBEAhQkEQRBEARBJECK2wRxj9CxGwHISoQgCOKgQEESQdwjdOxGALISIQiCOCjQ5zaCIAiCIIgEKEgiCIIgCIJIgD63EQRBEHuKH4QolMqZZY4dP4EL5964Ry0iCD0oSCIIgiD2FCki/PWf/Vxmmd/81IfvTWMIYg4oSCIIgiD2HZ3VJoBWnIh7CwVJBLEAdNL7KbWfINLRWW0CaMWJuLdQkEQcCHQ1hO71W+Q82kZ/8+f/JLMMpfYTBEHcX1CQtEfoTK60bLyDrobQvX6LJG0jgiCIhxcKkvYIncmVlo0JgiAI4uBCQdI+ortRUUiAs+wyB3lVivbrEARBEPcjFCTNySL9t3Q3Kv6Hf/hB/K1/96eZZQ7yqpTOqtoiP1fpXiOd4JOCN4I4WOi8XOrc28DBfrkkDgYUJM3JQd2jsh+rUgfVsHWea5QXfNJeI4I4WOi8XOrc28DBfrkkDgYUJD0g7Meq1EENGAmCIAhiEVCQROxCd1VqPz5F6bSNPpERBEEQi4CCJGIX86xK3Wt0l9oJgiDyIJVvIg8KkgiCIIiHElL5JvLg+90AgiAIgiCIg8gDv5IkpUS3213o8YJhX6dgfrmH4Vj7UScd68E41n7USceiYyXg+QGcYimzzNFjx/DKyy/l1zkH1WoVjGloGRB7BpNSyv1uxF7S6XRQr9f3uxkEQRAEMRftdhu1Wm2/m/FQ88AHSYteSbrXdDodnDx5ElevXqWbZQT1yW6oT3ZDfZIM9ctuDmqf0ErS/vPAf25jjB2oQX+n1Gq1B+I8Fgn1yW6oT3ZDfZIM9ctuqE+IWWjjNkEQBEEQRAIUJBEEQRAEQSRAQdIBx3Ec/PiP/zgcx9nvphwYqE92Q32yG+qTZKhfdkN9QqTxwG/cJgiCIAiCuBNoJYkgCIIgCCIBCpIIgiAIgiASoCCJIAiCIAgiAQqSCIIgCIIgEqAgiSAIgiAIIoF9DZL+xb/4F2CMTf331FNPjX/vui4++clPYnV1FZVKBR/72MewsbGxjy0mCIIgCOJhYd9Xkt7+9rfj5s2b4/++8IUvjH/36U9/Gv/1v/5X/Pqv/zo+//nP48aNG/j+7//+fWwtQRAEQRAPC/vu3WaaJo4cObLr5+12G7/0S7+Ez372s/i2b/s2AMAv//Iv4+mnn8bzzz+P5557Tuv4scEtGQUSBEEQDzI03y2efV9JOnfuHI4dO4bHHnsMn/jEJ3DlyhUAwIsvvoggCPDhD394XPapp57CqVOn8MUvfjH1eJ7nodPpjP+7fv066vU6ut3unp8LQRAEQdwraL7be/Y1SHrf+96Hf//v/z1+93d/F//u3/07XLx4ER/84AfR7XZx69Yt2LaNpaWlqb85fPgwbt26lXrMn/qpn0K9Xh//d/LkyT0+C4IgCIK499B8t/ccKFuSVquF06dP49/8m3+DYrGIH/zBH4TneVNl3vve9+Jbv/Vb8a//9b9OPIbneVN/0+l0cPLkSbTbbdRqtT1tP0EQBEHcK2i+23v2fU/SJEtLS3jiiSfw1ltv4Tu+4zvg+z5ardbUatLGxkbiHqYYx3HIpJAgCIJ44KH5bu/Z9z1Jk/R6PZw/fx5Hjx7Fe97zHliWhT/8wz8c//6NN97AlStX8P73v38fW0kQBEEQxMPAvq4k/eN//I/xfd/3fTh9+jRu3LiBH//xH4dhGPj4xz+Oer2OH/qhH8KP/uiPYmVlBbVaDT/8wz+M97///dqZbQRBEARBEHfKvgZJ165dw8c//nFsb29jfX0dH/jAB/D8889jfX0dAPAzP/Mz4JzjYx/7GDzPw3d913fh53/+5/ezyQRBEARBPCQcqI3be0Gn00G9XqeNbARBEMQDDc13i+dA7UkiCIIgCII4KFCQRBAEQRAEkQAFSQRBEARBEAlQkEQQBEEQBJEABUkEQRAEQRAJHCjF7YcNP5JoDSMYHFgqGDB4smuzEBItVyAQEstFA7Zx9+7ObijQdgUcg6FW4OApjtGhkGgOIwDActGAmdJGXaSU2B5EuN4NsF42cbRiprpV6/bPfuKGApeaPsCAR5dsOGbye4eQEh1XwIsk6gWOQko5KSU2BxFudgMcKps4ktE/PT/CpWaAosVweslOvTZBJHGp5cOPJB5dtlGy0uu+0Q2xNQhxomZhtZT+eBgGAm1PoGgy1BxOjuMEQTyQkATAPiCkRNsV6Hhi/DPOgOWCgbLNxhOOlBL9QAUpYuIqVR2OpYzAJotoFPT0g50DGgxYLRkoTkyeUkp0PYGWKxCXZADqBX7Hk+IwEHir4aHl7px32eI4u2qj6hjjn6X2T9FA2WIHYkIWUuJ6J8DlVjDuH86AR5ZsHKtOBzbDQGB7ECGauIZli2G5OB34DQKBt7Y9tCfOu2Kr/qnYO/0TCokrLR/Xu+H4ZyYHHl9xsF4ypsbPRj/EhYY/VffJmoWTdWuq7o4X4dy2h8HEuFgpGnh8xZ4K6OKgebKcyYHVooFCSvBFEMS94SDOd/c
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 600x600 with 3 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# A prettier way to represent the number of observations in a given region\n",
"# would be to use a hexbin plot:\n",
"plot_naive_hexbin = sns.jointplot(data=pol, x=\"long\", y=\"lat\", kind=\"hex\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are quite a few counties concentrated around the east coast, above Florida and below New Hampshire. On the other hand, there are far fewer individual counties along the west coast. This is all pretty self-evident."
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of data points in the region of New York: 131\n"
]
},
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAkAAAAHHCAYAAABXx+fLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAACw20lEQVR4nOzdd5xU1fk/8M+t08v2wvaFZeldRBFREDB8UUBjiQ01UaOxhGjUFCM2NBrF2GPyMxpNTDSxRCOKFQsiRXqHpS7by+z0ufee3x/LDgwzs3VmZ3b3eb9evl5yz517z+zOzjxzznOewzHGGAghhBBCBhA+0R0ghBBCCOltFAARQgghZMChAIgQQgghAw4FQIQQQggZcCgAIoQQQsiAQwEQIYQQQgYcCoAIIYQQMuBQAEQIIYSQAYcCIEIIIYQMOBQAEZLE7r33XnAc16PH1tXVdXgux3H42c9+1q37xMvnn38OjuPw5ptvJrorQT35fZysqKgIixYtism1yHH79+8Hx3F47LHH4n6vWL4eSO+jAIj0mq1bt+Lyyy/HoEGDoNPpkJubi8svvxzbtm1LdNd6pO2DmuM4rFu3Lqx90aJFMJvNCegZGUjaPoyzsrLgdrvD2ouKivB///d/CehZq1WrVoHnedx9990R2x955BFwHIf333+/l3sWzuv14oknnsDkyZNhs9mg1+tRVlaGn/3sZ9i1a1e7j33ooYfw9ttv905HSY9QAER6xX/+8x+MHz8en3zyCa6++mo8++yzuPbaa/Hpp59i/PjxeOeddxLdxZi49957Y3q93/zmN/B4PDG9JkkOO3fuxIsvvhjz69bU1OC5556L+XV7asqUKbj++uvxhz/8AVu3bg1pO3DgAO677z788Ic/xNy5cxPUw1Z1dXWYOnUqFi9ejMzMTNx333145plnMH/+fLz77rsYOXJk8NxIf58UAPUdYqI7QPq/vXv34oorrkBJSQlWrlyJjIyMYNutt96KM844A5dffjk2bdqE4uLiBPY0OpfLBZPJ1O45Y8eOxXvvvYf169dj/PjxMbmvKIoQxYH1Z8oYg9frTXQ34k6n08XlumPHjsWjjz6KG2+8EQaDIS736K6HH34Y77zzDq6//np8+eWXwemjm2++GZIk4cknn+z2tTVNg9/v73EfFy1ahO+//x5vvvkmLrjggpC2+++/H7/+9a+D/x6If5/9CY0Akbh79NFH4Xa78ac//Skk+AGA9PR0vPDCC3A6nXj00UeDxxctWoSioqKwa0Wbc3/11VcxYcIEGAwGpKam4pJLLsGhQ4fCzlu9ejXmzJkDm80Go9GIM888E19//XXEe2zbtg0/+tGPkJKSgqlTp3b4PG+++WakpKR0ehTogw8+wBlnnAGTyQSLxYK5c+eGfTOO9Hw9Hg9uueUWpKenw2Kx4LzzzsORI0fAcVzEezc1NWHRokWw2+2w2Wy4+uqrI06RAMBrr72GoUOHQq/XY8KECVi5cmXYOd9//z3OPfdcWK1WmM1mzJgxA99++22H/QaAv/71r+A4Dvv37w8ea5ua+fDDDzFx4kQYDAa88MILwXZN0/Dggw8iLy8Per0eM2bMwJ49e8Ku/cYbbwRfA+np6bj88stx5MiRsPM+/fTT4M/dbrfj/PPPx/bt28PO++qrrzBp0iTo9XqUlpaG9OlEK1aswNSpU2G322E2mzF06FD86le/injuiU7OAWr72Xz99ddYvHgxMjIyYDKZsGDBAtTW1nZ4vTb33HMPqqurOzUKpGkali1bhhEjRkCv1yMrKwvXX389Ghsbg+csXrwYaWlpYIwFj918883gOA5//OMfg8eqq6vBcVy797XZbHjyySfx9ddf489//jMA4K233sJ///tfPPzww8jJyYHL5cIvfvEL5OfnQ6fTYejQoXjsscdC7g8cz1t77bXXMGLECOh0OixfvjzifRljuO666yDLMv7zn/9E7d/q1avx/vvv49prrw0LfoDWoPXE3KKTX+ccx8HlcuHll18OTosvWrQIn332GTiOw1tvvRV2zb///e/gOA6rVq2K2i8SJ4yQOMvNzWVFRUXtnlNUVMTy8vKC/77qqqtYYWFh2Hm/+93v2Mkv2wceeIBxHMcuvvhi9uyzz7IlS5aw9PR0VlRUxBobG4PnffLJJ0yWZTZlyhT2hz/8gT3xxBNs9OjRTJZltnr16rB7DB8+nJ1//vns2WefZc8880zUvn/22WcMAHvjjTfYfffdxwCwdevWhTwXk8kU8phXXnmFcRzH5syZw5566in2yCOPsKKiIma321lFRUW7z/eiiy5iANgVV1zBnnnmGXbRRRexMWPGMADsd7/7Xdhjx40bxxYuXMieffZZ9uMf/5gBYL/85S9DrgmAjRw5kqWnp7P77ruPPfLII6ywsJAZDAa2efPm4HlbtmxhJpOJ5eTksPvvv589/PDDrLi4mOl0Ovbtt9+222/GGHvppZcYgJDnWFhYyAYPHsxSUlLYXXfdxZ5//nn22WefBX+u48aNYxMmTGBPPPEEu/fee5nRaGSnnHJKxOtOmjSJPfHEE+yuu+5iBoMh7DWwYsUKJooiKysrY7///e+Dr5WUlJSQPm3atIkZDAZWUFDAli5dyu6//36WlZXFRo8eHfK8tmzZwmRZZhMnTmRPPvkke/7559ntt9/Opk2bFvbcT1ZYWMiuuuqqsOcwbtw4dvbZZ7OnnnqK/eIXv2CCILCLLrqow+u1/cxra2vZ2WefzbKyspjb7Q6539y5c0Me8+Mf/5iJosh+8pOfsOeff57deeedzGQysUmTJjG/388YY+w///kPAxDyOhgzZgzjeZ5deOGFwWNvvPEGA8C2bNnSYV/nzp3LUlJS2N69e1l+fj477bTTmKZpTNM0dvbZZzOO49iPf/xj9vTTT7N58+YxAOy2224LuQYANmzYMJaRkcGWLFnCnnnmGfb999+ziooKBoA9+uijjDHGFEVhV155JdPpdOy9995rt1+/+tWvGAC2cuXKDp8DY+Gv87/97W9Mp9OxM844g/3tb39jf/vb39g333zDNE1j+fn57IILLgi7xg9+8ANWWlraqfuR2KIAiMRVU1MTA8DOP//8ds8777zzGADmcDgYY50PgPbv388EQWAPPvhgyHmbN29moigGj2uaxoYMGcJmz57NNE0Lnud2u1lxcTE755xzwu5x6aWXduo5nhgANTU1sZSUFHbeeecF208OgFpaWpjdbmc/+clPQq5TVVXFbDZbyPGTn++6desifhgsWrQoagB0zTXXhJy7YMEClpaWFnIMAAPA1q5dGzx24MABptfr2YIFC4LH5s+fz2RZZnv37g0eq6ysZBaLJeRDv6sBEAC2fPnykHPbfq7Dhg1jPp8vePzJJ58M+UD2+/0sMzOTjRw5knk8nuB57733HgPA7rnnnuCxsWPHsszMTFZfXx88tnHjRsbzPLvyyitDnqder2cHDhwIHtu2bRsTBCHkeT3xxBPBoKOrogVAM2fODHmN/vznP2eCILCmpqZ2r3diAPTFF18wAOzxxx8Pud+JAdCXX37JALDXXnst5DrLly8POV5TU8MAsGeffZYx1vo3zfM8++EPf8iysrKCj7vllltYampqSN+j2b9/PzOZTCw1NZVJkhT8Xb799tsMAHvggQdCzr/wwgsZx3Fsz549wWMAGM/zbOvWrSHnnhgABQIBdvHFFzODwcA+/PDDDvu1YMECBiAkaG5PpNe5yWQK+b22ufvuu5lOpwv5PdbU1DBRFEP+bknvoSkwElctLS0AAIvF0u55be1t53fWf/7zH2iahosuugh1dXXB/7KzszFkyBB89tlnAIANGzZg9+7d+NGPfoT6+vrgeS6XCzNmzMDKlSuhaVrItW+44YYu9QVoHeK/7bbb8O677+L777+PeM6KFSvQ1NSESy+9NKTPgiBg8uTJwT5H0jbEf+ONN4Ycv/nmm6M+5uTnccYZZ6C+vh4OhyPk+JQpUzBhwoTgvwsKCnD++efjww8/hKqqUFUVH330EebPn4+SkpLgeTk5OfjRj36Er776KuyanVVcXIzZs2dHbLv
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# New York is one of the cities covered by the Gaybourhoods data set.\n",
"# Let's take a closer look at it:\n",
"\n",
"gb_ny = gb[gb[\"long\"] > -73]\n",
"print(f\"Number of data points in the region of New York: {gb_ny.shape[0]}\")\n",
"plot_ny_scatter1 = sns.scatterplot(data=gb_ny, x=\"long\", y=\"lat\", hue=\"TOTINDEX\")\n",
"_ = plot_ny_scatter1.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Queer Neighbourhoods in New York City\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above plot, we colour points darker the higher their \"gaybourhood index\" is. This index is a weighted composite of a few factors. The TOTINDEX boldy weighs the queerness of same-sex married couples more than unmarried same-sex households. While it's subjective, in my opinion, this marginalizes the experiences of queer people who have no desire to marry.\n",
"\n",
"If we wanted to look at unmarried same sex households alone, we could use the \"Cns_RateSS\" column:"
]
},
{
"cell_type": "code",
"execution_count": 126,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAkAAAAHHCAYAAABXx+fLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAC3QUlEQVR4nOzdd3hUVfoH8O8tc+/0kt4rCSGEXhQBAUGK2LB3cd11XfnZWOsWFUVRcRUb1l1XF3fd1bWgriAiAhYUCL23hJDeJ9Pn3nt+f4QMDDOTOkkmyfk8T54H7rlz50wy5Z1z3vMehhBCQFEURVEUNYCwvd0BiqIoiqKonkYDIIqiKIqiBhwaAFEURVEUNeDQAIiiKIqiqAGHBkAURVEURQ04NACiKIqiKGrAoQEQRVEURVEDDg2AKIqiKIoacGgARFEURVHUgEMDIIqKYI899hgYhunSbWtqato8l2EY/N///V+n7qe7fPfdd2AYBh999FFvd8WnK3+PM2VkZGD+/PlhuRZ1SlFRERiGwXPPPdft9xXO5wPV82gARPWYPXv24IYbbkBycjJEUURSUhJuuOEG7N27t7e71iUtH9QMw2Dr1q0B7fPnz4der++FnlEDScuHcXx8PBwOR0B7RkYGLrzwwl7oWbOffvoJLMvi4YcfDtr+zDPPgGEYfPnllz3cs0AulwsvvPACzjrrLJhMJqjVauTm5uL//u//cPDgwVZv+9RTT+HTTz/tmY5SXUIDIKpHfPzxxxg9ejTWrl2LW265BcuXL8ett96Kb7/9FqNHj8Znn33W210Mi8ceeyys1/vTn/4Ep9MZ1mtSkeHAgQN46623wn7dqqoqvPbaa2G/bldNmDABv/3tb/GXv/wFe/bs8WsrLi7G448/jiuvvBJz587tpR42q6mpwaRJk7Bw4ULExcXh8ccfx6uvvopLL70UK1euREFBge/cYK9PGgD1HXxvd4Dq/44cOYIbb7wRWVlZ2LBhA2JjY31td999NyZPnowbbrgBO3fuRGZmZi/2NDS73Q6dTtfqOSNHjsQXX3yBwsJCjB49Oiz3y/M8eH5gvUwJIXC5XL3djW4nimK3XHfkyJFYunQp7rjjDmg0mm65j856+umn8dlnn+G3v/0tNm7c6Js+uvPOO6FSqfDiiy92+tqKosDj8XS5j/Pnz8e2bdvw0Ucf4fLLL/dre+KJJ/DHP/7R9/+B+PrsT+gIENXtli5dCofDgTfffNMv+AGAmJgYvPHGG7DZbFi6dKnv+Pz585GRkRFwrVBz7itWrMCYMWOg0WgQFRWFa665BiUlJQHn/fzzz5g9ezZMJhO0Wi2mTJmCH374Ieh97N27F9dddx0sFgsmTZrU5uO88847YbFY2j0K9NVXX2Hy5MnQ6XQwGAyYO3duwDfjYI/X6XTirrvuQkxMDAwGAy6++GKUlpaCYZig993Q0ID58+fDbDbDZDLhlltuCTpFAgDvv/8+Bg8eDLVajTFjxmDDhg0B52zbtg1z5syB0WiEXq/H9OnTsWnTpjb7DQB///vfwTAMioqKfMdapmZWr16NsWPHQqPR4I033vC1K4qCJ598EikpKVCr1Zg+fToOHz4ccO0PP/zQ9xyIiYnBDTfcgNLS0oDzvv32W9/v3Ww245JLLsG+ffsCzvv+++8xbtw4qNVqZGdn+/XpdGvWrMGkSZNgNpuh1+sxePBg/OEPfwh67unOzAFq+d388MMPWLhwIWJjY6HT6TBv3jxUV1e3eb0WjzzyCCorK9s1CqQoCpYtW4ahQ4dCrVYjPj4ev/3tb1FfX+87Z+HChYiOjgYhxHfszjvvBMMweOmll3zHKisrwTBMq/drMpnw4osv4ocffsDbb78NAPjkk0/w+eef4+mnn0ZiYiLsdjt+//vfIzU1FaIoYvDgwXjuuef87h84lbf2/vvvY+jQoRBFEatWrQp6v4QQ3HbbbRAEAR9//HHI/v3888/48ssvceuttwYEP0Bz0Hp6btGZz3OGYWC32/Huu+/6psXnz5+PdevWgWEYfPLJJwHX/Oc//wmGYfDTTz+F7BfVTQhFdbOkpCSSkZHR6jkZGRkkJSXF9/+bb76ZpKenB5z36KOPkjOftosXLyYMw5Crr76aLF++nCxatIjExMSQjIwMUl9f7ztv7dq1RBAEMmHCBPKXv/yFvPDCC2T48OFEEATy888/B9xHfn4+ueSSS8jy5cvJq6++GrLv69atIwDIhx9+SB5//HECgGzdutXvseh0Or/bvPfee4RhGDJ79mzy8ssvk2eeeYZkZGQQs9lMjh071urjveqqqwgAcuONN5JXX32VXHXVVWTEiBEEAHn00UcDbjtq1Chy2WWXkeXLl5Nf//rXBAB54IEH/K4JgBQUFJCYmBjy+OOPk2eeeYakp6cTjUZDdu3a5Ttv9+7dRKfTkcTERPLEE0+Qp59+mmRmZhJRFMmmTZta7TchhLzzzjsEgN9jTE9PJ4MGDSIWi4U89NBD5PXXXyfr1q3z/V5HjRpFxowZQ1544QXy2GOPEa1WS8aPHx/0uuPGjSMvvPACeeihh4hGowl4DqxZs4bwPE9yc3PJs88+63uuWCwWvz7t3LmTaDQakpaWRpYsWUKeeOIJEh8fT4YPH+73uHbv3k0EQSBjx44lL774Inn99dfJfffdR84999yAx36m9PR0cvPNNwc8hlGjRpHzzjuPvPzyy+T3v/894TiOXHXVVW1er+V3Xl1dTc477zwSHx9PHA6H3/3NnTvX7za//vWvCc/z5De/+Q15/fXXyYMPPkh0Oh0ZN24c8Xg8hBBCPv74YwLA73kwYsQIwrIsueKKK3zHPvzwQwKA7N69u82+zp07l1gsFnLkyBGSmppKzjnnHKIoClEUhZx33nmEYRjy61//mrzyyivkoosuIgDIPffc43cNAGTIkCEkNjaWLFq0iLz66qtk27Zt5NixYwQAWbp0KSGEEEmSyE033UREUSRffPFFq/36wx/+QACQDRs2tPkYCAl8nv/jH/8goiiSyZMnk3/84x/kH//4B/nxxx+JoigkNTWVXH755QHXuOCCC0h2dna77o8KLxoAUd2qoaGBACCXXHJJq+ddfPHFBACxWq2EkPYHQEVFRYTjOPLkk0/6nbdr1y7C87zvuKIoJCcnh8yaNYsoiuI7z+FwkMzMTHL++ecH3Me1117brsd4egDU0NBALBYLufjii33tZwZATU1NxGw2k9/85jd+16moqCAmk8nv+JmPd+vWrUE/DObPnx8yAPrVr37ld+68efNIdHS03zEABADZsmWL71hxcTFRq9Vk3rx5vmOXXnopEQSBHDlyxHesrKyMGAwGvw/9jgZAAMiqVav8zm35vQ4ZMoS43W7f8RdffNHvA9nj8ZC4uDhSUFBAnE6n77wvvviCACCPPPKI79jIkSNJXFwcqa2t9R3bsWMHYVmW3HTTTX6PU61Wk+LiYt+xvXv3Eo7j/B7XCy+84As6OipUADRjxgy/5+i9995LOI4jDQ0NrV7v9ABo/fr1BAB5/vnn/e7v9ABo48aNBAB5//33/a6zatUqv+NVVVUEAFm+fDkhpPk1zbIsufLKK0l8fLzvdnfddReJiory63soRUVFRKfTkaioKKJSqXx/y08//ZQAIIsXL/Y7/4orriAMw5DDhw/7jgEgLMuSPXv2+J17egDk9XrJ1VdfTTQaDVm9enWb/Zo3bx4B4Bc0tybY81yn0/n9XVs8/PDDRBRFv79jVVUV4Xne73VL9Rw6BUZ1q6amJgCAwWBo9byW9pbz2+vjjz+Goii46qqrUFNT4/tJSEhATk4O1q1bBwDYvn07Dh06hOuuuw61tbW+8+x2O6ZPn44NGzZAURS/a99+++0d6gvQPMR/zz33YOXKldi2bVvQc9asWYOGhgZce+21fn3mOA5nnXWWr8/BtAzx33HHHX7H77zzzpC3OfNxTJ48GbW1tbBarX7HJ0yYgDFjxvj+n5aWhksuuQSrV6+GLMuQZRlff/01Lr30UmRlZfnOS0xMxHXXXYfvv/8+4JrtlZmZiVmzZgVtu+WWWyAIgl//AeDo0aM
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_ny_scatter2 = sns.scatterplot(data=gb_ny, x=\"long\", y=\"lat\", hue=\"Cns_RateSS\")\n",
"_ = plot_ny_scatter2.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Queer Neighbourhoods in New York City\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As predicted, the diversity of the data becomes more apparent when we don't hide it. While the difference isn't extremely substantial, it will be worthwhile to continue to analyze multiple factors when exploring this data. This is especially the case when we consider the fact that the data set is already very limited in how it fails to account for gender diversity.\n",
"\n",
"Just for fun, let's do something similar for the political dataset:"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {},
"outputs": [
{
"data": {
2023-03-03 09:13:54 +00:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAkQAAAHHCAYAAABeLEexAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOyddXgc17n/P7uzzGJmW5YlGWRmdmzHAYccaPCG2pTTW/7d26TtLbdJkwbaFNJw4pDDcew4ZmYZZFsWM6+W6fz+WGul9a5k2UnqwH6ex89jDZw5c2Z25jvveUEmhBDEiBEjRowYMWJ8iZFf6A7EiBEjRowYMWJcaGKCKEaMGDFixIjxpScmiGLEiBEjRowYX3pigihGjBgxYsSI8aUnJohixIgRI0aMGF96YoIoRowYMWLEiPGlJyaIYsSIESNGjBhfemKCKEaMGDFixIjxpScmiGLEiBEjRowYX3pigijGp8J7773H+PHj0Wg0yGQyuru7L3SXyM3N5dZbb73Q3fjM8uSTTyKTyaiurv7E2qyurkYmk/Hkk09+Ym1+mvT19w9/+MOF7spnhvvuuw+ZTHahu/Glpu+3uXv37gvdlS80n3lB1Hcj9P3TaDQUFhbyjW98g5aWlgvdvY/NkSNHuO+++z7Rl9CFpqOjg5UrV6LVannkkUd4+umn0ev1/5Fjb926lfvuu+8zIcC+TDz33HM8+OCDF7obX1oaGxu577772L9//wXrw6233hr2rDYYDOTn53P11VfzyiuvEAgELljfPms4HA7uu+8+Pvroowvdlc8Vn/bzXfGptPop8POf/5y8vDxcLhebN2/mscce45133qG8vBydTnehu3feHDlyhPvvv5958+aRm5t7obvzibBr1y56e3v5xS9+waJFi/6jx966dSv3338/t956KxaLJWxdRUUFcvln/hvgc8lzzz1HeXk53/nOd8KW5+Tk4HQ6USqVF6ZjXxIaGxu5//77yc3NZfz48ResH2q1mr///e8AOJ1OampqePPNN7n66quZN28eq1evxmQyXbD+fVZwOBzcf//9AMybN+/CduZzxFDP90+Cz40gWrZsGZMmTQLgjjvuICEhgT/96U+sXr2a66+//mO17XA4Ptei6rNGa2srwKdyw34c1Gr1he7Cl44+q26MLwcKhYIbb7wxbNkvf/lLfvOb3/DjH/+YO++8kxdffPEC9e7Tw+fzEQgEUKlUF7orMT4Gn9vP5QULFgBQVVUVWvbMM88wceJEtFot8fHxXHfdddTV1YXtN2/ePEpLS9mzZw9z5sxBp9Pxk5/8BACXy8V9991HYWEhGo2GtLQ0rrzySiorK0P7BwIBHnzwQUpKStBoNKSkpHD33XfT1dUVdpzc3FwuueQSNm/ezJQpU9BoNOTn5/PUU0+FtnnyySe55pprAJg/f37I1NxnRl29ejXLly8nPT0dtVpNQUEBv/jFL/D7/RHj8cgjj5Cfn49Wq2XKlCls2rSJefPmRXx9uN1ufvaznzFixAjUajVZWVn84Ac/wO12D2vcV61aFRrjxMREbrzxRhoaGsLG95ZbbgFg8uTJyGSyQf12Xn75ZWQyGRs2bIhY99e//hWZTEZ5eXlo2Ycffsjs2bPR6/VYLBYuv/xyjh49Glp/33338f3vfx+AvLy80Hj2TUee6UPUNx27ZcsW7r33XpKSktDr9VxxxRW0tbWF9ScQCHDfffeRnp6OTqdj/vz5HDlyZFh+SQP9Uh544AFycnLQarXMnTs37PyGe5595yqTyTh27BgrV67EZDKRkJDAt7/9bVwuV8Sxo/nwyGQy7rvvviH7Ppx7cN68ebz99tvU1NSExrzP2jnY8c/lHE+ePBn6IjSbzdx22204HI4h+/3QQw8hSVKYaf2Pf/wjMpmMe++9N7TM7/djNBr54Q9/GNHG3/72NwoKClCr1UyePJldu3ZFbDOc8ziTlpYWFApFyEIwkIqKCmQyGX/5y19Cy06dOsU111xDfHw8Op2OadOm8fbbb4fWf/TRR0yePBmA2267LXQNBo75jh07WLp0KWazGZ1Ox9y5c9myZUvE8Tdv3szkyZPRaDQUFBTw17/+dchzGS4/+tGPuOiii1i1ahXHjx8PW/fuu++GxtBoNLJ8+XIOHz4cts2tt96KwWCgtraWSy65BIPBQEZGBo888ggAhw4dYsGCBej1enJycnjuueci+nC2cezjbO+Bgb/nBx98MHSPHDlyBI/Hw//+7/8yceJEzGYzer2e2bNns379+lD71dXVJCUlAXD//feHrtfZfosQ/Hi/++67SUhIwGQycfPNN4e9e2655RYSExPxer0R+1500UWMGjXqrMfYsWMHF198MXFxcej1esaOHcuf//znsG2Gc9/feuutUWc9ovmkyWQyvvGNb/D6669TWlqKWq2mpKSE9957L2y/wZ7vc+fOZdy4cVHPZ9SoUSxZsuSs5w2A+Izzr3/9SwBi165dYcv//Oc/C0A8/vjjQgghfvnLXwqZTCauvfZa8eijj4r7779fJCYmitzcXNHV1RXab+7cuSI1NVUkJSWJb37zm+Kvf/2reP3114XP5xMLFy4UgLjuuuvEX/7yF/HrX/9aLFiwQLz++uuh/e+44w6hUCjEnXfeKR5//HHxwx/+UOj1ejF58mTh8XhC2+Xk5IhRo0aJlJQU8ZOf/ET85S9/ERMmTBAymUyUl5cLIYSorKwU3/rWtwQgfvKTn4inn35aPP3006K5uVkIIcSKFSvEypUrxe9//3vx2GOPiWuuuUYA4r//+7/DxuLRRx8VgJg9e7Z46KGHxL333ivi4+NFQUGBmDt3bmg7v98vLrroIqHT6cR3vvMd8de//lV84xvfEAqFQlx++eXDvhaTJ08WDzzwgPjRj34ktFpt2BivWbNG3HXXXQIQP//5z8XTTz8ttm7dGrU9h8MhDAaDuOeeeyLWzZ8/X5SUlIT+/uCDD4RCoRCFhYXid7/7Xej6xsXFiaqqKiGEEAcOHBDXX3+9AMQDDzwQGk+bzRa6JrfcckvE+ZSVlYkFCxaIhx9+WHzve98TkiSJlStXhvXnBz/4gQDEpZdeKv7yl7+IO++8U2RmZorExMSwNqNRVVUlADFmzBiRm5srfvvb34r7779fxMfHi6SkpND1Hu55CiHEz372s1CbfX268cYbBSBuuummiGP/61//iugXIH72s59FjMfA4wznHlyzZo0YP368SExMDI35a6+9Nujxz/Ucy8rKxJVXXikeffRRcccddwhA/OAHPxhyzPfu3SsA8eabb4aWXX755UIul4tJkyaFlu3atUsA4q233grrb1lZmRgxYoT47W9/K373u9+JxMREkZmZGfYbH+55RGPBggWiuLg4Yvn9998vJEkK3RPNzc0iJSVFGI1G8dOf/lT86U9/EuPGjRNyuVy8+uqroW1+/vOfC0DcddddoWtQWVkphBBi3bp1QqVSienTp4s//vGP4oEHHhBjx44VKpVK7NixI3TsgwcPCq1WK7Kzs8Wvf/1r8Ytf/EKkpKSIsWPHiuG8Km655Rah1+sHXf/0008LQPzlL38JLXvqqaeETCYTS5cuFQ8//LD47W9/K3Jzc4XFYgkbw1tuuUVoNBpRXFwsvvrVr4pHHnlEzJgxI3Rvpaeni+9///vi4YcfFiUlJUKSJHHq1KnQ/sMZRyHEsN4DffdIcXGxyM/PF7/5zW/EAw88IGpqakRbW5tIS0sT9957r3jsscfE7373OzFq1CihVCrFvn37hBBC2Gw28dhjjwlAXHHFFaHrdeDAgUHHru+3OWbMmNBz/utf/7qQy+Vizpw5IhAICCGC9+SZ970QQjQ1NQlJksTPf/7zIa/hmjVrhEqlEjk5OeJnP/uZeOyxx8S3vvUtsWjRotA2w73vb7nlFpGTkxNxjL7f9UAAMW7cOJGWliZ+8YtfiAcffFDk5+cLnU4n2tvbhRBDP9+feOIJAYhDhw6Ftbtz504BiKeeemrI8w71Y1hbXUD6boS1a9eKtrY2UVdXJ1544QWRkJAgtFqtqK+vF9XV1UKSJPF///d/YfseOnRIKBS
2023-03-02 07:42:00 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_democrat_scatter = sns.scatterplot(\n",
" data=pol[pol[\"party\"] == \"Democrat\"],\n",
" x=\"long\",\n",
" y=\"lat\",\n",
" hue=\"percent\"\n",
")\n",
"_ = plot_democrat_scatter.set(\n",
" xlabel=\"Longitude\",\n",
" ylabel=\"Latitude\",\n",
" title=\"Percentage of voting population who voted Democrat by county\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A better way to represent this data would be to use a heatmap, but even this scatterplot reveals something we probably already knew: Democrats tend to be concentrated around the urban parts of the United States"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Renewed Analysis Plan\n",
"\n",
"The research questions posed earlier in this analysis still feel sufficiently specific, having clear, measurable goals and room for additional analysis. Here are two very rough step-by-step plans to answer the two research questions I seek to answer in this analysis:\n",
"\n",
"**Is there a correlation between political alignment & living in neighbourhoods with large quantities of LGBT people?**\n",
"1. Unify the political data with the gaybourhoods data set\n",
2023-03-14 01:43:11 +00:00
" 1. Establish the best way to measure the distance from a given `gb` and a county (Euclidean distance? Some other measurement?)\n",
" 2. Find the county that is closest to each observation by minimizing the function established in step (a)\n",
" 3. Merge the two tables. Each `gb` observation should then include a political breakdown of the nearest county during the 2012 presidential election\n",
2023-03-02 07:42:00 +00:00
"2. Use this information to plot queerness by different metrics against political alignment and measure the correlation\n",
"\n",
"**Is there a correlation between geographical stratums & being LGBT?**\n",
"1. Explore different metrics of queerness and analize more qualitatively how different metrics reveal different information in different places by graphing them\n",
"2. Graph and measure the clusteredness of neighbourhoods surpassing different threshholds of queerness\n",
"3. If the previous steps show that this is a relevant line of inquiry: measure and graph the rate of change in queerness radially outward from clusters of queer neighbourhoods\n"
]
2023-02-01 01:23:45 +00:00
}
],
"metadata": {
2023-02-16 00:29:26 +00:00
"kernelspec": {
2023-03-03 09:13:54 +00:00
"display_name": "Python 3",
2023-02-16 00:29:26 +00:00
"language": "python",
"name": "python3"
},
2023-02-01 01:23:45 +00:00
"language_info": {
2023-02-16 00:29:26 +00:00
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2023-03-03 09:13:54 +00:00
"version": "3.11.1"
},
"vscode": {
"interpreter": {
"hash": "b2baa059f790e7ad780c83135aaea020c73a7a7a6921010b599b8b664933698d"
}
2023-02-01 01:23:45 +00:00
}
},
"nbformat": 4,
2023-02-16 00:29:26 +00:00
"nbformat_minor": 4
}