pride-data-analysis/analysis/ungraded/nat-presentation-script.md

18 lines
3.4 KiB
Markdown

For my part of the analysis, I sought to answer the following two research questions. In the first, we asked if there's a correlation between geographical stratums & being LGBT. More specifically, we wanted to see if the conventional wisdom that queer people tend to cluster in space is true. In the second, we wanted to know if there was a meaningful correlation between political alignment and living in a neighbourhood with a large LGBT population.
Before we could fully tackle these two questions we needed to establish some additional metrics that would make the data more approchable. For reasons fully developed in our analysis, we decided to use the Gaybourhood dataset's same-sex index to measure the queerness of a given neighbourhood. The fact that this is a continuous variable presented some challenges, so we decided to discretize the observations by dividing the range into 7 chunks, which we call the observation's Kinsey index.
For our first research question, we were interested not only in a given neighbourhood's kinsey index, but also the kinsey index of any adjacent neighbourhoods. This can be well-represented visually, but we also wanted to find a way to represent it numerically. To do this, we designed an algorithm that calculates an approximate average kinsey index of a small set of observations about the original, which we call the nighbourhood kinsey index.
Down in the left corner here, we have a bar graph comparing the average neighbourhood kinsey index per kinsey index, where a higher kinsey index represents a neighbourhood with a larger number of gay and lesbian residents. The trend here indicates that relationship between the two variables is proportional, although understandably, the neighbourhoods forming the geographical "peak" of queerness in a given region tend to be surrounded by neighbourhoods that are less queer.
Similarly, to the right, we have a scatterplot illustrating the neighbourhood kinsey index versus the same-sex index. This graph represents the same idea, but unencumbered from the abstractions we've created to facilitate data processing. As such, it's clear that while the trend is still present, there's a lot of variance in the data.
Altogether, we have strong numerical evidence that queer communities tend to concentrate in space.
The third graph we have along the bottom tackles our second research question, quantitatively showing the relationship between a neighbourhood's kinsey index and the percentage of the population who voted democrat in the 2012 American presidential election. We see that on average, neighbourhoods that have a high kinsey index tend to vote more democrat.
These two phenomena can also be visualized spatially. This is where Tableau shines for the issues we're tackling with this project. Instead of using weighted KDE plots as done in the analysis, for the dashboard, we chose to take a more interactive approach. Both maps illustrate a density map of all illustrations; the one on the left is filtered by minimum kinsey index and the one on the right is filtered by the minimum percent of the population who voted democrat. Additionally, both maps represent the entirety of the United States, so they can be panned freely.
Here, the filters serve to illustrate the third dimention of the data two-dimensionally. To compare the peaks of queer and democrat population density, we can adjust the filters on the respective graphs.