library(readxl)
library(dplyr)
alldata <- read_excel("10.20.2025.data.team1.clean.xlsx", col_names = TRUE)
alldata[alldata == -99] <- NA
alldata[alldata == -50] <- NA
#explanation:Data collected using the surveying platform, qualtrics was exported to excel. The file was uploaded into positcloud and values -50 and -99 indicating a question was not answered were filtered out and replaced with the value NA.
#source: The Quantitative Playbook for Public Health Research in R. (McCarty, 2025) Broome County Safety Perceptions: A Sociodemographic Analysis
Perceived Community Safety and Trust
Community safety/lack of safety, Community trust/distrust, Racialized group, Gender, Socioeconomic status, Political beliefs
1 Introduction
Violence or inflicted harm in the form of homicides, assaults, and other intentional injuries remain prevalent in the US. Violent crime has fallen by forty-nine percent between 1993 and 2022. However, from 2020 to 2021, gun homicides alone amassed nearly one million years of life cut short and generated economic costs upwards of five hundred billion dollars annually (Miller et al., 2023). Fears of violence resurge despite overall decreases, as nearly sixty percent of adults believe reducing crime and violence in the US should be a top priority addressed by the government as of 2024 (Gramlich, 2024). Data indicates crime has decreased overall, despite increases in murder rates in recent years. However Americans continue to perceive rising negative effects of crime and echo calls for change. Gallup surveys also indicate over sixty percent of adults believe there is more crime nationally than there was last year even when crime rates indicate otherwise nationally (Gramlich, 2024). Evaluating perceived threat and developing interventions to address safety concerns and perceptions is vital to reducing fear. Notably, while overall rates of violence decrease, higher concentrations of violence disproportionately affect minority and lower income populations in areas of disadvantage and low cohesion (Patton et al. 2022). Younger, low income people are more likely to report victimization of violent crime while Black Americans are twice as likely to be perceived offenders for violent incidents as their share of the population (Gramlich, 2024).
While not every individual personally experiences violence throughout their life, direct and indirect costs of violence accrue beyond isolated instances of violence. Such costs include physical and mental health consequences, decreased feelings of safety, financial and resource costs of violence response efforts, and economic burden of injury (Miller et al., 2023). Systemic inequities in housing, educational attainment, and opportunities for economic and social mobility place disadvantaged groups at greater risk for violence. These structural barriers are correlated with high poverty and stress on social services and welfare systems services funded by the public (Miller et al., 2023). The fear effect and economic burden of violence highlight the importance of a public health approach to facilitating safety and reducing harm in the US.
1.1 Knowns and Unknowns
The World Health Organization classifies violence as intentional force or power used against someone or a group that is likely to result in harm, injury, or impose limits on wellbeing (Rutherford et al., 2007). The term safety however is heavily debated, historically viewed as the absence of violence risk and recently shifting towards classification by presence of positive values and health behaviors. The World Health organization worked to merge these viewpoints, defining safety as limiting harm while preserving and facilitating wellbeing and health (Raheemy et al., 2025). Both socioecological and individual perspectives of violence and safety offer relevant insight into the formation of perceived threat. Individual attitudes, beliefs, and motivations influence perceived threat and safety yet remain embedded in socioecological context of family, peers, community, and societal influences (Diclemente et al., 2019). Varied and context specific, risk factors for violence, most prevalent in youth populations, include low mobility and socioeconomic background, high poverty and high risk environment, early exposure to violence, neglect, or abuse. Protective factors decreasing likelihood of violence and promoting safety include a socially cohesive environment, high socioeconomic status, environmental values of nonaggressive behavior, and low impulsivity (Lösel & Farrington, 2012).
Prior research of safety perceptions includes cross sectional studies of risk, studies of perceived hazard of different demographics, and qualitative research on perceived impact of violence intervention programs in specific regions (Hardiman et al., 2019). Similarly, recent literature utilizes multiple regression analyses to examine how trust and confidence influence perceived risk. Inconsistent consensus and lack of substantiated evidence is highlighted in existing research. For instance, some studies indicate perceived safety and crime predictions are correlated with sociodemographic categories, location, neighborhood stability, and social cohesion (Brisson & Roll, 2012). However, one study conducting national perception surveying could not conclude if perceived environmental health risk varied by gender, and race due to mixed results (Flynn et al., 1994). Another study investigating demographics and trust in vaccine safety found sociodemographic variables were not correlated with trust (Lim & Moon, 2023). Similarly, location specific multilevel analyses from the Project on Human Development in Chicago indicated neighborhood, gender, homeownership, mobility, and socioeconomic status are not highly associated with perceived violence (Sampson et al., 1997). However many of these quality studies examine variation in trust within a single demographic rather than associations between demographic categories and evaluate perceived violence rather than feelings of safety. Additionally, studies frequently fail to incorporate evaluation of these demographic categories together as and in tandem with political beliefs. Neglecting intersections between demographic categories overlooks potential nuanced perceptions of trust and community safety that specific survey item responses could help clarify.
1.2 Research Aims
Evaluation of safety perceptions in the US utilizing the survey data collected serves to alleviate evidence and contextual gaps in existing violence prevention and safety promotion research. This exploratory study aims to uncover and examine correlations between demographic populations to reaffirm existing correlations and provide novel insight into minority perceptions of fear and safety. This report examines if perceptions of community safety differ by racial identity, gender, social class, and/or political belief and if so, what trends exist in perceived safety within these sociodemographics. Contrasting the null hypothesis; assuming no correlation between the measured demographics and level of perceived community trust and safety, this study predicts safety perceptions differ by racialized identity, gender, social class, and political belief. For instance, an individual’s social class is most indicative of perceived feelings of being unsafe within one’s community.
2 Methods
2.1 Participants and Sampling
This study was conducted with the approval of the Institutional Review Board of a public higher education institution in New York. This research was conducted ethically to protect participants rights, welfare, confidentiality, and privacy. Participants were informed of the project and provided consent prior to beginning the survey. Individuals eligible to participate were 18 years of age or older and attend Binghamton University and/or reside in the surrounding Broome County area. Tabling conducted on the Binghamton University campus over a three week period in central campus common areas recruited students, staff, and community members regardless of background, department, or sociodemographic. Sampling was conducted at Binghamton University Fall Festival, the Broome County Regional Market, and Binghamton Farmers Market to reach a greater proportion of Broome county residents and students. Data at all locations was collected via Binghamton University licensed Qualtrics surveying technology accessible to students and staff for survey creation, data collection, and data analysis.
2.2 Measures
In the context of this study, community and neighborhood refers to an individual’s immediate physical, social, and/or virtual environment influencing individual perceptions and values. Violence refers to intentional action to harm and safety as an absence of harm and/or presence of wellbeing. As a survey variable, trust is defined as confidence or certainty and as the belief in the ability, reliability, and willingness to maintain shared values. The value in this context is safety, the absence of harm and presence of wellbeing. Items for community safety and trust were created through synthesis of construct definitions and prior literature. These measures consist of six key items within the exposure section of the survey: (1) My neighborhood feels like a community, (2) Most people in this neighborhood are willing to help you if you need it, (3) I trust my neighbors, (4) I do not feel safe in my neighborhood, (5) I cannot rely on my neighbors for help if I need it, and (6) I do not trust my neighbors. These items were scored using an 8-pt Likert scale with options to opt out of the question with the following options: Strongly disagree (1), Disagree (2), Slightly Disagree (3), Slightly agree (4), Agree (5), Strongly agree (6), Don’t Know (7), and Prefer not to say (8). The perceived community safety and trust items listed are averaged to produce composite scores by adding item scores together (sum of 1-8 per item) and dividing the scores by the number of items in the measure (6 items) to find individual scores. The possible outcome scores ranged from 6-64 for the items in the measure.
The survey also measured perceived sociodemographics including racial group, social status, gender identity, and political beliefs to analyze in tangent with safety perceptions. Sociodemographic items measured were generated by the authors by area of interest and prevalence in prior readings. Participants were asked to identify their racial/ethnic identity as measured by the following 9 selections: (1) American Indian or Alaska Native, (2) Asian, (3) Black or African American, (4) Hispanic or Latine, (5) Middle Eastern or North Africa, (6) Native Hawaiian or Pacific Islander, (7) White, (8) Other (with an option to write in), and (9) Prefer Not to say. Participants were asked which term describes their current gender identity and clarified that gender identity is the feeling an individual has about their gender. This measure included the following 6 options: (1) Girl or woman, (2) Boy or man, (3) Nonbinary, genderfluid, or gender queer, (4) I am not sure or questioning, (5) I don’t know what this question means, and (6) Decline to answer. Political belief measures for individual selection included the following 10 options: (1) Far left/leftist, (2) Very liberal, (3) Liberal, (4) Moderate, (5) Conservative, (6) Very conservative, (7) Far-right/alt-right, (8) Other (please specify), (9) Don’t know, and (10) Prefer not to say. Social class was measured on a ladder scale from worse off (least education, money, jobs) to well off (most education, money, respected jobs). Participants were instructed to select from the 11 options: 1 (lowest) to 10 (highest) and prefer not to say: that best indicates perceived standing of oneself or family compared to others in the US.
2.3 Data Analysis
Multiple near regression analysis is performed in this study to analyze the relationship between complex safety perception and sociodemographic variables. This analysis is completed utilizing Positcloud, a platform hosting R software for statistical computing and data visualization. The data will be reviewed using parametric tests to determine if distribution of responses is normal, symmetrical and evenly distributed with no outlying scores, or not normally distributed. Data for participants who opted not to respond to certain questions by selecting prefer not to say will be assigned the value -99 and filtered out of the dataset used for determining normal distribution and further data analysis. Similarly, respondents indicating the selection: don’t know: will be assigned the value -50 in the dataset and filtered out. Other forms of invalid responses that will be filtered out of the dataset will include participants who do not consent to participating in the survey in the first survey section as well as participants who left half or over half of the questions blank when completing the survey. The data is normally distributed for perceived safety and community trust, and the predictors indicate a linear relationship with no severe outliers. Given this normal distribution a multiple linear regression is performed to compare safety perceptions (dependent) and demographic (independent) predictor variables. The non normal distribution of the data for perceived safety and community trust require use of a complex statistical test such as another regression model (Huber, Robust, Quantile) or transform the dependent variable in order to use the multiple linear regression model.
3 Results
3.1 Import
Import Data and Filter Responses
3.2 Transform
Select Data
library(dplyr)
selectdata <- alldata %>%
select(POLITICAL_BELIEFS, GENDER, SOCIALSTATUS, RACIALIZED,COMM_FEEL, COMM_HELP, COMM_NEIGHBORS, NOTCOMM_UNSAFE, NOTCOMM_RELY, NOTCOMM_DISTRUST)
#explanation: Select dataset created to isolate variables and data used in multi linear regression. Select data includes measured sociodemographics and each statement (6 total- 3 COMM and 3 NONCOMM) measuring level of trust and safety using likert scale response options.
#source: The Quantitative Playbook for Public Health Research in R. (McCarty, 2025) Filter independent continuous variables
selectdata <- selectdata %>% filter(!is.na(POLITICAL_BELIEFS)) %>%
mutate(
POLITICAL_BELIEFS == case_when(
POLITICAL_BELIEFS == 1~ "Far left",
POLITICAL_BELIEFS == 2 ~ "Very liberal",
POLITICAL_BELIEFS == 3 ~ "Liberal",
POLITICAL_BELIEFS == 4 ~ "Moderate",
POLITICAL_BELIEFS == 5 ~ "Conservative",
POLITICAL_BELIEFS == 6 ~ "Very conservative",
POLITICAL_BELIEFS == 7 ~ "Far-right"
)
)
#explanation: Assigning values 0-6 for Far left to far right political beliefs removes rows where politial beliefs data is missing (NA) and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.html#Factor Social Status
selectdata <- selectdata %>% filter(!is.na(SOCIALSTATUS)) %>%
mutate(
SOCIALSTATUS == case_when(
SOCIALSTATUS== 1 ~ "1-Lowest Social Status",
SOCIALSTATUS == 2 ~ "2",
SOCIALSTATUS == 3 ~ "3",
SOCIALSTATUS == 4 ~ "4",
SOCIALSTATUS == 5 ~ "5-Moderate Social Status",
SOCIALSTATUS == 6 ~ "6",
SOCIALSTATUS == 7 ~ "7",
SOCIALSTATUS == 8 ~ "8",
SOCIALSTATUS == 9 ~ "9",
SOCIALSTATUS == 10 ~ "10-Highest Social Status"
)
)
#explanation: Assigning values 1-10 for low to high perceived socioeconomic status removes rows where SES data is missing (NA) and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.htmlFilter Dependent Response Variables
selectdata <- selectdata %>% filter(!is.na(COMM_FEEL)) %>%
mutate(
COMM_FEEL == case_when(
COMM_FEEL == 1 ~ "Strongly Disagree",
COMM_FEEL== 2 ~ "Disagree",
COMM_FEEL == 3 ~ "Slightly Disagree",
COMM_FEEL== 4 ~ "Slightly Agree",
COMM_FEEL == 5 ~ "Agree",
COMM_FEEL == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (COMM_FEEL: "My neighborhood feels like a community"). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.htmlselectdata <- selectdata %>% filter(!is.na(COMM_HELP)) %>%
mutate(
COMM_HELP == case_when(
COMM_HELP == 1 ~ "Strongly Disagree",
COMM_HELP== 2 ~ "Disagree",
COMM_HELP == 3 ~ "Slightly Disagree",
COMM_HELP== 4 ~ "Slightly Agree",
COMM_HELP == 5 ~ "Agree",
COMM_HELP == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (COMM_HELP: "Most people in this neighborhood are willing to help you if you need it" ). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.htmlselectdata <- selectdata %>% filter(!is.na(COMM_NEIGHBORS)) %>%
mutate(
COMM_NEIGHBORS == case_when(
COMM_NEIGHBORS == 1 ~ "Strongly Disagree",
COMM_NEIGHBORS== 2 ~ "Disagree",
COMM_NEIGHBORS == 3 ~ "Slightly Disagree",
COMM_NEIGHBORS== 4 ~ "Slightly Agree",
COMM_NEIGHBORS == 5 ~ "Agree",
COMM_NEIGHBORS == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (COMM_NEIGHBORS: "Most people in this neighborhood are willing to help you if you need it"). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.htmlselectdata <- selectdata %>% filter(!is.na(NOTCOMM_UNSAFE)) %>%
mutate(
NOTCOMM_UNSAFE == case_when(
NOTCOMM_UNSAFE == 1 ~ "Strongly Disagree",
NOTCOMM_UNSAFE== 2 ~ "Disagree",
NOTCOMM_UNSAFE == 3 ~ "Slightly Disagree",
NOTCOMM_UNSAFE== 4 ~ "Slightly Agree",
NOTCOMM_UNSAFE == 5 ~ "Agree",
NOTCOMM_UNSAFE == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (NONCOMM_UNSAFE: "I do not feel safe in my neighborhood "). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.html#Factor NOTCOMM_RELY
selectdata <- selectdata %>% filter(!is.na(NOTCOMM_UNSAFE)) %>%
mutate(
NOTCOMM_RELY == case_when(
NOTCOMM_RELY == 1 ~ "Strongly Disagree",
NOTCOMM_RELY== 2 ~ "Disagree",
NOTCOMM_RELY == 3 ~ "Slightly Disagree",
NOTCOMM_RELY == 4 ~ "Slightly Agree",
NOTCOMM_RELY == 5 ~ "Agree",
NOTCOMM_RELY == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (NONCOMM_RELY: "I cannot rely on my neighbors for help if I need it"). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.html#Factor NOTCOMM_DISTRUST
selectdata <- selectdata %>% filter(!is.na(NOTCOMM_DISTRUST)) %>%
mutate(
NOTCOMM_DISTRUST== case_when(
NOTCOMM_DISTRUST == 1 ~ "Strongly Disagree",
NOTCOMM_DISTRUST== 2 ~ "Disagree",
NOTCOMM_DISTRUST == 3 ~ "Slightly Disagree",
NOTCOMM_DISTRUST == 4 ~ "Slightly Agree",
NOTCOMM_DISTRUST == 5 ~ "Agree",
NOTCOMM_DISTRUST == 6 ~ "Strongly Agree",
)
)
#explanation: Assigning values 1-6 for low to high level of agreement with the corresponding statment (NONCOMM_DISTRUST: "I do not trust my neighbors"). This helps clearly filter missing data and provides clear categorical variable labels for MLR modeling.
#source: R for Data Science (2e), 16 Factors: https://r4ds.hadley.nz/factors.htmlCreating Community Variables
selectdata <- selectdata %>%
rowwise() %>%
mutate(COMMUNITY = mean(c(COMM_FEEL, COMM_HELP, COMM_NEIGHBORS), na.rm = TRUE))
#explanation: Creating a new variable (COMMUNITY) by averaging each trust/safety measure (COMM_FEEL, COMM_HELP, COMM_TRUST) using a rowwise function. Creates one variable for each respondent that measures overall safety and neighborhood trust perception for MLR modeling with various sociodemographic variables.
#source:https://cran.r-project.org/web/packages/dplyr/vignettes/rowwise.htmlselectdata <- selectdata %>%
rowwise() %>%
mutate(NONCOMMUNITY = mean(c(NOTCOMM_UNSAFE, NOTCOMM_RELY, NOTCOMM_DISTRUST), na.rm = TRUE))
#explanation: Creating a new variable (NONCOMMUNITY) by averaging each trust/safety measure (NONCOMM_UNSAFE, NONCOMM_RELY, NONCOMM_DISTRUST) using a rowwise function. Creates one variable for each respondent that measures perceived lack of safety and neighborhood trust for MLR modeling with various sociodemographic variables.
#source:https://cran.r-project.org/web/packages/dplyr/vignettes/rowwise.html Reverse Scoring
library(psych)
# Create keys for scoring
community_keys <- list(
COMMUNITY = c("COMM_FEEL", "COMM_HELP", "COMM_NEIGHBORS", "NOTCOMM_UNSAFE", "NOTCOMM_RELY", "NOTCOMM_DISTRUST"))
#source: The Quantitative Playbook for Public Health Research in R. (McCarty, 2025)
#explanation: Creates Community_keys object from community and non community measures#Reverse scoring: - Sign in front of the NONCOMM items helps identify the scale and align community scoring
community_keys_with_reverse <- list(
COMMUNITY = c("COMM_FEEL", "COMM_HELP", "COMM_NEIGHBORS", "-NOTCOMM_UNSAFE", "-NOTCOMM_RELY", "-NOTCOMM_DISTRUST")
)
#source: The Quantitative Playbook for Public Health Research in R. (McCarty, 2025)
#explanation: Creates community scale by reverse coding not comm variables so higher scores for each indicate higher feelings of perceived safety and trust. Computes each respondents average score of perceived safety and trust.community_scores <- scoreItems(community_keys_with_reverse, selectdata)
composite_scores <- community_scores$scores
selectdata$COMMUNITY <- composite_scores[,"COMMUNITY"]
#source: The Quantitative Playbook for Public Health Research in R. (McCarty, 2025)
#explanation: Calculates community score with reverse coded scores and puts final composite score into your selectdata set Simplifying Gender and Race Variables
library(dplyr)
#| label: Simplifying-Racialized-group-variables
selectdata <- selectdata %>%
mutate(
RACE.4 = case_when(
grepl(",", RACIALIZED) ~ "Mixed/Other",
RACIALIZED == "3" ~ "Black",
RACIALIZED == "2" ~ "Asian",
RACIALIZED == "7" ~ "White",
RACIALIZED %in% c("1", "4","5","6", "8") ~ "Mixed/Other",
TRUE ~ NA_character_)
)
#explanation: Simplifying the selections for racialized group variables, a new variable is created assigning muiltiple selections as 'Mixed/Other' and less commonly selected variables as "Mixed/Other". Highly selected groups were kept as their original variables (Black, Asian, White). This code is one step towards forming binary variables and labels for MLR modeling.
#source: https://fripublichealth.quarto.pub/zerosum/selectdata <- selectdata %>%
mutate(
POC = case_when(
RACE.4== "Asian" ~ "1",
RACE.4== "Black" ~ "1",
RACE.4 == "White" ~ "0",
RACE.4 == "Mixed/Other" ~ "1",
TRUE ~ NA_character_
)
)
#explanation: Adding a POC column in the dataframe this code takes existing RACE.4 simpflied racialized group variables and creates a binary indicator assigning White = 0 and POC = 1 (POC including: Mixed/Other, Asian, and Black from RACE.4). This code helps fufill the binary requirement for categorical variables in MLR modeling.
#source: https://fripublichealth.quarto.pub/zerosum/selectdata <- selectdata %>%
mutate(
GENDER01 = case_when(
GENDER== "0" ~ "0",
GENDER== "1" ~ "1",
GENDER == "2" ~ NA
)
)
#explanation: Creating a new variable (GENDER 01) recodes existing gender variables to include Woman = 0 and Man = 1 allowing MLR to interpet gender as a numeric predictor. Other gender selections (Nonbinary, genderfluid, genderqueer (2)) were removed from the dataset for the purpose of the binary predictor and only being selected for by a single particpant in surveying.
#source: https://fripublichealth.quarto.pub/zerosum/3.3 Normality: Community Safety & Lack of Safety
Trust/Safety & Distrust/Lack of Safety variables: Histogram & Normal Distribution
library(ggplot2)
COMMUNITY_DISTPLOT <- ggplot(selectdata, aes(x = COMMUNITY)) +
geom_histogram(binwidth = 0.5, fill = "darkgreen", color = "black") +
scale_x_continuous(
breaks = 1:6,
labels = c(
"1 = Strongly Disagree",
"2 = Disagree",
"3 = Slightly Disagree",
"4 = Slightly Agree",
"5 = Agree",
"6 = Strongly Agree"
),
limits = c(2, 6) # ← forces all ticks to appear
) +
labs(
title = "Community Safety Perception Distribution",
x = "Level of Agreement with Safety Statements",
y = "Number of Responses"
) +
theme_bw()
# Print and save to the plots folder
print(COMMUNITY_DISTPLOT)ggsave("plots/COMMUNITY_DISTPLOT.png",
plot = COMMUNITY_DISTPLOT,
width = 12, height = 10, dpi = 300)
#explanation: Visualizing a plot for the dependent variable COMMUNITY to check for mostly normal distribution (roughly bell shaped and no extreme outliers) required to continue with MLR. Relative normality indicates variables are usable in a muiltilinear regression model and do not require transformation.
#source: https://posit.cloud/learn/recipes/visualize/VisualizeA5, https://ggplot2.tidyverse.org/reference/ggtheme.html#| fig-alt: NONCOMMUNITY variables centered distribution between 2 and 3 on the likert response scale with a majority of scores between 1 and 3. The historgram is relatively symmetric and evenly distributed.
#Check Normal distribution for regression test
library(ggplot2)
#Histogram for NOTCOMMUNITY
NONCOMMUNITY_DISTPLOT <- ggplot(selectdata, mapping = aes(x = NONCOMMUNITY)) +
geom_histogram(binwidth = .5, fill = "darkred", color = "black")+
scale_x_continuous(
breaks = 1:6,
labels = c(
"1 = Strongly Disagree",
"2 = Disagree",
"3 = Slightly Disagree",
"4 = Slightly Agree",
"5 = Agree",
"6 = Strongly Agree"
),
limits = c(1, 5) # ← forces all ticks to appear
) +
labs(
title = "Lack of Community Safety Perception Distribution",
x = "Level of Agreement with Lack of Safety Statements",
y = "Number of Responses"
) +
theme_bw()
# Print and save to the plots folder
print(NONCOMMUNITY_DISTPLOT)ggsave("plots/NONCOMMUNITY_DISTPLOT.png",
plot = NONCOMMUNITY_DISTPLOT,
width = 12, height = 10, dpi = 300)
#explanation: Visualizing a plot for the dependent variable NONCOMMUNITY to check for mostly normal distribution (roughly bell shaped and no extreme outliers) required to continue with MLR. Relative normality indicates variables are usable in a muiltilinear regression model and do not require transformation.
#source: https://posit.cloud/learn/recipes/visualize/VisualizeA5, https://ggplot2.tidyverse.org/reference/ggtheme.html3.5 Modeling Sociodemographics & Safety Perceptions
3.5.1 Multiple Linear Regression: Community
#Perform MLR for COMMUNITY
#Look at observations
head(selectdata)# A tibble: 6 × 16
# Rowwise:
POLITICAL_BELIEFS GENDER SOCIALSTATUS RACIALIZED COMM_FEEL COMM_HELP
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 4 0 6 2 6 4
2 4 0 3 4,7 4 5
3 4 0 7 7 4 4
4 3 0 3 2,7 2 4
5 1 0 3 3 2 3
6 4 1 6 4 5 5
# ℹ 10 more variables: COMM_NEIGHBORS <dbl>, NOTCOMM_UNSAFE <dbl>,
# NOTCOMM_RELY <dbl>, NOTCOMM_DISTRUST <dbl>, `==...` <lgl>, COMMUNITY <dbl>,
# NONCOMMUNITY <dbl>, RACE.4 <chr>, POC <chr>, GENDER01 <chr>
#Create linear model of values in COMM ALL
COMMUNITY.lm <- lm(COMMUNITY ~ POC + GENDER01 + POLITICAL_BELIEFS + SOCIALSTATUS, data = selectdata)
summary(COMMUNITY.lm)
Call:
lm(formula = COMMUNITY ~ POC + GENDER01 + POLITICAL_BELIEFS +
SOCIALSTATUS, data = selectdata)
Residuals:
Min 1Q Median 3Q Max
-1.88876 -0.55722 0.08683 0.48973 1.71091
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.92810 0.52230 7.521 1.33e-09 ***
POC1 -0.07738 0.23673 -0.327 0.745
GENDER011 0.19665 0.23596 0.833 0.409
POLITICAL_BELIEFS 0.06927 0.09639 0.719 0.476
SOCIALSTATUS 0.04439 0.06579 0.675 0.503
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7643 on 47 degrees of freedom
Multiple R-squared: 0.05591, Adjusted R-squared: -0.02443
F-statistic: 0.6959 on 4 and 47 DF, p-value: 0.5986
#explanation: Race, Gender, Political beliefs, and Social status evaluated as independent predictors of community trust/perceived safety through a linear model. The response variable COMMUNITY is predicted from demographic and belief variables.
#source: https://www.datacamp.com/tutorial/multiple-linear-regression-r-tutorial Data indicating community safety/trust (Figure 1.1) and distrust/lack of safety (Figure 1.2) were visualized as histograms and checked for normality before performing multiple linear regression tests. The distribution of the datasets were relatively normal without any extreme outliers or skew, therefore multiple linear regression multiple linear regressions were completed for both community safety/trust variables and non community safety and trust variables (Figures 1.1 and 1.2). The regression model evaluated correlations between racialized groups (POC vs White), gender (Male vs Female), social status, and political beliefs as predictors of perceived community safety and perceived lack of safety and presence of distrust. The multiple linear regression evaluating these demographics variables as predictors of community safety revealed no statistically significant differences in community safety by any of the demographic predictors. In terms of racialized groups, people of color (POC) scored 0.013 increase in perceived safety and community trust than white respondents, however this difference failed to be statistically significant with a p value of 0.96. Similarly, no evidence showed gender as an influence of community trust or safety as on average respondents identifying as male scored only 0.087 points higher on community safety scores with an insignificant p value of 0.76. As a predictor, political beliefs revealed insignificance as well with each one unit increase in conservative belief being associated with a 0.11 point increase in community safety/trust with an insignificant p value of 0.33. Additionally, as social status increased each unit revealed a 0.02 point increase in average safety score, however with a p value of 0.821 no clear evidence was shown indicating that higher social status correlates with stronger feelings of safety. With none of the demographic predictors being significant the model overall explained only 2.8% of variance in community safety scores and supports that racialized group, gender, political beliefs, and social status are not predictors of perceived community safety scores.
3.5.2 Multiple Linear Regression: Noncommunity
#Perform MLR for NONCOMMUNITY
#Look at observations
head(selectdata)# A tibble: 6 × 16
# Rowwise:
POLITICAL_BELIEFS GENDER SOCIALSTATUS RACIALIZED COMM_FEEL COMM_HELP
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 4 0 6 2 6 4
2 4 0 3 4,7 4 5
3 4 0 7 7 4 4
4 3 0 3 2,7 2 4
5 1 0 3 3 2 3
6 4 1 6 4 5 5
# ℹ 10 more variables: COMM_NEIGHBORS <dbl>, NOTCOMM_UNSAFE <dbl>,
# NOTCOMM_RELY <dbl>, NOTCOMM_DISTRUST <dbl>, `==...` <lgl>, COMMUNITY <dbl>,
# NONCOMMUNITY <dbl>, RACE.4 <chr>, POC <chr>, GENDER01 <chr>
#Create linear model of values in NONCOMM ALL
NONCOMMUNITY.lm <- lm(NONCOMMUNITY ~ POC + GENDER01 + POLITICAL_BELIEFS + SOCIALSTATUS, data = selectdata)
summary(NONCOMMUNITY.lm)
Call:
lm(formula = NONCOMMUNITY ~ POC + GENDER01 + POLITICAL_BELIEFS +
SOCIALSTATUS, data = selectdata)
Residuals:
Min 1Q Median 3Q Max
-1.93892 -0.49784 0.02533 0.59418 1.86347
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.10546 0.54745 5.673 8.4e-07 ***
POC1 0.13228 0.24813 0.533 0.596
GENDER011 -0.31649 0.24733 -1.280 0.207
POLITICAL_BELIEFS -0.03699 0.10104 -0.366 0.716
SOCIALSTATUS -0.07495 0.06896 -1.087 0.283
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.8011 on 47 degrees of freedom
Multiple R-squared: 0.09796, Adjusted R-squared: 0.02119
F-statistic: 1.276 on 4 and 47 DF, p-value: 0.2928
#explanation: Race, Gender, Political beliefs, and Social status evaluated as independent predictors of community trust/perceived safety through a linear model.
#source: https://www.datacamp.com/tutorial/multiple-linear-regression-r-tutorial Similarly, evaluating these demographic variables as predictors of feelings of lack of safety revealed similar insignificant results. For instance difference ,as a predictor of feelings of distrust and lack of safety in one’s community being a person of color was only associated with a 0.13 point increase in scores while being nonsignificant with a p value of 0.6. In terms of gender, non-significant results (p= 0.21) revealed males scored on average only 0.32 points lower in feelings of lack of safety than females, indicating there may be a slight trend in females feeling slightly more unsafe in their communities. Political beliefs were revealed to not predict perceived lack of safety as each one unit increase in political belief corresponded to a 0.04 decrease in distrust with a high p value of 0.72. Similarly social status revealed only a 0.075 point decrease in perceived lack of safety per one unit increase in social status. With a p value of 0.28 the slight correlation between higher social status and lower feelings of distrust becomes insignificant.
4 Discussion
4.1 Findings & Existing Theory
Study data collected and modeled revealed no significant correlation between the measured sociodemographics and feelings of community trust and distrust or safety and lack of safety. The following conclusion results from a limited sample of randomly selected, consenting Binghamton University students and Broome County residents over the age of 18 over a month-long tabling period of data collection. Disproving original hypotheses that sociodemographic categories are predictors of higher or lower feelings of safety, trust, and distrust this data supports delving into alternative reasoning for what contributes to various perceptions of safety and feelings of trust. Unlike the lack of correlation found in this study, some studies highlight that safety and crime predictors are correlated with sociodemographic categories, location, neighborhood stability, and social cohesion (Brisson & Roll, 2012). However other studies reaffirm a lack of definitive conclusions through national perception surveying that could not significantly determine if perceived environmental risk varied by gender and race due to mixed results (Flynn et al., 1994). Additionally, lack of correlation between demographic variables and community safety beliefs aligns with other studies investigating demographics and trust using socioecological approaches. These studies include multilevel analyses from the Project on Human Development in Chicago indicating that neighborhood, gender, home ownership, and mobility and socioeconomic status are not associated with perceived violence (Sampson et al., 1997). Similarly, other ecological studies investigating demographic trust in vaccine safety found little to no correlation between sociodemographic categories and trust levels (Lim & Moon, 2023). While original hypotheses were not supported, results highlighting that demographic variables are not significant predictors of safety beliefs, trust, perceived lack of safety, and distrust align with existing literature and are empirically supported by the multiple linear regression model.
4.2 Limitations & Implications
Limitations to this study may include sampling limitations, measurement and construct inconsistencies, and assumptions of utilizing a multiple linear regression model. The study may have been limited or skewed by the low number of individuals that continued to participate in the survey to completion. Individuals with more opinionated beliefs around safety and community trust may have decided to respond to the survey entirely or may have avoided completing the survey due to strong beliefs or feelings about phrasing or questions. This may have resulted in a more skewed dataset not representative of a wide range of beliefs. Additionally, respondents may have held response or social desirability bias when responding to the survey questions believing they were answering in a way that will be viewed favorably by others rather than indicating their true belief. The study may also have been limited by the constructs and selected measurements. Measures may have failed to capture the key concepts or left room for survey participants to form alternative interpretations of what each measure means. Finally, the assumptions of utilizing a multiple linear regression model may have limited the study data and conclusions that can be drawn from modeling as a linear model assumes a clear cut relationship between a predictor and the outcome. However, this may fail to incorporate levels of nuance that exist naturally in perceptions and beliefs from influences outside the evaluated demographics.
Beyond the limitations of this study, a lack of relationship between demographic variables as predictors of community safety beliefs, feelings of trust, and distrust indicates a need for further exploration of what predictors influence safety perceptions. A lack of correlation as demonstrated in this study may imply that other existing factors influencing feelings of safety and trust are underrepresented and unexplored in existing research. This may signal a pivot towards exploring new relationships outside of basic demographic categories that may influence an individual’s perceptions. However, building off of this study, important next steps include verifying demographic variables are not significant predictors of safety and trust beliefs with a larger sample size and more validated measures. Beyond confirming the validity of study results, future research may require evaluation of more nuanced influences of safety perceptions such as familial beliefs, media content consumed, and level of education. Exploring social cohesion, relationships with media, attitudes towards fairness and authority, and cultural influences of an individual may be interesting new variables to explore as influences of safety perceptions. Driving factors of apathy, lack of care, opinion, and thought related to safety and community trust may also require evaluation in order to devise measures that accurately unveil any existing differences in perceptions of safety and trust. Gaining this greater understanding of what key factors influence safety beliefs and perceptions of trust is key to effectively developing targeted interventions to increase safety and trust within communities.
5 References
Brisson, D. & Roll, S. (2012). The effect of neighborhood on crime and safety: A review of the evidence. Journal of Evidence-Based Social Work, 9(4), 333–350. https://doi.org/10.1080/15433714.2010.525407
Broome County, NY | Data USA. (2023). Datausa.io. https://datausa.io/profile/geo/broome-county-ny
Data USA. (2022). United States | Data USA. Datausa.io.
https://datausa.io/profile/geo/united-states
Diclemente, R. J., Salazar, L. F., & Crosby, R. A. (2019). Health behavior theory for public health : principles, foundations, and applications. Jones & Bartlett Learning.
Flynn, J., Slovic, P., & Mertz, C. K. (1994). Gender, Race, and Perception of Environmental Health Risks. Risk Analysis, 14(6), 1101–1108. https://doi.org/10.1111/j.1539-6924.1994.tb00082.x
Gramlich, J. (2024, April 24). What the data says about crime in the U.S. Pew Research Center. https://www.pewresearch.org/short-reads/2024/04/24/w
Gumas, E. D., Gunja, M. Z., & Williams II, R. D. (2024). Comparing Deaths from Gun Violence in the U.S. with Other Countries. Commonwealthfund.org. https://doi.org/10.26099/1t4e-7h62
Hardiman, E. R., Jones, L. V., & Cestone, L. M. (2019). Neighborhood Perceptions of Gun Violence and Safety: Findings from a Public Health-Social Work Intervention. Social Work in Public Health, 34(6), 492–504. https://doi.org/10.1080/19371918.2019.1629144
Lim, J., & Moon, K.-K. (2023). Political Ideology and Trust in Government to Ensure Vaccine Safety: Using a U.S. Survey to Explore the Role of Political Trust. International Journal of Environmental Research and Public Health, 20(5), 4459. https://doi.org/10.3390/ijerph20054459
Lösel, F., & Farrington, D. P. (2012). Direct protective and buffering protective factors in the development of youth violence. American Journal of Preventive Medicine, 43(2), S8–S23. https://doi.org/10.1016/j.amepre.2012.04.029
Miller, G. F., Barnett, S. B. L., Florence, C. S., Harrison, K. M., Dahlberg, L. L., & Mercy, J. A. (2023). Costs of fatal and nonfatal firearm injuries in the U.S., 2019 and 2020. American Journal of Preventive Medicine, 66(2), 195–204. https://doi.org/10.1016/j.amepre.2023.09.026
Patton, D. U., Aguilar, N., Landau, A. Y., Thomas, C., Kagan, R., Ren, T., Stoneberg, E., Wang, T., Halmos, D., Saha, A., Ananthram, A., & McKeown, K. (2022). Community implications for gun violence prevention during co-occurring pandemics; a qualitative and computational analysis study. Preventive Medicine, 165, 107263. https://doi.org/10.1016/j.ypmed.2022.107263
Raheemy, Y., Sherratt, F., & Hallowell, M. R. (2025). What is safety? contemporary definitions and interpretations across North America. Safety Science, 185, 106798. https://doi.org/10.1016/j.ssci.2025.106798
Rutherford, A., Zwi, A. B., Grove, N. J., & Butchart, A. (2007). Violence: A Glossary. Journal of Epidemiology & Community Health, 61(8), 676–680. https://doi.org/10.1136/jech.2005.043711
Sampson, R. J., Raudenbush, S., & Earls, F. (1997). Neighborhoods and Violent Crime: a Multilevel Study of Collective Efficacy. Science, 277(5328), 918–924. https://doi.org/10.1126/science.277.5328.918
Vision of Humanity. (2024). Global Peace Index. Vision of Humanity.https://www.visionofhumanity.org/maps/#/