November 2023 – mahitha

In this analysis, we focused on the armed status of individuals who were shot, utilizing a dataset represented by the variable ‘data.’ To ensure the accuracy of our analysis, we removed any rows where information on the armed status was not available (NaN values).

Subsequently, we tallied the occurrences of each armed status category using the ‘value_counts()’ function, producing a count for each distinct armed status in the dataset.

The results were visualized through a bar chart, generated using the ‘matplotlib’ library, with the x-axis representing different armed statuses and the y-axis indicating the frequency of occurrences. The chart was customized with a title, labels for both axes, and proper rotation for better readability of the x-axis labels.

The chosen color scheme for the bars was derived from the ‘tab10’ colormap. The final plot provides a clear overview of the distribution of armed statuses among individuals who were shot, offering insights into the prevalence of each category.

In this analysis individuals were fleeing from the police using a dataset labeled as ‘data.’ To ensure the accuracy of our examination, we first removed any rows where information about the fleeing status was not available (NaN values).

Then, we proceeded to count the occurrences of each fleeing status category through the ‘value_counts()’ function, generating a count for each unique fleeing status in the dataset.

The findings were visually presented using a bar chart, crafted with the ‘matplotlib’ library. The x-axis of the chart illustrates different fleeing statuses, while the y-axis indicates the frequency of occurrences. To enhance readability, the chart includes a title, labels for both axes, and proper rotation of the x-axis labels.

The color palette chosen for the bars was derived from the ‘tab10’ colormap. The resulting plot offers a clear representation of the distribution of fleeing statuses among individuals involved in police shootings, shedding light on the prevalence of each category.

Certainly! In this analysis, we delved into whether individuals exhibited signs of mental illness using a dataset denoted as ‘data.’ To ensure the reliability of our examination, we initially removed any rows where information regarding signs of mental illness was not available (NaN values).

Subsequently, we tabulated the occurrences of each mental illness status category through the ‘value_counts()’ function, producing a count for each distinct status in the dataset.

The results were visually communicated via a bar chart, crafted with the ‘matplotlib’ library. The x-axis of the chart illustrates different mental illness statuses, while the y-axis denotes the frequency of occurrences. For clarity, the chart features a title, labels for both axes, and appropriate rotation of the x-axis labels.

The color scheme chosen for the bars was derived from the ‘tab10’ colormap. The resultant plot provides a visual insight into the distribution of mental illness statuses among individuals involved in the context under consideration, offering a glimpse into the prevalence of each category.

The bar plot indicates that the signs of mental illness in the individual shot may also contribute to the occurrence of fatal police shootings in some cases. However, individuals exhibiting no signs of mental illness were more likely to be shot by police compared to the one with the illness.

Fatal police shootings predominantly involved men aged between 25 and 35 who were armed with either a gun or a knife and did not exhibit signs of mental illness. The majority of these individuals were not fleeing from the police, and the prevalent racial demographic was white.

Geographically, the incidents were concentrated primarily in California, with Los Angeles emerging as the top city in this context. Other states, such as Texas and Florida, also experienced these incidents, with a relatively consistent but varying frequency. Notably, California exhibited the highest occurrence, while Texas and Florida, along with several other states, followed the trend, showing an average fluctuation of approximately -180, reaching a minimum at 250.

On a city level, Phoenix secured the second position, while at the state level, Arizona, where Phoenix is situated, held the second position. In the third position at the city level was Houston, while at the state level, Texas, where Houston is located, secured the second position. This analysis sheds light on the patterns and distribution of fatal police shootings, emphasizing demographic and geographic factors.

November 9, 2023November 10, 2023

Washington data analysis

It undertakes a comprehensive analysis of demographic factors, including age distribution, race, mental health conditions, gender, and other pertinent variables, within the dataset comprising individuals involved in police shootings. The data utilized for this analysis is sourced from the Washington Post Police Shootings Database. The primary aim of this report is to illuminate the age demographics of individuals impacted by police violence, offering a meticulous and insightful analysis of the findings.

The foundation of this analysis rests upon the Washington Post Police Shootings Database, encompassing data pertaining to incidents of police shootings in the United States spanning the years 2015 to 2023. To ensure the integrity of our analysis, any absent age values were substituted with the dataset’s mean age, and meticulous measures were implemented to address NaN and null values across all other columns. These preprocessing steps were crucial in preparing the dataset for subsequent visualization and analysis.

We use the Python code that utilizes the pandas and matplotlib libraries for dataset handling and graphical representation, respectively. The Washington Post Police Shootings Database, in CSV format, is loaded into a pandas DataFrame named ‘data.’ Subsequently, the analysis focuses on examining trends over time, particularly the number of fatal police shootings each year.

To ensure data integrity, rows with NaN values in the ‘date’ column are removed. The ‘date’ column is then converted to a datetime format, and the corresponding years are extracted and stored in a new column named ‘year.’ The number of fatal police shootings per year is computed using the groupby function, and a line plot is generated using matplotlib.

– Loading the CSV file into a pandas DataFrame
-Dropping rows with NaN values in the ‘date’ column for data integrity
-Converting ‘date’ to datetime and extracting the year
– Calculating the number of fatal police shootings per year
– Creating a line plot

Findings

The examination of the plotted graph reveals a notable trend in the number of fatal police shootings over the years. Between 2016 and 2022, there was a pronounced and consistent increase in the count of such incidents. However, a conspicuous anomaly is observed in the year 2023, where a substantial decline in the count is evident. This abrupt shift prompts a cautious interpretation, raising the possibility of missing data or inconclusive details regarding the circumstances leading to the police shootings. It is essential to consider the potential factors contributing to this unexpected decrease and exercise prudence in drawing definitive conclusions about the incidents during the year 2023.

The following Python code examines the relationship between race and the number of individuals shot, employing the pandas and matplotlib libraries for data manipulation and visualization. Rows with NaN values in the ‘race’ column are removed to ensure data integrity. Subsequently, the count of people shot for each race is calculated, and a bar chart is generated to visually represent the distribution.

–

-Dropping rows with NaN values in the ‘race’ column for data integrity
– Counting the number of people shot for each race
– Creating a bar chart
– Rotating x-axis labels for readability
– Adjusting the layout to prevent overlapping

The provided Python code addresses the handling of missing or null values in the ‘age’ column by filling them with the median age. Subsequently, the code calculates the occurrences of each unique age, extracts the unique age values, and generates a scatter plot depicting the distribution of ages of individuals shot by the police.

– Filling NaN values or null values in the ‘age’ column with the median
– Counting the occurrences of each unique age
– Extracting the unique age values
– Creating a scatter plot using the unique age values and their counts

The following Python code conducts an analysis on the gender distribution of individuals involved in fatal police shootings. It utilizes the pandas library to handle the dataset and matplotlib for graphical representation. Rows with NaN values in the ‘gender’ column are dropped to ensure data integrity. The code then calculates the number of people shot for each gender and generates a bar chart to illustrate the gender distribution.

– Dropping rows with NaN values in the ‘gender’ column for data integrity
– Counting the number of people shot for each gender
-Creating a bar chart
– Rotating x-axis labels for readability
– Adjusting the layout to prevent overlapping

The following Python code examines fatal police shootings based on geographical locations, specifically focusing on cities and states. Utilizing the pandas library for data manipulation and matplotlib for visualization, the code groups the data by city and state, counts the number of shootings for each, and extracts the top 10 cities and states with the highest number of fatal police shootings. Bar charts are then generated to illustrate these findings.

– Grouping the data by city and counting the number of shootings for each city
– Extracting the top 10 cities with the highest number of shootings
– Creating a bar chart for the top 10 cities
– Grouping the data by state and counting the number of shootings for each state
– Extracting the top 10 states with the most shootings
– Creating a bar chart for the top 10 states.

Month: November 2023

MTH522 PROJECT 2 REPORT

Gender disparity

Washington data analysis