Thursday, January 22, 2015

Crime and Collision in The City of Angels

For one of my UCI research papers I was exploring R's "ggmap" package for spatial visualization of Particulate Matter emissions in the Ports of Long Beach and Los Angeles area. "ggmap" is pretty easy to learn and makes beautiful maps. As a weeknight project I decided to spatially visualize Crime and Collision in Los Angeles, CA. Raw LAPD crime and collison data for 2013 is available here.

####################################################################
# Install required packages
install.packages('ggmap')
install.packages('ggplot2')

library(ggmap)
library(ggplot2)

setwd('/Users/Ankoor/Desktop/ML with R/LA Crime')

# Read Data
crime <- read.csv("LAPD_Crime_and_Collision_Raw_Data_for_2013.csv", stringsAsFactors = FALSE)

Latitude and Longitude data is in string format: "(34.0496, -118.265)". Need to clean this column by removing "(" and ")" then splitting the string "34.0496, -118.265" into separate Longitude and Latitude columns based on comma and single space: ", "

# Cleaning Location Column to extract Longitude and Latitude
crime$Location <- gsub("\\(|\\)", "", crime$Location.1)
temp_1 <- sapply(strsplit(crime$Location, ", ", fixed = TRUE), "[",1)
temp_2 <- sapply(strsplit(crime$Location, ", ", fixed = TRUE), "[",2)
crime$Lat <- as.numeric(temp_1)
crime$Long <- as.numeric(temp_2)

# Keeping necessary Columns
names(crime)
keep <- c('DATE.OCC', 'TIME.OCC', 'AREA.NAME', 'Crm.Cd','Crm.Cd.Desc', 'Lat', 'Long')
crime <- crime[keep]


Date is in string format: "12/31/2013". Need to convert date from "string" to "date" format used in R and then get weekdays.

# Cleaning Date Column
crime$DATE.OCC <- as.Date(crime$DATE.OCC, "%m/%d/%Y")
crime$Day <- weekdays(crime$DATE.OCC)

Time data is in 24-hour format (Military time). To visualize temporal variation in crimes I decided to assign Time in 4 quarters of a day as follows: 0 to 600 hours = First Quarter, 601 to 1200 hours = Second Quarter, 1201 to 1800 hours = Third Quarter and 1801 to 2400 hours = Fourth Quarter

# Quarters in a day
crime$Quarter <- crime$TIME.OCC

crime$Quarter[which(crime$TIME.OCC < 600)] <- 'First'

crime$Quarter[which(crime$TIME.OCC >= 600 & crime$TIME.OCC < 1200)] <- 'Second'

crime$Quarter[which(crime$TIME.OCC >= 1200 & crime$TIME.OCC < 1800)] <- 'Third'

crime$Quarter[which(crime$TIME.OCC >= 1800)] <- 'Fourth'


Now creating maps

# Get Longitude and Latitude 
geocode("Los Angeles") 

# Get Google Map
LA = c(lon = -118.2437, lat =  34.05223)
LA.map = get_map(location = LA, zoom = 11, maptype = 'terrain')

# Plotting Crime Density Map
ggmap(LA.map, extent = "normal", maprange=FALSE) %+% crime + aes(x = Long, y = Lat) + 
        stat_density2d(aes(fill = ..level.., alpha = ..level..), size = 5, bins = 20, geom = 'polygon') + 
        scale_fill_continuous(low = 'black', high = 'red', name = "Crime\nDensity") +
        scale_alpha(range = c(0.05, 0.25), guide = FALSE) + 
        coord_map(projection = "mercator", 
                  xlim = c(attr(LA.map, "bb")$ll.lon, attr(LA.map, "bb")$ur.lon), 
                  ylim = c(attr(LA.map, "bb")$ll.lat, attr(LA.map, "bb")$ur.lat)) + 
        theme(legend.justification=c(1,0), legend.position=c(1,0),axis.title = element_blank(), text = element_text(size = 14)) 
       


Vehicle Collision/Accident Maps

# Creating subset of Crime data based on Crime Code Description for Collision
collision <- subset(crime, Crm.Cd.Desc == 'TRAFFIC DR #')
names(collision)[5]<-"Collision"

# Get Stamen Map
LA.map = qmap(location = LA, zoom = 11, source = "stamen", maptype = 'toner')

# Plotting Collision Map (I used color = #cb181d" from Color Brewer)
LA.map + geom_point(data = collision, aes(x = Long, y = Lat), size = 2, alpha = 0.1, color = "#cb181d")


# Plotting Collision Map to Visualize Weekday Variation in Collisions
LA.map + geom_point(data = collision, aes(x = Long, y = Lat), size = 2, alpha = 0.1, color = "#0c2c84") + facet_wrap(~ Day)


# Plotting Collision Map to Visualize Temporal (Quarter based) Variation in Collisions
LA.map + geom_point(data = collision, aes(x = Long, y = Lat), size = 2, alpha = 0.1, color = "#0c2c84") + facet_wrap(~ Quarter)



# Plotting Collision Density Map
geocode("Hollywood") 
LA = c(lon = -118.3287, lat =  34.09281)
LA.map = get_map(location = LA, zoom = 11, maptype = 'terrain')

ggmap(LA.map, extent = "normal", maprange=FALSE) %+% collision + aes(x = Long, y = Lat) + 
        stat_density2d(aes(fill = ..level.., alpha = ..level..), size = 2, bins = 15, geom = 'polygon') + 
        scale_fill_gradient(low = "red", high = "#081d58", name = "Collision\nDensity") + 
        scale_alpha(range = c(0.05, 0.3), guide = FALSE) + 
        coord_map(projection = "mercator", 
                  xlim = c(attr(LA.map, "bb")$ll.lon, attr(LA.map, "bb")$ur.lon), 
                  ylim = c(attr(LA.map, "bb")$ll.lat, attr(LA.map, "bb")$ur.lat)) + 
        theme(legend.justification=c(1,0), legend.position=c(1,0),axis.title = element_blank(), text = element_text(size = 14))



# Creating subset of Crime data based on Crime Code Description for Violent Crimes
violent <- subset(crime, Crm.Cd.Desc == 'ROBBERY' | Crm.Cd.Desc == 'ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT' |
                          Crm.Cd.Desc == 'RAPE, ATTEMPTED' | Crm.Cd.Desc == 'CRIMINAL HOMICIDE' | 
                          Crm.Cd.Desc == 'CRIMINAL HOMICIDE' | Crm.Cd.Desc == 'ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER' |
                                 Crm.Cd.Desc == 'RAPE, FORCIBLE' | Crm.Cd.Desc == 'HOMICIDE (NON-UCR)')

names(violent)[5] <-"Violent"

violent$Violent <- factor(violent$Violent)


# Plotting Violent Crime Density Map
geocode("Vernon, CA")
LA = c(lon = -118.2301, lat =  34.0039)
LA.map = get_map(location = LA, zoom = 12, maptype = 'terrain')
ggmap(LA.map, extent = "normal", maprange=FALSE) %+% violent + aes(x = Long, y = Lat) + 
        stat_density2d(aes(fill = ..level.., alpha = ..level..), size = 2, bins = 10, geom = 'polygon') + 
        scale_fill_gradient(low = "black", high = "red", name = "Violent Crime\nDensity") + 
        scale_alpha(range = c(0.05, 0.3), guide = FALSE) + 
        coord_map(projection = "mercator", 
                  xlim = c(attr(LA.map, "bb")$ll.lon, attr(LA.map, "bb")$ur.lon), 
                  ylim = c(attr(LA.map, "bb")$ll.lat, attr(LA.map, "bb")$ur.lat)) + 
        theme(legend.justification=c(1,0), legend.position=c(1,0),axis.title = element_blank(), text = element_text(size = 14)) 
     

# Plotting Collision Map to Visualize Weekday Variation in Violent Crimes
LA.map = qmap(location = LA, zoom = 11, source = "stamen", maptype = 'toner')
LA.map + geom_point(data = violent, aes(x = Long, y = Lat), size = 2, alpha = 0.1, color = "red") + facet_wrap(~ Day)


1 comment:


  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete