The dataset I have chosen is “Crime Data from 2010 to Present.” My dataset’s ontology provides detailed information about the crimes using 26 categories. This includes the date a crime was reported, as well as the date and time it occurred. Other categories like the reporting district, the victim’s age and sex, and the weapon description are also used. The ontology allows one to pinpoint, with reasonable accuracy, where and how the crime occurred, and who was involved.
I think that this ontology makes the most sense for someone in the law enforcement industry, such as a police officer. He or she will be able to search for and zoom in on certain crime records of interest, by selected criteria such as area name or reporting district. Someone who studies crime, such as a criminologist, may also find this illuminating. He or she could mine the data to determine if there are certain trends that emerge in the incidents of crime reported in Los Angeles, such as where they occur and the type of weapons commonly used.
This dataset can tell me about any temporal or spatial trends in the occurrence of crime. Temporal trends, for instance, can be derived by analyzing the data on the times the crimes occurred, while spatial trends can be derived by analyzing data on the name of the area where crimes occurred. The dataset can also tell me who the victims of crimes tend to be. For instance, a quick scroll through the dataset showed me that crime victims in Los Angeles are seldom young people or teenagers – they tend to be people who are already well into adulthood.
While this dataset is rather comprehensive, I do feel that certain pieces of information have been left out. For instance, there is no information about the perpetrator(s) of the crime. One can only assume that each incident of a crime reported was committed by one person, when there may have been more than one perpetrator. Data about the age, sex and descent of the perpetrator is not included, even though such data about the victim is included. The time the crime was reported is also omitted, when its inclusion could have been helpful in enabling one to understand how fast the crime was reported after it happened.
If I were starting over, I would describe an ontology from the perspective of an ordinary resident of the City of Los Angeles. One category I would include would be “Period of day” (morning, afternoon, night) instead of “Time occurred” so residents would be able to more instinctively deduce if crimes are more common at certain times. I would also include the speed of reporting, whether the victim was alone, and whether the perpetrator was caught, with the latter two merely requiring yes or no answers. This ontology would allow residents to have a more on-the-ground feel of how safe Los Angeles is, and the likelihood of justice being served should they become the victim of a crime.
Hi,
I looked at a dataset about 311 calls for non-emergency concerns (such as grafitti, dangerous animals reporting) and I found similar things to your observations. While the dataset does seem useful to those working in the industry, there are certain additional details that the general public might be interested in which was not reflected in the dataset, such as details about the perpetrator as you mentioned. As a resident of LA, I would certainly be more interested in the kinds of information your proposed dataset would include.
I found it notable that this dataset left out race as one of the categories, simply because that is a category that is typically discussed. I think a lot of important information would be missing from this dataset because crimes can have so much detail that it would be hard to encapsulate in a dataset. It would be interesting to see this data presented as a map, and even a heat map depending on the density of data.