Welcome to Criminal Profiling , Featuring Criminal Justice, Offender Profiling, Victimology, Serial Killers & Forensic Psychology News
 



New For May 2008: Criminal Justice Careers
FREE information on Criminal Justice programs at the top online schools.

Computer Aided Tracking and Characterization of Homicides & Sexual Assaults

Posted by buffy on: Thursday 16 August 2001

Lars J. Kangas, Kristine M. Terrones, Robert D. Keppel, and Robert D. La Moria Pacific Northwest National Laboratory, MS K7-22, PO Box 999, Richland, WA 99352 Attorney General of Washington, Criminal Division
When a serial offender strikes, it usually means that the investigation is unprecedented for that police agency. The volume of incoming leads and pieces of information in the case(s) can be overwhelming as evidenced by the thousands of leads gathered in the Ted Bundy Murders, Atlanta Child Murders, and the Green River Murders. Serial cases can be long term investigations in which the suspect remains unknown and continues to perpetrate crimes. With state and local murder investigative systems beginning to crop up, it will become important to manage that information in a timely and efficient way by developing computer programs to assist in that task. One vital function will be to compare violent crime cases from different jurisdictions so investigators can approach the investigation knowing that similar cases exist.

CATCH (Computer Aided Tracking and Characterization of Homicides) is being developed to assist crime investigations by assessing likely characteristics of unknown offenders, by relating a specific crime case to other cases, and by providing a tool for clustering similar cases that may be attributed to the same offenders. CATCH is a collection of tools that assist the crime analyst in the investigation process by providing advanced data mining and visualization capabilities. These tools include clustering maps, query tools, geographic maps, timelines, etc. Each tool is designed to give the crime analyst a different view of the case data. The clustering tools in CATCH are based on artificial neural networks (ANNs). The ANNs learn to cluster similar cases from approximately 5000 murders and 3000 sexual assaults residing in a database. The clustering algorithm is applied to parameters describing modus operandi (MO), signature characteristics of the offenders, and other parameters describing the victim and offender. The proximity of cases within a two-dimensional representation of the clusters allows the analyst to identify similar or serial murders and sexual assaults. Keywords: Data mining, visualization, clustering, serial murders, rapes, self-organizing maps Introduction CATCH is being developed to provide crime analysts enhanced means for interpreting large databases of crime data. These databases store a large number of crimes with each case described in a large number of details. Battelle Memorial Institute at Pacific Northwest National Laboratories developed CATCH in collaboration with the Attorney General of Washington, Criminal Division. Investigators at the Criminal Division are currently evaluating CATCH. The development of CATCH was made possible with the HITS (Homicide Investigation Tracking System) database system. Police involved in the infamous Green River and Ted Bundy murder investigations in the State of Washington developed HITS circa ten years ago to enable computer-based analysis of murders. The database now contains several thousand violent crimes primarily from the Pacific Northwest, USA. CATCH provides analysts tools for efficiently viewing crime details and comparing crimes against each other. An initial set of one or more crimes is selected by using point-and-click methods that generate SQL queries to retrieve the set of crimes from the database. This set of crimes is then refined with tools that tell the analyst that specific crimes do not "belong" in the set. The analyst can also add other crimes to the set that should "belong" to the set. The set of crimes may belong together because they appear to be committed by the same offender. There are two versions of CATCH, one for murders and one for sexual assaults. Although the version of CATCH described here is custom configured specifically for the HITS database of violent crimes, it can be applied against other crime databases through relatively minor changes in the software. Clustering Algorithm CATCH uses artificial neural networks (ANNs) for analysis. The benefit of ANNs is often described by means of their information (sensor) fusion capabilities. Information fusion is the process of extracting information from a several data sources in parallel. More information can frequently be gained by this approach compared to processing each data source individually. Another benefit with some ANNs is their ability to extract non-linear information from data. The clustering algorithm in CATCH is based on self-organizing maps (SOMs). These networks are also called self-organizing feature maps or Kohonen networks after the inventor Professor Teuvo Kohonen11, 6. The SOMs belong to the unsupervised neural network class meaning that the network is not provided any labels that describe the data vectors during a learning phase. Instead, the SOM organizes data vectors into clusters of similar data in regions on a two dimensional map. "Two" dimensions provide a convenient visual representation, though it is not a requirement. The HITS Unit staff at the Attorney General of Washington, Criminal Division use "standard" forms to record the large number of details describing each crime which are then entered into the HITS database. CATCH processes these crime details and generates data vectors for numerical analysis. Each data vector includes more than 200 details of each crime. The SOM in CATCH has 4096 cells organized as a 64 by 64 grid (See Figure 1 below). The learning phase assigns each crime to exactly one of these cells. The specific cell to which each crime is assigned is based on a clustering algorithm applied during a learning phase. Similar crimes are placed in closer proximity to each other. Identical or nearly identical crimes may be placed in the same cell. Some cells may not be assigned any crimes during the learning phase but these cells may be assigned new crimes as they are entered into the database between retraining of the SOM. The SOM should periodically be retrained when a sufficient number of new cases are added to the database to take advantage of all the crime data available.

Figure 1. The self-organizing map in the figure represents about 5,000 murders in the HITS database. Each of the cells in the 64 by 64 map typically contains eight or less crimes. The black cells contain no crimes. The lighter the cell color the more crimes are in the cell. (The cells are colored in different shades of green in the application.) The overlaid light rectangle contains light colored cells (in yellow) that are selected into a current set of crimes being analyzed. (The example set of clustered crimes in the figure is the Green River Murders believed to be by one serial offender). Database Mining The tools in CATCH are of two types. First, there are database mining tools to give the crime analyst a better understanding of the content of the database. Second, there are tools that let the analyst retrieve and compare specific crimes. The self-organizing map is like a window into the database. Each crime in the database has a location on the SOM and the clusters on the SOM link together similar crimes in the database. Thus, the database can be mined for related crimes through the SOM. These mining tools which use the SOM include a search tool that lets the analyst select a combination of crime details and see where on the SOM there are crimes for which these details hold true (see figures 2 and 3 below). Another tool allows the analyst to select one crime case and see where in the SOM there are other similar crime cases based on any combination of the details describing the crimes. CATCH allows the analyst to add to, remove from, or crop the current set of crime cases by selecting areas of cells in the SOM, while mining the database.
Figure 2. The SOM is overlaid by boundaries around areas of common crime details. The small window shows which details are selected and the color coding of the boundaries. The user can select crimes that are in the unions of the bounded areas, shown as light colored cells in the example.
Figure 3. The depicted tool emphasizes cells containing crimes for which all selected details correctly describe the crimes. The cells in the SOM are colored lighter according to the correlation of the selected crime details, i.e., lighter cells have higher correlation with the selected crime details. < The "starmap" of crimes in CATCH is shown in Figure 4. This representation of all crimes in the database is a three-dimensional cube, where the data vectors describing the crimes have been reduced down to three eigenvalues. The cube is viewed be selecting any two of the dimensions. Although a significant amount of information is lost when high dimensional data is reduced to a few dimensions, the visualization of the data still conveys significant structure of the data in the database. The user can select volumes in the cube to retrieve, remove, and crop crimes to and from the current set of crimes being analyzed.
Figure 4. The figure shows all the crime data vectors as points in a three-dimensional eigenspace. The cube of crimes is viewed in any two of the three dimensions. This cube of crimes gives an alternate view of the clusters and structure of the crimes in the database. Similar crimes form denser areas of "stars" in the cube. The highlighted crimes, within the overlaid rectangle, are selected into the current working set. The geographic map in CATCH is shown in Figure 5 with crimes placed as pins at the locations where they were committed. This map also allows the user to select an area and retrieve all the crimes in that area or the user can crop or remove crimes from the current working set of crimes.
Figure 5. The geographic map tool places the current set of crimes on the map as pins (see examples in the rectangle). The user can select pins to view additional information about specific crimes. The tools described above and some additional tools, e.g., a time line tool, allow the crime analyst to retrieve crime data from the database without having to use queries. CATCH automatically generates SQL queries to retrieve requested information from the user’s interaction with graphical representations of the data. Thus, although CATCH allows the use of queries in a specific query tool, it has been designed so that a user is removed from having to work with queries when mining the database. Database Visualization While the data mining tools are used for rapidly focusing on a set of crimes that may be related, the data visualization tools become the priority for more in-depth analysis of crime data. Some graphical data visualization capabilities of CATCH have been partially described above with the data mining tools. This section describes a few of the tools that allow the users to view, analyze, and compare details describing different crimes. (Because CATCH processes sensitive information it is necessary to restrict the images of these tools in the figures). Most of the data visualization tools in CATCH show the crime details in grids that are enhanced by color and order of significance. The color enhancement in the grids is used to give the user improved perception of the data without having to focus on numerical values. Typically, grid values representing crime details are lighter in color if the crime detail has a higher numerical value or if the crime detail holds true for a specific crime. The grids can also be sorted to bring more significant details to the top of the grids. The significance of each detail is dynamically computed in the sorting algorithms. Figures 6 and 7 below show two tools for comparing crime cases based on labels assigned to sexual offenders. These labels: Power Reassurance, Power Assertive, Anger Retaliatory, and Anger Excitation were conceived by the FBI to describe the behavior of sexual offenders1-5. Dr. Robert Keppel7-10, Chief criminal investigator at the Attorney General of Washington, Criminal Division, developed a weighting scheme applied to these labels. Each specific detail describing crimes has associated weights that are based on how much the detail contributes to the different labels and the rarity of that detail occurring in the HITS database of crimes. The weighting scheme incorporates the expertise of the crime investigators, recognizing that some crime details are more important than others for identifying related crimes by serial offenders.
Figure 6. The grid in the figure shows several crimes, one on each row, which have been determined by CATCH to be similar to a crime being analyzed (marked by an X in the first column). The most similar crimes in the database are retrieved and ordered by the overall weight assigned to one of the four sexual offender labels: Power Reassurance, Power Assertive, Anger Retaliatory, and Anger Excitation. The grid can be used for selecting and removing crimes to and from the current set of crimes.
Figure 6. The grid in the figure shows several crimes, one on each row, which have been determined by CATCH to be similar to a crime being analyzed (marked by an X in the first column). The most similar crimes in the database are retrieved and ordered by the overall weight assigned to one of the four sexual offender labels: Power Reassurance, Power Assertive, Anger Retaliatory, and Anger Excitation. The grid can be used for selecting and removing crimes to and from the current set of crimes. Figure 7. The tool shown in the figure allows the crime analyst to compare two crimes side by side according to the sexual offender labels: Power Reassurance, Power Assertive, Anger Retaliatory, and Anger Excitation. The figure shows the individual weights assigned to each of the details and the four labels describing each of the two crimes. The details of the two crimes in the figure are sorted to bring the significant details to the top. The two crimes compared in the figure are both described to have "unusual ritual" and "blindfold" in common. These are two crime details that are relatively rare in the database and may suggest that the same offender committed these two crimes. Evaluation Catch was developed to identify serial offenders by recognizing that serial offenders tend to repeat certain aspects of their crimes. Because the neural network algorithm clusters similar data vectors, we expect the crimes by the same offenders to be clustered close together. The graphs in figures 8 and 9 below show the summary of distances found between any pair of crimes committed by the same known offenders, for murders and sexual assaults, respectively. Distances are measured as the number of cells between two crimes on the self-organizing map. A distance of zero indicates that both crimes are in the same cell, a distance of one indicates that the two crimes are in adjacent cells, etc. The results shown in Figure 8 are based on 189 serial murders committed by 81 known offenders. The graph shows that 50 percent of serial murders by the same offenders are within 15 cells from each other. The results shown in Figure 9 are based on 412 serial sexual assaults committed by 154 known offenders. The graph shows that 50 percent of serial sexual assaults by the same offenders are within 8 cells from each other.
Figure 8. The solid line in the graph shows the probability of finding two murders by one serial murderer n number of cells apart. Fifty percent of the related serial murders are found within 15 cells of each other. The dashed line in comparison shows the distance between the same murders as they would appear if randomly placed into cells in the self-organizing map. The confidence is greater than 99% against these two probability distributions having the same mean (two-tailed t-test).
Figure 9. The solid line in the graph shows the probability of finding two sexual assaults by one serial rapist n number of cells apart. Fifty percent of the related serial sexual assaults are found within 8 cells of each other. The dashed line in comparison shows the distance between the same sexual assaults as they would appear if randomly placed into cells in the self-organizing map. The confidence is greater than 99% against these two probability distributions having the same mean (two-tailed t-test). Conclusion Crime analysts at the Attorney General of Washington, Criminal Division, are currently evaluating CATCH. Thus, a statement regarding the utility of CATCH must remain pending until the outcome of this evaluation. Preliminary evaluations suggest that the clustering algorithms and visualization tools in CATCH have the potential to add considerable value to crime analysts. A new version of CATCH is planned to incorporate additional tools that have been identified from the current research and development. The first set of tools in version one of CATCH was concentrated on researching the value of using artificial neural networks to cluster similar cases. The new tools will provide the crime analysts a more complete suite of tools, for example, a new tool will provide a more complete method for generating SQL statements from graphical representations. ACKNOWLEDGEMENT This work was supported by National Institute of Justice. Pacific Northwest National Laboratory (PNNL) is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RLO 1830. REFERENCES G. Copson, "Coals to Newcastle? Part 1: A Study of Offender Profiling: Police Research Group Special Interest Series," Paper 7, Home Office Police Department, London, 1995. G. Copson, R. Badcock, J. Boon, and P. Britton, "Articulating a Systematic Approach to Clinical Crime Profiling," Criminal Behaviour and Mental Health, 1997. J. E. Douglas, A. W. Burgess, A. C. Burgess, and R. K. Ressler, Crime Classification Manual, Lexington Books, NY, 1992. V. J. Geberth and R. N. Turco, "Antisocial Personality Disorder, Sexual Sadism, Malignant Narcissism, and Serial Murder," Journal of Forensic Sciences, Vol. 42, No. 1, pp. 49-60, 1997. V. J. Geberth, Practical Homicide Investigation: Tactics, Procedures, and Forensic Techniques, CRC Publishing, Boca Raton, Florida, Third Edition, 1996. S. Kaski, Data Exploration Using Self-Organizing Maps, Acta Polytechnica Scandinavica, Mathematics, Computing and Management in Engineering Series No. 82, Espoo, 1997. R. D. Keppel and J. P. Weis, "Time and Distance as Solvability Factors in Murder Cases," Journal of Forensic Sciences, Vol. 39, No.2, pp. 386-401, 1994. R. D. Keppel, Signature Killers, Pocket Books, NY, 1997. R. D. Keppel, "Signature Murders: A Report of Several Related Cases," Journal of Forensic Sciences, Vol.40, No.4, pp. 658-662, 1995. R. D. Keppel, The Riverman: Ted Bundy and I Hunt the Green River Killer, Pocket Books, NY, 1995. T. Kohonen, Self-Organizing Maps, Springer-Verlag, Berlin Heidelburg, Second Edition, 1997.





Post New Comment
Note: This site does not allow anonymous comments. Registered members can login here to participate.
Icon:
                 
                 
Message:
Include my profile signature.
Disable smilies in this post.
Disable block tag code.
Add [url] tag at URLs.


 

Read our feed