Sei sulla pagina 1di 9

FINAL YEAR PROJECT

Crime Analysis Toolkit

Muhammad Usman Khan (muahmmadusman.07259@khi.iba.edu.pk)


Saad Aslam (saadaslam.07521@khi.iba.edu.pk)
Sohaib Siddique (sohaibsidique.07532@khi.iba.edu.pk)

Introduction:
We are building a Toolkit for Crime Analytics which will produce the automated
analyses across the three dimensions of Data Analytics: Data warehousing,
Business Intelligence and Predictive Analytics. Each option will be executed if the
input data is applicable to provide the results on this option. For example: if the
data contains data for data warehousing then after manual detection, the data will
be passed to an automated procedure to make the OLAP cubes, if the requirements
are different or the data is not suitable for this kind of work then we’ll go for
Business Intelligence. Furthermore, if the predictions can be made from these data,
the predictive results will also be provided (if required or requested). The whole
processes above will be automated in this toolkit. In short, the motto of this toolkit
is to abstract the user from the tensions and hectic work of executing the analytics
himself/herself, they will just give us the data and these three processes will be
automated to produce the outputs.
In addition to this, we’re still to come across any toolkit that can give us the
analyses on these three dimensions of data analytics. So far, these tasks have been
done separately through manual and complex operations, and after extensive
search on Google we found that no one has presented these three analyses in a
single application or toolkit. This toolkit will help the societies to first understand
the crime through descriptive analyses like warehousing, and by generating B.I
results such as graphs and charts. Afterwards, the precautionary and preventative
measures can be adopted with the help of predictive analytics. The toolkit will be
accessed easily by any business or non-business user and can generate graphs, do
the ETL, and do the predictions in few clicks.
High Level Design:
1. Rationale and sources of your project idea

Crime is something that negatively affects overall societal well-being in ways


that go beyond the residents of the community in which the crime
occurs. Members of a community may draw closer or may develop grassroots
improvement opportunities as a result of crime. While the immediate effect of
crime is usually felt by the individual upon whom the crime was committed, the
community at large is also affected by criminal activity. High crime rates may
lead to population reduction as able individuals move away to avoid
victimization. Members who remain in crime-filled areas may feel unsafe in
general, particularly if they witness crime. Additionally, crime rates create a
negative impression about a community to those who live outside it.
To prevent these situations, and to reduce crime for the wellbeing of our
societies, we need to understand the very nature of crime, the patterns, the
history and the conditions that compel one to do the crime. And to overthrow
these crimes we need precautionary measures. The initiative to stop and prevent
crime is quite ancient, and our toolkit will have both features incorporated in it,
and comprehensive enough to help the law enforcement agencies, police and
even intelligence to eliminate these crimes from the societies.

2. Logical structure

We developed the standard template phase-1 to map the data and standardize it for
the whole process. In this phase we have made a master dashboard to control each
and every step easily and to give the comprehensive control to the user. The
dashboard will have suggestive options for the ease of the user. Generation of BI
graphs, ETL and predictive analysis will be done through this dashboard.

3. Background

Data input files can be of different formats but initially we take input files in csv
format. Later, we would also try to incorporate other formats for analyses. After
the formation of standard template in FYP-1 with specific common attributes
related to crime. We took two kind of standard template; one for Business
Intelligence and second for Predictive analysis. For automation, initially, we take
10 datasets from which 80% of which is used to design standard template and on
20% for testing, and make sure that this testing data could map on standard
template or not. These have been completed in phase 1.
In phase 2 we will have a toolkit which performs Extract, Transform, and Load
(ETL) on data sets. For this we’ve design dashboard for missing which will check
whether this data set contain missing values or not if contain this dashboard
handle these missing values accordingly. After missing value handling process, we
will focus on data type process for better analysis. After designing standard
template and ETL process we will try to generate all types of possible BI graphs
with detail graph interpretation, usually not common in dashboard graphs. And an
easy to use Master dashboard to do the analytics (descriptive and predictive) in few
clicks.

4. Hardware / Software tradeoffs

This toolkit is purely consist of the software and does not require any hardware
assistance at the moment. We’ve developed this toolkit using open source software
and platforms. Because of this the system is more portable, and much less of a
hassle to use. The only threat we have is the data; inadequate and inaccurate data,
irrelevant and corrupt data from the user may lead to misleading and false
solutions.

5. Relationship with available past projects or standards e,g. IEEE,


ANSI, ISO and etc.

To do gap analysis, we created the following search queries to determine the level,
approach and research and development done in this field:
1. “Crime analytic toolkit”
2. “Crime analysis solutions”
3. “Crime analytic software”
4. “Crime analysis toolkit”
5. “Crime prevention application”
6. “Crime forecast toolkit”
We executed these queries on both Google search engine and Google scholar. In
both cases, we couldn’t find any software, application or toolkit has provided or
currently providing the results in all three dimensions of data analytics. Also, our
toolkit will automate the analyses in real time and produces the results; whether
they are descriptive or predictive as required or requested. The current work in
Pakistan from departments like CPLC, and police is not sufficient to find the root
cause of all the crimes that are being committed, despite having so much expenses
being borne on defenses. Our toolkit in equipped with interactive dashboards with
which the business user can predict, generate graphs, do the descriptive analysis.

Software / Hardware Design:


1. Overview

We are building a Toolkit for Crime Analytics which will produce the automated
analyses across the three dimensions of Data Analytics: Data warehousing,
Business Intelligence and Predictive Analytics. Each option will be executed if the
input data is applicable to provide the results on this option. For example: if the
data contains data for data warehousing then after manual detection, the data will
be passed to an automated procedure to make the OLAP cubes, if the requirements
are different or the data is not suitable for this kind of work then we’ll go for
Business Intelligence. Furthermore, if the predictions can be made from these data,
the predictive results will also be provided (if required or requested). The whole
processes above will be automated in this toolkit. In short, the motto of this toolkit
is to abstract the user from the tensions and hectic work of executing the analytics
himself/herself, they will just give us the data and these three processes will be
automated to produce the outputs.
2. Program Details

a. Overview

It is a simple toolkit with a Dashboard which has all the controls to manipulate data
and do the analytics across three dimensions of data analytics; Data warehousing,
Business intelligence and predictive Analytics. After the mapping of data on
standard template from the user the cube will be generated to find the aggregates
across the dimensions. BI graphs are generated after the user demand which help
the user understand the data intensively. Predictive analytics is done and the results
are shown to the user through this dashboard as well.

b. User interface
3. Errors

We had the problem of finding the connectivity among development environments


we have been using which we resolve using several open source libraries and
community.

a. Trails and tests


Creation of standard template using PHP forms to collect, populate and visualize
the data.

4. Hardware Details

No hardware is use in our project for now.

Results:
1. The standard templates will be created from all the datasets we have or we
can acquire in the future to standardize all the crime datasets. This has been
done in Phase-1. This is done manually by the user.
2. The forms will be automated on the front end.
3. The outcome is the Master dashboards for Business intelligence and
predictive analytics.

Conclusions:
Crime is something that adversely influences general societal prosperity in ways
that go past the inhabitants of the group in which the crime happens. Individuals
from a group may move nearer or may create grassroots change open doors as a
consequence of crime. While the prompt impact of crime is normally felt by the
person upon whom the crime was perpetrated, the group everywhere is likewise
influenced by criminal movement. High crime rates may prompt to populace
diminishment as capable people move away to keep away from exploitation.
Individuals who stay in crime-filled territories may feel risky all in all, especially
in the event that they witness crime. Also, crime rates make a negative impression
about a group to the individuals who live outside it.
To keep these circumstances, and to lessen crime for the prosperity of our social
orders, we have to comprehend the very way of crime, the examples, the history
and the conditions that force one to do the crime. Furthermore, to topple these
crimes we require careful steps. The activity to stop and anticipate crime has its
foundations in Muslim time and our toolbox will have both components fused in it,
and far reaching enough to help the law requirement offices, police and even
insight to devastate and maintain a strategic distance from these crime from the
social orders.
Appendix:

Appendix 4: Software components.

1. Master Dashboard, using PHP, MySQL, HTML 5 etc.

2. Predictive Analysis using R and Python

Appendix 5: Work distribution

 Muhammad Usman Khan (Development and research on Standard Template


and mapping the data on it.)

 Saad Aslam (Formation of Dashboard and generation of graphs and


Predictive Analytics.)

 Sohaib Siddique (Business Intelligence- graphs and connectivity, Predictive


Analytics, Report generation and FYP reports.)

Appendix 6: Project timeline


Acknowledgements:
We would like to thank our supervisor Dr. Tariq Mehmood for his incredible support and
guidance, especially with the Data warehousing Predictive Analytics techniques.
Secondly we'd like to thank the IBA FCS faculty for their honest input which has
definitely shaped how this project has turned out. Lastly we'd like to express gratitude to
the open source community at large for the amazing eco-system without which this
software would have been much difficult.

Potrebbero piacerti anche