Sei sulla pagina 1di 6

1

Washington DC Mobile Transit App


John Brewer, Eric Frohnhoefer, Weihan Yang
AbstractThis paper outlines our approach, called CrimeRank, for developing a real-time mobile transit application that incorporates crime data from the DC metropolitan area into route planning. Our approach is intended to provide the user with a variety of options to get to their destination, while providing valuable insight into the amount of crime along their route and their destination. Index TermsData Mining, Android, Crime, WMATA, Visualization, Spatial Data, Transit

framework would calculate the amount of time it would take for the user to arrive at their destination. Crime details and a crime score for various stops along the route and their final destination would be provided so the user understands if there are high crime transfer points or if their destination is in a high crime area. Additionally, our system architecture will utilize cloudbased resources for computation and storage. The limitations of mobile devices have driven mobile applications to use the cloud for computation. For example: Smart phone have low cost, low power CPUs that are not capable of handling large processing loads and have limited multitasking support. Limited battery life prevents extended computation Storage is limited to several gigabytes (Micro SD cards are available up to 32 gigs). Keyboards are small and cramped. Screens are extremely small. By moving applications to the cloud one is able to take advantage of the vast computing resources provided by vendors like Google, Amazon, and Microsoft. The smart phone will simply create a request and display the results returned from the Cloud. Thus, the majority of the applications intelligence will reside in the cloud, which will reduce the amount of work done by the smart phone. In this paper, we will provide an overview of the related work in the field of transit and crime mobile applications, then describe our proposed approach, and outline the architecture of our proposed system. II. RELATED WORK There are a plethora of mobile applications for consuming mass transit data in real time. In the DC area alone, there are over 15 Android applications serving up WMATA transit data. All have similar themes and revolve primarily around giving users information about arrival times for buses and trains. Some apps serve both systems and incorporate GPS for quickly locating public transportation nearest the user. One application allows for giving feedback to Metro, similar to other citizen reporting applications focused on problems within a city. A few apps attempt to incorporate Twitter data, but none work extensively with datasets not accessible through the WMATA API.

I. INTRODUCTION The growth of mobile devices such as Android and the iPhone has led to an increase in demand for location aware applications. In a single quarter, more than 81 million smart phones were sold [1]. The Android platform accounts for 25.5% of all smart phone sales, meaning there is a high demand for Android applications that can perform useful location based services. Data for developers on both the iPhone and Android platforms is widely available. For example, the API for the Washington Metro Area Transit Authority (WMATA) is free for anyone to use, and provides useful spatial and temporal data for Metro riders to find their way around the DC metropolitan area. Governments and everyday citizens have also provided freely available crime data that is searchable for anyone. With widely available data and high demand for mobile devices that can accurately pinpoint a users location, there is an opportunity for developers to create applications that provide interesting and insightful information for consumers. Our proposed approach, called CrimeRank, is to use crime data provided to us for the DC metropolitan area, combine it with WMATA data, and perform various spatial data mining techniques to provide the user with valuable insight into the amount of crime along their route and destination. With this approach, we intend to provide the user with a variety of choices to get to their destination, while optimizing for different characteristics, such as shortest path, quickest route, least amount of crime, and so on. A simple use case for CrimeRank would involve the user providing their current location and their desired location, and our processing

2 applications in the hands of millions of people using mobile applications only serves to accelerate the gathering of the data and the rate at which it becomes useful to others. III. PROPOSED APPROACH Our proposed approach, called CrimeRank, will provide the user with a rating for the amount of crime along a route that they planning to take. Using our mobile application, the user will input what their desired destination will be. The application will calculate a variety of routes from the users current location, optimizing for a number of factors, such as least amount of crime, shortest time, shortest path, and so on. These routes can be a combination of Metrorail, Metrobus, driving, and/or walking directions.
Figure 1: DC Metro Transit App for Android

The market for mobile crime applications is far less developed. No DC specific Android applications exist. Most crime apps are browser based and simply display the crime data points on a map, giving locations of incidents, then allowing the user to drill down and find out more about events in a particular area. Some applications simply allow for retrieving regular police reports for a given precinct. Probably the most similar in nature to the proposed CrimeRank application is an iPhone app named, AreYouSafe: DC reportedly due to launch soon [2]. Currently, all that they have made available is a Google Maps based heat map of the District with no further information (Figure 2). A web-based application, Stumble Safely, uses crime data to find safe walking routes home from bars in NW DC.

CrimeRank will be a density based measure that calculates the average crime density for an area. There will be four levels: Low, Average, Moderate, and High. An Average crime density implies an average amount of crime, while Low will mean one standard deviation below average. Moderate implies one standard deviation above Average, while High will mean two standard deviations about the average. Based on the crime density along the various routes that our application calculates, we will output a score for each route, allowing the user to choose a route with a low crime rating, or possibly a faster route with a higher crime rating. IV. SYSTEM ARCHITECTURE A. Overview To implement our proposed approach, we plan to use a cloud-based Amazon EC2 service, to host a web server and database. This will allow us to offload computation from the Android mobile device to a more powerful server. The web server, which we call the Transit Server, will host our application, which will interact with the database to retrieve and process the data for the user. The crime dataset and static WMATA data will be loaded into the database, while the real-time WMATA data can be fetched by the web-server when needed. The web server can expose a REST interface for the Android phone to use to perform queries and retrieve results. The spatial processing and data mining will run in the web application, combining our crime and WMATA data and returning the results to the Android phone. Figure 3 shows the overall architecture of the system. Figure 4 shows the architecture of the web application that will be hosted on the web server.

Figure 2: AreYouSafe: DC heat map

As technology becomes more prevalent in everyday life, so too does information about the world in which we live. Specifically, GIS and spatial databases allow for mining and exploration of datasets with regard to their location. This extra level of detail allows for more opportunity for correlation and drawing helpful conclusions. Putting the power of GIS based

Figure 3: System Architecture

3 For our application we explored three cloud computing services, Google App Engine, Microsoft Azure, and Amazon EC2. These vendors represent the three leaders in cloud computing services. All three services offer a free pricing tier we can utilize to develop, test, and deploy our application. Each service was examined based on ease of setup, runtime familiarity, and geospatial support. Google App Engine follows the PaaS model and features a Java and Python runtime environment. Persistent storage of data is provided by a non-relational Datastore. App Engine wasnt the best option for this project due to the lack of built in geospatial functionality in the App Engine Datastore. A number of projects, such as GeoModel, GISCloud, and GeoDatastore, aim to add geospatial support. However, their functionality is limited. Google is planning to deploy a relational Datastore based on MySQL which does have geospatial support. Microsoft Azure also follows the PaaS model and features a .NET runtime environment. Persistent storage of data is provided by Azure SQL which is highly distributed version of Microsofts SQL Server. While Azure SQL supports the same geospatial operations as SQL Server we ruled out Microsoft Azure because nobody on the team was familiar with the .NET runtime. Our last option, Amazon EC2 follows the IaaS model. We selected Amazon EC2 to host our application. By using EC2 we have greater control over the environment in which our application will be run. The added flexibility comes at the cost of increased setup time. Because Amazon has such a wide user base we were able to find a per-built Ubuntu VM to get us started. C. Transit Server The Transit Server web application will be built using Spring to manage the application framework and RESTeasy to provide management of the REST interface. Maven will be used to automatically manage all third-party libraries. Our application will be deployed on a Tomcat server running in Amazon EC2. We will use Postgres as our database and PostGIS as the spatial database extension to Postgres to allow us to store spatial objects. The database will also be installed on our Amazon EC2 instance and persisted to Amazon Elastic Block Store. Postgres and PostGIS provide Java libraries that our application can utilize to perform regular SQL queries and spatial queries. Data will be retrieved from the database using JDBC and writing SQL statements rather than using an ORM library such as Hibernate and Hibernate Spatial. PostGIS also provides indexing on spatial objects to help speed up our queries. The REST interface on the server-side will return JSON objects to the Android device. By performing simple POSTs or PUTS of location data (for example, where the user currently is and where he or she wants to go), the Android device will be able to retrieve JSON objects detailing route

Figure 4: Web Application Architecture

To keep the architecture uniform, we plan on writing a majority of our code in Java and using free third-party Java libraries to implement our system. We will also plan on utilizing the Android SDK and the various Google APIs for tools such as Google Maps. Table 1 outlines the various third party libraries we have chosen to implement our architecture.
Table 1: Software Components

Component Cloud Provider Web Container Dependency Management Java Application Framework Database REST library Android Development

Selection Amazon EC2 Tomcat Maven Spring Postgres and PostGIS RESTeasy Android SDK

B. Choosing a Cloud Service Cloud computing options can be broken down into three categories, Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS). Software as a Service vendors prove a fully hosted web-based applications. SaaS vendor typically charge on a per user basis. No vendors in this category were evaluated. Platform as a Service vendors offer a development platform for developers. Developers write their own code and deploy their application on the vendors platform. The benefit of the PaaS model is the developer doesnt have to worry about the maintenance (i.e. patching, upgrades, etc.) of the platform running the application. Infrastructure as a Service vendors provide an enterprise grade physical infrastructure in which a developer can deploy a virtual machine running their application. New instances can be stopped and started as demand changes and the developer pay an hourly rate per instance. The benefit of the IaaS model is the developer has full control over the platform, but does not need to be concerned with the physical infrastructure.

4 information and crime information. Essentially, when the user accesses a REST service on the web server, the web server will open a connection to the database and perform the necessary queries to retrieve the desired data. If any real-time WMATA data is needed, the web-server will then use the WMATA API to retrieve the data. The real-time WMATA data and the stored crime data will be merged in a postprocessing step and then returned in a JSON format to the user. This sequence is shown in Figure 5. A simple use case that demonstrates this sequence has already been implemented and successfully tested. The mobile device is able to connect to and retrieve JSON data from the Transit Server. prune crime data containing invalid location information. The TIGER/Line data may also help in computing our CrimeRank metric. E. Android Client The target handset for this project is a Verizon Droid (Motorola A855). This phone runs Android v2.2.2 also known as Froyo. This corresponds to Android API Level 8. The application will aim to support older versions of Android, but the focus will be to ensure that everything runs efficiently on Froyo. Android runs Java applications on a Java based framework. A basic level of working knowledge of the Android stack was necessary to begin exploring the possibilities within the operating system. To this end, several tutorials using the Android SDK were completed, with each aimed at developing paths to provide the necessary functionality for the CrimeRank application. The Android SDK was installed along with the Google Maps API Add-on. A baseline application demonstrating the rough pieces of the final application was created. This application made initial contact with the web server to do REST operations (get, put, post). This application was tested on both an emulator and the Droid. All the necessary tools for application development were tested as part of this initial survey: debugging and logging capabilities from within Eclipse, along with emulator setup and loading code onto the device. Still remaining in the realm of basic functionality to be explored is an on-board SQLite database. This capability will most likely be used to store user favorites and their CrimeRanks as well as other pieces of information needed for quick access. The development of the Android application will be incremental in nature. The goal is to grow the application by working on various pieces of functionality in steps. At the end of each step, the application will be useable and capable of demonstrating the added functionality. The application will also undergo testing at the end of each increment. By doing incremental releases with testing, we hope to keep the project on track for delivery and reduce the amount of overall testing near that time. Rather than attempting to add more developers if we fall behind schedule, the plan will be to drop functionality. The goal of the Android application development is to provide an interface for the user to interact with the CrimeRank algorithm in a meaningful way. The pages of the application should be designed so that the user experience is both simple and efficient. Basic WMATA functionality will mirror existing Metro applications to meet the expectations of current users of such applications. The CrimeRank algorithm will be on display throughout the application and uniquely integrated with the WMATA dataset.

Figure 5: User query sequence diagram

D. Datasets and Data Ingest The crime data set contained a number of issues so the data had to be cleaned before it could be imported. We removed unrecognized characters and removed entries with invalid locations. Only data for the most current year (2009) was imported. Using the pg_read_file() function in PostgreSQL the crime dataset was directly loaded into the database as a XML data type. This allowed us to use XPATH to parse the XML and load the data into a table. We decided to cache the bus and metro stop locations because the dataset is relatively static. Doing so allows us to precompute our crime metric and store the results in the database. The locations of bus and metro stops were obtained thru the WMATA API. The XML loaded into the database using the pg_read_file() function and XPATH was used to load the data into a table. TIGER/Line shape files are provided by the US Census Bureau. These files contain select geographic information such as roads, water features, and political boundaries. Using a tool call org2org we imported the boundaries for all counties in which crime information was available. This allowed us to

5 The first and most basic piece of functionality to be delivered is a list of WMATA locations nearest the user. The GPS on board the phone will provide the desired location. A simple nearest neighbor query on the Metro database should return the information. This will be displayed as a scrollable list displaying the distances to each stop. The user will be able to pull up a Google Map window with all the points displayed. The next piece of functionality will be to give the user an immediate assessment of his or her current locations crime level. The same GPS based location will be used to query the web server and custom CrimeRank score will be returned and displayed. Along with the custom score, the list of close metro stops will now have crime scores attached as well. This information will have been previously calculated and the number will be provided to the user. A route planner will also be developed. The first step will be to use the WMATA API to return possible itineraries. Again, these will be displayed as a scrollable list. Then the CrimeRank will be added to the itineraries, giving the user the ability to make tradeoffs with regard to safety and/or expediency. An effort will be made to display a chosen itinerary in Google Maps. It has yet to be determined whether visual displays of CrimeRank will be made available to the user. This is a piece of functionality that is set to be added as the schedule allows. In parallel with the development of those pieces of functionality, the look and feel of the app will be determined. Sketches of each page of the application will be made and evaluated for usability. These sketches will then be translated into XML layouts for use in Android. The flow between pages, user menus, and application settings will all have associated graphical user interfaces that need to be designed. VI. SCHEDULE Table 2 outlines the schedule our team will follow.
Table 2: Project Schedule

Date 2/21 4/4 5/2 5/9

Deliverable Project Proposal Project Checkpoint Project Presentation Project Report

VII. PROGRESS REPORTS

John Brewer
Task Setup Development Environment. Comments Eclipse, Android SDK, Google Maps API, etc. are all configured. Debugger and logger are working. Basic get, put, post operations tested on emulator and handset. Sketches and flow diagrams will be developed to guide the process. These will later be translated to XML. A simple interaction with the WMATA API to get started. Return crime scores for current location as well as desired metro stops. Return list of itineraries along with CrimeRank score to allow for tradeoffs. Give the user visuals of immediate vicinity and/or chosen routes. Status COMPLETED

Make REST connection.

COMPLETED

Design Application Page Layout/Flow

IN WORK

Closest Metro Stops

IN WORK

V. PERFORMANCE AND EVALUATION A. Performance We will add metrics gathering to be an integral part of our application so we can measure the speed and responsiveness of our queries. Any mobile or web application needs to respond to the user in a reasonable amount of time. After the implementation is complete, we will examine the speed of our queries to see if the spatial queries, as weve implemented them, are sufficiently quick for our application. Postgres and PostGIS provide indexing capabilities for spatial objects that we can use to optimize and speed up our queries. We will add spatial indexes where necessary based on an analysis of the types of queries we typically perform. B. User Experience Another aspect of this project that will need to be evaluated is the design of the user interface. We will have to examine the look and feel of the interface on the Android device to determine if it is intuitive and easy to use.

Incorporate CrimeRank with GPS and Metro.

IN WORK

Trip Planner

IN WORK

Google Maps Visuals

IN WORK

6 Testing Testing of each new piece of functionality, testing of the user interface, etc. will be occurring throughout the development process. ONGOING data Metro station. Need to decide what other queries are necessary. Have to access the WMATA API and combine with the results from the database queries before returning the results to the mobile device.

Integrate real-time WMATA data into the queries

IN WORK

Eric Frohnhoefer
Task Evaluated various cloud vendors for our application. Deploy Ubuntu virtual machine on Amazon EC2. Comments Amazon EC2 selected. Instance stored on S3. DB data files and Tomcat stored on EBS for persistence. Need to work with groups to determine queries so that indexes can be added. Only 2009 data used. Alexandria also not imported due to lack of valid dates. Status COMPLETED
1.

REFERENCES
Gartner Says Worldwide Mobile Phone Sales Grew 35 Percent in Third Quarter 2010; Smartphone Sales Increased 96 Percent . 2010 2/18/2011]; Available from: http://www.gartner.com/it/page.jsp?id=1466313. Are You Safe? 2/20/2011]; Available from: http://areyousafedc.com/.

COMPLETED

2.

Install Tomcat, PostgreSQL, and PostGIS. Provide access to group members. Ingest crime data set, static metro data, and TIGER/Line Data.

COMPLETED

COMPLETED

Work with group to design and evaluate CrimeRank algorithm.

IN WORK

Weihan Yang
Task Select libraries to be used in the web application Design framework for the web application Comments Status COMPLETED

Design and implement an HTTP interface for the Android phone to use Implement spatial queries to allow the CrimeRank algorithm and Android device to access the needed

Spring application framework implemented, tested, and deployed to the EC2 instance. Need to work with John to figure out what queries the phone will be performing. Implemented a simple query to find all crimes within some distance of each

COMPLETED

IN WORK

IN WORK

Potrebbero piacerti anche