Sei sulla pagina 1di 5

CSC8001: Data Science Project Report

For your Final Project, your task is to analyse and report on the extent and nature of the injury or violence
problems for your assigned country. The country you will report on has been assigned based on the last
digit of your Student ID number.
Country Last digit of Student
ID number
Brazil 0
Mexico 1
Chile 2
Costa Rica 3
Iceland 4
Kyrgystan 5
Poland 6
Romania 7
Turkmenistan 8
Slovenia 9

---------------------------------------------------------------------------------------------------------------------------

“As more governments around the world come to recognize that injuries and violence can and
must be prevented, many are trying to get a better understanding of the problem in their
countries as a basis for designing, implementing and monitoring effective prevention strategies
(World Health Organization, 2015).”
World Health Organisation, “Injuries and violence: the facts 2014”, p.14

Data Based Prevention and Control


Please read the Injuries and violence: the facts 2014 publication included in the Project_Files folder. The
World Health Organisation’s publication, Injuries and violence: the facts 2014, outlines a four step
Prevention and control process (Figure 1). Your Data Science Project Report will be based on the
Surveillance step of this process: Using data to understand the extent and nature of the injury and
violence problem.
Figure 1: Prevention and control

Project Report
All of your final code, analysis and discussions will be included in a single Jupyter notebook (Python 3)
with appropriate headings for each section. All plots should meet the expected standards of
visualisation and be displayed inline. Tables can be created by either formatted print statements or using
markdown cells with Markdown or HTML code. Your report may include supporting images if you feel
this is relevant. You may also source additional datasets to support your analysis and discussions, again
only if you feel it is relevant.
Your Jupyter notebook project report should include the following sections:
1. Introduction [10 marks]:
o Introduce and discuss relevant features/facts of your country. You may find the WHO’s
country pages (http://www.who.int/countries/en/) and the The World Factbook
(https://www.cia.gov/library/publications/the-world-factbook/) helpful. All data
sources should be appropriately referenced.
2. Datasets [10 marks]:
a. Discuss the datasets you’ve used and any relevant details which are required to
understand how you’ve extracted your country’s data. Examples of relevant information
for the WHO database would be:
i. the WHO Mortality Database country code for your selected country,
ii. the ICD files, Years and Lists used in your analysis,
iii. a table which summarises the code and causes of death descriptions for your
country’s leading causes of death, as discussed in your report,
iv. a table which summarises which causes of death codes you classified as a death
due to injury or violence.
Any long detailed tables should be included as an Appendix at the end of your report.
3. Analysis and Discussion [70 marks]:
In this section you will analyse and discuss the extent and nature of the injury or violence
problem for your selected country. You are required to provide graphs and tables, as indicated
below, to support and illustrate your analysis and discussion. You may also include additional
graphs and tables where and if you feel they are relevant. .

This section should provide analysis and discussion to address the following questions:
a. What are the current leading causes of injury deaths in your country?
i. Provide a pie chart, which displays the top 5 leading causes of deaths due to
injuries and violence based on your country’s most current year’s data. Include
the remainder of the injury and violence deaths as other.
b. Have injury deaths risen in rank over the last thirty years?
i. Provide tables which compare the top twenty causes of all deaths (not just due
to injuries and violence) over the last thirty years, in fifteen year increments
starting from the most current year’s data and going backwards. For example, if
your country has data from 1950 to 2010 you will have a table displaying the
top 20 causes of all deaths for the years 1980, 1995 and 2010.
ii. Provide a time series chart depicting how the top ten current leading causes of
injury and deaths have changed over the last thirty years. Your data should be
in five year increments. For example, if your country has data from 1950 to
2010, your chart will include the years: 1980, 1985, 1990, 1995, 2000, 2005 and
2010.
c. Are some groups more vulnerable to injuries and violence than others?
i. Provide a vertical bar chart which displays the death rates by cause of injury and
group for the most current year’s data. Use the top five current leading causes
of injury and deaths (from part b), and the following groups: youths (ages 15-29,
any gender), males (all ages) and females (all ages).
d. Does poverty increase the risk of injury?
i. Compare your country’s deaths due to injury and violence to another country in
your world region which has a different WHO income level classification. For
example, per the provided LMIC-HIC_country_grouping document, Australia’s
WHO region code is Wpr HI, indicating that Australia is in the West pacific
region and is considered High income. So a good comparison country for
Australia could be Vanuatu, with a WHO region of Wpr LMI.

For each of your countries, Provide an appropriate chart with the top twenty
causes of all deaths (not just due to injuries and violence) based on the most
recent years data which is available for both countries. For example, if the most
recent year’s data for one country is 2011, and the other’s is 2009, you should
use the 2009 data for both countries.
ii. Provide a vertical bar chart displaying the percentage of all deaths due to injury
and violence for both countries.
4. Conclusion [10 marks]:
a. Summarize the main points discussed in your report, including your findings for the
leading causes of death due to injury and violence for your country.
5. References:
All sources must be referenced, but since Data Science is a broad multi-disciplinary field I leave
the choice of referencing style up to you. The style you choose must be used consistently for all
in-text references and all in-text references must be included in your list of references.
6. Appendices (optional)

Marking Criteria
Please review the marking criteria document provided.

Submission
Submit a single zip file which contains your Jupyter project notebook and all other files that are necessary
to reproduce your notebook. When I test your project notebook I will unzip your submission to a local
folder on my machine, and re-run all cells on your notebook. Please make sure that all links in your
Jupyter notebook to data files, imported code files, images, etc. are relative to the notebook.

WHO Mortality Database


The data for this assignments is provided by the WHO Mortality Database available at:
http://www.who.int/healthinfo/mortality_data/en/. Your analysis can be verified by using the WHO
online tool available at: http://apps.who.int/healthinfo/statistics/mortality/causeofdeath_query/
The online tool allows you to query the WHO Mortality database to:
 Extract data for causes of death by:
o country, year, sex and age with all individual causes of death
o country, year, sex and age for a few selected causes of death by coding systems
 Extract data for population data by:
o country, year, sex and age

Resources
Review the data and documents available in the provided Project_Files folder. Table 1 below describes
the contents of each file.
Table 1. Data and document files (last updated 25 November 2015)

Filename Description
Documentation_25nov2015.doc Word file with information on the WHO Mortality Database,
file specifications and list of causes of death.
Last updated: 25 November 2014
list_ctry_years_25nov2015.xlsx Excel file with the list of countries-years available for the
mortality and population data.
Last updated: 25 November 2014
country_codes.csv Country codes and names.
Last updated: 03 November 2014
notes.csv Notes pertaining to data for some countries-years. Last
updated: 25 November 2014
pop.csv Reference populations and live births (for regular users,
figures are now in units).
Last updated: 25 November 2014
MortIcd7.csv Data file containing the detailed mortality data for the
seventh revision of the ICD (International Classification of
Diseases).
Last updated: 18 February 2004.
Morticd8.csv Data file containing the detailed mortality data for the eighth
revision of the ICD (International Classification of Diseases).
Last updated: 09 July 2012.
Morticd9.csv Data file containing the detailed mortality data for the ninth
revision of the ICD (International Classification of Diseases).
Last updated: 25 November 2015.
Morticd10_part1.csv Data file containing the detailed mortality data for the tenth
revision of the ICD (International Classification of Diseases).
Last updated: 25 November 2015.
Morticd10_part2.csv Data file containing the detailed mortality data for the tenth
revision of the ICD (International Classification of Diseases).
Last updated: 25 November 2015.
LMIC-HIC_country_grouping.pdf Country 3 letter code, WHO region and Income regions.
Last updated: May 2014
WHO-Violence_Injury_Prevention.pdf WHO document which highlights that more than 5 million
people die each year as a result of injuries, resulting from
acts of violence against oneself or others, road traffic
crashes, burns, drowning, falls, and poisonings, among other
causes.
World Health Statistics 2016-SDG.pdf The World Health Statistics series - WHO’s annual
compilation of health statistics for its 194 Member States.
World Health Statistics 2016 focuses on the proposed health
and health-related Sustainable Development Goals (SDGs)
and associated targets.

Potrebbero piacerti anche