Sei sulla pagina 1di 54

Scope of epidemiology

The literal translation of the word 'epidemiology', is based on its Greek roots epi - upon, demo - people, and logo-study, word, or discourse. That is 'the study of that which is upon the people' or, in modern parlance, 'the study of disease in populations'. Epidemiology is the study of the distribution and determinants (factors) of health-related states or events in specified populations, and the application of this study to the control of health problems.


"Study": Epidemiology is a highly quantitative discipline based on the principles of statistics and research methodology. Study includes surveillance, observation, hypothesis testing, analytic research, and experiments.

"Distribution": Epidemiologists study the distribution of frequencies and patterns of health events within groups in a population. To do this, they use "descriptive epidemiology", which characterizes health events in terms of time (when), place (where), and persons affected (who).

1-Time distribution (of a disease):

Whether there has been an increase or decrease of the disease over time (years, months, days) ?

Time distribution of a disease is the time interval between the appearances of the first case till the appearance of the last one. The peak time is the point of time during which the maximum number of cases occurs. Changes in disease frequency with time can be presented by a line graph as follows: 1

Number Of cases

Time units (e.g. months)

Time distribution of a disease may take one of the following forms: a) Point epidemic: large and excess number of cases among people exposed to a source of infection or chemical over a short period of time (e.g. food poisoning), b) Secular change: changes occur gradually in disease frequency over long periods of time (years). Example; developed countries have a secular decrease (downward secular change) in morbidities and mortalities from infections BUT they have a secular increase (upward secular change) in other health problems as cancer and heart diseases. c) Seasonal fluctuations: e.g. increase gastro-enteritis in infants in summer season and increase bronchitis in winter.

Epidemiological transition: Is a shift in the pattern of diseases that occur in a given community. Such transitions are important because they tell us a great deal about the true foundations of disease: what really causes the disease. Throughout the past 100 years a major epidemiological transition has occurred.

2-Place distribution (of a disease):

Whether one geographical area has a higher frequency of the disease than another? -International comparisons (between countries) e.g. malaria.

-Variations within the country: e.g. between urban and rural areas. Example; lung cancer increases in urban areas may be due to air pollution and smoking. -Local distribution: within a local community (e.g. a city). Example; rheumatic fever and scabies increase in poorer sectors of a city due to overcrowding and bad housing.

3-Personal characteristics:
Whether the characteristics of persons with a particular disease or condition distinguish them from those without it?

Personal characteristics: -Demographic characteristics: Age, gender, race (a social/political classification generally based on physical characteristics), ethnicity (generally defined in relation to cultural characteristics), religion, culture (is a system of shared values, assumptions, beliefs, and norms that unite the members of the population) -Biologic characteristics: Blood levels of antibodies, chemicals, and enzymes, -Socio-economic factors: social class, income, education, occupation, marital status -Personal habits: smoking, dietary pattern, physical exercise, -Genetic characteristics: Rh,

"Determinants (factors)": epidemiologists attempt to search for causes or factors that are associated with increased risk or probability of disease occurrence. This type of epidemiology that is trying to answer the questions of "how" and "why" the health event happened, is referred to as "analytic epidemiology". Determinants are all the physical, biological, social, cultural, and behavioral factors that influence health. 3

"Health-related states and events" include infectious diseases; chronic diseases; environmental problems; behavioral problems (as use of tobacco), injuries, reactions to preventive regimens, and provision and use of health services.

"Specified populations" epidemiology may deal with groups of people rather than with individuals. They are those with identifiable characteristics.

"Application to control ..": epidemiologic data steers public health decision making and aids in developing and evaluating interventions to control and prevent health problems. This is the primary function/aim of applied (field) epidemiology: to promote, protect, and restore health.

Functions (purposes) of epidemiology: 1. Identifying causes and/or risk factors for diseases. To do that, it discovers the agent, host, and environmental factors that affect health, in order to provide the scientific basis for the prevention of disease and injury and the promotion of health.

2. Determine the extent of the disease in the community, and so the relative importance of causes of illness, disability, and death, in order to establish priorities for research and intervention, targeting those who have the greatest risk for ill health.

3. Study natural history (disease stages from start to end) and prognosis of disease (outcome).

4. Evaluate the effectiveness of preventive and therapeutic health programs and services in improving the health of the population.

5. Conduct surveillance of disease and injury occurrence in populations and of the levels of risk factors passive (receive reports), active (poll practitioners, conduct surveys)

6. Investigate outbreaks (e.g., hospital-acquired infections, disease clusters, foodborne and water-borne infections) to identify their source and controlling epidemics (e.g., measles, rubella, coronary heart disease, overweight)

Characteristics of epidemiology
With so many varieties of epidemiology, it is no wonder that confusion abounds about what is and what is not epidemiology. Epidemiologic research tends to: be observational, rather than experimental;

focus on free-living human populations defined by geography, worksite, institutional affiliation, occupation, migration status, health conditions, exposure history, or other characteristics rather than a group of highlyselected individuals studied in a clinic or laboratory; deal with etiology and control of disease, rather than with phenomena that are not closely tied to health status; take a multidisciplinary, empirical approach directed at understanding or solving a problem rather than on advancing theory within a discipline. However, not all epidemiologic studies have these characteristics.

Key aspects of epidemiology:

A number of other fields medicine, nursing, dentistry, pharmacy, demography, sociology, health psychology, health education, health policy, nutrition share many common features and areas of interest with epidemiology (and with each other). Some of the key aspects of epidemiology are: 1-Epidemiology deals mostly with populations, thus involving: - Rates and proportions - Heterogeneity within - Averages - Dynamics

2-Epidemiology involves measurement; 3-Most epidemiologic studies involve comparison; 4-Epidemiology is fundamentally multidisciplinary, since it must consider statistics, biology, chemistry, physics, psychology, sociology, demography, geography, animal medicine, microbiology, statistics, computer programming, administration, toxicology, entomology, environmental science, policy analysis, 5


There are three types of epidemiologic research, each is conducted by persons who are trained in different ways and who use different methods:

1-Laboratory research: applies knowledge of the basic sciences to the development of procedures and strategies that enhance our patho-physiological mechanisms. 2-Epidemic investigations: deal with outbreaks of disease in specific populations typically at the local level. The objectives are to find the agent that caused the outbreak and its modes of transmission and to suggest appropriate control measures. 3-Population research (also called: field or survey research): deals with the study of biological, environmental and behavioral determinants of diseases and their prevention.


It is a triad of interacting factors related to host, agent and environment and explain the disease process.


It determines: 1-Exposure of man to the risk of morbidity. 2-Characteristic features and pattern of disease.


3-Endemicity (existence and perpetuation) of a disease in the involved community.

1-HOST: Refers to human or nonhuman and specifically the particular person or group of people susceptible to illness. Host factors include: age, sex, marital status, race, specific immunity, genetic predisposition, occupation, education, religion, culture, lifestyle, health status, behaviors. All the host factors are called intrinsic factors

2-AGENT: The agent is the direct cause of disease without which a specific disease cannot occur. It may be exogenous (from pollution of the environmentphysical, chemical or biological) or endogenous (e.g. genetic eg sickle cell anaemia and rhesus incompatibility). So, an agent could be: -Biologic (microorganisms) -Chemical (toxins, drugs, tobacco) -Physical (fire, radiation, trauma) -Nutritional (lack of, excess or imbalance) Idiopathic diseases: Agents of diseases are sometimes not precisely known e.g. cancers, essential hypertension Risk factors: Are not direct causal agents of diseases, but they increase susceptibility of the atrisk individuals or groups to disease (and so show significantly more incidence of that disease).

3-ENVIRONMENT: It is the medium (milieu) where man lives: It includes all that is external to the agent and host. It is classified into three classes: 1-The physical environment (weather, light, air quality, water, radiation, pressure [altitude], noise, chemicals .) 2-The biologic(infectious agents of disease, reservoir of infection, vehicles of transmission as mosquitoes & flies, and modes of propagation as air and water). 3-Socio-economic(overall economic and political organizations of a society where individuals live). All the environmental factors are called extrinsic factors

Human environment must be sanitary: Fulfills standard sanitary requirements. Free of vectors of diseases, rodents and pollution.


Primary sources include: medical examinations, laboratory research, clinical trials, personally conducted interviews, observations, etc. Merits: less measurement error, suits objectives of the study better. Disadvantage: costly, may not be feasible. Secondary sources include: individual records (medical / employment); group records (census data, vital statistics) Group records
1. Census data (counting of population)

Basic demographic data are usually required for many epidemiological study As they provide the denominators necessary for the calculation of prevalence And incidence. Demographic data on family size, socio-economic status class, level of educational attainment, age, sex distribution and geographic location of the population are of value. The census usually conducted every ten years is the source of demographic data. Mounting a census is resource consuming hence, it cannot be conducted too often, between census years (intercensal years), estimates are made of the size of population, based on population birth and death rates. Incorrect estimates of population size will lead to erroneous assessments of prevalence and incidence and health service utilization.

DEFINITION: A census is the process of enumeration of people to obtain data about their number, age, sex distribution, education, etc. TYPES: "de facto" OR "de jure".

A true "de facto" OR "present in area" means that all persons present in the country, at the time the census was counted. 9

The "de jour" census includes all persons who usually reside in the area, irrespective of where they might happen to be at the time of the census.

METHODS OF DATA COLLECTION: by either of two ways: 1-ENUMERATORS: who visit the individuals and write down the required information (in developing countries as Egypt). In this type, errors inherited in the census depend to a considerable extent on the training of enumerators. 2-QUESTIONNAIRES: which the individuals themselves fill out and return to the source (used in developed countries as England).

2. Vital statistics: Vital data are defined as major events in the population that are required by law to be reported to governmental authority. The five components of vital data are: births, deaths, fetal deaths, marriages, and divorce. Births and Deaths are the two most commonly used data sets by public health workers.

Uses of vital data: 1-Measuring the burden of the disease. How do cause-specific mortality rates compare to each other? 2-Historical pattern of the disease. How have death rates changed over centuries? 3-Secular trends in disease. How are death rates changing now? 4-Geographic differences. How do countries differ in mortality? 5-Definig acute epidemics. Can we detect sudden changes in disease? 6-Defining new diseases. 10

Can we detect new disease in vital data? 7-Assessing medical care. Can we compare death rates in populations receiving different kinds of medical care? 8-Environmental exposures. Can we compare death rates in regions exposed to different ecological exposures?

3. Morbidity Data:

Morbidity data are also available and provide insights into diseases which may not Necessarily result in death.

a) Notifiable diseases (infectious)

In many countries, some morbidity data are collected to meet legal requirements, such as those related to infectious diseases(communicable diseases).in many countries the ministry of health has a stastistic unit which is responsible for surveillance and monitoring the reported incidence of infectious diseases and detecting unsual patterns or epidemics. Certain diseases are specified as being required by law, to be notified to the relevant authority in that particular of sources :
1 .Local Health Authorities. 2. World: WHO: World Health Organization, Geneva, Switzerland. 3. Region: WHO Member States are grouped into six regions. Each region has a regional office:

:: :: :: :: :: ::

Regional Regional Regional Regional Regional Regional

Office Office Office Office Office Office

for for for for for for

Africa the Americas South-East Asia Europe the Eastern Mediterranean (EMRO) the Western Pacific


Disease registers (records) 4. Disease control activities consisting of case registers of TB and others communicable diseases and Tumor Registers. 5. Records of industrial and school absenteeism, pre-employment and periodical physical examinations. 6. From data accumulated as a by-product of insurance and other health related activities. 7. From data accumulated in biomedical research activities (as in surveys), interview of hospitalized cases b. hospital- based data routine data from hospital records and health center medical records are additional sources of data. For example, data on hospital admissions, discharges, out patient and primary care consultations. Registers of particular disease events such as cancer in patient register etc

hospital-based data are used in: -Etiologic research (cases control and cohort studies and clinical trials).

-Medical care research (Utilization pattern, organizational characteristics of providers associated with level of care).



-Natural history studies (rates of survival after treatment in a hospital).

-Studies of hospital acquired infections (nosocomial infections).

-Time trends in disease frequency (estimate trends over time in surrounding community). -Define priorities for allocation of resources (based on patterns of mortality, on the symptomatic burden of the illness to both the individuals and society (from e.g. hospital days and doctors visits), and on measures of functional impairment of the individual-i.e. disability measures (restricted activity days, bed days, and work or school-loss days).


Other sources of morbidity data

Specific data on particular heath problems may be collected through the health service Or in special surveys:

Data on mental health service utilization Nutrition in children Disability and need for rehabilitation services

These may be collected by governmental or non governmental agencies. Many countries have established registers for certain disease for example, head injuries and their causes, insulin -dependent diabetes and its distribution in the population, patients with cancer and their occupational histories. children with congenital malformations and the drug their mothers may have been exposed to during pregnancy.

Mortality data Mortality data are usually based upon the completion of a standard death certificate which records the date of death, age, date of birth and place of residence, in addition, occupation and other variables may be recorded. Death certificates are usually completed by medical practitioners where the subject and the cause of death were known to the practitioner; in other cases they may be completed by the police or other authorities. The degree of completeness and accuracy of death certification differ from country to country. Where additional socio- demographic details are recorded, these may provide valuable insight into the determinants of cause of death. Once these certificates are recorded, they are coded by a trained person using the international classification of diseases ICD. This provides a standardized system for recording the immediate cause of death, as well as the underling cause of death. Factors affecting mortality as a source of data 1. Causes of death may be poorly or under recorded for stigmatised diseases eg AIDS in an attempt to maintain confidenciality 2. Accuracy of clinical diagnoses 3. Ability to carefully complete death certificates by medical practitioners 4. Difficulty of classification and coding of data from death certificates


Data collection techniques

Data collection techniques allow us to systematically collect information about our objects of study (ie people, objects, phenomena ) and
about the setting in which they occur. In the collection of data we have to be systematic. If data are collected anyhow, it would be difficult to answer our research questions in a conclusive way. Example: During a nutriyion survey three different weighing scales were used in three villages.the researcher did not Record which scales were used in which village.affter completion of the survey, it was discovered that, the scales were not standardized and indicated different weights when weighing the same child. it was therefore impossible to conclude in which village malnutrition was most prevalent. Various data techniques can be used such as :

Experiments/clinical trials. Observing and recording well-defined events (e.g., counting the number of patients waiting in emergency at specified times of the day). Laboratory research Obtaining relevant data from management information systems. Or using available information or document review Written questionnaires Interviews

Using available information Usually there is a large amount of data that has been already been collected by others, although it may not necessarily have been analysed or published .locating these sources and retrieving the information is a good starting point in any data collection effort. Analyses of health information system data, census data, unpublished reports and publications in archives and libraries or in offices at the various levels of health and health related services, may be a study in itself. usually however it forms part of a study in which other data techniques are also used. Other sources of available data are news papers and published case histories eg, patients suffering from serious diseases or their relatives telling their experiences and how they cope. The use of key informants is another important technique to gain access to available information. key informants could be knowledgeable community leaders or health staff at various levels and one or two informative members of the target group( eg adolescents on their sexual behavior).they can 14

be involved in various stages of the research, from the statement of the problem, to analyses of the data and development of recommendations. Observing Observation is a technique that involves systematically watching, selecting and recording behavior and characteristics of human beings ,objects or phenomena Observation of human behaviuor is a much used data collection can be under taken in different ways : Participant observation : the observer takes part in the situation he or she observes. for example, a doctor hospitilised with a broken hip, who now observes hospital procedures from within. Non participant observation : the observer watches the situation, but does not participate.

Observation can be open,shadowing,(eg observinhg ahealth worker with his or her permission during routine activities) or concealed,mystery client,( eg trying to obtain antibiotics without without medical prescription).they may serve different purposes. Observation s of human behaviuor can form part of any study,but as they are time consuming they are most often used in small scale studies. Observation can also be made on objects. Eg the presence or absence of a latrine and its state of cleanliness . here observation may be the major research technique. If observations are made using a defined scale, they may be called measurements. Measurements usually require additional tools. Eg in nutriyioal surveillance we measure weight and height by using weighing scales and measuring board.we use thermometers for measuring body temperature. Observation can give additional, more accurate information on behaviuor of people than interviews or questionnaires.they can also check on the information collected through interviews especially on sensitive topics such as alcohol or drug use or stigmatized diseases eg whether community members share food or drinks with patients suffering from feared diseases(leprosy,TB,AIDS) are essential observations in a study on stigma.

Interviewing An interview is a data collection technique that involve oral questing of respondents, either individually or as a group. The answers to the questions posed during an interview can be recorded by writing then down (either during the interview itself or immediately after the interview) or by recording the responses or by a combination of both.

Answers to the questions posed during an interview can


be recorded by writing them down (either during the interview itself or immediately after the interview) or by tape-recording the responses, or by a combination of both. Interviews can be conducted with varying degrees offlexibility. The two extremes, high and low degree offlexibility, are described below: When studying sensitive issues such as teenage pregnancy and abortions, the investigator may use a list of topics rather than fixed questions. These may, e.g., include how teenagers started sexual intercourse, the responsibility girls and their partners take to prevent pregnancy (if at all), and the actions they take in the event of unwanted pregnancies. The investigator should have an additional list of topics ready when the respondent falls silent, (e.g., when asked about abortion methods used, who made the decision and who paid). The sequence of topics should be determined by the flow of discussion. It is often possible to come back to a topic discussed earlier in a later stage of the interview. The unstructured or loosely structured method of asking questions can be used for interviewing individuals as well as groups of key informants. (For details concerning focus group discussions (FGDs), see.) A flexible method of interviewing is useful if a researcher has as yet little understanding of the problem or situation he is investigating, or if the topic is sensitive. It is frequently applied in exploratory studies. The instrument used may be called an or interview schedule.* Low degree of flexibility Less flexible methods of interviewing are useful when the researcher is relatively knowledgeable about expected answers or when the number of respondents being interviewed is relatively large. Then may be used with a fixed list of questions in a standard

After a number of observations on the (hygienic) behaviour of women drawing water at a well and some key informant interviews on the use and maintenance of the wells, one may conduct a larger survey on water use and satisfaction with the quantity and quality of the


water. * Though in principle one may speak of loosely structured questionnaires, in practice the term questionnaire appears to be so hooked to tools with precategorised answers that we have decided to use the term interview guide for loosely structured tools. However, in reality there is often a mixture of open and pre-categorised answers 4. Administering written queetionnaires. A WRITTEN QUESTIONNAIRE (also referred to as self-administered questionnaire) is a data collection tool inwhich written questions are presented that are to beanswered by the respondents in written form. Written questionnaires can be administered in many ways,such as : 1. Sending questionnaires by mail with clear instructions on how to answer the questions and asking for mailed responses; 2. Gathering all or part of the respondents in one place at one time, giving oral or written instructions, and letting the respondents fill out the questionnaires; or 3. Hand delivering questionnaires to respondents and collecting them later The questions can be either open-ended or closed (with pre-categorised answers).

5. focus group discussion A focus group discussion allows a group of 8 - 12 informants to freely discuss a certain subject with the guidance of a facilitator or reporter. 6.Projective techniques When a researcher uses projective techniques, (s)he asks an informant to react to some kind of visual or verbal stimulus. Another example of a projective technique is the presentation of a hypothetical question or an incomplete sentence or case/study to an informant (story with a gap). A researcher may ask the informant to complete in writing sentences such as: Such techniques can easily be combined with semistructured interviews or written questionnaires. They are


also very useful in FGDs to get peoples opinion on sensitive issues. 7. mapping and scaling In a water supply project,, mapping is invaluable. It can be used to present the placement of wells, distance of the homes from the wells, other water systems, etc. It gives researchers a good overview of the physical situation and may help to highlight relationships hitherto unrecognised.

certain types of herbal medicine and ask them to arrange these into piles according to their usefulness. The informants would then be asked to explain the logic of their ranking. Mapping and scaling may be used as participatory techniques in rapid appraisals or situation analyses. In a separate volume on participatory action research, more such techniques will be presented.

Intervews In a structured interview, the researcher asks a standard set of questions and nothing more Face -to -face interviews have a distinct advantage of enabling the researcher to establish rapport with potential participants and therefore gain their cooperation. These interviews yield highest response rates in survey research. They also allow the researcher to clarify ambiguous answers and when appropriate, seek follow-up information. Disadvantages include impractical when large samples are involved time consuming and expensive Telephone interviews are less time consuming and less expensive and the researcher has ready access to anyone on the planet who has a telephone. Disadvantages are that the response rate is not as high as the face-to- face interview but cosiderably higher than the mailed questionnaire.The sample may be biased to the extent that people without phones are part of the population about whom the researcher wants to draw inferences. Computer Assisted Personal Interviewing (CAPI): is a form of personal interviewing, but instead of completing a questionnaire, the interviewer brings along a laptop or hand-held computer to enter the information directly into the database. This method saves time involved in processing the data, as well as saving the interviewer from carrying around hundreds of

questionnaires. However, this type of data collection method can be expensive to set up and requires that interviewers have computer and typing skills.

Paper-pencil-questionnaires can be sent to a large number of people and saves the researcher time and money.People are more truthful while responding to the questionnaires regarding controversial issues in particular due to the fact that their responses are anonymous. But they also

have drawbacks.Majority of the people who receive questionnaires don't return them and those who do might not be representative of the originally selected sample Web based questionnaires : A new and inevitably growing methodology is the use of Internet based research. This would mean receiving an e-mail on which you would click on an address that would take you to a secure web-site to fill in a questionnaire. This type of research is often quicker and less detailed. Some disadvantages of this method include the exclusion of people who do not have a computer or are unable to access a computer.Also the validity of such surveys are in question as people might be in a hurry to complete it and so might not give accurate responses. Questionnaires often make use of Checklist and rating scales.These devices help simplify and quantify people's behaviors and attitudes.A checklist is a list of behaviors,characteristics,or other entities that the researcher is looking for.Either the researcher or survey participant simply checks whether each item on the list is observed, present or true or vice versa.A rating scale is more useful when a behavior needs to be evaluated on a continuum.

Techniques of data collection epidemiology

1. available information

tools of

checklist, data Compilation form

2. observation

eyes and other senses Pen/paper,watch weighing scales


Microscope, measuring boards etc 3. interview interview quide, checklist, Questionnaire, tape recorder Video camera,computer 4. written questionnaire List of tools for data collection 1. 2. 3. 4. observation schedule(observationnaire) interview guide interview schedule mailed questionnaire - rating scales - checklist - opinionnaire - data sheets(document schedule) - schedule for institutions - inventories questionnaire

Survey Research Meaning and Nature

Health survey is a systematic collection of factual data pertaining to health and disease in a human population within a given geographic area. Survey studies are usually used to find the fact by collecting the data directly from population or sample. It is the most commonly used descriptive method in researches. The researcher collects the data to describe the nature of existing condition or look forward to the standards against existing condition or determine the relationships that exists between specific events. Many a time survey study intends to understand and explain the phenomena in a natural setting or provide information to government / other organization or compare different demographic groups or see the cause and effect relationship to make predictions. For this it requires responses directly from respondents of large population in general. The kind of information requires decides the coverage of geographical area for data collection and whether it is an extensive or intensive one. Extensive survey is carried out when a researcher wants to make generalization, whereas intensive survey is done for making estimation. Survey researches demands various tools to collect the data from samples. They range from observation, interview to questionnaire.

So the kind of survey study needed for any study is based on its purpose, nature of data and population and sample of the study.


The selection of a survey method suited for a particular study is based on the purpose of the study, method of data collection and time frame. The methods are given below.


General surveys generally involve collecting information about population, institutions or phenomena without any specific objective or hypothesis. These surveys are usually taken up by the government for providing regular data on socio economic problems. Census is a general survey. For Specific surveys data collection is based on certain specified objectives or hypothesis. These kind of surveys are taken up by any institution specific to their problem, or individual surveys for their academic work.


It is already mentioned that census survey is a kind of general survey. But the unique characteristic of census survey is, that, it collects data from all the members of the population. Sample survey is just opposite to census survey. Here the data is collected from few samples from the population. It can be general or specific. Sample surveys saves time and money when compare to census survey if the samples truly represent the population.


Regular survey by name indicates that it is conducted at regular interval of time, but Ad hoc surveys are conducted once for all. Regular Surveys gives the longitudinal data / information about the core issues as well as special issues which are taken for research work. Ad hoc surveys are limited to one issue and it is dealt with at a specific time point.


Preliminary Survey is generally conducted before taking up a wide sample survey. By this, the tool can be improved based on the responded data. Generally in research work pilot study is conducted for this purpose. The data collected from the improvised tool based on preliminary survey is the final survey.

In longitudinal survey the phenomena is observed or data collected in different period of time. In this survey the changes in the phenomena at different point of time is also been observed. Here revisiting the population and posing similar kind of queries and getting data from the population gives the transitory state of the data. For example, the health condition of the people with respect to their environment of a particular place is observed over 2 to 3 year, the data collected in different period of time. The change in health condition with respect to change in environment over a period of time, help the researcher to see the casual relationship between these two.


In cross sectional survey observation of sample is done at one point in time and the data collected provides the description of population feature. These studies focus the relationship between different variables at a point in time. For instance, the relationship between income, locality, and personal expenditure. The cross-sectional analysis relates to how variables affect each other at the same time.


In this survey, the status of two variable or instituition of same population is compared. For example two distance education study centres of the same region is compared with respect to its enrolment, achievement and other variables.

Evaluative surveys are usually conducted to evaluate any program or any implementation of scheme which is already done by the organisation or government. For example, an accrediation body allowed to start new institution throughout the nation, after a particular period, it intends to know the impact of these new institutions with respect to the specific education, and the effectiveness of these institutions with respect to the expected outcome. The outcomes of the study help the accreditation body to formulate future policies for better output.



In documentary survey a variety of information resources are used to answer the research question. These sources can be workshop material, books, official records, articles from the news paper, hand outs, institutional reports, individual experiences etc. These surveys are used to analyse the present events based on the records available to the researcher.


Even though each survey type varies with respect to the mode of data collection and tools used, the steps involved in data collection is more or less similar. The steps are
1. purpose 2. 3. Selection of the problem and defining objective,ie identify the eg what is the prevalence of cigarette smoking in agogo? Deciding the information needed Research Design

4. Operationalisation of concepts and construction of measuring indexes and scales 5. Sampling 6. Construction of tools for collection of data and their pre test 7. Field work and collection of data 8. Processing of data and tabulation 9. Analysis of data 10. Reporting

1. It gives the opportunity to researcher to see the reality more closely, inference are not based on theory or dogma but it is based on facts. 2. It leads to greater objectivity. 3. It leads to the introduction of new theory. For example, poverty was regarded as the cause of crime for fairly long time but increasing crime in advanced countries has falsified this theory. 4. It helps to know the health situation or status 5. The important aspect of survey study is its versatility. It is the only practical way to collect many types of information from individuals, such as personal characteristics, socio-economic data, attitudes, opinions, experiences and expectations. 6. Facilitates in drawing generalisations about population on the basis of data from representative sample. 7. It is flexible and allows various methods of collection of data.


8. 9.

It sensitizes the researcher to unanticipated or unknown problems. It is useful in verifying theories.

1. It requires training for those who collect information, which demands more financial source. 2. It is time consuming process, if the area is large. 3. Its reliability and validity is based on the honesty and efficiency of the survey workers. 4. Survey are mostly based on samples, so always there is a possibility of sampling error. 5. As data is collected from primary sources, the feasibility is dependant upon the willingness and cooperation of the respondents. 6. There is a possibility of response error, due to respondents untrue / misleading answers.

Design of epidemiological studies

1. Observational A) DESCRIPTIVE

-Case report-

A description of a single case, typically describing the manifestations,

clinical course, and prognosis of that case. Due to the wide range of natural biologic variability in these aspects, a single case report provides little empirical evidence to the clinician. They do describe how others diagnosed and treated the condition and what the clinical outcome was.

-Case series-A

descriptive, observational study of a series of cases, typically

describing the manifestations, clinical course, and prognosis of a condition. A case series provides weak empirical evidence because of the lack of comparability unless the findings are dramatically different from expectations. Case series are best used as a source of hypotheses for investigation by stronger study designs, Unfortunately, the case series is the most common study type in the clinical literature.

-Cross-sectional (prevalence studies)-

A descriptive study

of the relationship between diseases and other factors at one point in time (usually) in a defined population. Cross sectional studies lack any information on timing of exposure and outcome relationships and include only prevalent cases. 24

B) ANALYTIC -Cross-sectional (making groups)

-Case control (retrospective) -Cohort (prospective/follow up) Prospective cohort (concurrent follow up) Retrospective cohort (Historical/non-concurrent follow up) Time series studies

also called ecologic correlational studies, or aggregate studies

Group level studies2-Experimental (intervention/clinical trials) studies
randomized controlled trial 1. 2. 3. double- blind randomized trial single- blind randomized trial non blind trial


Nonrandomized trial (quasi experiment)


Cohort studies
A cohort is defined as a group of people with a common characteristic or experience. In a cohort study, healthy subjects are defined according to their exposure status and followed over time to determine the incidence of symptoms, disease, or death. The common characteristic for grouping subjects is their exposure level. Usually two groups are compared, anexposed and unexposed group. The unexposed group is called the referent group or comparison group. Cohort study is the term that is typically used to describe an epidemiologic investigation that follows groups with common characteristics. Other expressions that are used include follow-up, incidence, and longitudinal study. There are several additional terms for describing cohort studies that depend on the characteristics of population from which the cohort is derived, whether the exposure changes over time, and whether there are losses to follow-up. The term fixed cohort is used when the cohort is formed on the basis of an irrevocable event such as undergoing a medical procedure. Thus, an individuals exposure in a fixed cohort does not change over time. The term closed cohort is used to describe a fixed cohort with no losses to follow up. In contrast, a cohort study conducted in an open population is defined by exposures that can change over time such as cigarette smoking. Cohort studies in open populations may also experience losses to follow-up. Timing of Cohort Studies Three terms are used to describe the timing of events in a cohort study: prospective, retrospective, and ambidirectional. In a prospective cohort study, participants are grouped on the basis of past or current exposure and are followed into the future in order to observe the outcomes of interest. When the study commences, the outcomes have not yet developed and the investigator must wait for them to occur. In a retrospective cohort study, both the exposures and outcomes have already occurred when the study begins. Thus, this type of investigation studies only prior outcomes and not future ones. An ambidirectional cohort study has both prospective and retrospective components. The decision to conduct a retrospective, prospective, or ambidirectional study depends on the research question, practical constraints such as time and money, and the availability of suitable study populations and records. Selection of the Exposed Population The choice of the exposed group in a cohort study depends on the hypothesis being tested, the exposure frequency, and feasibility considerations such as the availability of records and ease of follow-up. Special cohorts are used to study the health effects of rare exposures such as uncommon


workplace chemicals, unusual diets, and uncommon lifestyles. Special cohorts are often selected from occupational groups (such as rubber workers) or religious groups (such as Mormons) where the exposures are known to occur. General cohorts are typically assembled for common exposures such as cigarette smoking and alcohol consumption. These cohorts are often selected from professional groups such as nurses or from well-defined geographic areas in order to facilitate follow-up and accurate ascertainment of the outcomes under study. Selection of Comparison Group There are three sources for the comparison group in a cohort study: an internal comparison group, the general population, and a comparison cohort. An internal comparison group consists of unexposed members of the same cohort. An internal comparison group should be used whenever possible, because its characteristics will be most similar to the exposed group. The general population is used for comparison when it is not possible to find a comparable internal comparison group. The general population comparison is based on preexisting population data on disease incidence and mortality. A comparison cohort consists of members of another cohort. It is the least desirable option because the comparison cohort, while not exposed to the exposure under study, is often exposed to other potentially harmful substances and so the results can be difficult to interpret. Sources of Information Cohort study investigators typically rely on many sources for information on exposures, outcomes, and other key variables. These include medical and employment records, interviews, direct physical examinations, laboratory tests, biological specimens, and environmental monitoring. Some of these sources are pre-existing, and others are designed specifically for the study. Because each type of source has advantages and disadvantages, investigators often use several sources to piece together all of the necessary information. Health care records are used to describe a participants exposure history in studies of possible adverse health effects stemming from medical procedures. The advantages of these records include low expense and a high level of accuracy and detail regarding a disease and its treatment. Their main disadvantage is that information on many other key characteristics, apart from basic demographic characteristics, is often missing. Employment records are used to identify individuals for studies of occupational exposures. Typical employment record data includes job title, department of work, years of employment, and basic demographic characteristics. Like medical records, they usually lack details on exposures and other important variables. Because existing records such as health care and employment records often have limitations, many studies are based on data collected specifically for the investigation. These include interviews, physical examinations,


and laboratory tests. Interviews and self-administered questionnaires are particularly useful for obtaining information on lifestyle characteristics (such as use of cigarettes or alcohol), which are not consistently found in records. Whatever the source of information, it is important to use comparable procedures for obtaining information on the exposed and unexposed groups. Biased results may occur if different sources and procedures are used. Thus, all resources used for one group must be used for the other. In addition, it is a good idea to mask investigators to the exposure status of a subject so that they make unbiased decisions when assessing the outcomes. Standard outcome definitions are also recommended to guarantee both accuracy and comparability. Approaches to Follow-Up Loss to follow-up occurs either when the participant no longer wishes to take part or when he or she cannot be located. Because high rates of follow-up are critical to the success of a cohort study, investigators have developed many methods to maximize retention and trace study participants. For prospective cohort studies, strategies include collection of information (such as full name, Social Security number, and date of birth) that helps locate participants as the study progresses. In addition, regular contact is recommended for participants in prospective studies. These contacts might involve requests for up-to-date outcome information or newsletters describing the studys progress and findings.9 The best strategy to use when participants do not initially respond is to send additional mailings. When participants are truly lost to follow-up, investigators employ a number of strategies. These include sending letters to the last known address with Address Correction Requested; checking telephone directories, directory assistance, newly available Internet resources such as the White Pages, vital statistics records, drivers license rosters, and voter registration records; and contacting relatives, friends, and physicians identified at baseline. Analysis The primary objective of the analysis of cohort study data is to compare the occurrence of symptoms, disease, and death in the exposed and unexposed groups. If it is not possible to find a completely unexposed group to serve as the comparison, then the least exposed group is used. The occurrence of the outcome is usually measured using cumulative incidence or incidence rates, and the relationship between the exposure and outcome is quantified using absolute or relative difference between the risks or rates, Analysis of a cohort study uses the ratio of either the risk or rate of disease in the exposed cohort, compared with the rate or risk in the unexposed cohort.


If follow-up times differ markedly between participants, a rate may be more appropriate.The risk ratio uses as a denominator the entire group recruited at the start of the study while the rate ratio uses as a denominator the person years which takes account of losses to follow-up. Table 1. Calculation of the rate ratio from a hypothetical cohort study of smoking and cancer of the pancreas followed for 1 year. Cancer of the pancreas No disease Total Smokers 42 27,000 63,000 90,000 Incidence rate

27,042 1.5/1000/yr 63,007 0.1/1000/yr 90,049

Non Smokers 7 Total 49

From the data in table 1 taken from a hypothetical cohort study to investigate the association between smoking and cancer of the pancreas the relative and attributable risk can be calculated as follows; Incidence rate in exposed group (r1) Rate Ratio = Incidence rate in unexposed group (r0) RR = 1.5/0.1 = 15 The relative risk of 15 indicates that the risk of cancer of the pancreas is 15 times higher among smokers than non-smokers. Attributable risk (AR) AR = incidence risk among exposed (r1) - incidence risk among unexposed (r0) AR = 1.5-0.10 = 1.4/1,000/yr The attributable risk of cancer of the pancreas due to smoking is 1.4 cases per 1000 per year Attributable risk % is calculated as; RR - 1 1.5 - 0.1 AR% = r1-r0 / r1 x 100 or = x 100 = 93% RR 1.5 This can be interpreted as: smoking accounts for 93% of all cases of cancer of the pancreas among smokers. Standardised mortality or morbidity ratio (SMR)


Standardised mortality and morbidity ratios are another commonly used method of presenting results in a cohort study.

Case control studies

In the traditional view, subjects are selected on the basis of whether they have or do not have the disease. Those who have the disease are termed cases, and those who do not have the disease are termed controls. The exposure histories of cases and controls are then obtained and compared. Thus, the central feature of the traditional view is the comparison of the cases and controls exposure histories. This differs from the logic of experimental and cohort study designs, in which the key comparison is disease incidence between the exposed and unexposed (or least exposed) groups. Over the last two decades, the traditional view that a casecontrol study is a backwards cohort study has been supplanted by a modern view that asserts that it is merely an efficient way to learn about the relationship between an exposure and disease. More specifically, a casecontrol study is a method of sampling a population in which researchers identify and enroll cases of disease and a sample of the source population that gave rise to the cases. The sample of the source population is known as the control group. Its purpose is to provide information on the exposure distribution in the population that produced the cases, so that the rates of disease in exposed and nonexposed groups can be compared. Thus, the key comparison in the modern view is the same as that of a cohort study. Selection of Cases The first step in the selection of cases for a casecontrol study is the formulation of a disease or case definition. A case definition is usually based on a combination of signs and symptoms, physical and pathological examinations, and results of diagnostic tests. It is best to use all available evidence to define with as much accuracy as possible the true cases of disease. Once investigators have created a case definition, they can begin case identification and enrollment. Typical sources for identifying cases are hospital or clinic patient rosters, death certificates, special surveys, and reporting systems such as cancer or birth defects registries. Investigators consider both accuracy and efficiency in selecting a particular source for case identification. The goal is to identify as many true cases of disease as quickly and cheaply as possible. Another important issue in selecting cases is whether they should be incident or prevalent. Researchers who study the causes of disease prefer incident cases because they are usually interested in the factors that lead to developing a disease rather than factors that affect its duration. However, sometimes epidemiologists have no choice but to rely on prevalent cases (for example, when studying the causes of insidious diseases whose exact

onset is difficult to pinpoint). Studies using prevalent cases must be interpreted cautiously, because it is impossible to determine if the exposure is related to the inception of the disease, its duration, or a combination of the two. Selection of Controls Controls are a sample of the population that produced the cases. The guiding principle for the valid selection of controls is that they come from the same base population as the cases. If this condition is met, then a member of the control group who gets the disease under study would end up as a case in the study. This concept is known as the would criterion, and its fulfilment is crucial to the validity of a casecontrol study. Another important principle is that controls must be sampled independently of exposure status. In other words, exposed and unexposed controls should have the same probability of selection. Epidemiologists use several sources for identifying controls in casecontrol studies. They may sample: (1) individuals from the general population, (2) individuals attending a hospital or clinic, (3) friends or relatives identified by the cases, or (4) individuals who have died. Population controls are typically selected when cases are identified from a well defined population such as residents of a geographic area. These controls are usually identified using voter registration lists, drivers license rosters, telephone directories, and random digit dialing (a method for identifying telephone subscribers living in a defined geographic area). Population controls have one principal advantage that makes them preferable to other types of controls. Because of the manner in which population controls are identified, investigators are usually assured that the controls come from the same population as the cases. Thus, investigators are usually confident that population controls are comparable to the cases with respect to demographic and other important variables. However, population controls have several disadvantages. First, they are time consuming and expensive to identify. Second, these individuals do not have the same level of interest in participating as do cases and controls identified from other sources. Third, because they are generally healthy, their recall may be less accurate than that of cases who are likely reviewing their history in search of a reason for their illness. Epidemiologists usually select hospital and clinic controls when they identify cases from these health care facilities. Thus, these controls have diseases or have experienced events (such as a car accident) for which they have sought medical care. The most difficult aspect of using these types of controls is determining which diseases or events are suitable for inclusion. In this regard, investigators should follow two general principles. First, the illnesses in the control group should, on the basis of current knowledge, be unrelated to the exposure under study. For example, a casecontrol study of cigarette smoking and emphysema should not use lung cancer patients as controls, because lung cancer is known to be caused by smoking cigarettes.

Second, the controls illness should have the same referral pattern to the health care facility as the cases illness. For example, a casecontrol study of acute appendicitis should use patients with other acute conditions as controls. Following this principle will help ensure that the cases and controls come from the same source population. There are several advantages to the use of hospital and clinic controls. Because they are easy to identify and have good participation rates, hospital and clinic controls are less expensive to identify than population controls. In addition, because they come from the same source population they will have comparable characteristics to the cases. Finally, their recall of prior exposures will be similar to the cases recall, because they are also ill. The main disadvantage of this type of control is the difficulty in determining appropriate illnesses for inclusion. In rare circumstances, deceased and special controls are enrolled. Deceased controls are occasionally used when some or all of the cases are deceased by the time data collection begins. Researchers usually identify these controls by reviewing death records of individuals who lived in the same geographic area and died during the same time period as the cases. The main rationale for selecting dead controls is to ensure comparable data collection procedures between the two groups. For example, if researchers collect data by interview, they would conduct proxy interviews with subjects spouses, children, relatives, or friends for both the dead cases and dead controls. However, many epidemiologists discourage the use of dead controls because these controls may not be a representative sample of the source population that produced the cases, whichby definitionconsists of living people. Furthermore, the investigator must consider the study hypothesis before deciding to use dead controls, because they are more likely than living controls to have used tobacco, alcohol, or drugs.11 Consequently, dead controls may not be appropriate if the study hypothesis involves one of these exposures. In unusual circumstances, a friend, spouse, or relative (usually a sibling) is nominated by a case to serve as his or her control. These special controls are used because they are likely to share the cases socioeconomic status, race, age, educational level, and genetic characteristics, if they are related to the cases. However, cases may be unwilling or unable to nominate people to serve as their controls. In addition, biased results are possible if the study hypothesis involves a shared activity among the cases and controls. Methods for Sampling Controls Epidemiologists use three main strategies for sampling controls in a casecontrol study. Investigators can select controls from the non-cases or survivors at the end of the case diagnosis and accrual period. This method of selection, which is known as survivor sampling, is the predominant


method for selecting controls in traditional casecontrol studies. In case-based or case-cohort sampling, investigators select controls from the population at risk at the beginning of the case diagnosis and accrual period. In risk set sampling, controls are selected from the population at risk as the cases are diagnosed. When case-based and risk set sampling methods are used, the control group may include future cases of disease. Although this may seem incorrect, modern epidemiologic theory supports it. Recall that both diseased and nondiseased individuals contribute to the denominators of the risks and rates in cohort studies. Thus, it is reasonable for the control group to include future cases of disease because it is merely an efficient way to obtain the denominator data for the risks and rates. Sources of Exposure Information Casecontrol studies are used to investigate the risk of disease in relation to a wide variety of exposures, including those related to lifestyle, occupation, environment, genes, diet, reproduction, and the use of medications. Most exposures that are studied are complex, and so investigators must attempt to obtain sufficiently detailed information on the nature, sources, frequency, and duration of these exposures. Sources available for obtaining exposure data include in-person and telephone interviews; self-administered questionnaires; preexisting medical, pharmacy, registry, employment, insurance, birth, death, and environmental records; and biological specimens.12 When selecting a particular source, investigators consider its availability, its accuracy, and the logistics and cost of data collection. Accuracy is a particular concern in casecontrol studies because exposure data are retrospective. In fact, the relevant exposures may have occurred many years before data collection, making it difficult to gather correct information. Analysis As described above, controls are a sample of the population that produced the cases. However, in most instances the sampling fraction is not known, so the investigator cannot fill in the total population in the margin of a twoby- two table or obtain the rates and risks of disease. Instead, the researcher obtains a number called an odds, which functions as a rate or risk. An odds is defined as the probability that an event will occur divided by the probability that it will not occur. In a casecontrol study, epidemiologists typically calculate the odds of being a case among the exposed (a/b) compared to the odds of being a case among the nonexposed (c/d). The ratio of these two odds is expressed as follows: a/b c/d or ad bc

This ratio, known as the disease odds ratio, provides an estimate of the relative risk just as the incidence rate ratio and cumulative incidence ratio

do. Risk or rate differences are not usually obtainable in a casecontrol study. However, it is possible to obtain the attributable proportion among the exposed.

Screening: Refers particular disease.

to the application of a test to people who are asymptomatic in order to classify them as likely or unlikely to have a

The screening procedure does not diagnose illness. Those who test positive (appear likely to have the disease) are sent on for further evaluation by a subsequent diagnostic test or procedure to determine whether they do in fact have the disease (reach a final diagnosis). The people who are then found to have the disease are treated.

The concept of screening is that early detection, before the development of symptoms, will lead to a more favorable prognosis because treatment begun before the disease becomes clinically manifest will be more effective than later treatment. In that way screening reduces morbidity and mortality from the disease among the screened people (by the early treatment).

Screening has played an important role in improving public health over years.

There are often risks or costs associated with the screening and/or consequent procedures that must be weighted against the benefits.

For a screening program to be successful: 1-Apply screening for a disease with characteristics appropriate for screening, 2-A suitable screening test must be available.


To be appropriate for screening, a disease should be: 34

1-Serious: life-threatening and have irreversible consequences if not treated early (e.g. breast cancer). Medical problems, such as gallstones, which usually not lifethreatening and may never become symptomatic, may not be suitable for screening. This criterion relates to issues of a) cost-effectiveness (resources expended on screening must be justifiable in terms of eliminating or ameliorating adverse health consequences //application of screening makes better use of limited resources than competing medical activities), and b) ethics (the consequences of failing to diagnose or treat early must be severe enough to warrant undergoing the risks and discomforts of the screening procedure itself);

2-Effective treatment for the disease should be available. Treatment given before symptoms development (in the preclinical phase) should be more beneficial (better prognosis) in terms of reducing morbidity/mortality than if given after they develop. E.g. cancer of the uterine cervix develops slowly, taking more than 10 years for cancer cells which are initially confined to the outer layer of the cervix, to progress to a phase of invasiveness. During the pre-invasive stage, the cancer usually asymptomatic but can be detected by screening using Papanicolaou smear. It is better to begin treatment during this stage than when cancer becomes invasive.

On the other hand, if early treatment makes equally good or bad, whether treatment is develop, then the application of a screening effective. e.g. lung cancer has a very poor treatment is initiated;

no difference because prognosis is begun before or after symptoms test will be neither necessary nor prognosis regardless of when the

3-Prevalence of preclinical disease should be high among the population screened. This criterion relates to the issue of the costs of the screening program relative to the numbers of cases detected. The prevalence of the detectable preclinical phase of a disease and thus the number of cases detected by screening can be increased by screening high risk groups, e.g. screen for cancer bladder in those with relative occupational exposures. Also, screen for breast cancer in women with family history of the disease.

Natural history of disease to be screened must be understood.


HYPERTENSION: meets all the criteria suitable for screening. First, it is a serious disease (greater mortality in hypertensive individuals and the risk of death increased with higher levels of blood pressure). Second, early treatment of cases reduces the risk of subsequent morbidity and mortality from all vascular diseases combined and from stroke. Third, the prevalence of hypertension in a screened population is likely to be high in adults.

2-SCREENING TESTS: A screening test should be ideally: Simple, inexpensive, rapid, easy to administer, impose a minimal discomfort (painless) on the patients. Also, the results of the screening test must be valid, reliable, and reproducible.

A screening test should divide the examined people into two groups: 1) healthy and 2) with abnormality. Those with abnormality are likely to have the disease and will be subjected to further confirmatory tests. The following diagram represents this procedure: -ve (no disease) Apparently healthy people Screening test +ve

confirmatory tests

-ve (no disease)

+ve (with disease)

Being simple and rapid, we expect a screening test to have some errors. We use methods of evaluation of screening tests. We measure reliability and validity of each test.

Reliability (or reproducibility): a reliable test gives consistent results (always the same) when it is performed more than once on the same individuals under the same conditions. Unreliable test (gives different results each time it is applied to same individual under same conditions) should never be as a screening test, as it misclassify people into healthy or diseased. 36

Validity (or accuracy): a valid test provides a true preliminary indication of whether an individual had the disease or not. So, the validity is the ability of the test to do what it is supposed to do (categorize persons who have preclinical disease as test positive and those without preclinical disease as test negative).

We can know the validity of any test by comparing its results with the results of confirmatory tests (gold standard tests). If they agree, the test is considered "valid".

The following table shows the results of applying a screening test to a group of apparently healthy people which classified them into healthy and abnormal.

Then we applied a confirmatory test to the same group and we got the following results:

RESULTS OF A SCREENING TEST: Disease status (Dx) [truth] Results of confirmatory test Positive (+) Negative (-)

Results of screening test(T) Positive (+) a c a+c b d b+d a+b c+d a+b+c+d

Negative (-) Total

a =The number of individuals for whom the screening test is positive and the individual actually has the disease (true positive). b =The number of individuals for whom the screening test is positive 37

BUT the individual does NOT have the disease (false positive). c =The number of individuals for whom the screening test is negative BUT the individual does have the disease (false negative). d =The number of individuals for whom the screening test is negative and the individual does not have the disease (true negative).

Condition Truly Present

Condition Truly absent

(T+\D+) Test positive


(T+\D-) false positive

total testing

true positive

(T-\D+) Test negative


(T-\D-) true negative

total testing

false negative


True Prevalence

1 - Prevalence

Population size

Validity is divided into sensitivity and specificity (So, Sensitivity and specificity are two measures of the validity of a screening test).

Sensitivity = Probability (T , Dx ) =

a a+c

Specificity = Probability (T , Dx ) =

d b+d


The overall validity is calculated by the formula: a+d a+b+c+d

A 100% valid test is one that has no false positive (cell b) or false negative (cell c) cases. That is to say its results agree 100% with the confirmatory test, so it can be considered a diagnostic test.

Sensitivity (accuracy in classification of cases): is the ability of the screening test to give positive results in diseased patients (avoid missing true cases). It is calculated by relating the true positives by the screening test to all positives by the confirmatory test: a (a+c).

Specificity (accuracy in classification of noncases): is the ability of the screening test to give negative results in free individuals (rule out all negative individuals). It is calculated by relating the true negatives by the screening test to all negatives by the confirmatory tests: d (b+d)

The overall validity = sensitivity + specificity It is desirable to have a screening test that is both highly sensitive and specific, but that is usually not possible, and there is generally a tradeoff between the sensitivity and specificity of a given screening test. Any decision regarding the acceptable levels of sensitivity and specificity in a given situation involves: weighing the consequences of leaving cases undetected (false negative) against incorrectly classifying healthy persons as diseased (false positive).

Sensitivity and specificity of breast cancer screening exam.

Breast cancer

Cancer confirmed

Cancer not confirmed



Screening test (physical examination & mammography)

Positive Negative

132 45

983 63 650

1115 63 695



64 633

64 810

Sensitivity=a (a+c)= 132 177 =

0.746 X 100 = 74.6%

Specificity=d (b+d)= 6365064633 = 0.985 X 100= 98.5%

Cancer confirmation is done using biopsies or aspiration of breast tissues.

In the above table 64810 women aged 40-64 years were screened for breast cancer.

During the first 5 years of observation, 132 breast cancers were diagnosed among the 1115 biopsies or aspirations that were recommended on the basis of the results of the screening procedures.

In addition, 45 cases of breast cancer were detected among women who screened negative but were diagnosed with the disease during the subsequent years. These women were assumed to have been false negative (i.e. they were assumed to have had the disease at the time of the screening test but were missed by the screening test).

Thus the sensitivity of mammography plus physical examination in these data would be 132/177 or 74.6%. This means that of those diagnosed with breast cancer during the study period, 74.6% tested positive on the screening procedure.


The specificity of the screening procedure would equal 63,650/64,633 or 98.5%, indicating that almost all women who did not have the disease, tested negative.


Sensitivity should be increased (to catch all cases) at the expense of specificity when the penalty associated with missing a case is high, such as: 1) when the disease serious and definitive treatment exists so that can be treated effectively in early stages (e.g. PKU), 2) when the disease can spread (syphilis and gonorrhea), or 3) when subsequent diagnostic evaluations of positive screening tests are associated with minimal cost and risk (further series of blood pressure readings to ascertain hypertension).

Specificity should be increased relative to sensitivity when the cost or risks associated with further diagnostic techniques are substantial (so select a small number of individuals for further testing with a small chance of not having the disease) (e.g. breast cancer, for which the definitive diagnostic evaluation of a positive screening test is a biopsy). In this circumstance, it must be made quite clear to those screened that a negative screening test is not a guarantee of being disease-free, but rather that the likelihood of having the disease is low.

Phenylketonuria (PKU): is a rare disease but has very serious long term consequences if left untreated. It is a congenital metabolic disorder, in which there is an absence of phenylalanine hydroxylase activity in the liver. When a newborn with PKU ingests proteins containing the amino acid phenylalanine, accumulation of certain metabolites affects the developing brain leading to severe mental retardation. Dietary restriction of phenylalanine begun soon after birth can prevent mental retardation.

Evaluation of screening programs: Even after a disease is determined to be appropriate for screening and a valid test becomes available, it remains unclear whether a widespread screening program for that disease should be implemented.

Evaluation of a screening program involves consideration of two issues:

1-Whether it is effective in reducing morbidity and mortality. 2-Whether it is feasible. Feasibility is determined by: Acceptability of the program to the screenees (must be quickly, easily administered, with minimal discomfort), Cost-effectiveness (total cost and the resources expended per detected case of the disease), 42

The subsequent diagnosis and treatment of individuals who test positive, and The yield of cases (number of cases detected by a screening program). That is measured by the predictive value of the screening test (total number of tests performed). Predictive value measures whether or not an individual actually has the disease, using the results of the screening test.

Sensitivity and specificity are, in principle, characteristics of the test itself. In practice, all sorts of factors can influence the degree of sensitivity and specificity that are achieved in a particular setting (e.g., calibration of the instruments, level of training of the reader, quality control, severity of the condition being detected, expectation of positivity). However, for any particular sensitivity and specificity, the yield of a test (accurate and inaccurate positive test results) will be determined by how widespread the condition is in the population being tested.

The typical difficulty is that, since the number of people without the condition is usually much larger than the number with the condition, even a very good test can easily yield more false positives than true ones.

The concept of predictive value is used to assess the performance of a test in relation to a given frequency of the condition being sought.

Predictive value of the positive test OR predictive value positive (PV+) = a (a+b). It is the proportion of those measured sick (+ve) on the screening test who are actually sick (+ve) according to the confirmatory test.

Predictive value of the negative test OR predictive value negative (PV-) = d (c+d). It is the proportion of those measured well (-ve) on the screening test who are actually well (-ve) according to the confirmatory test.

Predictive value is an essential measure for assessing the effectiveness of a detection procedure. The PV+ provides an estimate of the probability that someone with a positive result in fact has the condition; the PV- provides an estimate that someone with a negative result does not in fact have the condition.

A high predictive value negative is expected in screening for rare diseases, because the vast majority of those screened will be disease free. 43

The predictive value positive, or yield, of a screening test can be increased by increasing the prevalence of the preclinical disease. This is done by applying the screening test to individuals who are at high risk of developing the disease.

Confirmed +ve Screen +ve Screen -ve Total 132 45 177

Confirmed -ve 983 63650 64 633

Total 1115 63695 64 810

PV+= 132 1115=0.118 x 100 =11.8% PV- = 63650 63695= 0.999 x 100= 99.99%

A study was made of clinicians ability to diagnose streptococcal infection in 149 patients coming to the emergency department with sore throat. Doctors clinical impressions were compared to results of throat culture for group A streptococcus. Thirty seven patients had positive throat cultures, and 27 of these were diagnosed by doctors as having streptococcal throat. One hundred twelve patients had negative cultures, and doctors diagnosed 35 of these as having streptococcal throat.

What is the sensitivity of the doctors clinical impression of streptococcal throat in

this study?


What is the specificity?


EXERCISES (to calculate)


Sensitivity, Specificity, Positive predictive value and the Negative

predictive value of the screening test:

Surgical outcome in appendicitis Yes Emergency Room Diagnosis Of appendicitis No 10 590 600 Yes 190 210 400 No Total









2) There is a test which is said to be an excellent screening test for anemia, with sensitivity and specificity each 99%. Assume that the prevalence of anemia in a population of 1000 persons is 1%.

Set up the fourfold table (like the above one) to show the results of the screening test in this population. How many false positive are there? How many false negatives? What is the predictive value of a positive test? Stools use in epidemiological survey
1. Scales
Survey scales are simply the formats used to collect response data. Likerk scale- The most commonly used scale is the Likert scale, named after the psychologist Rensis Likert. It measures degree of agreement (strongly agree to strongly disagree). Semantic differential scales- also common, require respondents to rate a statement against two adjectives with opposite meanings (good-bad, tasty-bland, easy-hard). Self-rating scales -are direct measures of the respondent's attitude about the survey item (describes me very well-does not describe me at all.) All these scales provide numerical data and lead to straightforward analysis. Many survey designers supplement numerical scales with open-ended questions in order to add additional depth to the data. Number of scale points Survey scales can be constructed using different points. The common scales are:

three-point scale five-point scale seven-point scale ten-point scale eleven-point scale

1. To construct a five-point Likert scale.

Develop a series of statements that are either favorable or unfavorable toward the object or issue under investigation, such as, "Doctor variety is an important factor for me in choosing a hospital." Respondents answer on a five-point scale: (5) strongly agree (4) agree (3) neither agree nor disagree (2) disagree (1) strongly disagree.
2. To construct a seven-point semantic differential scale.


Require respondents to rate the item of interest in respect to two adjectives with opposite meanings. Here is an example item: "I find the food at fast-food restaurants to be _____. " Respondents rate the statement against the bipolar adjectives, 1. "healthy" and 2. "unhealthy." To explore attitudes further, add additional bipolar pairs, such as 1."tasty 2. or bland," 1."a good value 2. or overpriced," and 1, "natural 2. or processed." Limit the number of items, because respondents find long semantic-differential surveys tedious. o 3. To construct a seven-point self-rating scale. Items should require respondents to answer concerning their attitude about the subject under question. Here is an example item: "How favorably or unfavorably do you feel toward capital punishment?" The respondent answers on a seven-point continuum between "very favorable" and "very unfavorable." Another variant makes a general statement, such as "I support capital punishment." The respondent answers on a continuum between "Describes Me Very Well" and "Does Not Describe Me At All." Self-rating scales are easy to construct, and items require the least amount of preparation and screening by the researcher. Balanced and unbalanced scales

Balanced scales have equal numbers of positive and negative points: 1. completely satisfied; 2. mostly satisfied; 3. mostly dissatisfied; 4. completely dissatisfied. Although there are exceptions, a best practice with respect to balanced scales is that the use of modifiers should be symmetrical on the positive and negative ends of the scale. Unbalanced scales attempt to get greater discrimination on one side of the scale than on the other. If past experience suggested that most respondents are satisfied with a certain product or service, a researcher might want to stretch out the positive side of the scale, as in this extreme example (five positives and only two negatives): 1. completely satisfied; 2. very satisfied; 3. mostly satisfied; 4. somewhat satisfied; 5. barely satisfied; 6. mostly dissatisfied; 7. completely dissatisfied. Whether a scale should be balanced or unbalanced usually depends on whether were measuring a unipolar or a bipolar concept of satisfaction. A unipolar satisfaction scale might range from not satisfied to completely satisfied (i.e., it doesnt measure any more 48

extreme dissatisfaction at all). In contrast, a bipolar scale would range from completely dissatisfied to completely satisfied (i.e., it measures extremes of both satisfaction and dissatisfaction 2. observation 3. questionnaire
A questionnaire is a research instrument consisting of a series of questions for the purpose of gathering information from respondents. It is also defined as a planned set of questions used to collect data.

Types of questionnaire
Questionnaires with questions that measure separate variables, could for instance include questions on: preferences (e.g. type of food) behaviors (e.g. smoking) facts (e.g. gender)

Questionnaires with questions that are aggregated into either a scale or an index , include for instance questions that measure:
latent traits (e.g. personality traits such as extroversion) attitudes (e.g. towards AIDS patients) an index (e.g. Social Economic Status)

Question types
Usually, a questionnaire consists of a number of questions that the respondent has to answer in a set format. There are two common types of questions, open-ended and closed-ended questions. An open-ended question asks the respondent to formulate his own answer, whereas a closed-ended question asks the respondent to pick an answer from a given number of options. The response options for a closed-ended question should be exhaustive and mutually exclusive. Four types of response scales for closed-ended questions are distinguished: 1.Dichotomous, where the respondent has two options 2.Nominal-polytomous, where the respondent has more than two unordered options 3.Ordinal-polytomous, where the respondent has more than two ordered options 4.(Bounded)Continuous, where the respondent is presented with a continuous scale A respondent's answer to an open-ended question is coded into a response scale afterwards. An example of an open-ended question is a question where the respondant has to complete a sentence (sentence completion item).


Question sequence
In general, questions should flow logically from one to the next. To achieve the best response rates, questions should flow from the least sensitive to the most sensitive, from the factual and behavioural to the attitudinal, and from the more general to the more specific.

Basic rules for questionnaire item construction

- Use statements which are interpreted in the same way by members of different subpopulations of the population of interest. - Use statements where persons that have different opinions or traits will give different answers. - Think of having an "open" answer category after a list of possible answers. - Use only one aspect of the construct you are interested in per item. - Use positive statements and avoid negatives or double negatives. - Do not make assumptions about the respondent. - Use clear and comprehensible wording, easily understandable for all educational levels - Use correct spelling, grammar and punctuation. - Avoid items that contain more than one question per item (e.g. Do you like strawberries and potatoes?).

Questionnaire administration modes

Main modes of questionnaire administration

-Face-to-face questionnaire administration, where an interviewer presents the items orally. -Paper-and-pencil questionnaire administration, where the items are presented on paper. -Computerized questionnaire administration, where the items are presented on the computer.
-Adaptive computerized questionnaire administration, where a selection of items is presented on the computer, and based on the answers on those items, the computer selects following items optimized for the respondants estimated ability or trait.

Example of questionnaire

ARE YOU FIT TO WORK NIGHTS? The purpose of this questionnaire is to ensure that you are suited to working at night. All the information you provide will be kept confidential. Type of work/duration of night work 1. Surname ... 2. First and second name/s .. 3. Sex.... Male Female 4. Date of Birth .. 5. Permanent Address 6. Job Title 7. National Insurance No . 8. Department/Clock No .


Do you suffer from any of the following health conditions? Yes/No Diabetes Heart or circulatory disorders Stomach or intestinal disorders Any condition which causes difficulties sleeping Chronic chest disorders, especially if night time symptoms are troublesome Any medical condition requiring medication to a strict timetable Any other health factors that might affect fitness at work If you have answered yes to the above question you may be asked to see a doctor or nurse for further assessment. I, the undersigned, confirm that the above is correct to the best of my knowledge. Signed Date.. Assessment (this gives an indication of whether the worker is fit to work nights or should see a doctor or nurse for a medical examination) .. .. . .. .. .. Signed. Date.

2. Health and smoking questionniare

1.1. How long have you smoked regulary ?

Less than 1 year 1-3 years 3-5 years 5-10 years 10-25 years More than 25 years
1. 2. At what age did you start smoking ?

Under 12 12-16 16-20


20-29 30 years +
1. 3. Which tobacco product do you regularly smoke ?

Cigarettes Cigars Pipe Other



Epidemiology of communicable and non-communicable disease Communicable diseases, also known as infectious diseases, contagious diseases or transmissible diseases comprise clinically evident illness (i.e., characteristic medical signs and/or symptoms of disease) resulting from the infection, presence and growth of pathogenic biological agents in an individual host organism. In certain cases, infectious diseases may be asymtomatic for much or all of their course. Infectious pathogens include some viruses, bacteria, fungi, protozoa, multicellular parasites, and aberrant proteins known as prions. These pathogens are the cause of disease epidemics, in the sense that without the pathogen, no infectious epidemic occurs.

Disease transmission
Disease transmission process has three components i.e source, transmission route and susceptible host. Source is the origin of the disease causing organism. This could be infected person, animal, place or object. The main routes of transmission are;

Direct contact for example sexual contact,or objects such as a coin passed from one person
to another

Vectors like mosquitoes, housefly Faecal oral (ingesting contaminated food and water) Airbone , sneezing, coughing, talking, or even singing Transplacental (mother to foetus) Blood contact (transfusion, surgery, injection) Contact with animals or their products that are infected.

Susceptible host is an individual who has low resistance to particular disease. This may be due to various factors such as;

Lack of previous contact with the disease hence no immune cells Immuno suppressive illnesses such as AIDS Malnutrition Drugs that a person may be consuming.

Classification of communicable disease

There are various ways of classifying communicable diseases, eg they may be classified as either primary pathogens or as opportunistic pathogens according to the status of host defenses. The


classification below, based on routes of transmission is the one that is considered to be best for ease of understanding. Contact diseases e.g. scabies, pediculosis, fungal skin infections, trachoma, acute bacterial conjuctivities. Sexually transmitted diseases and HIV/AIDS Vector borne diseases e.g. relapsing fever, bancroftian filariasis onchocerciasis, yellow fever, trypanosomiasis, schistosomiasis, dracunculiasis, leishmaniasis and Malaria. Faecal oral contamination e.g acute gastro-enteritis, bacillary dysentery, campylobacter jejuni, giadiasis, amoebiasis, cholera, enteric fevers, food poisoning, Helminthic diseases e.g. Ascariaris, enterobiasis, trichuriasis, hookworm, strongyloidiasis, taeniasis, hydatidosis Airborne diseases ;Acute respiratory infections, Meningitis (bacterial and fungal) Tuberculosis and leprosy.

Disease prevention