Sei sulla pagina 1di 6

Psychometric Test On the whole, testing became an essential element of life applied in various spheres.

The main objective of testing is provision with additional information, which may be extremely beneficial and crucial in different situations. Nowadays, tests affect professional and school experience of individuals because education and human resource professionals often draw conclusions regarding concrete individuals on the basis of test results. Job application process is almost always accompanied by completion of particular psychometric tests of different types. Such an application procedure allows human resource executives to implement initial screening and select the most appropriate candidates for the further consideration. It is especially convenient when there is a large pool of applicant, and it is physically challenging or even impossible to screen every applicant profoundly. Hence, under these circumstances, testing becomes the first stage of the application process. Educational establishments of all kinds also use tests in order to obtain valuable information about every student and understand individual situation in terms of intelligence, decision-making, personality, etc. It helps educational specialists to create a successful developmental path for every student on the basis of an individual approach. Therefore, in the educational environment, results of tests determine concrete educational intervention of school professionals and influence an academic progress of every student. There is no doubt that testing became an important element of everyday life widely used under different social conditions. Hence, it is essential to ensure a correct testing and an adequate interpretation of testing results. Otherwise, a lot of mistakes may happen that would produce an extremely negative impact upon the tested people and their future. It may provoke denial of a potentially successful job applicant, failed certification, wrong educational intervention, loss of promising opportunities, and other negative consequences. Adverse decisions can be made on the basis of tests without the requisite psychometric properties. There is a number of factors which should be considered by test developers and users. To begin with, responsibility for test use should be assumed by or delegated only to those individuals who have the training, professional credentials, and experience necessary to handle this responsibility. Any special qualifications for test administration or interpretation specified in the test manual should be met (AERA, APA, & NCME, 1999, p. 114). Of course, it is understandable that a person without proper qualification is more likely to misuse tests and draw out wrong conclusions. Test developers and users should think of other issues during the process of test evaluation. It is reasonable to view the mentioned factors on example of a personality type questionnaire elaborated by Team Technology company (2012, p.1). Personality questionnaires provide valuable information about personality of every concrete involved with each individual. It helps to assess and determine further cooperation with this particular individual. Besides, personality questionnaires are widely used for enlargement of self- knowledge and self- correction. The test under consideration helps to identify a personality type and leadership style. Its results can be extremely helpful for understanding the peculiarities of one's personality, creation of a correct leadership profile and selection of concrete correct career options. Test participants are asked to answer the test in a following way: For each of the pair of statements select the radio button nearest the statement you agree with most. For the statements where one agree or disagree with

both, select a radio button near the middle (Team Technology, 2012, p.1). Such a model allows more detailed answers and helps to explore one's personality more profoundly. There is no doubt that reliability of the test should be the primary concern of test users. Reliability can vary with the many factors that affect how a person responds to the test, including their mood, interruptions, time of day, etc. A good test will largely cope with such factors and give relatively little variation. An unreliable test is highly sensitive to such factors and will give widely varying results. Generally speaking, the longer the delay between tests, the greater the likely variation. Better tests will give less retest variation with longer delays. In my opinion, to assess a level of reliability, it is possible to apply the Test- Retest Method that evaluates correlation of results from the same test given to the same participants two times (Annastasi, 1988; Cronback, 1970). For example, various questions for a personality test can be tried out with a class of students over several years. This helps the researcher determine those questions and combinations that have better reliability. Additionally, in the development of national school tests, a class of children can be given several tests that are intended to assess the same abilities. A week and a month later, they are given the same tests. With allowances for learning, the variation in the test and retest results are used to assess which tests have better test-retest reliability. Importantly, there should be an appropriate interval of time between each testing. Of course the problem with test-retest is that in some cases, people may remember answers and give the same answers intentionally during the second testing. Hence, this method is not quite accurate for assessment of the test's reliability. Therefore, I believe that the best method in this case is Parallel Test Reliability. Parallel Test Reliability implies creation of two tests measuring the same traits in the same way (Edenborough, 1974; Helmstadter, 1996; Hayes, 1994). Parallel-forms reliability evaluates different questions and question sets that seek to assess the same construct. For example, an experimenter develops a large set of questions. They split these into two and administer them each to a randomly-selected half of a target sample. In development of national tests, two different tests are simultaneously used in trials. The test that gives the most consistent results is used, whilst the other (provided it is sufficiently consistent) is used as a backup. Hence, the test under consideration should be given initially to participants. Later, it is necessary to create a fully equivalent test, give it to the same participants and observe the differences in obtained results. The creation of the equivalent test can be achieved by replacing formulations of the initial test with their equivalents. For examples, it is possible to replace I really enjoy comforting other people who feel hurt or upset with I really enjoy giving moral support to people who feel frustrated and discouraged. This method can help test users identify if the results are reliable and can be used for an adequate personality assessment. Parallel Test evaluation may be done in combination with other methods, such as Split-half, which divides items that measure the same construct into two tests and applies them to the same group of people. However, reliability itself does not guarantee credibility of test results. Validity is another crucial parameter that should be analyzed by test users and developers (Borsboom, Mellenbergh & van Heerden, 2004; Braden & Niebling, 2005). Generally, we can define validity as an overall evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions on the basis of test scores or other modes of assessment (Messick, 1995). It is important to realize that validity is not a property of the test as such. Rather, it should be

understood as representative of the meaning of the test scores. In turn, these scores take into consideration not only stimulus conditions, but only the person taking the test and the context in which the test is taken. In essence, it is the meaning or the interpretation of the test results that needs to be valid. The two main sources of invalidity are construct underrepresentation and construct-irrelevant variance (Messick, 1995). Construct underrepresentation occurs when the test does not measure adequately what it intends to measure. As Messick (1995) explains, the assessment might be too narrow and fail to incorporate crucial dimensions or aspects of a given construct. The test under consideration is not characterized by this problem because it offers statements correlated with leadership qualities and personality traits. It correlates with the test's objective - measuring leadership qualities and assisting in personality assessment. When it comes to construct irrelevant variance, it reveals itself through the assessment which is too broad and might include excess reliable variance linked to other constructs as well as response sets or guessing propensities that affects responses in a manner irrelevant to the interpreted construct. It happens when the test demands particular skills that have nothing in common with measured skills (Sireci & Parker, 2006). For instance, the test under consideration demands to read skills that are not the object of measurement. A person with reading disabilities would not be able to respond to the test adequately that would make the results of the test invalid. Hence, it is important to know the participants and their peculiarities in order to perceive correctly results of the test and be able to create special accommodations preventing invalidity. In fact, proper accommodations may significantly improve the validity of the results (Braden & Joyce, 2008). Besides, test users and developers must understand that the overwhelming majority of existing tests are oriented on English-speaking audience and audience speaking English as a mother tongue. Different language background may affect understanding of the test by participants. Moreover, cultural differences are also important and should be taken into account. For instance, the test under consideration asks participants to react on the following statement: I really like building better relationships between people. However, assessing reaction on this statement, it is necessary to remember that in certain Eastern cultures people are perceived as serious when they are reserved, do not engage actively in relationships with people and limit their social interactions. Hence, test users/ developers should be especially careful with interpreting test results involving participants of foreign cultural and language background. Moreover, it is necessary to observe behavior of every participant while testing to determine how tests should be interpreted. Behavior at the moment of testing may prompt the nature of concrete test reactions. For example, if the test under consideration is offered to school students, it is essential for supervising educational specialists to take into account not only the particular test results but also the following key behaviors: attention to tasks, problem- solving approaches, frustration tolerance, activity level, fine and gross motor skills, effort or motivation, speech and language, attitude, self- efficacy and self- concept, response to praise, and reflectivity (Kauffman & Kauffman, 1983). If a student does not display enough attention, he is more likely to give inadequate responses; hence the results of his test should not be considered credible. Otherwise, it may provoke implementation of wrong and ineffective intervention methods and techniques. Besides, both quantitative and qualitative interpretation can be used. The former should be restricted to cases where there is sufficient supporting evidence. The use of fixed cut-offs

with personality measures can be particularly misleading, without relevant evidence of the validity. Therefore, qualitative interpretation may be more appropriate in these cases. Decision rules and their rationale should be properly documented (SHL Group, 2000, p. 9). However, even if the all the mentioned factors are taken into account, it is still possible that test results would not help the employer to assess a job candidate adequately. In some cases, candidate's ability to produce results does not depend upon certain traits of his personality displayed by the test: there are people who have the most inappropriate psychometric testing results who, nonetheless, have a proven record or results. In other words, they get the job done, regardless of their personality quirks (Clark, 2011, p.1). Hence, there are a lot of factors that should be considered by test developers or users because every single factor described may significantly distort the results of the test and destroy the credibility of a personality assessment based on the test. On the whole, it is understandable that people are very different, and human personality and aptitudes are characterized by a variety of peculiarities. There is no doubt that it is difficult to measure certain traits and abilities of personality completely adequately taking into account uniqueness of every person. Hence, it is necessary for test users to rely on other evidence as well try to achieve a better understanding of one's personality and avoid negative implications.

References American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Annastasi A. (1988). Psychological Testing. Sixth Edition Macmillan Publishing Co., London. Borsboom, D., Mellenbergh, G., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 10611071. Braden, J. P., & Joyce, L. B. (2008). Best practices in making assessment accommodations. In A. Thomas & J. Grimes. (Eds.), Best practices in school psychology (5th ed.). Silver Spring, MD: National Association of School Psychologists. Braden, J. P., & Niebling, B. C. (2005). Using the joint test standards to evaluate the validity evidence for intelligence tests. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests and issues (2nd ed., pp. 615 630). New York: Guilford. Clark, N. (2011). Psychometric Testing. Should you Use it? Web. 27 Aug. 2012. Retrieved from http://www.performance-management-made-easy.com/psychometric-testing.html Cronback, L. (1970). Essentials of Psychological Testing. Harper International Edition, New York and London. Edenborough R. (1974). Using Psychometrics. Kogan Page Ltd, London. Hayes, N. (1994). Foundations of Psychology. Routledge, London. Helmstadter G. (1996). Principles of Psychological Measurement. Methuen & Co. London. Kauffman, A. S., & Kauffman, N. L. (1983). Kauffman Assessment Battery for Children (K ABC) administration and scoring manual. Circle Pines, MN: American Guidance Service. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741749. SHL Groups. (2000). Best Practice in the Management of Psychometric Tests. Web. 27 Aug. 2012. Retrieved from http://www.era.org.in/library/best20practicepsychometrics1.pdf Sireci, S., & Parker, P. (2006). Validity on trial: Psychometric and legal conceptualizations of validity. Educational Measurement: Issues and Practice, 25(3), 2734. Team Technology. (2012). Personality Type Questionnaire. Web. 27 Aug. 2012. Retrieved from http://www.teamtechnology.co.uk/mmdi/questionnaire/

Potrebbero piacerti anche