- Open the dataset, and report means, standard deviations, and inter-correlations for all variables in the dataset, using the table provided. Identify which of these correlations are statistically significant by using the (*) symbol. Report the median and 95th percentile scores for OP1
|Median||95th percentile score:|
- Consider the results of Know1 and Know2. What is the appropriate method of estimating reliability here? Report a specific value, and describe it concerning the appropriate standards reported in the text. What can we conclude from this version of the Job knowledge test?
|Method of estimating reliability||Reported value|
|Parallel form reliability is a suitable method of determining the authenticity of the tests: Know1 and Know2.
This is because the two tests conducted above have an equal weight regarding content, objectives, format and difficulty level (Phelan, 2017). One can only estimate the reliability of these tests using the Parallel form reliability method.
This is the estimated value of reliability between the two tests as recorded in the correlation table.
|Interpretation of reported value||Conclusion|
|Theoretically, the coefficient of correlation between the two tests is taken as the reliability coefficient (Disha, 2016). Based on comparable test standards, a figure of 0.57 is not sufficient to credit reliability (Disha, 2016). In other words, if a person repeats a similar test, the probability of maintaining the previous score is relatively low||The low-reliability value means that the above Job knowledge tests have not met the required reliability standards for equivalent tests. Hence they may have a limited applicability as assessment tools when evaluating the hiring process of management trainees.|
- Consider the assessment Centre measures (Comm1, OP1, and Decide). Describe the pattern of inter-correlations among these variables. What does thispattern, together with the relationship between these predictor measures andthe criterion measure, suggest about assessment centers as a selection tool?
|Describe the pattern of inter-correlations among these variables:||Meaning of the pattern considering validity|
|Presentation exercise (OP1) and organizing/planning exercise (Comm1) have a very low correlation coefficient of 0.01. Similarly, the presentation exercise (OP1) and decision making exercises (Decide) also have a very low correlation coefficient of 0.05. But decision-making exercise (Decide) and presentation exercises (Comm1) have a significant correlation coefficient of 0.59. Conversely, The independent criterion measures (Crit.) and organizing/planning practices (OP1) have a low correlation coefficient of 0.18. However the criterion measures (Crit.) exhibit a high correlation coefficient of 0.57 with presentation exercises (Comm1) and decision-making exercises (Decide).||The above findings suggest that OP1 measures produce results that have a weak correlation with Comm1 and Decide tests. There is also a small correlation coefficient between OP1 measures and Crit. measures. The observations above imply that OP1 measures lack consistency with other measures and they cannot be used as reliable sources of evaluation. On the other hand, Comm1 and Decide measures correlate well with each other with a significant coefficient of 0.57. The assessment instruments also correspond well with Crit. measures. Hence they can be tipped as dependable and consistent information references when assessing applicants.|
- Consider a new predictor that is formed by adding standardized scores fromComm1 and Decide Report the mean and standard deviation ofthis new measure. What is the validity of this new predictor? What typeof validation strategy have you used here? Was it necessary to standardizethese measures before adding them together? Why or why not?
|New predictor (Comm1 + Decide)||18.75413462
|Validity of new predictor||0.57|
|Validation strategy||Criterion-related validity. Because the study requires demonstration of a correlation between the new predictor and Criteria performance obtained from managers (Phelan, 2017).|
|Necessity of standardization||Yes, it was necessary because the two tests were measured with different standards. Unification could bring about consistency and validity.|
- What is your overall evaluation of this selection system? Cite statistical evidence, together with relevant standards reported in the text, insupport of your conclusions
|Whole evaluation selection system|
|The comm1 and decide tests can have a significant relationship and their results concur by a more considerable margin with criterion measures. This statement can be supported by a substantial correlation coefficient of 0.59 between the two assessment tools. It is the highest value in the correlation pattern table and also meets the standards of validity. The two predictor measures Comm1 and Decide even correlate well with the criterion measures recording values of 0.56 and 0.47 respectively. Therefore the two tests can be used as reliable tools of evaluation. Conversely, the knowledge test tools recorded a reasonably good correlation figure of 0.57, but it is not adequate for reliability purposes based on known standards that parallel tests should meet. Besides, the identical tests were not consistent with other criterion measures like Comm1 and Decide. Low correlation coefficients prove that knowledge tests cannot be relied upon as practical assessment tools (Trochin, 2006). OP1 experiments consistently recorded very low correlation coefficients with all other measures including the final criterion measures. In other words, OP1 assays are not reliable evaluation tools. In conclusion, Comm1 and decide experiments have demonstrated to be valid evaluation references that can provide consistent and reliable information when hiring trainees.|
- Concisely explain why the validity coefficients obtained infield data may differ from the values obtained in Table 8.3 in the text
|a. A person’s physical or psychological state can significantly influence the final score in an assessment test. For instance, a person can have varying levels of motivation, anxiety or even fatigue at the time of testing hence affecting the final score. Such factors can alter the validity coefficient by a significant margin. Therefore, they have to be put into consideration (Disha, 2016).
b. An individual’s final performance can be influenced by the state of the environment in the field. Conditions like humidity, room temperature, noise, and lighting levels can affect the final scores. The nature of supervision or behavior of the administrator can also have an impact on the individual’s results. The above findings indicate that validity of the assessment tools significantly on the environmental factors. Therefore, they cannot be neglected.
c. Many knowledge tests for instance parallel tests can take different forms but check on the same aspects and skills. It is difficult to standardize the criteria hence an individual can perform better on one version than another. In other terms, if parallel tests are not equivalent in all aspects then field results may not tally with booklet results (Disha, 2016).
Disha, M. (2016, April 2). Determining Reliability of a Test. Retrieved from Your Article
Phelan, C. (2017, 2006 4). EXPLORING RELIABILITY IN ACADEMIC ASSESSMENT.
Retrieved from UNI: https://chfasoa.uni.edu/reliabilityandvalidity.htm
Trochin, W. (2006, June 1). Types of Reliability. Retrieved from WEB CENTER FOR SOCIAL
RESEARCH METHODS: https://www.socialresearchmethods.net/kb/reltypes.php