Analyzing Assessment Data
Assessment Description
Data from assessments can be used to determine if learners are meeting course or learning outcomes. Assessments can be utilized in many ways, such as learner practice, learner self-assessment, determining readiness, determining grades, etc. The purpose of this assignment is to analyze sample test statistics to determine if learning has taken place.
To address the questions below in this essay assignment, you will need to use the information from your textbook chapter readings and the data provided in the “Sample Item Analysis” resource.
In a 1,000-1,250 word essay respond to the following questions:
- Explain what reliability is and whether this test is reliable based on the “Sample Item Analysis” resource. What evidence supports your answer?
- What trends are seen in the raw scores? How would an instructor use this information?
- What is the range for this sample? What information does the range provide and why is it important?
- What information does the standard error of measurement provide? Does the test have a small or large standard error of measurement? How would an instructor use this information?
- Explain the process of analyzing individual items once an instructor has analyzed basic concepts of measurement.
- If one of the questions on the exam had a p-value of .100, would it be a best practice to eliminate the item? Justify your answer.
- If one of the questions on the exam has a negative PBI for the correct option and one or more of the distractors have a positive PBI, what information does this give the instructor? How would you recommend that the instructor adjust this item?
- Based on the “NUR-648E Sample Item Analysis” resource, what steps would you take to improve learning?
Place your order today!
Solution
Analyzing Assessment Data
- Explain what reliability is and whether this test is reliable based on the “Sample Item Analysis” resource. What evidence supports your answer?
Reliability in research and statistics refers to how consistently a test or a measure produces the same result every time it is repeated under the same circumstances. If the same measurement can be extracted every time the test is used, then the instrument is reliable and can be applied across the board. Looking at the Sample Item Analysis, this test is not reliable at all. The evidence for this stems from the fact that the scores or performance history is wildly different each time. For instance, at one time the correct scores are 67 while the incorrect is only 1. But another time the correct scores are 27 and the incorrect ones are a whopping 41.
- What trends are seen in the raw scores? How would an instructor use this information?
There are wild fluctuations in performance such that at one time the students are doing well but the next time they are scoring poorly. An instructor can use this information to either better the instruction methods or the assessment methods. Either way it is one of those two that presents the problem (McDonald, 2018; Oermann & Gaberson, 2014).
- What is the range for this sample? What information does the range provide and why is it important?
The range of the sample can be given by subtracting the minimum from the maximum score. That gives 70-51 = 19. The range shows the spread of the data in the distribution that has been provided. It is important in that it measures variability among the test takers and also within the scores. Variability means how different the test takers are in terms of their ability to score points. The instructor can therefore use this descriptive statistical measure to improve instruction in particular areas and also concentrate on particular learners.
- What information does the standard error of measurement provide? Does the test have a small or large standard error of measurement? How would an instructor use this information?
The standard error of measurement provides information about the spread of observed test scores around a true score. It essentially indicates the spread of measurement errors and would therefore be essential in testing for the reliability of a test (Frey, 2018). Standard error of measurement (SEm) is the standard deviation divided by the square root of the sample size. In this case, the standard deviation is 4.30. The sample size is 68 and so the square root of this is 8.25. Therefore, the SEm, in this case, is 4.30 divided by 8.25. The answer is 0.52. This test has a large or substantial standard error of measurement and this confirms the unreliability of the test as stated above. An instructor would use this information to fine-tune the test or assessment method and make it more reliable. This is because a reliable test gives a standard error of measurement of zero; while an unreliable one gives a value equal to or close to the standard deviation (Frey, 2018).
- Explain the process of analyzing individual items once an instructor has analyzed basic concepts of measurement.
The process of individual item analysis involves an examination of the responses given by a student to specific questions. The aim is to find out the quality of those items as well as that of the whole test. The process starts by assessing those items that make up the examination or assessment. This is followed by the statistics of the test performance in general.
- If one of the questions on the exam had a p-value of .100, would it be a best practice to eliminate the item? Justify your answer.
The p-value has to traditionally be small for the evidence to be strongest. The general guidelines are that a p-value of less than 0.001 shows very strong evidence against the null hypothesis. In the same vein, a p-value equal to or greater than 0.1 shows insufficient evidence for rejecting the null hypothesis (Yildirim, 2020). For that reason alone, the question on the examination with a p-value of 0.100 should be eliminated.
- If one of the questions on the exam has a negative PBI for the correct option and one or more of the distractors have a positive PBI, what information does this give the instructor? How would you recommend that the instructor adjust this item?
A negative point biserial index or PBI indicates that those examinees that performed poorly in the test got it correctly. On the other hand, a positive PBI on the distractor only goes to show that the students who performed well selected it (Schoening, n.d.). The information that is given by the above (a negative PBI for the correct option and a positive PBI for the distractor) is that high performing students are getting the answer to the test item (question) wrong, whereas poorly performing students are getting the same test item correctly. The recommendation is that the instructor should adjust the item by writing it properly. This is because the reason for the above scenario is usually that the question is written poorly or cannot be understood well.
- Based on the “NUR-648E Sample Item Analysis” resource, what steps would you take to improve learning?
Based on the resource that has been used in this exercise (the NUR-648E Sample Item Analysis), the steps that could be taken to improve learning include the following:
- Giving regular formative assessments to help determine the suitability of test items way before the summative assessment.
- To continuously refine test items depending on factors such as the point biserial index or PBI.
- To use statistics such as the standard error of measurement to determine the difference or variability in test scores or spread of error. This way, it will be easy for the instructors to identify the weak students and then concentrate on them to bring them to par with the others.
- To continuously check and test for the reliability of tests so that they are deemed consistent in the way that they assess the competencies of the students or examinees.
References
Frey, B.B. (2018). Standard error of measurement. The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. https://dx.doi.org/10.4135/9781506326139.n658Online Publication Date: June 5, 2018
McDonald, M.E. (2018). The nurse educator’s guide to assessing learning outcomes, 4th ed. Jones & Bartlett Learning.
Oermann, M.H. & Gaberson, K.B. (2014). Evaluation and testing in nursing education, 4th ed. Springer Publishing Company.
Schoening, A. (n.d.). Interpreting exam performance: What do those stats mean anyway? https://my.methodistcollege.edu/ICS/icsfs/PtT-Exam_Analysis_and_Recommendations.pdf?target=8cb58375-f4a7-4a07-9ee9-4439f46f2648
Yildirim, S. (2020). P value – Explained. https://towardsdatascience.com/p-value-explained-c7f5547c0562 Disciplines: Education
Print ISBN: 9781506326153 | Online ISBN: 9781506326139