Using generalizability theory to investigate the reliability of peer assessment

Authors

  • Gülşen Taşdelen Teker Sakarya University
  • Melek Gülşah Şahin Gazi University
  • Kemal Baytemir Amasya University

Keywords:

peer assessment, generalizability theory, reliability, peer rater

Abstract

In this study, the effectiveness of peer assessment, which has an important role in measurement and evaluation, was attempted to be defined. For this purpose, performance task, which is one of the alternative assessment techniques, was evaluated with the help of a scoring rubric prepared by the researchers.  As a basic research, the working group was 41 sophomore students and their instructor. Three of 41 students were acted as rater and they rated their 38 peers’ performances with the instructor. The analysis of the data was carried out by using fully crossed two-facet design (sxtxr) of generalizability theory in three steps: G-studies for peer and peers-instructor ratings and D-study for peer ratings. According to the results of the G studies, the reliability coefficient obtained from the peer ratings and peer-instructor ratings were quite high (0.86 and 0.82 respectively). According to the result of the D study of peer ratings, just two peer raters are enough for getting high reliability coefficient. With the help of the gained results, it is suggested that peer assessment, which is effective on learning and decision making processes of students, should be used more often in education systems.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biographies

Gülşen Taşdelen Teker, Sakarya University

Ph.D., Sakarya University, Faculty of Education

Melek Gülşah Şahin, Gazi University

Ph.D., Gazi University, Faculty of Education

Kemal Baytemir, Amasya University

Assistant Professor, Amasya University, Faculty of Education

References

Alfallay, I. (2004). The Role of Some Selected Psychological and Personality Traits of The Rater in the Accuracy of Self-and Peer- Assessment. System, 32, 407-425.

Aryadoust, V. (2016). Gender and Academic Major Bias in Peer Assessment of Oral Presentations. Language Assessment Quertly an International Journal, 3(1), 1-24.

Atılgan, H. (2004). Genellenebilirlik Kuramı ve Çok Değişkenlik Kaynaklı Rasch Modelinin Karşılaştırılmasına İlişkin Bir Araştırma. Unpublished Doctoral Dissertation. Hacettepe University: Ankara.

Bal, A. P. (2009). The Evaluation of Measurement and Evaluation Approaches Used in Fifth Grade Mathematics Instruction in Terms of Students’ and Teachers’ Opinions. Unpublished Doctoral Dissertation, Cukurova University, Adana, Turkey.

Ballantyne, R., Hughes, K., & Mylonas, A. (2002). Developing Procedures for Implementing Peer Assessment in Large Classes Using an Action Research Process. Assessment & Evaluation in Higher Education, Vol. 27, No. 5.

Baştürk, R (2008). Applying The Many-Facet Rasch Model to Evaluate Powerpoint Presentatiton Performance in Higher Education. Assessment&Evaluation in Higher Education, 33(4), 431-444.

Bayat, O. (2010). İngilizce Yazılı Anlatım Derslerinde Uygulanan Akran ve Öz Değerlendirme EtkinliklerineYönelik Öğrenci Görüşleri. Dil Dergisi, 150, 70-81.

Baykul, Y. (2000). Eğitimde ve Psikolojide Ölçme, Ankara: ÖSYM Yayınları

Brennan, R. L. (1992). Elements of Generalizability Theory. New York: Springer-Verlog.

Brennan, R. L. (2001). Generalizability Theory. New-York: Springer-Verlag.

Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Harcourt Brace Javanovich College Publishers, USA.

Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. New York: Wiley.

Coklar, A. N., & Odabasi, H. F. (2009). Determining the Assessment and Evaluation Self-Efficacies of Teacher Candidates Regarding Education Technology Standards. Ahmet Kelesoglu Education Faculty (AKEF) Journal, 27, 1-16.

Dogan, C. D., & Kutlu, O. (2011). Factors Related Learning Which Effect Pre-Service Teachers’ Preferences on Alternative Assessment Methods. Kastamonu Education Journal, 19(2), 459-474.

Donnon, T. , McIlwrick, J. & Woloschuk, W. (2013). Investigating the Reliability and Validity of Self and Peer Assessment to Measure Medical Students’ Professional Competencies. Creative Education, 4, 23-28. doi: 10.4236/ce.2013.46A005.

Esfandiari, R. (2015). Rater Errors Peer-Assessors: Appliying the Many-Facet Rasch Measurement Model. Iranian Journal of Applied Linguistics, 18(2), 77-107.

Falchikov, N. (1998). Involving Students in Feedback and Assessment : A Report from the Assessment Strategies in Scottish Higher Education (ASSHE) project, in: S. BROWN (Ed.) Peer Assessment in Practice, SEDA Paper 102, Birmingham, SEDA.

Falchikov, N. (1995). Peer Feedback Marking: Developing Peer Assessment. Innovations in Education & Training Interational, 32(2), 175-187, DOI: 10.1080/1355800950320212.

Falchikov, N. (1986). Product comparisons and process benefits of collaborative peer group and self assessments. Assessment & Evaluation in Higher Education, 11:2, 146-166, DOI: 10.1080/0260293860110206

Goodrich Andrade, H. (2001). The Effects of Instructional Rubrics on Learning to Write . Current Issues in Education, 4(4), 1-22.

Gugiu, M. & Gugiu, P. C. (2012). Assessing the Reliability of Peer Evaluation of Undergraduate Research Papers Through the Use of Generalizability Theory. American Political Science Association, Conference Paper, 1-34.

Güler, N., Kaya Uyanık, G. & Taşdelen Teker, G. (2012) Generalizability Theory. Pegem Akademi Publishing, Ankara, Turkey.

Güler, N. (2009). Generalizability Theory and Comparison of the Results of G and D Studies Computed by SPSS and Genova Packet Programs. Education and Science, 34, 154.

Hafner, J. C. & Hafner, P. M. (2007). Quantitative Analysis of the Rubric as an Assessment Tool: An Empirical Study of Student Peer-Group Rating, International Journal of Science Education, 25(12), 1509-1528.

Han, K. S., Mun, G., S. & Ahn, J. Y. (2009). Comparing the Use of Self and Peer Assessment: A Case Study in A Statistics Course. Communications of the Korean Statistical Society, 16(6), 979-987.

Karakaya, İ. (2015). Comparison of Self, Peer and Instructor Assessments in the Portfolio Assessment by Using Many Facet Rasch Model. Journal of Education and Human Development, 4(2), 182-192.

Koç, C. (2011). Sınıf Öğretmeni Adaylarının Öğretmenlik Uygulamasında Akran Değerlendirmeye İlişkin Görüşleri. Kuram ve Uygulamada Eğitim Bilimleri, 11(4), 965-1989.

Kutlu, Ö., Dogan, C. D., & Karakaya, İ. (2009). Ogrenci Basarisinin Belirlenmesi: Performansa ve Portfolyoya Dayali Durum Belirleme. Ankara: Pegem Academy Publishing.

Lee, G., & Frisbie, D. A. (1999). Estimating Reliability Under a Generalizability Theory Model for Test Scores Composed of Testlets. Applied Measurement in Education. 12(3), 237-255.

Luft, J. (1997). Design Your Own Rubric. Educational Leadership, 20(5), 25-27.

Mamur, N. (2011). The Qualification of Pre-Service Visual Art Teachers About Measurement and Evaluation Tools and Approaches of Their Branch. Turkish Educational Sciences Journal, 9(3), 597-626.

Marcoulides, G. A. (1989). The Application of Generalizability Analysis to Observational Studies. Quality and Quantity, 23, 115–127.

Marcoulides, G. & Simkin, M. G. (1992). Evaluating Student Papers: The Case for Peer Review, Journal of Education for Business, 67(2), 80-83.

Marty, C. M., Henning, J. M., & Willse, J. T. (2010). Accuracy and Reliability of Peer Assessment of Athletic Training Psychomotor Laboratory Skills, Journal of Athletic Training, 45(6), 609–614.

Mcdowell, L. (1995) The Impact of Innovative Assessment on Student Learning. Innovation in Education and Training International, 32(4), 302–313.

Moskal, B. & Leydens, J. (2000). Scoring Rubric Development: Validity and Reliability. Practical Assessment, Research and Evaluation, 7(10), 1-11.

Mowl, G. & Pain, R. (1995). Using Self and Peer Assessment to Improve Students’ Essay Writing: A Case Study from Geography. Innovation in Education and Training International, 32(4), 324–335.

Nunally, J. C. & Bernstein, I. H. (1994). Psychometric Theory. New-York: Mc-Graw-Hill.

Orsmond, P. & Merry, S. (1996). The Importance of Marking Criteria in the Use of Peer Assessment. Assessment and Evaluation in Higher Education, 21(3), 239–250.

Popham, J. W. (1997). What's Wrong and What's Right With Rubric. Educational Leadership, 55(2), 72-75.

Reed, M. W., & Burton, J. K. (1985). Effective and Ineffective Evaluation of Essays: Perceptions of College Freshmen. Journal of Teaching Writing, 4(2), 270-283.

Reynolds, C. R., Livingston, R. L., & Willson, V. L. (2009). Measurement and Assessment in Education. Upper Saddle River, NJ: Pearson/Merrill Publishers.

Sadde, P. M. & Good, E (2006). The Impact of Self and Peer-Grading on Student Learning. Educational Assessment, 11(1), 1-31.

Searby, M. & Ewers, T. (1997) An Evaluation of The Use of Peer Assessment in Higher Education: A Case Study in The School Of Music, Kingston University. Assessment and Evaluation in Higher Education, 22(4), 371–383.

Semerci, Ç. (2011a). Mikro Öğretim Uygulamalarinin Çok Yüzeyli Rasch Ölçme Modeli ile Analizi. Eğitim ve Bilim, 36 (161), 14-25.

Semerci, Ç. (2011b). Doktora Yeterlilikler Çerçevesinde Öğretim Üyesi, Akran ve Öz Değerlendirmelerin Rasch Ölçme Modeliyle Analizi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2(2), 164-171.

Shavelson, R. J., & Webb, N. M. (1991). Generalizability Theory: A Primer. USA: SAGE Publications.

Strang, K. D (2013). Exploring Summative Peer Assessment During A Hybridundergraduate Supply Chain Course Using Moodle. Paper Presented 30th Ascilite Conference, 1-4 December, Sydney.

Sung, Y., Chang, K., Chang, T., & Yu, W. (2010). How Many Heads are Better Than One? The Reliability and Validity of Teenagers’ Self- and Peer Assessments, Journal of Adolescence, 33, 135–145.

Şahin, M. G., Taşdelen Teker, G. & Güler, N. (2016). An Analysis of Peer Assessment through Many Facet Rasch Model. Journal of Education and Practice, 7(32), 172-181.

Topping, K. J. (1998) Peer Assessment Between Students in Colleges and Universities. Review Of Educational Research, 68 (3), 249–276.

Topping, K. J., Smith, E. F., Swanson, I. & Elliot, A. (2000) Formative Peer Assessment of Academic Writing between Postgraduate Students. Assessment and Evaluation in Higher Education, 25(2), 146–169.

Topping, K. J. (2009). Peer Assessment. Theory into Practice, 48(1), 20-27.

Voltan Acar, N. (2012). Insan Iliskileri ve Iletisim. Ankara: Nobel Academic Publishing.

Yin, Y. & Shavelson, R. J. (2008). Application of Generalizability Theory to Concept Map Assessment Research. Applied Measurement in Education. 21, 273-291.

Downloads

Published

2016-12-20

How to Cite

Taşdelen Teker, G., Şahin, M. G., & Baytemir, K. (2016). Using generalizability theory to investigate the reliability of peer assessment. Journal of Human Sciences, 13(3), 5574–5586. Retrieved from https://www.j-humansciences.com/ojs/index.php/IJHS/article/view/4155

Issue

Section

Educational Evaluation, Measurement and Research