Corpus (MEBET)

This corpus was born out of a collaboration of roughly a year with Dr. Xiaoli YU, Yağmur Su KOLSAL, Beyza DİLBAZ, and Halide İSLAMOĞLU.

First, we created the high school textbooks corpus between 2019 and 2020. Then, I expanded the corpus all the way down to grade 5. Now, I am in the process of expanding the corpus all the way down to second grade. The corpus will be freely available to all shortly and is in raw .txt format, which means it is not processed. However, when used in projects/studies/publications, please cite the following two studies:

Gedik, Tan Arda. 2022. An Analysis of Lexicogrammatical Development in English Textbooks in Turkey: A Usage-Based Construction Grammar Approach. Explorations in English Language and Linguistics 9(1).
Gedik, Tan Arda. & Yağmur Kolsal S. 2022. A Corpus Based Approach to English University Entrance Exams and English High School Textbooks in Turkey. TAPSLA 8(1). 157–176.

The corpus mainly consists of student's books, workbooks (when available), and listening tracks (when available). Please read the documentation to find out more about the corpus. 

We hope that this corpus will inspire future studies on textbooks.