The Saudi Learners of English Corpus (SLEC): Design, Research Potential and Applications
DOI:
https://doi.org/10.59992/IJESA.2025.v4n4p5Keywords:
English as a foreign language, Learner corpus, Learner corpus design, Saudi learners of EnglishAbstract
The design and use of learner corpora is a rapidly developing branch of corpus linguistics. Learner corpora have the signal merit of allowing for the use of evidence-based methods in applied linguistics. This article introduces the Saudi Learners of English Corpus (SLEC), composed of writing by undergraduate students in Saudi universities. The SLEC is presented as the first Saudi written English as a Foreign Language (EFL) corpus that will be made eventually publicly available. It comprises 175,592 words, collected from EFL learners in Saudi Arabia, all of whom have studied English for nine years in Saudi public schools. The corpus includes data produced by 741 students. Their proficiency level ranges between beginner and intermediate. The corpus is designed to include a variety of metadata which describes features of the texts and the learners. The article presents the contents and the design criteria of SLEC, discussing in detail the rationale for the corpus, the participants involved, the corpus size, the materials included, the method of data collection, corpus metadata and architecture. Pedagogical implications and potential future research are also addressed.
References
Algouzi, S. (2014). Discourse markers in Saudi English and British English: A comparative investigation of the use of English discourse markers. Unpublished PhD Thesis. University of Salford.
Ammon, U. (2007). Global Scientific Communication: Open Questions and Policy Suggestions. In: Carli, A., Ammon, U. (Eds.), Linguistic inequality in scientific communication today. John Benjamins, Amsterdam/Philadelphia, (pp. 123–133).
Biber, D., Conrad, S., & Leech, G. N. (2002). Longman student grammar of spoken and written English. Harlow: Longman.
Burnard, L. (2005). Metadata for corpus work. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good practice (pp. 30–46). Oxford, UK: Oxbow Books.
Buttery, P. and A. Caines (2012) Normalising Frequency Counts to Account for ‘opportunity of use’ in Learner Corpora, in Developmental and Crosslinguistic Perspectives in Learner Corpus Research, Y. Tono, Y. Kawaguchi, and M. Minegishi, Editors. 2012, John Benjamins: Amsterdam. p. 187-204.
Conrad, S. (2002). Corpus Linguistic Approaches for Discourse Analysis. Annual Review of Applied Linguistics; Cambridge, 22, 75–95.
Crystal, D. (2003). English as a Global Language. Ernst KlettSprachen.
Díaz-Negrillo, A., & Thompson, P. (2013). Learner corpora: Looking towards the future. In Díaz-Negrillo, A., Ballier, N., & Thompson, P. (Eds). Automatic Treatment and Analysis of Learner Corpus Data (pp. 9–30). Amsterdam, the Netherlands: Benjamins.
Fauth, C., Bonneau, A., Zimmerer, F., Trouvain, J., Andreeva, B., Colotte, V., Fohr, D., Jouvet, D., Jügler, J., Laprie, Y., Mella, O., & Möbius, B. (2014). Designing a Bilingual Speech Corpus for French and German Language Learners: A Two-Step Process. In: Proceedings of the LREC 2014, International Conference on Language Resources and Evaluation (pp. 1477–1482). Reykjavik, Iceland: European Language Resources Association.
Gilquin, G., & Granger, S. (2015). Learner language. In: Douglas Biber & Randi Reppen, The Cambridge Handbook of English Corpus Linguistics, Cambridge University Press: Cambridge 2015, p.418-435.
Granger, S. (1993). The International Corpus of Learner English. In J. Aarts, P. de Haan, & N. Oostdijk (Eds.), English language corpora: Design, analysis and exploitation (pp. 57–69). Amsterdam, the Netherlands: Rodopi.
Granger, S. (1998). The computer learner corpus: A versatile new source of data for SLA research. In S. Granger (Ed.), Learner English on computer (pp. 3–18). London, UK: Longman.
Granger, S. (2003). Error-tagged Learner Corpora and CALL: A Promising Synergy. CALICO Journal, 20(3), 465–480. JSTOR.
Granger, S. (2004). Practical Applications of Learner Corpora. In Lewandowska-Tomaszczyk, Barbara (ed) Practical Applications in Language and Computers, Frankfurt: Peter Lang, 291–301.
Granger, S. (2004b). Computer Learner Corpus Research: Current Status and Future Prospects. In U. Connor & T. Upton (Eds.) Applied Corpus Linguistics: A multidimensional perspective (pp. 123–145). Amsterdam, the Netherlands: Rodopi.
Granger, S. (2008). Learner Corpora. In A. Ludeling & M. Kyto (Eds.), Corpus linguistics: An international handbook (pp. 259–275). Berlin, Germany: Walter de Gruyter.
Granger, S. (2002). A bird’s-eye view of computer learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3–33). Amsterdam, the Netherlands: Benjamins.
Granger, S. (2003). The International Corpus of Learner English: A New Resource for Foreign Language Learning and Teaching and Second Language Acquisition Research. TESOL Quarterly, 37(3), 538–546.
Granger, S. (2012). How to use foreign and second language learner corpora? In Mackey, A. &Gass, S.G. (Eds.), A Guide to Research Methods in Second Language Acquisition (pp.7‐29). Basil: Blackwell.
Granger, S., Gilquin, G., & Meunier, F. (2013). Twenty Years of Learner Corpus Research. Looking Back, Moving Ahead: Proceedings of the First Learner Corpus Research Conference (LCR 2011). Presses universitaires de Louvain.
Granger, S. & Paquot, M. (2010). The Louvain English for Academic Purposes dictionary. In S. Granger & M. Paquot (Eds.) eLexicography in the 21st century: New applications, new challenges. Proceedings of eLEX2009. Cahiers du Cental 7. Louvain-la-Neuve: Presses universitaires de Louvain, 87-96.
Granger, S., & Dumont, A. (2014). Learner corpora around the world. UCL Centre for English Corpus Linguistics. Accessed November, 30, 2019.
Hunston, S. (2002). Corpora in Applied Linguistics. Cambridge: Cambridge
University Press.
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlỳ, P., & Suchomel, V. (2014). The sketch engine: Ten years on. Lexicography 1: 7–36.
Koester, A. (2010). Building small specialised corpora. In A. O‘Keeffe and M. McCarthy (Eds.) Routledge Handbook of Corpus Linguistics (pp.66-79). London: Routledge.
Leech, G. (1991). The state of the art in corpus linguistics. In Aijmer, K. and Altenberg, B. (eds.), English Corpus Linguistics: Studies in honour of Jan Svartvik. Longman, London, pp. 8 – 29.
McEnery, T., & Hardie, A. (2011). Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press.
McEnery, T., & Wilson, A. (2001). Corpus Linguistics: An Introduction. Edinburgh: Edinburgh University Press.
Nelson, M. B. (2000). Corpus-based study of the lexis of business English and business English teaching materials. University of Manchester. Retrieved from https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488069
Nesselhauf, N. (2004). Learner corpora: Learner corpora and their potential for language teaching. In J. Sinclair (Ed.), How to use corpora in language teaching (pp. 125–152). Amsterdam, the Netherlands: Benjamins.
Pravec, N. A. (2002). Survey of learner corpora. ICAME Journal, 26(1), 8–14.
Seals, C. A., & Shah, S. (2017). Heritage Language Policies around the World. London: Routledge.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford University Press.
Swan, M., & Smith, B. (2001). Learner English: A teacher’s guide to interference and other problems. Cambridge University Press.
Thompson, P. A. (2005). Spoken language corpora. In M. Wynne (Ed.), Developing linguistic corpora: A guide to good practice (pp. 59–70). Oxford, UK: Oxbow Books.
Wen, Q. (2006). Chinese learner corpora and second language research. Paper presented at the 2006 International Symposium of Computer-Assisted Language Learning, Beijing, China.