16th Workshop on Innovative Use of NLP for Building Educational Applications


Quick Info
Co-located with EACL 2021
Location Online
Deadline Monday, January 18, 2021 11:59pm UTC-12 (anywhere on earth)
Monday, January 25, 2021 11:59pm UTC-12 (extended)
Date Tuesday, April 20, 2021
Organizers Jill Burstein, Andrea Horbach, Ekaterina Kochmar, Ronja Laarmann-Quante, Claudia Leacock, Nitin Madnani, Ildikó Pilán, Helen Yannakoudakis, and Torsten Zesch
Contact bea.nlp.workshop@gmail.com

Workshop Description

In its 16th year, the BEA Workshop continues to be a leading venue for NLP innovation for educational applications, and one of the largest one-day workshops in the ACL community. The workshop’s continuous growth illustrates an alignment between societal need and technological advances. The COVID-19 pandemic brings a necessary expansion for remote learning across the educational space from early primary education through secondary, adult education, and for workforce learning. BEA2021 will have a sub-theme and will welcome research that can support remote and hybrid learning contexts.

NLP capabilities now support an array of activities for learning domain knowledge, including writing, speaking, reading, science, and mathematics, and the related intra- (e.g., self-confidence) and inter-personal (e.g., collaboration) domains that support achievement across learning domains. Within these domains, the community continues to develop and deploy innovative NLP approaches for use in educational settings. In the writing and speech domains, automated writing evaluation (AWE) and speech scoring applications, respectively, are commercially deployed for high-stakes assessments and for formative, instructional assessments in classrooms and online learning contexts. The current educational and assessment landscape in K-12, higher education, and adult learning (in academic and workplace settings) fosters a strong interest in technologies that yield user log data and analytics that support proficiency measures for complex constructs across learning domains. For writing, there is a focus on innovation that supports writing tasks requiring source use, argumentative discourse, and factual content accuracy. For speech, there is an interest in advancing automated scoring to include the evaluation of discourse and content features in responses to spoken assessments. General advances in speech technology have renewed interest in spoken dialog and multimodal systems for instruction and assessment, for instance, for workplace interviews and & simulated teaching environments. The explosive growth of mobile applications for game-based and simulation applications for instruction and assessment is another place where NLP can play a large role, especially in language learning. Due to the immediate need for greater online learning and resources, we expect that NLP technology for education will broaden its reach and expand into the essential personalized learning space with text, speech, and multimodal applications.

The 16th BEA workshop will have oral presentation sessions and a large poster session to maximize the amount of original work presented. The workshop will continue to expose the NLP community to technologies that identify novel opportunities for the use of NLP in education in English, and languages other than English. The workshop will solicit both full and short papers for either oral or poster presentation. We will solicit papers that incorporate NLP methods, including, but not limited to: automated scoring of open-ended textual and spoken responses; game-based instruction and assessment; educational data mining; intelligent tutoring; peer review, grammatical error detection; learner cognition; spoken dialog; multimodal applications; tools for teachers & test developers; and use of corpora. Research that incorporates NLP methods for use with mobile and game-based platforms will be of special interest. We also plan to hold an expert panel to address how NLP technology can support urgent issues in education brought on by the COVID-19 pandemic.

Suggested topics are listed below. We also welcome and urge submission of work in any of the suggested topics that addresses personalized learning. Papers with research that specifically addresses the urgent needs in education due to the pandemic, but also make sense in post-pandemic educational contexts are especially welcome.

Automated scoring/evaluation for written student responses

  • Content analysis for scoring/assessment
  • Grammatical error detection and correction
  • Argumentation, discourse, sentiment, stylistic analysis, & non-literal language
  • Plagiarism detection
  • Non-traditional genres (beyond essays)
  • Interest, motivation, and values in writing tasks

Intelligent Tutoring (IT), Collaborative Learning Environments

  • Educational Data Mining: Collection of user log data from educational applications
  • Game-based learning
  • Multimodal communication (including dialog systems) between students and computers
  • Knowledge representation & concept visualization in learning systems

Learner cognition

  • Assessment of learners’ language and cognitive skill levels
  • Systems that detect and adapt to learners’ cognitive or emotional states
  • Tools for learners with special needs

Use of corpora in educational tools

  • Data mining of learner and other corpora for tool building
  • Annotation standards and schemas / annotator agreement

Tools and applications for classroom teachers and/or test developers

  • NLP tools for second and foreign language learners
  • Semantic-based access to instructional materials to identify appropriate texts
  • Tools that automatically generate test questions & for curriculum development
  • Processing of and access to lecture materials across topics and genres
  • Adaptation of instructional text to individual learners’ grade levels


  1. Note that BEA 2021 will be fully virtual just like EACL 2021.
  2. All times are 24 hours CET.
April 20, 2021 (All times are in CET; For EDT, subtract 6 hours)
10:10–10:20 Opening Remarks
10:20–11:00 Paper Session 1
10:20–10:40 Parsing Argumentative Structure in English-as-Foreign-Language Essays. Jan Wira Gotama Putra, Simone Teufel, and Takenobu Tokunaga.
10:40–11:00 Employing distributional semantics to organize task-focused vocabulary learning. Haemanth Santhi Ponnusamy, and Detmar Meurers.
11:00–11:30 Break
11:30–12:10 Paper Session 2
11:30–11:50 Negation Scope Resolution for Chinese as a Second Language. Mengyu Zhang, Weiqi Wang, Shuqiao Sun, and Weiwei Sun.
11:50–12:10 On the Application of Transformers for Estimating the Difficulty of Multiple-choice Questions from Text. Luca Benedetto, Giovanni Aradelli, Paolo Cremonesi, Andrea Cappelli,Andrea Giussani, and Roberto Turrin.
12:10–13:30 Poster Session 1
  Character Set Construction for Chinese Language Learning. Chak Yan Yeung and John Lee.
  Training and Domain Adaptation for Supervised Text Segmentation. Goran Glavaš, Ananya Ganesh, and Swapna Somasundaran.
  Data Strategies for Low-Resource Grammatical Error Correction. Simon Flachs, Felix Stahlberg, and Shankar Kumar.
  Towards a Data Analytics Pipeline for the Visualisation of Complexity Metrics in L2 writings. Thomas Gaillat, Anas Knefati, and Antoine Lafontaine.
  Estonian as a Second Language Teacher’s Tools. Tiiu Üksik, Jelena Kallas, Kristina Koppel, Katrin Tsepelina, and Raili Pool.
  Assessing Grammatical Correctness in Language Learning. Anisia Katinskaia and Roman Yangarber.
  Automated Classification of Written Proficiency Levels on the CEFR-Scale through Complexity Contours and RNNs. Elma Kerz, Daniel Wiechmann, Yu Qiao, Emma Tseng, and Marcus Ströbel.
  Using Linguistic Features to Predict the Response Process Complexity Associated with Answering Clinical MCQs. Victoria Yaneva, Daniel Jurich, Le An Ha, and Peter Baldwin.
13:30–15:00 Break
15:00–15:40 Paper Session 3
15:00–15:20 Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models. Felix Stahlberg and Shankar Kumar.
15:20–15:40 "Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays. Tariq Alhindi and Debanjan Ghosh.
15:40–16:50 Poster Session 2
  Text Simplification by Tagging. Kostiantyn Omelianchuk, Vipul Raheja, and Oleksandr Skurzhanskyi.
  Broad Linguistic Complexity Analysis for Greek Readability Classification. GSavvas Chatzipanagiotidis, Maria Giagkou, and Detmar Meurers.
  Identifying Negative Language Transfer in Learner Errors Using POS Information. Leticia Farias Wanderley and Carrie Demmans Epp.
  Document-level Grammatical Error Correction. Zheng Yuan and Christopher Bryant.
  Essay Quality Signals as Weak Supervision for Source-based Essay Scoring. Haoran Zhang and Diane Litman.
  Automatically Generating Cause-and-Effect Questions from Passages. Katherine Stasaski, Manav Rathod, Tony Tu, Yunfang Xiao, and Marti A. Hearst.
  Interventions Recommendation: Professionals' Observations Analysis in Special Needs Education. Javier Muñoz and Felipe Bravo-Marquez.
  C-Test Collector: A Proficiency Testing Application to Collect Training Data for C-Tests. Christian Haring, Rene Lehmann, Andrea Horbach, and Torsten Zesch.
  Virtual Pre-Service Teacher Assessment and Feedback via Conversational Agents. Debajyoti Datta, Maria Phillips, James P. Bywater, Jennifer Chiu, Ginger S. Watson, Laura Barnes, and Donald Brown.
16:50–17:50 Discussion Panel
17:50–18:00 Closing Remarks


In 2021, we will be hosting a virtual 1-hour panel on New Challenges for Educational Technology in the Time of the Pandemic that will cover such questions as (1) What areas / subjects / aspects of education will be most affected by the pandemic? (2) How can AI / educational technology help tackle these issues? (3) What long-term consequences will the current situation have on education? We will have 4 panelists:

  1. Prof. Dr Gaëlle Molinari (Faculty of Psychology, Swiss Distance University Institute)
  2. Prof. Carolyn Rosé (Language Technologies Institute and HCI Institute, Carnegie Mellon University)
  3. Prof. Diane Litman (Learning Research & Development Center and Department of Computer Science, University of Pittsburgh)
  4. Burr Settles (Director of Research, Duolingo)

Important Dates

Note: these dates are still preliminary and may change. All deadlines are 11:59pm UTC-12 (anywhere on earth).

  • Submission Deadline: Thursday, Januuary 18, 2021 Monday, January 25, 2021
  • Notification of Acceptance: Thursday, February 18, 2021 Thursday, February 25, 2021
  • Camera-ready Papers Due: Thursday, March 4, 2021
  • Workshop: Tuesday, April 20, 2021

Attending the Workshop

The BEA 2021 workshop is co-located with the 2021 Conference of the European Chapter of the Association for Computational Linguistics or EACL 2021. To attend the workshop, you need to register on the conference website and choose the BEA workshop on the registration form.

EACL 2021 workshop poster sessions will be hosted by Virtual Chair and will take place on the gather.town platform. The poster sessions for BEA will take place in Room 3 on the EACL 2021 gather.town. For more details, refer to the attendee guide.

The Virtual Chair help desk staff will be monitoring the support inbox during the event, so any technical or troubleshooting questions from attendees should be sent to help+eacl-2021@virtualchair.net instead of the workshop organizers to ensure the fastest possible response.

Submission Information

We will be using the EACL Submission Guidelines for the BEA Workshop this year. Authors are invited to submit a full paper of up to eight (8) pages of content, plus unlimited references; final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. We also invite short papers of up to of up to four (4) pages of content, plus unlimited references. Upon acceptance, short papers will be given five (5) content pages in the proceedings. Authors are encouraged to use this additional page to address reviewers’ comments in their final versions. Papers which describe systems are also invited to give a demo of their system. If you would like to present a demo in addition to presenting the paper, please make sure to select either “full paper + demo” or “short paper + demo” under “Submission Category” in the START submission page.

Previously published papers cannot be accepted. The submissions will be reviewed by the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author’s identity, e.g., “We previously showed (Smith, 1991) …”, should be avoided. Instead, use citations such as “Smith previously showed (Smith, 1991) …”.

We have also included conflict of interest in the submission form. You should mark all potential reviewers who have been authors on the paper, are from the same research group or institution, or who have seen versions of this paper or discussed it with you.

We have also included conflict of interest in the submission form. You should mark all potential reviewers who have been authors on the paper, are from the same research group or institution, or who have seen versions of this paper or discussed it with you.

We will be using the START conference system to manage submissions: https://www.softconf.com/eacl2021/bea2021/

Double Submission Policy

We will follow the official ACL double-submission policy. Specificially:

Papers being submitted both to BEA and another conference or workshop must:

  • Note on the title page the other conference or workshop to which they are being submitted.
  • State on the title page that if the authors choose to present their paper at BEA (assuming it was accepted), then the paper will be withdrawn from other conferences and workshops.

Organizing Committee

Program Committee

  • Tazin Afrin, University of Pittsburgh
  • David Alfter, University of Gothenburg
  • Jason Angel, Instituto Politécnico Nacional
  • Piper Armstrong, Brigham Young University
  • Timo Baumann, Universität Hamburg
  • Lee Becker, Pearson
  • Kay Berkling, DHBW Cooperative State University Karlsruhe
  • Chris Bryant, University of Cambridge
  • Guanliang Chen, Monash University
  • Mei-Hua Chen, Department of Foreign Languages and Literature
  • Leshem Choshen, Hebrew University of Jerusalem
  • Mark Core, University of Southern California
  • Scott Crossley, Georgia State University
  • Orphée De Clercq, LT3, Ghent University
  • Kordula De Kuthy, Tübingen University
  • Carrie Demmans Epp, University of Alberta
  • Ann Devitt, Trinity College, Dublin
  • Mariano Felice, University of Cambridge
  • Michael Flor, Educational Testing Service
  • Thomas François, Université catholique de Louvain
  • Jennifer-Carmen Frey, Eurac Research
  • Michael Gamon, Microsoft Research
  • Cyril Goutte, National Research Council Canada
  • Masato Hagiwara, Octanove Labs
  • Jiangang Hao, Educational Testing Service
  • Marti Hearst, University of California, Berkeley
  • Trude Heift, Simon Fraser University
  • Heiko Holz, LEAD Graduate School & Research Network
  • Chung-Chi Huang, Frostburg State University
  • Yi-Ting Huang, Academia Sinica
  • Radu Tudor Ionescu, University of Bucharest
  • Elma Kerz, RWTH Aachen University
  • Fazel Keshtkar, St. John’s University
  • Mamoru Komachi, Tokyo Metropolitan University
  • Ji-Ung Lee, UKP Lab, TU Darmstadt
  • Diane Litman, University of Pittsburgh
  • Zitao Liu, TAL Education Group
  • Peter Ljunglöf, University of Gothenburg; Chalmers University of Technology
  • Anastassia Loukina, Educational Testing Service
  • Fabiana MacMillan, Rosetta Stone
  • Lieve Macken, Ghent University
  • Montse Maritxalar, University of the Basque Country
  • James Martin, University of Colorado Boulder
  • Sandeep Mathias, IIT Bombay
  • Ditty Mathew, IIT Madras
  • Stephen Mayhew, Duolingo
  • Julie Medero, Harvey Mudd College
  • Detmar Meurers, University of Tübingen
  • Natawut Monaikul, University of Illinois at Chicago
  • Farah Nadeem, University of Washington
  • Diane Napolitano, Associated Press
  • Hwee Tou Ng, National University of Singapore
  • Huy Nguyen, LingoChamp
  • Ulrike Pado, HFT Stuttgart
  • Long Qin, Singsound Inc
  • Mengyang Qiu, University at Buffalo
  • Marti Quixal, University of Tübingen
  • Taraka Rama, University of North Texas
  • Lakshmi Ramachandran, A9 (Amazon)
  • Hanumant Redkar, IIT Bombay
  • Robert Reynolds, Brigham Young University
  • Brian Riordan, Educational Testing Service
  • Alla Rozovskaya, City University of New York
  • Helmer Strik, Radboud University Nijmegen
  • Jan Švec, University of West Bohemia
  • Anaïs Tack, UCLouvain & KU Leuven
  • Yuen-Hsien Tseng, National Taiwan University
  • Shalaka Vaidya, Research assistant
  • Sowmya Vajjala, National Research Council, Canada
  • Giulia Venturi, Institute for Computational Linguistics
  • Tatiana Vodolazova, University of Alicante
  • Elena Volodina, University of Gothenburg, Sweden
  • Shuting Wang, Facebook
  • Zarah Weiss, University of Tübingen
  • Michael White, The Ohio State University; Facebook AI
  • Alistair Willis, Open University, UK
  • Victoria Yaneva, NBME; University of Wolverhampton
  • Seid Muhie Yimam, University of Hamburg
  • Mingzhi Yu, University of Pittsburgh
  • Zheng Yuan, University of Cambridge
  • Klaus Zechner, Educational Testing Service
  • Fabian Zehner, DIPF, Leibniz Institute for Research and Information in Education