Educational NLP at LREC 2022

4 minute read

Proceedings: http://www.lrec-conf.org/proceedings/lrec2022/index.html

Topics: automated item generation (AIG) argument-mining complexity corpora dialogue discourse feedback grammatical error correction (GEC) grammatical error detection (GED) language-learning programming readability speech text-simplification vocabulary-acquisition writing-assistants

  • Generating Questions from Wikidata Triples. Kelvin Han, Thiago Castro Ferreira and Claire Gardent [paper] AIG
  • CEPOC: The Cambridge Exams Publishing Open Cloze dataset. Mariano Felice, Shiva Taslimipoor, Øistein E. Andersen and Paula Buttery [paper] AIG corpora
  • JADE: Corpus for Japanese Definition Modelling. Han Huang, Tomoyuki Kajiwara and Yuki Arase [paper] AIG corpora
  • Argument Similarity Assessment in German for Intelligent Tutoring: Crowdsourced Dataset and First Experiments. Xiaoyu Bai and Manfred Stede [paper] argument-mining corpora
  • Subjective Text Complexity Assessment for German. Laura Seiffe, Fares Kallel, Sebastian Möller, Babak Naderi and Roland Roller [paper] complexity
  • CTAP for Chinese:A Linguistic Complexity Feature Automatic Calculation Platform. Yue Cui, Junhui Zhu, Liner Yang, Xuezhi Fang, Xiaobin Chen, Yujie Wang and Erhong Yang [paper] complexity
  • EXPRES Corpus for A Field-specific Automated Exploratory Study of L2 English Expert Scientific Writing. Ana-Maria Bucur, Madalina Chitez, Valentina Muresan, Andreea Dinca and Roxana Rogobete [paper] complexity corpora
  • Morphological Complexity of Children Narratives in Eight Languages. Gordana Hržica, Chaya Liebeskind, Kristina Š. Despot, Olga Dontcheva-Navratilova, Laura Kamandulytė-Merfeldienė, Sara Košutar, Matea Kramarić and Giedrė Valūnaitė Oleškevičienė [paper] complexity language-learning
  • MOTIF: Contextualized Images for Complex Words to Improve Human Reading. Xintong Wang, Florian Schneider, Özge Alacam, Prateek Chaudhury and Chris Biemann [paper] complexity readability
  • RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions. Justine Reverdy, Sam O’Connor Russell, Louise Duquenne, Diego Garaialde, Benjamin R. Cowan and Naomi Harte [paper] corpora dialogue
  • The TalkMoves Dataset: K-12 Mathematics Lesson Transcripts Annotated for Teacher and Student Discursive Moves. Abhijit Suresh, Jennifer Jacobs, Charis Harty, Margaret Perkoff, James H. Martin and Tamara Sumner [paper] corpora dialogue
  • Dataset and Baseline for Automatic Student Feedback Analysis. Missaka Herath, Kushan Chamindu, Hashan Maduwantha and Surangika Ranathunga [paper] corpora feedback
  • LeSpell - A Multi-Lingual Benchmark Corpus of Spelling Errors to Develop Spellchecking Methods for Learner Language. Marie Bexte, Ronja Laarmann-Quante, Andrea Horbach and Torsten Zesch [paper] corpora GEC
  • Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction. Daisuke Suzuki, Yujin Takahashi, Ikumi Yamashita, Taichi Aida, Tosho Hirasawa, Michitaka Nakatsuji, Masato Mita and Mamoru Komachi [paper] corpora GEC
  • ProQE: Proficiency-wise Quality Estimation dataset for Grammatical Error Correction. Yujin Takahashi, Masahiro Kaneko, Masato Mita and Mamoru Komachi [paper] corpora GEC
  • The Tembusu Treebank: An English Learner Treebank. Luís Morgado da Costa, Francis Bond and Roger V. P. Winder [paper] corpora GED
  • MuLVE, A Multi-Language Vocabulary Evaluation Data Set. Anik Jacobsen, Salar Mohtaj and Sebastian Möller [paper] corpora language-learning
  • LaVA – Latvian Language Learner corpus. Roberts Darģis, Ilze Auziņa, Inga Kaija, Kristīne Levāne-Petrova and Kristīne Pokratniece [paper] corpora language-learning
  • Semi-automatically Annotated Learner Corpus for Russian. Anisia Katinskaia, Maria Lebedeva, Jue Hou and Roman Yangarber [paper] corpora language-learning
  • LeConTra: A Learner Corpus of English-to-Dutch News Translation. Bram Vanroy and Lieve Macken [paper] corpora language-learning
  • Representing the Toddler Lexicon: Do the Corpus and Semantics Matter? Jennifer Weber and Eliana Colunga [paper] corpora language-learning
  • ChiSense-12: An English Sense-Annotated Child-Directed Speech Corpus. Francesco Cabiddu, Lewis Bott, Gary Jones and Chiara Gambi [paper] corpora language-learning
  • Predicting the Proficiency Level of Nonnative Hebrew Authors. Isabelle Nguyen and Shuly Wintner [paper] corpora language-learning
  • The Hebrew Essay Corpus. Chen Gafni, Anat Prior and Shuly Wintner [paper] corpora language-learning
  • Image Description Dataset for Language Learners. Kento Tanaka, Taichi Nishimura, Hiroaki Nanjo, Keisuke Shirai, Hirotaka Kameko and Masatake Dantsuji [paper] corpora language-learning
  • Dataset of Student Solutions to Algorithm and Data Structure Programming Assignments. Fynn Petersen-Frey, Marcus Soll, Louis Kobras, Melf Johannsen, Peter Kling and Chris Biemann [paper] corpora programming
  • NyLLex: A Novel Resource of Swedish Words Annotated with Reading Proficiency Level. Daniel Holmer and Evelina Rennes [paper] corpora readability
  • The Copenhagen Corpus of Eye Tracking Recordings from Natural Reading of Danish Texts. Nora Hollenstein, Maria Barrett and Marina Björnsdóttir [paper] corpora readability
  • Reading Time and Vocabulary Rating in the Japanese Language: Large-Scale Japanese Reading Time Data Collection Using Crowdsourcing. Masayuki Asahara [paper] corpora readability
  • Building Large-Scale Japanese Pronunciation-Annotated Corpora for Reading Heteronymous Logograms. Fumikazu Sato, Naoki Yoshinaga and Masaru Kitsuregawa [paper] corpora speech
  • Klexikon: A German Dataset for Joint Summarization and Simplification. Dennis Aumiller and Michael Gertz [paper] corpora text-simplification
  • Simple TICO-19: A Dataset for Joint Translation and Simplification of COVID-19 Texts. Matthew Shardlow and Fernando Alva-Manchego [paper] corpora text-simplification
  • ALEXSIS: A Dataset for Lexical Simplification in Spanish. Daniel Ferrés and Horacio Saggion paper] corpora text-simplification
  • CWID-hi: A Dataset for Complex Word Identification in Hindi Text. Gayatri Venugopal, Dhanya Pramod and Ravi Shekhar [paper] corpora text-simplification
  • TallVocabL2Fi: A Tall Dataset of 15 Finnish L2 Learners’ Vocabulary. Frankie Robertson, Li-Hsin Chang and Sini Söyrinki [paper] corpora vocabulary-acquisition
  • Automating Idea Unit Segmentation and Alignment for Assessing Reading Comprehension via Summary Protocol Analysis. Marcello Gecchele, Hiroaki Yamada, Takenobu Tokunaga, Yasuyo Sawaki and Mika Ishizuka [paper] discourse readability
  • Misspelling Semantics in Thai. Pakawat Nakwijit and Matthew Purver [paper] GEC
  • English Language Spelling Correction as an Information Retrieval Task Using Wikipedia Search Statistics. Kyle Goslin and Markus Hofmann [paper] GEC
  • On the Robustness of Cognate Generation Models. Winston Wu and David Yarowsky [paper] GEC
  • Developing a Spell and Grammar Checker for Icelandic using an Error Corpus. Hulda Óladóttir, Þórunn Arnardóttir, Anton Ingason and Vilhjálmur Þorsteinsson [paper] GEC
  • Enriching Grammatical Error Correction Resources for Modern Greek. Katerina Korre and John Pavlopoulos [paper] GEC
  • Automatic Classification of Russian Learner Errors. Alla Rozovskaya [paper] GEC
  • COSMOS: Experimental and Comparative Studies of Concept Representations in Schoolchildren. Jeanne Villaneau and Farida SAID [paper] language-learning
  • Perceived Text Quality and Readability in Extractive and Abstractive Summaries. Julius Monsen and Evelina Rennes [paper] readability
  • FABRA: French Aggregator-Based Readability Assessment toolkit. Rodrigo Wilkens, David Alfter, Xiaoou Wang, Alice Pintard, Anaïs Tack, Kevin P. Yancey and Thomas François [paper] readability
  • AiRO - an Interactive Learning Tool for Children at Risk of Dyslexia. Peter Juel Henrichsen and Stine Fuglsang Engmose [paper] readability
  • Trends, Limitations and Open Challenges in Automatic Readability Assessment Research. Sowmya Vajjala [paper] readability
  • 2nd Workshop on Tools and Resources for People with REAding DIfficulties (READI) [proceedings] readability text-simplification
  • MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases. Louis Martin, Angela Fan, Éric de la Clergerie, Antoine Bordes and Benoît Sagot [paper] text-simplification
  • HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French. Amalia Todirascu, Rodrigo Wilkens, Eva Rolin, Thomas François, Delphine Bernhard and Núria Gala [paper] text-simplification
  • One Document, Many Revisions: A Dataset for Classification and Description of Edit Intents. Dheeraj Rajagopal, Xuchao Zhang, Michael Gamon, Sujay Kumar Jauhar, Diyi Yang and Eduard Hovy [paper] writing-assistants