BEA Newsletter #8

8 minute read

Hi all, hope you are having a nice start to the new year. The 8th BEA newsletter contains the following:

  • BEA10 Announcements
  • Upcoming EduNLP Conferences
  • Job Openings (via Hwee Tou Ng)
  • Recent EduNLP publications
  • Resources

I’d like to thank our BEA Newsletter volunteers: Ekaterina Kochmar, Ildiko Pilan, Somwya V. B. and Helen Yannakoudakis for once again massively assisting in the writing of this newsletter. Thanks!

As always, if you know of any corpora, resources, tools, pubs, conferences, job postings, etc. that would be good to have on the newsletter, please let us know and they’ll go in the next one.

Finally, at the NLP4CALL Workshop this past month, they put out a survey to get a handle on the research interests and goals people had for the field. If you have a moment, please fill it out.

Joel & BEA Friends

BEA10 Announcements

We’re getting excited for the 10th edition of the BEA! We are one of only a handful of workshops in the *CL universe that have made it to a tenth edition which is a wonderful achievement. To add to it, we had a record number of submissions to BEA10, and the workshop is looking like it might be one of the largest in its history. We’re looking forward to seeing you in Denver on Thursday, June 04. Some notes for people attending and presenting:

  • You can find the workshop schedule (and thus list of accepted papers) here.

  • We will be having our world-famous post-workshop dinner following the workshop that night. Please consider sticking around Denver another day to attend!

  • For presenters, we will be following the NAACL2015 instructions, so please check them out. The only differences are that 1) short paper oral presentations get 20min (15min content, 5min Q/A), and 2) the presentations will not be video’d. Instructions are here.

  • We are fortunate to have many sponsors this year. This goes towards our free workshop t-shirt (free with registration) and defrays the cost of dinner for students. The gold level sponsors are Appen, McGraw-Hill Education/CTB, Educational Testing Service, Grammarly, Turnitin Lightside Labs, Pacific Metrics and Pearson. And American Institutes for Research is a silver level sponsor!

  • Yes we have a whole new design for the BEA T-shirts. You will get a shirt when you go up to register. Go early to get the size you want!

Upcoming EduNLP Conferences and Workshops

There are several EduNLP events in 2015! We’ve broken things into ones where the deadlines have passed but the event is still upcoming, and those in which the deadlines are still open.

Open Deadlines (sorted by conference deadline)

  • Workshop on Arabic Natural Language Processing (WANLP 2015): Including a Shared Task on Automatic Arabic Error Correction http://www.arabic-nlp.net/ (paper submission deadline: May 14; date: July 30, location: Beijing, China)

  • SLaTE Workshop on L1 Teaching, Learning and Technology: https://sites.google.com/site/l1teachingandtechnology/ (deadline extension: May 25, dates: September 03; location: Leipzig, Germany)

  • 6th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT) http://www.slpat.org/slpat2015/ (deadline: June 08; date: September 11; location: Dresden, Germany )

  • Workshop on Vision and Language Integration 2015 (VL’15): https://sites.google.com/site/vl15workshop/ (submission deadline: June 28; dates: September 18-19; location: Lisbon, Portugal)

Past Deadlines (sorted by conference date)

Job Openings

The NUS Natural Language Processing Group is looking for suitable candidates to fill the following research positions:

Research Fellows

The initial appointment is for a period of one year, with possible extension subject to research funding availability. The candidates will join at the rank of Research Fellow, at a starting monthly salary of around S$5,500, depending on qualifications and experience. Candidates with expertise in one of the following areas are encouraged to apply:

(a) Grammatical error correction: Detection and correction of grammatical errors of second language learners of English.

(b) Statistical machine translation: Translation of human languages, with focus on translation between Chinese and English.

A candidate should have a PhD in computer science specializing in natural language processing or a related discipline, with prior research experience and publications, preferably in the focus area of research. Strong programming skills and good command of English (both spoken and written) are required. Expertise in machine learning is valuable.

Research Assistants

Multiple positions are available in the research areas of grammatical error correction and statistical machine translation. The initial appointment is for a period of one year, with possible extension subject to research funding availability. The candidates will join at the rank of Research Assistant, at a starting monthly salary of around S$3,400, depending on qualifications and experience. A candidate should possess a good honors degree or equivalent in computer science or a related discipline. Strong programming skills and good command of English (both spoken and written) are required. Candidates interested in pursuing research leading to a PhD degree in computer science at NUS with specialization in natural language processing are preferred. Interested candidates for the above positions please apply with a cover letter, CV, academic transcripts, names and email addresses of 3 referees.

Please email the application materials to: Professor Hwee Tou Ng (Email: nght AT comp DOT nus DOT edu DOT sg). Professor Hwee Tou Ng Department of Computer Science National University of Singapore Home Page: http://www.comp.nus.edu.sg/~nght

Recent Publications in EduNLP

Journals and Tech Reports (found via Google Scholar)

  • Designing a Reading Material Recommendation System for EFL Learners. Chin-Hwa Kuo and Chen-Chung Chi. Journal of Applied Science and Engineering, Vol. 17, No. 4, pp. 371-382 (2014) [Link]

  • Evaluation Methods for Intelligent Tutoring Systems Revisited. Jim Greer, Mary Mark. International Journal of Artificial Intelligence in Education, April 2015 http://link.springer.com/article/10.1007/s40593-015-0043-2

  • Analyzing and Comparing Reading Stimulus Materials Across the TOEFL® Family of Assessments. Jing Chen and Kathleen M. Sheehan. ETS Research Report Series http://onlinelibrary.wiley.com/doi/10.1002/ets2.12055/full

  • Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality. Weiwei Yang, Xiaofei Lu, Sara Cushing Weigle. Journal of Second Language Writing, Vol 28, 2015. http://www.sciencedirect.com/science/article/pii/S1060374315000053

  • An Investigation of Native and Nonnative English Speakers’ Levels of Written Syntactic Complexity in Asynchronous Online Discussions. Rae L. Mancilla, Nihat Polat and Ahmet O. Akcay. Applied Linguistics (2015). http://applij.oxfordjournals.org/content/early/2015/04/15/applin.amv012.short

NAACL 2015

  • Predicting the Difficulty of Language Proficiency Tests. Lisa Beinborn, Torsten Zesch and Iryna Gurevych. Transactions of the Association for Computational Linguistics, vol. 2, pp. 517–529. https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/414/88

  • Towards a standard evaluation method for grammatical error detection and correction. Mariano Felice and Ted Briscoe. http://www.cl.cam.ac.uk/~mf501/pub/docs/2015-naacl.pdf

  • Aligning Sentences from Standard Wikipedia to Simple Wikipedia. William Hwang, Hannaneh Hajishirzi, Mari Ostendorf and Wei Wu. http://ssli.ee.washington.edu/~hannaneh/papers/simplification.pdf

  • Semantic Grounding in Dialogue for Complex Problem Solving. Xiaolong Li and Kristy Boyer http://research.csc.ncsu.edu/learndialogue/pdf/Li%20Boyer%20NAACL%202015_Camera%20Ready_v8.pdf

  • Large-Scale Native Language Identification with Cross-Corpus Evaluation. Shervin Malmasi and Mark Dras.

  • Continuous Space Representations of Linguistic Typology and their Application to Phylogenetic Inference. Yugo Murawaki.

  • Building a State-of-the-Art Grammatical Error Correction System. Alla Rozovskaya and Dan Roth. Transactions of the Association for Computational Linguistics, 2 (2014) pp. 419–434. http://www.aclweb.org/anthology/Q14-1033

  • Effective Feature Integration for Automated Short Answer Scoring. Keisuke Sakaguchi, Michael Heilman and Nitin Madnani.

  • A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, Bill Dolan

  • Constraint-Based Models of Lexical Borrowing. Yulia Tsvetkov, Waleed Ammar, Chris Dyer. http://www.cs.cmu.edu/~ytsvetko/papers/loanwords.pdf

ACL 2015

  • Automatic Spontaneous Speech Grading: A Novel Feature Derivation Technique Using the Crowd. Vinay Shashidhar, Nishant Pandey and Varun Aggarwal.

  • How Far Are We from Fully Automatic High Quality Grammatical Error Correction? Christopher Bryant and Hwee Tou Ng.

  • Modeling Argument Strength in Student Essays. Isaac Persing and Vincent Ng.

  • Efficient Disfluency Detection with Transition-Based Parsing. Shuangzhi Wu, Dongdong Zhang, Ming Zhou and Tiejun Zhao.

  • Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken Language Understanding. Yun-Nung Chen, William Yang Wang, Anatole Gershman and Alexander Rudnicky.

  • A Computationally Efficient Algorithm for Learning Topical Collocation Models. Zhendong Zhao, Lan Du, Benjamin Börschinger, John K Pate and Mark Johnson.

  • A Generalisation of Lexical Functions for Composition in Distributional Semantics. Antoine Bride, Tim Van de Cruys and Nicholas Asher.

  • Adding Semantics to Data-Driven Paraphrasing. Ellie Pavlick, Johan Bos, Malvina Nissim, Charley Beller, Benjamin Van Durme and Chris Callison-Burch.

  • Improving Evaluation of Machine Translation Quality Estimation. Yvette Graham.

  • Learning Answer-Entailing Structures for Machine Comprehension. Mrinmaya Sachan, Kumar Dubey, Matthew Richardson and Eric Xing.

  • Vector-Space Calculation of Semantic Surprisal for Predicting Word Pronunciation Duration. Asad Sayeed, Stefan Fischer and Vera Demberg.

  • A Strategic Reasoning Model for Generating Alternative Answers. Jon Stevens, Anton Benz, Sebastian Reusse and Ralf Klabunde.

  • A Unified Kernel Approach for Learning Typed Sentence Rewritings. Martin Gleize and Brigitte Grau.

CICLing 2015

  • Feature Analysis for Native Language Identification. Nisioi, S.

  • Question Analysis for a Closed Domain Question Answering System. Derici, C., Çelik, K., Kutbay, E., Aydın, Y., Güngör, T., Özgür, A., & Kartal, G.

  • A Readable Read: Automatic Assessment of Language Learning Materials based on Linguistic Complexity. Ildikó Pilán, Sowmya Vajjala and Elena Volodina.

  • What do our children read about? Affect analysis of school texts in Chile. Claudia Martinez, Alejandra Segura, Chistian Vidal-Castro, Jorge Fernandez and Clemente Rubio.

NLP4CALL Workshop

The complete proceedings of the workshop are available online.

The workshop organizers also prepared a small pre-workshop survey that anyone interested in ICALL is welcome to fill in before or after the event.

List of papers

  • Misspellings in Responses to Listening Comprehension Questions and Their Phonetic Normalization to Account for Teachers’ Scores. Heike Da Silva Cardoso and Magdalena Wolska.

  • Taking the Danish Speech Trainer from CALL to ICALL. Peter Juel Henrichsen.

  • Using Shallow Syntactic Features to Measure Influences of L1 and Proficiency Level in EFL Writings. Andrea Horbach, Jonathan Poitz and Alexis Palmer.

  • Semi-automated typical error annotation for learner English essays: integrating frameworks. Andrey Kutuzov and Elizaveta Kuzmenko.

  • Short Answer Grading: When Sorting Helps and When it Doesn’t. Ulrike Pado and Cornelia Kiefer.

  • Oahpa! Õpi! Opiq! Developing free online programs for learning Estonian and Võro. Heli Uibo, Jaak Pruulmann-Vengerfeldt, Jack Rueter and Sulev Iva.

Resources

A Python implementation of the I-measure, a metric used for evaluating grammatical error correction systems, as presented in Towards a standard evaluation method for grammatical error detection and correction by Mariano Felice and Ted Briscoe.