Despite a growing body of work exploring the use of construction grammar for natural language understanding, progress remains difficult to evaluate, for individual systems as well as the field as a whole.

One way to spur practical progress is to create a shared benchmark measuring the phenomena targeted by construction grammar. This short survey seeks input on questions relevant to developing such a benchmark:
  • What phenomena of human language understanding should be included?
  • What kinds of evaluation metrics are suitable for measuring these phenomena?
  • What applications are relevant to evaluating performance on these phenomena?
  • What common annotation standards, if any, might be adopted as part of a benchmark?
Your input is much appreciated!

This survey is designed in connection with the AAAI 2017 Spring Symposium on Computational Construction Grammar and Natural Language Understanding and will facilitate planning for the Wednesday morning session (3/27/17). We welcome responses from symposium participants as well as others interested in the construction grammar community.