Lecture 16 Coreference Resolution

#course #natural language processing #co-reference resolution #cs224n

Definitions

  • Coreference: mentions that refer to the same entity
  • anaphora: entities come before the word (VS. cataphora)
    • Barack Obama said he will sign the bill. In this case he refers to Barack Obama instead of linking to the entity directly
    • We wen to see a concert last night. The tickets were really expensive. (Bridging anaphora)

Applications

  • Machine Translation: especially for languages that drop pronouns
  • Dialogue Systems

Pipeline:

  • Mention Detection: pronouns/POS, named entities/NER, or noun phrases/NP
  • Coreference resolution

Model Architectures

  • Rule-based
  • Mention Pair
  • Mention Ranking: rank antecedents + NA/Dummy mention
  • End-to-end Neural Model:
    • word and character embeddings
    • Bi-LSTM
    • span detection + attention
    • calculate a coreference score for each span pair \(i\) and \(j\)
  • Clustering: agglomerative clustering

Features used in those models

  • Statistical features
  • Embeddings

Evaluation

  • B-cubed: average of precision and recall of all clusters and gold clusters