Blog Post #5
A lot of thoughts and coding happened this week. So let's get into it... two_models This is a branch that incorporates an off-the-shelf Coreference Resolution model during pre-processing and post-processing of the data. I'll go through what this branch of code does an then explain why I decided to make these decisions. The Coreference Resolution model that I'm using is from this repo which utilizes spaCy and Neural Networks to identify the pronouns and which noun the pronoun is referring to. Using this model, I run all the data that is used by the Pointer-Generator model to train; specifically, I run the reference summary (the summary that is written by a human of the given document). By running the training data through the Coreference Resolution model, I can utilize spaCy to identify clusters within the document (individual data). These clusters contain pretty much a dictionary of the noun and the pronouns/nouns that refer to the noun ( e.g...