Posts

Final Post

Image
Introduction This project aims to explore the research done on automatic abstract text summarization and look for ways to improve the model. The topic of automatic text summarization was chosen because of my interest in cutting down time spent on reading materials that could be dry and dull on occasions. With the rise of popularity in Neural Networks and my previous knowledge in Machine Learning, I wanted to study and learn more about the development of automatic text summarization tools. While there are two fields of automatic text summarization (extractive and abstractive), I focus on abstractive approaches due to its higher difficulty and significance in creating a human-like summarization tool. Here is a general overview: Existing Work Automatic text summarization is considered a sequence-to-sequence problem (seq2seq); meaning, it's a prediction problem that takes a sequence as input and requires a sequence as output. For seq2seq problems, Recurrent Neural Networks...

Blog Post #6

Image
Last blog post ended a bit abruptly so let me recap and add onto some things discussed on the last blog post. The two_models branch  utilizes two models: an off-the-shelf Coreference Resolution model  and the Pointer-Generator model . The Coreference Resolution model is used during pre-processing and post-processing of the data. Before the training data is fed into the Pointer-Generator model to train the automatic summarization model, the data goes through the Coreference Resolution model where the model replaces all the pronouns with the noun that the pronouns are referring to  ( e.g.  The text  "Wayne saw himself."  becomes  "Wayne saw Wayne." ) and saves the noun and the pronouns  ( e.g.  { Wayne : [ he, his, him, student ]})  in local storage for later use. The pre-processed model is then used to train the summarization model through the Pointer-Generator model. When the model spits out a summary given an input, it looks for the s...

Blog Post #5

A lot of thoughts and coding happened this week. So let's get into it... two_models This is a branch that incorporates an off-the-shelf  Coreference Resolution  model during pre-processing and post-processing of the data. I'll go through what this branch of code does an then explain why I decided to make these decisions. The  Coreference Resolution  model that I'm using is from this repo  which utilizes spaCy and Neural Networks to identify the pronouns and which noun the pronoun is referring to. Using this model, I run all the data that is used by the Pointer-Generator model to train; specifically, I run the reference summary (the summary that is written by a human of the given document). By running the training data through the Coreference Resolution model, I can utilize spaCy to identify clusters  within the document (individual data). These clusters contain pretty much a dictionary of the noun and the pronouns/nouns that refer to the noun ( e.g...

Blog Post #4

From the feedback I've received about my previous blogs, I can see how confusing the posts might have been and how unrelated they are to my project. So let me take a few moments to address these issues. Regarding the topics discussed from the blog post titled Updated 9/23 , I talked about some of the common problems many automatic text summarization tools face. There have been many research and solutions to the problems that were raised and I decided to use Google/Stanford's model which utilizes some of the solutions discussed in the blog post. These problems are considered to be generally solved by many of today's models including that of Google.  Blog Post #3  outlines the general features implemented in the model I chose to work with. I also talk about some of the problems I faced in running the model on my own system to make sure it works and that I can use it further for my comps project. I also briefly mentioned a problem that the model still faces: reference reso...

Blog Post #3

I realize that I have not yet explained why I chose to do Automatic Text summarization. To keep things short, I've never really enjoyed reading for academic purposes. I like reading for the purpose of learning about topics I am interested in (i.e. Computer Science) and for general reading fictions that have no academic values, but when it comes to reading for classes that I am taking for the sake of the credits, I find it hard to be fully indulged in the reading material. From talking to fellow peers, I realized that that was the general consensus. With the rise of Machine Learning and Deep Learning, I figured it would be interesting to look into developing an automatic abstract text summarization tool for students such as myself to utilize.  Picking up from the last blog post... A lot has happened. I decided to explore Google/Stanford's model which utilizes a pointer system and a coverage system to account for some of the problems that the general models for text summariza...

Update 9/23

After talking with Professor Li, I was able to narrow down my project into something more feasible. I decided to throw out the implementation part of the project completely and focus more into the model aspect. Also, instead of creating my own model, I decided, with the help of Professor Li, to use one of the working models out there and explore some of the problems that the model faces and look for any possible solutions. This reduces the scope of the project into a project that I can actually finish in a semester and I was glad that I received some guidance on how to do so. On to the topic of what I've done with the project... Over the summer, I conducted some research and learned as much as possible about automatic text summarization. Automatic text summarization is considered a Sequence-to-Sequence Prediction Problem (seq2seq) which means that it's a prediction problem that takes a sequence as input and requires another sequence as output. A model structure that most pe...

Senior CS Project Introduction

A lot of thought has been put into what I want to work on for my senior year Computer Science Comprehensive Project. There are a variety of topics that I am interested within the bounds of Computer Science including both software and hardware, but I decided to work on creating an automatic text summarization tool. Initially, when I was planning out what exactly I was going to do in creating this tool, I incorporated everything that I wanted to do. In creating my own model from scratch, I wanted to compare the three top models out there created by Google/Stanford, Facebook, and IBM. I would study these models and see their strengths, weaknesses, and methods of overcoming problems introduced with their models.  For example, when many of the models encounter a word that is out of the given vocabulary, the model tends to replace these unknown words with the tag <unk> . While this makes sense, when these tags come up in the summary produced by the model, it can be quite co...