Natural language processing technologies for sentiment analysis: the words we used to describe death

Author: Zhiwen Guo

Guo, Zhiwen, 2020 Natural language processing technologies for sentiment analysis: the words we used to describe death, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

With the population ages, global climate changes, pandemic spreads, other chronic diseases are increasingly resulting in people’s death, the trend in deaths rate increasing may be unavoidable. However, death and dying are taboo in contemporary western societies. Under this circumstance, people may not have the ability to face death when they are exposed to death or dying. For this reason, a study on death and dying called Dying2Learn (Tieman et al. 2018) has been proposed. The principal purpose of the Dying2Learn research is to have a deeper look at the public’s views on the topic of death and dying so that to improve the people’s death competence (the capabilities and attitudes to deal with death). The Dying2Learn is a survey-based study, applied linguistic sentiment methodology to activities held on Massive Open Online Courses (MOOCs). These courses are designed for encouraging the public to discuss dying and death, so many participants of the courses made comments on dying or death. All the discussions have been collected as study data. However, as the Dying2Learn MOOCs are running year after year, a large amount of text-based data are available, which make it a heavy workload for researchers to analyse. Let alone maintain the analysis consistency. Therefore, efficient and accurate solutions have been called for analysing all these data.

Based on this background, the present research focused on investigating different natural language processing technologies and developing an efficient and accurate solution for Dying2Learn research. After reviewing different baseline algorithms for sentiment analysis such as Native Bayes classifier, SVMs classifiers and lexicon-based algorithms, we explored a widely used algorithm for sentiment analysis, which is the combination of Global Vectors for Word Representation (Pennington et al. 2014) with Long short-term memory (Hochreiter & Schmidhuber 1997) network. Also, we proposed two novel methods that make use of Global Vectors for Word Representation (Pennington et al. 2014), the Universal Sentence Encoder (Cer et al. 2018) and Warriner et al. (2013)’s affective lexicon.

The main contribution of this present research is that a novel solution has been proposed for Dying2Learn research. The proposed method can be used as a black box which only requires the data from Dying2Learn research as inputs; it automatically analyses all the data and outputs the results. So, the proposed method meet the objectives of this present study that develop an efficient and accurate solution.

Keywords: Natural Language Processing, NLP, sentiment analysis, Machine Learning, ML, Deep Learning, Transformer, GloVE, Word embeddings

Subject: Computer Science thesis

Thesis type: Masters
Completed: 2020
School: College of Science and Engineering
Supervisor: Trent Lewis