Machine Learning and Multivariate Analysis for Chemical Determination of Suspicious Objects by Laser-Induced Breakdown and Raman Spectroscopies

Author: Nathan Garner

Garner, Nathan, 2023 Machine Learning and Multivariate Analysis for Chemical Determination of Suspicious Objects by Laser-Induced Breakdown and Raman Spectroscopies, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

Hidden explosives are commonly used as a tactic in modern warfare, it has been one of the largest sources of casualties in the most recent wars that Australia has been involved in. There is a need to find a way to work out if objects are explosives without getting close. Laser-Induced Breakdown Spectroscopy (LIBS) makes use of an optical spectrometer to collect atomic data; and Raman Spectroscopy (RS) makes use of an infra-red spectrometer to collect molecular-level-data about the composition of the target. Even separately, LIBS and RS data can be difficult to interpret quickly by hand. While trying to read both types of data, either combined or in quick succession, a human under pressure is likely to make mistakes. Machine learning (ML) can be used to interpret subtleties within the data of these two systems quickly and accurately with no concern about pressure leading to human error. Different forms of ML have various advantages and effective use cases. Several different ML techniques were considered including, Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), and forms of Artificial Neural Networks (ANNs). The most robust variant was found to be a form of KNN producing high accuracies in a range of conditions including high noise and fewer data points. The other models were found to lose accuracy much more quickly, LDA and ANNs losing accuracy in high noise conditions and ANNs losing accuracy when given low numbers of datapoints. However, LIBS and RS separately are unable to identify every possible type of sample and thus must be combined in a meaningful way. Different methods for data fusion were considered, including low-level concatenation, low-level addition, and high-level fusion of predictions. All of these methods proved nearly as effective at classifying spectra within their dataset as each other but mid-level fusion via Principal Component Analysis (PCA) was found to be most effective. All fusion methods were able to produce 100% accurate classification, however the mid-level fusion method was able to do so using significantly less data than the other methods, thus allowing for the fastest processing with minimal computing resources needed.

Keywords: Machine learning, data fusion, LIBS, Raman, KNN, ANN, LDA

Subject: Engineering thesis

Thesis type: Masters
Completed: 2023
School: College of Science and Engineering
Supervisor: Jamie Quinton