Forensically enhanced digital preservation

Author: Timothy Hart

Hart, Timothy, 2022 Forensically enhanced digital preservation, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

Digital preservation and digital forensics are two fields with differing goals that travel similar pathways which often converge. Each field does not necessarily acknowledge the other, but they are closely aligned and share similarities. Digital preservation has much to benefit from digital forensics; however, this is not to say digital forensics could not gain with respect to documentation and perspective with collaboration in mind.

One of the key differences is long-term preservation, where the material is stored and maintained long after it has been processed versus forensic evidence gathered and used to prosecute, with no further regard once done so. The efforts that go into ensuring the preservation of digital objects are where the similarities between the two fields end. This results in digital forensic software being tailored to the specifics of the field, such as modern devices, specific data, and criminal prosecution. Perspective and purpose are important factors as they determine how the software is perceived and documented. This affects the adaptability of digital forensic software for memory institutions (galleries, libraries, archives, museums) as at face value, it does not cater to their needs, despite being beneficial.

In this thesis, the benefits of using digital forensic software for born-digital preservation are explored, as well as the risk to collections should data remain unprocessed via the suggested methods.

Hidden data may already exist within storage collections, yet to be discovered and impossible to do so without the use of digital forensic software. These data, rightly named “sensitive data” have many implications. Sensitive data, whilst the key to criminal investigations, is also paramount to digital preservation as it can reveal significant amounts of new information.

Australian law is explored regarding the risk of sensitive data discovery and the actions that follow. The threats of sensitive data are discussed with consideration to the potential legal implications that arise from discovering sensitive data. This includes examples of current and future threats that may reside in stored data that have not been processed assiduously using digital forensic software.

Policies and procedures regarding Aboriginal and Torres Strait Islander people and their information are explored and compared against standard policies. The strict and careful policies developed for our Indigenous people can positively influence the standard privacy policies within institutions implementing or advancing sensitive data discovery.

The scope of this study has been narrowed down to Australian institutions, targeting State and National libraries whilst also considering archives, galleries, and museums, as these are the influential institutions. Australian institutions have been investigated by the information publicly available and by communications, distributing a questionnaire to willing participants.

Collection institutions from the United States of America were investigated to form a comparison and to establish potential tools and methods that could be adopted within Australian institutions. The data gathered from the U.S institutions were derived by publicly available information and other studies conducted. The main sources of data were derived from workflows as these allowed a visual representation of the processes and the tools used within collection institutions, revealing if and where digital forensics was being utilised.

It was evident the major collection institutions of Australia were performing digital preservation at various maturity levels. Intake requirements and dedicated preservation procedures were varied, as was the influence of digital forensic tools and methods.

Some of the participants of the study identified the need for improvement regarding their workflows, whereas others had low demand and therefore did not see the need to make any changes. It was determined that some digital forensics was being utilised, but not to its full potential, and in most cases, was missing completely. With the analysis of collection institutions and the benefits of digital forensics, the objective is to increase awareness and provide workflow improvements to enable sensitive data discovery and the handling of any surrounding issues that may arise.

The identification of maturity levels for digital preservation in Australian institutions has been established by the feedback provided via questionnaire and data gathered from public sources. This information, compared with other institutions and maturity level modelling allowed the establishment of an average baseline in terms of maturity levels of digital preservation requirements and performance.

Digital forensic tools and methods have been analysed to determine the data gathering capabilities of digital forensic software and the relevance to digital preservation. The benefits of digital forensics within digital preservation workflows and the impact of sensitive data within collection institutions form the contributions of this study.

Experiments have been conducted in real world scenarios using donated material (hard drives), resulting in a plethora of data gathered with an extensive range in severity. The potential for sensitive data discovery was revealed as well as the ability to derive information about the users of the physical media.

Issues regarding digital preservation workflows have been identified. Many workflows are missing core processes that are required to handle sensitive data. This may be the result of either a lack of transparency, where sensitive data discovery is being performed to some extent but is undocumented, or the process is missing entirely.

Through the process of reviewing and analysing workflows, good practices were also identified, resulting in the discovery of exemplary workflow designs to help in determining how digital preservation workflows can be improved.

Amendments and enhancements to workflows to address sensitive data discovery are presented, enhancing digital preservation workflows with digital forensic tools and methods. This is not only to improve existing institutions, but also to better enable peer-to-peer learning and collaboration.

With the implementation of digital forensics within mature and influential collection institutions, other institutions that may be in their infancy or slowly developing their procedures will have guidance. This can be achieved with transparent workflows that accurately visualise the forensic processes, addressing all outcomes and decision-making, and documenting the tools used and any implementation requirements.

Keywords: digital preservation, digital forensics, metadata, sensitive data, private data, privacy, collection institution, library, workflow

Subject: Computer Science thesis

Thesis type: Doctor of Philosophy
Completed: 2022
School: College of Science and Engineering
Supervisor: Denise de Vries