ForTrace Workshop @ DFRWS APAC 2024

17 September 2024

During an investigation, digital forensic examiners are typically faced with a wide variety of investigation objectives. To support the education of digital forensics practitioners, make faster progress in the development and validation of forensic tools and software, and support forensic research, the demand and expectations for up-to-date data sets are increasing. However, the manual creation of data sets is a complex, tedious and time-consuming task, increasing the need for automated solutions.

This workshop will first introduce the architecture and features of the open-source data synthesis framework ForTrace. It will then demonstrate how ForTrace can be used to simultaneously generate persistent, volatile and network traces to create forensically relevant data sets that can be used, e.g., for tool testing or digital forensic training. It will also show how to log all relevant interactions during the synthesis process and to provide the respective ground truth data of the simulated forensic images and how to use it to validate whether the correct traces were actually generated automatically.

The generation of various forensically relevant and increasingly complex scenarios (i.e., starting with simple and progressing to more difficult hands-on exercises) will be discussed by performing the detailed implementation, configuration and evaluation of these scenarios within the ForTrace data synthesis framework. These demonstrations include, for example, the forensic artifact creation of a typical malware/ransomware infection including a client and server system as well as the actual generation of important forensic artifacts on Windows and (optionally) also Linux systems. These include, for example, typical operating system artifacts, other persistent traces (e.g., in the file system), volatile traces in memory and in the network.

Finally, some other important aspects and solutions to problems we face that typically occur during data synthesis and complicate the automatic data synthesis process and the use of the data sets (e.g., additional traces caused by the data synthesis process itself and solutions on how to get rid of them) are discussed.