Machine Learning and Cyber Threat Intelligence and Analytics
Permanent URI for this collection
1 - 3 of 3
ItemWalk This Way: Footwear Recognition Using Images & Neural Networks( 2022-01-04)Footwear prints are one of the most commonly recovered in criminal investigations. They can be used to discover a criminal's identity and to connect various crimes. Nowadays, footwear recognition techniques take time to be processed due to the use of current methods to extract the shoe print layout such as platter castings, gel lifting, and 3D-imaging techniques. Traditional techniques are prone to human error and waste valuable investigative time, which can be a problem for timely investigations. In terms of 3D-imaging techniques, one of the issues is that footwear prints can be blurred or missing, which renders their recognition and comparison inaccurate by completely automated approaches. Hence, this research investigates a footwear recognition model based on camera RGB images of the shoe print taken directly from the investigation site to reduce the time and cost required for the investigative process. First, the model extracts the layout information of the evidence shoe print using known image processing techniques. The layout information is then sent to a hierarchical network of neural networks. Each layer of this network is examined in an attempt to process and recognize footwear features to eliminate and narrow down the possible matches until returning the final result to the investigator.
ItemUniversal Spam Detection using Transfer Learning of BERT Model( 2022-01-04)Several machine learning and deep learning algorithms were limited to one dataset of spam emails/texts, which waste valuable resources due to individual models. This research applied efficient classification of ham or spam emails in real-time scenarios. Deep learning transformer models become important by training on text data based on self-attention mechanisms. This manuscript demonstrated a novel universal spam detection model using pre-trained Google's Bidirectional Encoder Representations from Transformers (BERT) base uncased models with multiple spam datasets. Different methods for Enron, Spamassain, Lingspam, and Spamtext message classification datasets, were used to train models individually. The combined model is finetuned with hyperparameters of each model. When each model using its corresponding datasets, an F1-score is at 0.9 in the model architecture. The "universal model" was trained with four datasets and leveraged hyperparameters from each model. An overall accuracy reached 97%, with an F1 score at 0.96 combined across all four datasets.