Dr. ElMouatez Karbab is a postdoc at Concordia University. His research focuses on applied machine learning techniques on malware fingerprinting and mobile and IoT security. He is a research scientist at the National Cyber Forensic and Training Alliance (NCFTA) of Canada, an international organization which focuses on the investigation of cyber-crimes. He served as an associate researcher at Research Centre for Scientific and Technical Information (CERIST), Algeria, where he worked on international projects in collaboration with University of Cape Town, South Africa, and Heudiasyc Lab, France. ElMouatez has published many peer-reviewed research articles in international journals and conferences on malware fingerprinting using machine learning techniques, cyber security, and embedded systems.
Specialized in Cyber Security and Machine Learning.
Adviser: Dr. Debbabi
Thesis Title: "Resilient and Scalable Android Fingerprinting and Detection"
Specialized in Mobile Computing.
Adviser: Dr. Djenouri
Thesis Title: "Automatic Car Park Management with Integrated Radio Frequency Identification and Wireless Sensor Network"
Awards: Top Selected Students.
Specialized in Advanced Information Systems.
Adviser: Dr. Aliouat
Thesis Title: "Improvement and Implementation of Data-Centric and Hierarchical Wireless Sensor Network Routing Protocols in TinyOS"
Awards: Top three students for four years.
In this research, we propose, MalDy, a portable (plug and play) malware detection and family threat attribution framework using supervised machine learning techniques. The key idea of MalDy portability is the modeling of the behavioral reports into a sequence of words, along with advanced natural language processing (NLP) and machine learning (ML) techniques for automatic engineering of relevant security features to detect and attribute malware without the investigator intervention. More precisely, we propose to use bag-of-words (BoW) NLP model to formulate the behavioral reports. Afterward, we build ML ensembles on top of BoW features. We extensively evaluate MalDy on various datasets from different platforms (Android and Win32) and execution environments. The evaluation shows the effectiveness and the portability MalDy across the spectrum of the analyses and settings.
In this research, we propose MalDozer, an automatic Android malware detection and family attribution framework that relies on sequences classification using deep learning techniques. Starting from the raw sequence of the app's API method calls, MalDozer automatically extracts and learns the malicious and the benign patterns from the actual samples to detect Android malware. MalDozer can serve as a ubiquitous malware detection system that is not only deployed on servers, but also on mobile and even IoT devices. We evaluate MalDozer on multiple Android malware datasets ranging from 1K to 33K malware apps, and 38K benign apps. The results show that MalDozer can correctly detect malware and attribute them to their actual families with an F1-Score of 96%-99% and a false positive rate of 0.06%-2%, under all tested datasets and settings.
In this paper, we present ToGather, an automatic investigation framework that takes the Android malware samples, as input, and produces a situation awareness about the malicious cyber infrastructure of these samples families. ToGather leverages the state-of-the-art graph theory techniques to generate an actionable and granular intelligence to mitigate the threat imposed by the malicious Internet activity of the Android malware apps. We experiment ToGather on real malware samples from various Android families, and the obtained results are interesting and very promising
In this research, we propose DySign, a novel technique for fingerprinting Android malware's dynamic behaviors. This is achieved through the generation of a digest from the dynamic analysis of a malware sample on existing known malware. It is important to mention that: (i) DySign fingerprints are approximated of the observed behaviors during dynamic analysis so as to achieve resiliency to small changes in the behaviors of future malware variants; (ii) Fingerprint computation is agnostic to the analyzed malware sample or family. DySign leverages state-of-the-art Natural Language Processing (NLP) techniques to generate the aforementioned fingerprints, which are then leveraged to build an enhanced Android malware detection system with family attribution.
In this research, we propose Cypider framework, a set of techniques and tools aiming to perform a systematic detection of mobile malware by building an efficient and scalable similarity network infrastructure of malicious apps. Our detection method is based on a novel concept, namely malicious community, in which we consider, for a given family, the instances that share common features. Under this concept, we assume that multiple similar Android apps with different authors are most likely to be malicious. Cypider leverages this assumption for the detection of variants of known malware families and zero-day malware. It is important to mention that Cypider does not rely on signature-based or learning-based patterns. Alternatively, it applies community detection algorithms on the similarity network, which extracts sub-graphs considered as suspicious and most likely malicious communities. Furthermore, we propose a novel fingerprinting technique, namely community fingerprint, based on a learning model for each malicious community. Cypider shows excellent results by detecting about 50% of the malware dataset in one detection iteration. Besides, the preliminary results of the community fingerprint are promising as we achieved 87% of the detection.
In this research, we propose a novel and comprehensive fingerprinting approach for Android packaging APK. The proposed fingerprint captures, not only the binary features of the APK file, but also the underlying structure of the app. Furthermore, we leverage this fingerprinting technique to build ROAR, an automatic system for Android malware detection and family attribution. Our experiments show that the proposed fingerprint and the ROAR system achieve a precision of 95%.