avatar

ElMouatez Billah Karbab

PhD Candidate at Concordia University

About Me

ElMouatez Karbab is a PhD Candidate at Concordia University. His research focuses on applied machine learning techniques on malware fingerprinting and mobile & IoT security. He is a research scientist at the National Cyber Forensic and Training Alliance (NCFTA) of Canada, an international organization which focuses on the investigation of cyber-crimes. He is also serving as a data scientist and cyber-security specialist at NCFTA Canada. He served as an associate researcher at Research Centre for Scientific and Technical Information (CERIST), Algeria, where he worked on international projects in collaboration with the university of Cape Town, South Africa, and Heudiasyc Lab, France. ElMouatez has published many peer-reviewed research articles in international journals and conferences on malware fingerprinting using machine learning techniques, cyber security, and embedded systems.

Experiences

NCFTA Canada
ncfta.ca
2014 - current
Data Scientist & Cyber-Security Specialist
I am a researcher at National Cyber-Forensics and Training Alliance (NCFTA Canada).
  • Investigating large-scale cyber threats.
  • Designing, developing and maintaining big data and machine learning systems for cyber security threats mitigation.
  • Maintaining near-real-time front-end dashboards for security events.
  • Unix System administrator.
Concordia University
concordia.ca
2014-current
Research Assistant
I focus on binary and malware fingerprinting using machine learning techniques. Also, I am participating in different projects that include Concordia University/ NCFTA Canada and other academic, industrial, and government partners.
Concordia University
concordia.ca
2016-current
Teaching Assistant
I taught several tutorials and labs of graduate and undergraduate courses.
CERIST Research Center
cerist.dz
2011 - 2014
Research Assistant & Embedded System Developer
I was a research assistant at CERIST Research Center. I participated in national and international projects.
  • Research & Development.
  • Embedded systems development.
  • Middleware development.
  • Electronic Designer.
Mobilis Telecom
mobilis.dz
2010 - 2011
Software Engineer
I developed the management software for one department.
  • Java developer.
  • Software designer.
  • Database administrator.

Education

Publications

Languages

English

French

Arabic

Hobbies

Electronics
Graphical Design
Chess

Contact

Address
Sir George Williams Campus
EV9.173, 1455 De Maisonneuve Blvd. W.
Montreal, Quebec, Canada
H3G 1M8
Mail
mouatez _at_ karbab.net
e_karbab _at_ encs.concordia.ca
mtz _at_ ncfta.ca

Skills

Programming languages
Python
Go
Java
Bash
Rust
C/C++
Javascript
R
Matlab
Machine Learning Frameworks
scikit-learn
Tensorflow
PyTorch
Spark ML
Data Store and Processing
Elastic Stack
MongoDB
Pandas
Sql
PySpark
MySql
Embedded System and IoT Development
Arduino
Energia
Platformio
Nuttx OS
Riot OS
Build and Continuous Integration
Git
Make
Gradle
Maven
Jenkins
Cloud Computing and Virtualization
Docker
Vagrant
Qemu
Xen
KVM

Projects

Collaboration between Concordia University, Canarie, and several Canadian academic institutions to strengthen the overall state of network security at connected institutions within Canada's National Research and Education Network (NREN).

I built a machine learning system for maliciousness detection on academic institutions network traffic. The system is online (private access) and serving over ten academic institutions.

Telephony Abuse Mitigation
Collaboration between Concordia University, CRTC in a research project for telephony abuse mitigation.

I developed multiple statistical analyses to detect and cluster campaigns of telephony abusers from call detail records (CDR) and users' complaints .

CyberFusion Log Analysis Project
Collaboration between Concordia University, Carleton University, and Industrial partners to conduct research and development on OS systems' logs and network traffic to detect and classify abnormal and malicious behaviors.
  • I developed malware detection system on ClearOS Linux system logs based on deep learning techniques.
  • I design, develop, and deploy IoT malware detection system on IoT device Logs based on deep learning techniques.
  • I built an IoT intrusion detection system (IDS) to detect malicious traffic and devices in IoT network. The system has been deployed in an IoT testbed at Concordia university security lab.

Road Traffic Management using Sensor networks
Collaboration between CERIST (Algeria) Research Center and Heudiasyc (France) Laboratory.

I participated in the development of a smart prototype, hardware and embedding system development, for road traffic in management using wireless sensor network (WSN) and ultrasonic sensors.

Smart car parking using wireless sensor network
Collaboration between CERIST (Algeria) Research Center and University of Cape Town (South Africa).

I design, develop, and deploy a prototype of smart car parking manager, hardware and embedding system development, using wireless sensor network (WSN), radio frequency identification (RFID), ultrasonic sensor, and image processing.

Awards and Honors

Golden Key International Honour Society
Top 15% Canadian Graduate Students.
Concordia University Scholarship
Concordia University International Tuition Fee Remission Awards
USTHB University Top Selected Master Students (Competition-Based)
Top master students (6th out of ~400 candidates) of the Informatics Institute, USTHB University, Algiers, Algeria.
Saad Dahlab University Top Selected Master Students (Competition-Based)
Top master students (3th out of ~200 candidates) of the Informatics Institute, Saad Dahlab University, Blida, Algeria.
Bachelor Top Students
I was in the top three ranks for four successive years (2007, 2008. 2009. 2010) in the Informatics Institute of Ferhat Abbas University, Setif, Algeria.
DFRWS Scholarship (2018 and 2019)
Digital Forensics Research Conference Google and Facebook Scholarship.

Research



MalDy: Portable and Data-Driven Malware Detection using Natural Language Processing and Machine Learning Techniques on Behavioral Analysis Reports [PDF]

In this research, we propose, MalDy, a portable (plug and play) malware detection and family threat attribution framework using supervised machine learning techniques. The key idea of MalDy portability is the modeling of the behavioral reports into a sequence of words, along with advanced natural language processing (NLP) and machine learning (ML) techniques for automatic engineering of relevant security features to detect and attribute malware without the investigator intervention. More precisely, we propose to use bag-of-words (BoW) NLP model to formulate the behavioral reports. Afterward, we build ML ensembles on top of BoW features. We extensively evaluate MalDy on various datasets from different platforms (Android and Win32) and execution environments. The evaluation shows the effectiveness and the portability MalDy across the spectrum of the analyses and settings.



Android Malware Detection using Deep Learning on API Method Sequences [PDF]

In this research, we propose MalDozer, an automatic Android malware detection and family attribution framework that relies on sequences classification using deep learning techniques. Starting from the raw sequence of the app's API method calls, MalDozer automatically extracts and learns the malicious and the benign patterns from the actual samples to detect Android malware. MalDozer can serve as a ubiquitous malware detection system that is not only deployed on servers, but also on mobile and even IoT devices. We evaluate MalDozer on multiple Android malware datasets ranging from 1K to 33K malware apps, and 38K benign apps. The results show that MalDozer can correctly detect malware and attribute them to their actual families with an F1-Score of 96%-99% and a false positive rate of 0.06%-2%, under all tested datasets and settings.



Automatic Investigation Framework for Android Malware Cyber-Infrastructures [PDF]


In this paper, we present ToGather, an automatic investigation framework that takes the Android malware samples, as input, and produces a situation awareness about the malicious cyber infrastructure of these samples families. ToGather leverages the state-of-the-art graph theory techniques to generate an actionable and granular intelligence to mitigate the threat imposed by the malicious Internet activity of the Android malware apps. We experiment ToGather on real malware samples from various Android families, and the obtained results are interesting and very promising




Dynamic Fingerprinting for the Automatic Detection of Android Malware using Natural Language Processing [PDF]

In this research, we propose DySign, a novel technique for fingerprinting Android malware's dynamic behaviors. This is achieved through the generation of a digest from the dynamic analysis of a malware sample on existing known malware. It is important to mention that: (i) DySign fingerprints are approximated of the observed behaviors during dynamic analysis so as to achieve resiliency to small changes in the behaviors of future malware variants; (ii) Fingerprint computation is agnostic to the analyzed malware sample or family. DySign leverages state-of-the-art Natural Language Processing (NLP) techniques to generate the aforementioned fingerprints, which are then leveraged to build an enhanced Android malware detection system with family attribution.



Building Graph-Based Cyber-defense Infrastructure for Android Malware Detection [PDF]



In this research, we propose Cypider framework, a set of techniques and tools aiming to perform a systematic detection of mobile malware by building an efficient and scalable similarity network infrastructure of malicious apps. Our detection method is based on a novel concept, namely malicious community, in which we consider, for a given family, the instances that share common features. Under this concept, we assume that multiple similar Android apps with different authors are most likely to be malicious. Cypider leverages this assumption for the detection of variants of known malware families and zero-day malware. It is important to mention that Cypider does not rely on signature-based or learning-based patterns. Alternatively, it applies community detection algorithms on the similarity network, which extracts sub-graphs considered as suspicious and most likely malicious communities. Furthermore, we propose a novel fingerprinting technique, namely community fingerprint, based on a learning model for each malicious community. Cypider shows excellent results by detecting about 50% of the malware dataset in one detection iteration. Besides, the preliminary results of the community fingerprint are promising as we achieved 87% of the detection.



Fingerprinting Android packaging: Generating DNAs for malware detection [PDF]






In this research, we propose a novel and comprehensive fingerprinting approach for Android packaging APK. The proposed fingerprint captures, not only the binary features of the APK file, but also the underlying structure of the app. Furthermore, we leverage this fingerprinting technique to build ROAR, an automatic system for Android malware detection and family attribution. Our experiments show that the proposed fingerprint and the ROAR system achieve a precision of 95%.



Talks