This project is aimed at detecting malicious files using machine learning by analyzing features extracted from Portable Executable (PE) files. It uses feature extraction, entropy analysis, and classification techniques to identify whether an executable file is legitimate or malicious.
The project involves two main steps:
-
Model Development: A machine learning model is trained using features extracted from PE files to classify them as either legitimate or malicious. The model achieves an accuracy of 99.3%.
-
Automated Malware Classification Tool: A script is written using the PEfile library that extracts features from a given PE file and classifies it as legitimate or malicious using the pre-trained model.
- PEfile: Library used for extracting features from PE files.
- Python: Programming language used to implement the model and classification tool.
- Machine Learning: Used for training and classifying PE files.
- Joblib: Used for loading and saving the trained classifier model.
Before running the project, make sure you have Python 3.x installed on your system.
-
Clone this repository to your local machine:
git clone https://github.com/yourusername/malware-detection.git cd malware-detection
2.Usage To classify a given PE file as legitimate or malicious, use the following script:
python malware.py <path_to_PE_file>
3.Example
python malware.py test.exe
This project demonstrates the effectiveness of machine learning in malware detection by leveraging static analysis of PE files. The implemented model achieves an impressive 99.3% accuracy, making it a reliable tool for identifying malicious executables.
By automating the feature extraction process and integrating it with a trained classifier, this tool enhances cybersecurity defenses, enabling real-time malware detection. Future improvements could include incorporating dynamic analysis, expanding the dataset, and testing with different machine learning algorithms to further improve detection accuracy.