SQL-for-ML-Network-Intrusion-Detection

This project demonstrates how to use SQL for data preprocessing and feature engineering to train a machine learning model for network intrusion detection.

Data

The data used in this project is simulated network traffic data from an enterprise network. It includes features such as source/destination IP, port, protocol, and timestamps. If you want to download the data here is the link for it. "https://unsw-my.sharepoint.com/:f:/g/personal/z5025758_ad_unsw_edu_au/EnuQZZn3XuNBjgfcUu4DIVMBLCHyoLHqOswirpOQifr1ag?e=gKWkLS"

SQL Scripts

The "sql" folder contains the following SQL scripts:

data_cleaning.sql: Cleans the raw data and handles missing values.
feature_engineering.sql: Creates new features from the raw data, including aggregations and time-series features.
data_export.sql: Exports the processed data in CSV format for model training.

Python Code

The "python" folder contains Python code for training a Random Forest model using scikit-learn.

How to Run

Load the raw data into a SQL database (e.g., PostgreSQL, MySQL).
Run the SQL scripts in the following order: data_cleaning.sql, feature_engineering.sql, data_export.sql.
Use the exported CSV file to train the machine learning model using the Python code.

Results

The new features developed in this project improved the Random Forest model accuracy by 12% in identifying malicious network activity.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
data_cleaning.sql		data_cleaning.sql
data_export.sql		data_export.sql
feature_engineering.sql		feature_engineering.sql
tempCodeRunnerFile.python		tempCodeRunnerFile.python
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQL-for-ML-Network-Intrusion-Detection

Data

SQL Scripts

Python Code

How to Run

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SQL-for-ML-Network-Intrusion-Detection

Data

SQL Scripts

Python Code

How to Run

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages