Internship Tasks & Outline

← Back to Home

Daily Task Overview

  • Cleaned and formatted NetFlow V1-based network traffic dataset (NF-UNSW-NB15)
  • Dropped irrelevant features (e.g., IP addresses) to prevent data leakage
  • Engineered new features to improve classification performance
  • Built and trained baseline Random Forest models using Python and scikit-learn
  • Evaluated model using precision, recall, F1-score, and ROC-AUC metrics
  • Addressed class imbalance through downsampling of benign traffic
  • Performed cross-validation to test model stability and detect overfitting
  • Visualized feature importances and model metrics using Matplotlib and Seaborn
  • Created flow-based models using reconstructed IP address groupings
  • Designed layout and structure of an educational prototype web interface
  • Delivered weekly presentations outlining progress, challenges, and upcoming goals.