Network Traffic Classification (TC), which identifies the application responsible for generating a given traffic flow, is essential for managing modern networks by enabling effective resource allocation and service differentiation. However, TC faces growing challenges due to encryption, traffic diversity, inference efficiency, and the scarcity of labeled data. While machine learning offers a promising approach by learning patterns from limited information, it still faces challenges like distribution shifts and a scarcity of publicly available annotations.
This dissertation tackles the challenges mentioned above through three main contributions. First, we benchmark 18 hand-crafted augmentations for supervised TC and integrate them into contrastive learning to handle label-scarce scenarios, demonstrating improved performance and generalization. Second, to deepen our understanding of state-of-the-art generative models, we begin by exploring diffusion models in the machine learning domain, focusing specifically on conditional text-to-image generation task. We propose novel mutual information estimators using pretrained diffusion and rectified flow models, and apply them in self-supervised fine-tuning to enhance text-to-image alignment without external models or annotations. Third, we advance generative modeling for network traffic by building a standardized benchmarking framework including datasets, preprocessing, baselines, and evaluation, and developing a diffusion model for packet series that outperforms existing methods in fidelity and downstream utility. We also develop a diffusion model for packet series that achieves better performance than existing generative models.
By addressing these areas, this dissertation advances TC under data scarcity, improves alignment in conditional generative models, and provides benchmarks for generative modeling of traffic --- laying a foundation for more robust and adaptable machine learning systems in networking contexts.