IPTA 2024, 13th International Conference on Image Processing Theory, Tools and Applications, 14-17 October 2024, Rabat, Morocco
      
  In the domain of event-based vision, action recognition stands as a significant challenge that pushes the boundaries of advanced computational models. This paper compares three cutting-edge architectures - Spiking Neural Networks, Graph Convolutional Neural Networks, and Video Transformer-based Networks - to determine their effectiveness in this domain. Our study extends beyond accuracy, focusing on the error nature of each model and its corresponding complexity. We observed that these models while achieving comparable accuracies, tend to make different types of mistakes. Capitalizing on this complementary error phenomenon, we aim to leverage their strengths by proposing an ensemble learning strategy, which improves overall performance. Moreover, we further investigated the Video Transformer model, retraining it on subsets of data that the Spiking
Neural Network misclassified. This resulted in higher accuracy than when trained on subsets correctly classified, highlighting the Transformer’s ability to learn differently and complement the Spiking Network’s weaknesses. Our findings challenge the sole
focus on accuracy for model efficacy emphasizing the significance of error analysis. This research provides a road map for model selection in event-based vision tasks and introduces innovative ways to integrate these models.
Type:
        Conference
      City:
        Rabat
      Date:
        2024-10-14
      Department:
        Digital Security
      Eurecom Ref:
        7874
      Copyright:
        © 2024 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
      See also:
        
       
 
 
     
                       
                      