Dual Siamese Channel Attention Networks for Visual Object Tracking
Abstract
Siamese network based trackers have achieved remarkable performance on visual object tracking. The target position is determined by the similarity map produced via cross-correlation over features generated from template branch and search branch. The interaction between the template and search branches is essential for achieving high-performance object tracking task, which is neglected in previous works as features of the two branches are computed separately. In this paper, we propose Dual Siamese Channel Attentions Networks, referred as SiamDCA, which exploits the channel attentions to further improve tracking robustness. Firstly, a convolutional version of Squeeze and Excitation Networks (CSENet) is embedded in backbone to explicitly formulate interdependencies between channels to recalibrate channel-wise feature responses adaptively. Meanwhile, we propose a novel Global Channel Enhancement (GCE) module, which is capable of capturing attention weights of each channel in template branch, so as to normalize the channel characteristics in search branch. We experiment on benchmark OTB2015, VOT2016 and UAV123 where our algorithm demonstrates competitive performance versus other state-of-the-art trackers.