Abstract: Audio-visual event (AVE) localization aims to localize the temporal boundaries of events that contains visual and audio contents, to identify event categories in unconstrained videos.
Abstract: Place recognition plays an important role in multi-robot collaborative perception, such as aerial-ground search and rescue, in order to identify the same place they have visited. Recently, ...