2023 . 10 . 2

Escaping local minima in deep reinforcement learning for video summarization

State-of-the-art deep neural unsupervised video summarization methods mostly fall under the adversarial reconstruction framework. This employs a Generative Adversarial Network (GAN) structure and Long Short-Term Memory (LSTM) auto-encoders during its training stage. The typical result is a selector LSTM that sequentially receives video frame representations and outputs corresponding scalar importance factors, which are then used to select key-frames. This basic approach has been augmented with an additional Deep Reinforcement Learning (DRL) agent, trained using the Discriminator’s output as a reward, which learns to optimize the selector’s outputs. However, local minima are a well-known problem in DRL. Thus, this paper presents a novel regularizer for escaping local loss minima, in order to improve unsupervised key-frame extraction. It is an additive loss term employed during a second training phase, that rewards the difference of the neural agent’s parameters from those of a previously found good solution. Thus, it encourages the training process to explore more aggressively the parameter space in order to discover a better local loss minimum. Evaluation performed on two public datasets shows considerable increases over the baseline and against the state-of-the-art.

more news

#62

New Use Cases Demo Videos Now Live!

#61

AI4Media’s Lasting Legacy in Advancing AI for Media

#60

AI4Media Co-Organises “Artificial Intelligence: Possibilities and Challenges” Exhibition at Museum NOESIS Science Center

Cookie Settings

AI4Media may use cookies to store your login data, collect statistics to optimize the website’s functionality and to perform marketing actions based on your interests. You can personalize your cookies in .