On 10-15 January 2021, AI4Media will organise a workshop in ICPR’2020 on “Multi-Modal Deep Learning: Challenges and Applications”.
Deep learning is now recognized as one of the key software engines that drives the new industrial revolution. The majority of current deep learning research efforts have been dedicated to single-modal data processing. Pronounced manifestations are deep learning-based visual recognition and speech recognition. Although significant progress made, single-modal data is often insufficient to derive accurate and robust deep models in many applications. Our digital world is by nature multi-modal, which combines different modalities of data such as text, audio, images, animations, videos, and interactive content. Multi-modal is the most popular form for information representation and delivery. For example, posts for hot social events are typically composed of textual descriptions, images, and videos. For a medical diagnosis, the joint use of medical imaging and textual reports is also essential. Multi-modal data is common for humans to make accurate perceptions and decisions. Multi-modal deep learning that is capable of learning from information presented in multiple modalities and consequently making predictions based on multi-modal input is much in demand.
This workshop calls for scientific works that illustrate the most recent progress on multi-modal deep learning. In particular, multi-modal data capture, integration, modeling, understanding and analysis, and how to leverage them to derive accurate and robust AI models in many applications. It is a timely topic following the rapid development of deep learning technologies and their remarkable applications to many fields. It will serve as a forum to bring together active researchers and practitioners to share their recent advances in this exciting area. In particular, we solicit original and high-quality contributions in (1) presenting state-of-the-art theories and novel application scenarios related to multi-modal deep learning; (2) surveying the recent progress in this area; and (3) developing benchmark datasets and evaluations. We welcome contributions coming from various communities (i.e., visual computing, machine learning, multimedia analysis, distributed and cloud computing, etc.) to submit their novel results.
More information at
https://medical-and-multimedia-lab.github.io/MMDLCA2020/