Overview

Multimodal Cognitive Computing is an emerging research topic in artificial intelligence as the explosion of big multimodal data. It targets to break the traditional boundary between neuroscience and computer science in the field of multi-modality, and paves the way for machines that will have multi-sensory abilities analogous to a human brain. For example, sight and hearing are two of the most important senses for human perception, which are slightly discrepant, but the percept is unified with multi-sensory integration. What's more, when there are multiple input senses, human reactions usually perform like a kind of integrative perception by jointly processing multiple modalities, which contributes to accomplishing tasks more accurately and efficiently. By contrast, in the past decades, we mostly concentrated on unimodal perception and have achieved considerable progress in different modalities, such as visual object detection, acoustic speech recognition, natural language dialog, tactile object grasping etc., while paying little attention to the field of multi-modality. Hence, to pursue the human case, the computational multi-modality ability is highly expected for machines, named multimodal cognitive computing.

One vivid case of the multi-modality ability of humans could be like, "It could be heard that the night view is fragrant and warm, while observed that the string sound is bitter and cold." cited by a poem. In reality, "night", "fragrance", and "warmness" cannot be directly listened, while "string sound", "bitterness", and "coldness" cannot be actually seen. However, the commutative fusion of sensing brings new perception and expression. This typical example of multi-sensory abilities in human beings is named synaesthesia, which is similar to joint sensing such as one seeing colors when listening to music, or having tastes when listening to/speaking vocabulary. As for images, videos, audios, and texts, traditional data processing mainly focuses on the level of the signal, feature, and semantics, namely, perception and feeling, but overlooks synesthesia and the connection among them. Considering such obvious discrepancy in multi-modality processing, this special issue intends to explore the cognitive mechanism of multimodal fusion and focus on multimodal cognitive computing including key scientific issues concerning perceptual reasoning, semantic association, and collaborative learning and its applications.

Achieving effective multi-modality ability is challenging for computational models, as it is an interdisciplinary research and application field, and uses methods from psychology, biology, signal processing, physics, information theory, mathematics, and statistics. The development of multimodal cognitive computing will cross-fertilize these other research areas with which it interacts. Recently, the rapid development of multimodal machine learning techniques and large-scale low-cost multimodal data have offered promising opportunities. There are many open problems to be defined and to be addressed. This special issue tackles these problems in both academia and industry, and focuses on new foundations and technologies that are intrinsic to multimodal cognitive computing.

Scope

This special issue aims at promoting cutting-edge research along the research direction of multimodal cognitive computing, from the neuroscience study of humans to the computation models of machine, and offers a timely collection of works to benefit researchers and practitioners, including but not limits to the acquisition of different modalities (e.g., vision, olfaction, tactility, etc.), the modeling and interaction of multimodal data, as well as the potential problems about interpretability, fairness and trustworthiness. This special issue serves as a forum for researchers all over the world to discuss their works and recent advances in this field. We welcome high-quality original submissions addressing both novel theoretical and practical aspects related to multimodal cognitive computing, especially presenting technical reviews and discussions about the existing and potential research topics. All submitted papers will be peer-reviewed and selected based on both their quality and their relevance to the theme of this special issue.

Topics of interests include (but are not limited)

Paper submission and review

Important Dates

Guest Editors

Prof. Dr. Xuelong Li
Professor, Northwestern Polytechnical University, China
Email: li@nwpu.edu.cn
Prof. Dr. Di Hu
Assistant Professor (tenure-track), Renmin University of China
Email: dihu@ruc.edu.cn