Big Data technology has been one of key engines driving the new industrial revolution. However, the majority of current Big Data research efforts have been devoted to single-modal data analysis, which leads to a huge gap in performance when algorithms are carried out separately. Although significant progress has been made, single-modal data is often insufficient to derive accurate and robust models in many applications. Multimodal is the most general form for information representation and delivery in a real world. Multimodal data analytics algorithms often outperform single modal data analytics in many real-world problems. With the rapid development of Big Data technology and its remarkable applications to many fields, multimodal Big Data is a timely topic. This workshop aims to generate momentum around this topic of growing interest, and to encourage interdisciplinary interaction and collaboration between Natural Language Processing (NLP), computer vision, audio processing, machine learning, multimedia, robotics, Human-Computer Interaction (HCI), cloud computing, Internet of Things (IoT), and geospatial computing communities. It serves as a forum to bring together active researchers and practitioners from academia and industry to share their recent advances in this promising area.

MMAI 2023 Final Accepted Papers & Program Schedule

Dec. 16, 2023, Italy (GMT+1)


Virtually: Please join IEEE Big Data Workshop - MMAI 2023 using the link that was sent to you.

Physically: Hilton Sorrento Palace, Conference Room - Nettuno 3

Type Page Paper Title Author(s)
Session I (Italy Time 10:30-12:30)
Opening Remarks
Full 10 LoRA-like Calibration for Multimodal Deception Detection using ATSFace Data Shun-Wen Hsiao and Cheng-Yuan Sun
Full 10 Multimodal Large Language Models: A Survey Jiayang Wu, Wensheng Gan, Zefeng Chen, Shicheng Wan, and Philip S. Yu
Short 6 Semantic Prompt Based Multi-Scale Transformer for Few-Shot Classification Hongwu Liu, Shouhong Wan, Peiquan Jin, and Xin Wang
Short 8 Impact of Mixed Multimodalities and Size Dependence on Performance of Object Detection on Multimodal Satellite Imagery Yuri Gordienko, Nikita Gordienko, Oleksandr Rokovyi, Oleg Alienin, Andrii Polukhin, and Sergii Stirenko
Extended Abstract 5 Neural Crystals Sofia Karamintziou, Thanassis Mavropoulos, Dimos Ntioudis, Georgios Meditskos, Stefanos Vrochidis, and Ioannis (Yiannis) Kompatsiaris
Poster 4 CLIP-PubOp: A CLIP-based Multimodal Representation Fusion Method for Public Opinion Zhibo Wang, Yi Guo, and Jiaojiao Fu
Poster 4 Audio-visual Neural Face Generation with Emotional Stimuli Juheon Hwang and Jiwoo Kang
Lunch Break
Session II (Italy Time 14:00-16:00)
Short 8 De-SaTE: Denoising Self-attention Transformer Encoders for Li-ion Battery Health Prognostics Gaurav Shinde, Rohan Mohapatra, Pooja Krishan, and Saptarshi Sengupta
Short 6 Multimodal One-class Learning for Malicious Online Content Detection Roberto Corizzo, Nora Lewis, Lucas P. Damasceno, Allison Shafer, Charles C. Cavalcante, and Zois Boukouvalas
Full 10 Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning Jia Cheng Hu, Roberto Cavicchioli, and Alessandro Capotondi
Short 8 Evaluating CLIP: Understanding on Relationships in a Blocks World Kairui Zhang and Martha Lewis
Short 8 New Finger Photo Databases with Presentation Attacks and Demographics Anudeep Vurity and Emanuela Marasco
short 8 Understanding the Language of ADHD and Autism Communities on Social Media Niloofar Kalantari, Amirreza Payandeh, Marcos Zampieri, and Vivian Genaro Motti
Poster 4 Gender Classification Accuracy via Two-Dimensional Body Joints using Convolutional Neural Networks Cheng-En Sung and Nada Attar
Coffee Break
Session III (Italy Time 16:00-18:00)
Extended Abstract 5 Late Fusion-based Distributed Multimodal Learning Flavio Giobergia and Elena Baralis
Poster 4 A Supervised Autoencoder for Human Activity Recognition with Inertial Sensors Jaehyeok An, Younghoon Kwon, and Yoon-Sik Cho
Extended Abstract 5 Using the CARLA Simulator to Train A Deep Q Self-Driving Car to Control a Real-World Counterpart on A College Campus Joseph May, Khem Poudel, Samir Poudel, Sammi Hamdan, and Jorge Vargas
Full 10 Enhancing Scientific Image Classification through Multimodal Learning: Insights from Chest X-Ray and Atomic Force Microscopy Datasets David Meshnick, Nahal Shahini, Debargha Ganguly, Yinghui Wu, Roger French, and Vipin Chaudhary
Short 6 Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs Najmeh Forouzandehmehr, Yijie Cao, Nikhil Thakurdesai, Ramin Giahi, Luyi Ma, Nima Farrokhsiar, Jianpeng Xu, Evren Korpeoglu, and Kannan Achan
Short 8 Predicting Potential School Shooters from Social Media Posts Alana Cedeno, Rachel Liang, and Sheikh Rabiul Islam
Closing Remarks

*The program schedule is subject to change.