Multimodal learning analytics (MMLA), which has become increasingly popular, can help provide an accurate understanding of learning processes. However, it is still unclear how multimodal data is integrated into MMLA. By following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, this paper systematically surveys 346 articles on MMLA published during the past three years. For this purpose, we first present a conceptual model for reviewing these articles from three dimensions: data types, learning indicators, and data fusion. Based on this model, we then answer the following questions: 1. What types of data and learning indicators are used in MMLA, together with their relationships; and 2. What are the classifications of the data fusion methods in MMLA. Finally, we point out the key stages in data fusion and the future research direction in MMLA. Our main findings from this review are (a) The data in MMLA are classified into digital data, physical data, physiological data, psychometric data, and environment data; (b) The learning indicators are behavior, cognition, emotion, collaboration, and engagement; (c) The relationships between multimodal data and learning indicators are one-to-one, one-to-any, and many-to-one. The complex relationships between multimodal data and learning indicators are the key for data fusion; (d) The main data fusion methods in MMLA are many-to-one, many-to-many and multiple validations among multimodal data; and (e) Multimodal data fusion can be characterized by the multimodality of data, multi-dimension of indicators, and diversity of methods.