Music has been present in human life, creating a social connectedness, originating changes in persons' minds and moods, and producing different reactions in the human body. These reactions depend on personal experiences, the body sense, and physical stimuli (auditory, visual, or tactile). For instance, unconsciously, when people listen to a song, they can move their feet in synchrony with the rhythm. To acquire music abilities, the synchronization between body and stimuli is fundamental, either for rhythm or melody. Nowadays, technology supports the acquisition of music skills through the interaction between users and devices and opens the possibility of enhancing music engagement and cognitive, motor, affective, and social skills. To date, people have their first contact with music through technologies such as video games (e.g., Guitar Hero, Rock Band). One of the advantages of video games as a learning tool is the freedom of developers to add, modify, or suppress certain stimuli. Therefore, a video game can create different interactions, movements, and sounds where the music guides the player to take a specific action. Understanding the relationship between music and video games will help design effective and efficient games for music learning purposes. However, what is less clear is what effects interactivity, movement, and sounds cause in music perception due to the stimuli emitted by video games. Therefore, there is a lack of connection between the design of video games and the embodied music cognition, where the cognitive, sensorimotor, social dispositions, and capabilities of human beings are sometimes not considered. The goal and leading research questions for this thesis are to explore how the different stimuli emitted by a music video game change the players' performance. Moreover, how game-elements of a video game can improve music learning and user experience. This thesis seeks to understand the body reaction of players while playing music video games using different stimuli and analyze the game elements and mechanics for music learning. Therefore, two music video games are presented, one game based on rhythms and one game based on pitch recognition. The thesis presents two case studies using the rhythm game to analyze players' reaction times and user experience with auditory, visual, or tactile stimuli. A case study compares the user experience between players using a video game and players using a web application for the pitch game. The results show that players' performance is based on the game-elements and not on the stimuli; however, auditory stimuli enhance their performance over other stimuli. Moreover, a possible learning effect was observed after some trials (lives) and was found a fatigue effect during the gameplay. Using adequate game elements and mechanics can engage users in the activity at hand. In conclusion, human perception in video games is mainly focused on what the game shows visually; however, other stimuli impact the players' performance. Designing music game-based learning must find a balance between the game elements and mechanics and music perception. Consequently, players would acquire music skills during the gameplay.