Hazim Adfarid bin Haji Maz Adanan

Autromated Lip-Sync Framework for Video Game / Hazim Adfarid bin Haji Maz Adanan - Bandar Seri Begawan : Universiti Teknologi Brunei , ©2022 - xii, 85 pages : illustrations ; 30 cm

submitted in fulfilment of the requirements for the degree of Master of Science Abstract

Lip-syncing in 3D animation is the process of matching the shape of the lips of the character based on the character's speech. In general, this is a tedious process; and requires a lot of time to do. In video games development that involved 3D characters, with storylines and narratives that involve a lot of the characters talking in a scene or during gameplay, lip-syncing each character for every speech takes too much time. To solve such a problem, game developers often avoided lip-syncing altogether or created an automated lip-sync algorithm that automated the process of lip-syncing. While this automated lip-sync algorithm worked as intended, often time the result was a character that looks very robot-like and not quite believable. To further explain, when we said believable, we want a 3D character to look like they are an actual animated character with decent lip-syncing motion instead of a puppet flapping its jaw. As a game developer who made a story that involves the character talking, we want our audience the player to feel immersed when they play the video game, and one of the key factors is to make the character look believable when they talk.

For this thesis, we proposed an automated lip-sync framework for the Unreal Engine 4 game engine. This framework automates the process of lip-syncing based on speech signals while providing a decent lip-syncing animation quality. To achieve this, our automated lip-sync framework was constructed by using blueprint scripting within the Unreal Engine 4 game engine. Our framework only required the animator to provide their 3D character, with the corresponding visemes animation files, and an audio voice file. Our framework automated the process of lip-syncing by animating the lip shapes with the given audio speech in real-time. This method does not use machine learning to keep it lightweight to optimize for video games environments. The framework itself was designed to be modular, as it can be used and duplicated for multiple different characters. The framework supports audio input from different speech patterns which could be in diverse dynamic ranges, pitches, tempos, and different languages. Furthermore, since the framework was created within the Unreal Engine 4 game engine itself, it does not require any external software to work; this makes things convenient for video game developers who use Unreal Engine 4, as the purpose of this framework was made for this particular game engine. This thesis covered how our automated lip- Sync framework was built from the ground up and what components were needed. This thesis was also written to be somewhat informative for future researchers who wish to pursue video game development that involves lip-syncing animation for 3D characters.

Includes bibliographical references




Computer graphics--Technological innovations
Computer animation
Video games--Design and construction