NeuroSync Open Source Audio2Face Blendshape Transformer Model

Info Sheet

Download the info sheet here

23/02/2025 model update

45 epochs added from the open source dataset. Half precision inference added to local api, can disable in the config for full precision.

Latest Updates

21/02/2025 Scaling UP! | New 228m parameter model + config added

A milestone has been hit and previous research has got us to a point where scaling the model up is now possible with much faster training and better quality overall.

Going from 4 layers and 4 heads to 8 layers and 16 heads means updating your code and model, please ensure you have the latest versions of the api and player as the new model requires some architectural changes.

Enjoy!

19/02/2025 Trainer updates

Trainer: Use NeuroSync Trainer Lite for training and fine-tuning.
Simplified Loss Removed second order smoothness loss (left code in if you want to research the differences, mostly it just squeezes the end result resulting in choppy animation without smoothing)
Mixed Precision Less memory usage and faster training
Data augmentation Interpolate a slow set and a fast set of data from your data to help with fine detail reproduction, uses a lot of memory so /care - generally just adding the fast is best as adding slow over saturates the data with slow and noisey data (more work to do here... obv's!)

17/02/2025 – Player LLM Streaming + Chunking Updates

Talk to a language model with audio and face animation response
Player: Download the NeuroSync Player.

16/02/2025 – Demo Unreal Project Build

Demo Build: Download the demo build to test NeuroSync with an Unreal Project.
Player: Download the NeuroSync Player.

15/02/2025 – RoPe & Global/Local Positional Encoding Update

Update: Improved model performance.
Action: Update your code and model accordingly, bools available for research (easily turn either on or off when training to see differences, just make sure to match the settings in the local api model.py that you train with.)

11/02/2025 – Open Source Dataset Released

Dataset: Download the dataset to train your own model.
Trainer: Use NeuroSync Trainer Lite for training and fine-tuning.

08/02/2025 – Player and License Updates v0.02

Player Update: Now includes blink animations from the default animation, better thread management, and no playback stutter.
Action: Update your Python files and model.pth to the new v0.02 versions.

25/11/2024 – CSV and Emotion Dimensions Update

Update: Correct timecode format added and an option to remove emotion dimensions in the CSV generator of NeuroSync Player (Unreal Engine LiveLink).
Note: Set emotion dimensions to false in utils/csv to include only the first 61 dimensions for LiveLink.

Model Overview

The NeuroSync audio-to-face blendshape transformer seq2seq model converts sequences of audio features into corresponding facial blendshape coefficients, enabling real-time character animation. It integrates seamlessly with Unreal Engine via LiveLink.

Features

Audio-to-Face Transformation: Converts raw audio features into facial blendshape coefficients.
Transformer Seq2Seq Architecture: Utilizes encoder-decoder layers to capture complex dependencies between audio and facial expressions.
Unreal Engine Integration (LiveLink): Stream facial blendshapes in real time with the NeuroSync Player.

Usage

Local API

Set up your local API using the NeuroSync Local API repository to process audio files and stream generated blendshapes.

Non-Local API (Alpha Access)

If you prefer not to host the model locally, apply for the NeuroSync Alpha API at neurosync.info for direct integration with the NeuroSync Player.

Model Architecture

Encoder: Processes audio features with a transformer encoder using positional encodings.
Decoder: Uses cross-attention in a transformer decoder to generate blendshape coefficients.
Output: Produces 61 blendshape coefficients (with some exclusions for LiveLink).

Blendshape Coefficients

Included: Eye movements (e.g., EyeBlinkLeft, EyeSquintRight), jaw movements (e.g., JawOpen, JawRight), mouth movements (e.g., MouthSmileLeft, MouthPucker), brow movements (e.g., BrowInnerUp, BrowDownLeft), and cheek/nose movements (e.g., CheekPuff, NoseSneerRight).
Note: Coefficients 62–68 (related to emotional states) should be ignored or used for additive sliders since they are not streamed into LiveLink.

Community & Resources

Live Demo

Twitch: Talk to a NeuroSync prototype live on Twitch

YouTube Channel

For tutorials, updates, and more, visit our YouTube channel.

NeuroSync License

This software uses a dual-license model:

1. Free License (MIT License)

For individuals and businesses earning under $1M per year:

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

2. Commercial License (For Businesses Earning $1M+ Per Year)

Businesses or organizations with annual revenue of $1,000,000 or more must obtain a commercial license to use this software.

To acquire a commercial license, please contact us.

Compliance

By using this software, you agree to these licensing terms. If your business exceeds the revenue threshold, you must transition to a commercial license or cease using the software.

References

For any questions or further support, please feel free to contribute to the repository or raise an issue.

AnimaVR
/

NEUROSYNC_Audio_To_Face_Blendshape

You need to agree to share your contact information to access this model