About training details

Hi LiveAvatar team, thanks again for the great work! I am trying to reimplement liveavatar training. I want to ask about the text prompt details. Will you use the same data process as wan2.2 S2V to use a MLLM to extract very detailed text prompt for the training data, and also tune the text embedding and text cross attention during the training and distillation?