天气预报怎么说 你好, 很高兴认识你
我明天过来接你 为我们的合作成功干杯
This MCCS dataset is the first large-scale Mandarin Chinese Cued Speech dataset. This dataset covers 23 major categories of scenarios (e.g, communication, transportation and shoping) and 72 subcategories of scenarios (e.g, meeting, dating and introduction). It is recorded by four skilled native Mandarn Chinese Cued Speech cuers with portable cameras on the mobile phones. The Cued Speech videos are recorded with 30fps and 1280x720 format. We provide the raw Cued Speech videos, text file (with 1000 sentences) and corresponding annotations which contains two kind of data annotation. One is continuious video annotation with ELAN, the other is discrete audio annotations with Praat.
This MCCSD contains 1000 Mandarin Chinese Cued Speech sentences. We currently only provide it to universities and research institutions for research purposes. Please complete the following steps to obtain the dataset:
If you are interested in our CS generation work, check out the following links for more details: GitHub
If you use this MCCS dataset for your research, please consider citing the following papers:
If you have any questions about the dataset and our research works, please feel free to contact us:
Prof. Li Liu avrillliu@hkust-gz.edu.cn
Wentao Lei wentaolei@hkust-gz.edu.cn
Feel free to visit Homepage of Prof. Liu for more details about our group and research topics.
Special thanks for the support from Tencent Charity Venture Capital Program.