Publications
Check the full publications here
2023
- arXivPrompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages2023
2022
- arXivThe Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges2022
- arXivNusaCrowd: Open Source Initiative for Indonesian NLP ResourcesarXiv preprint arXiv:2212.09648 2022
- arXivBLOOM+ 1: Adding Language Support to BLOOM for Zero-Shot PromptingarXiv preprint arXiv:2212.09535 2022
- arXivBLOOM: A 176B-Parameter Open-Access Multilingual Language ModelarXiv preprint arXiv:2211.05100 2022
- AACLCross-lingual Few-Shot Learning on Unseen LanguagesIn Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing 2022
- SumEvalIndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local LanguagesIn Proceedings of the First Workshop on Scaling Up Multilingual Evaluation 2022
- arXivTransfer Learning Application of Self-supervised Learning in ARPESarXiv preprint arXiv:2208.10893 2022
- arXivNusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian LanguagesarXiv preprint arXiv:2207.10524 2022
- EMNLP DemoGEMv2: Multilingual NLG Benchmarking in a Single Line of CodearXiv preprint arXiv:2206.11249 2022
- Accepted at TMLRBeyond the Imitation Game: Quantifying and extrapolating the capabilities of language models2022
- Accepted at EACL
- DialDocRetrieval-Free Knowledge-Grounded Dialogue Response Generation with AdaptersAccepted at DialDoc 2022arXiv Best Student Paper at DialDoc 2022
- ACLOne Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in IndonesiaAccepted at ACL 2022
- LRECCI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command RecognitionAccepted at LREC 2022
- LRECASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn ConversationAccepted at LREC 2022
2021
- arXivNL-Augmenter: A Framework for Task-Sensitive Natural Language AugmentationarXiv preprint arXiv:2112.02721 2021
- arXiv
- arXivGreenformer: Factorization toolkit for efficient deep neural networksarXiv preprint arXiv:2109.06762 2021
- DialDoc21CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue SystemDialDoc21 2021arXiv Third Place in the Shared Task
- InterspeechAdapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech RecognitionINTERSPEECH 2021
- SIGDIAL
- RepL4NLPExploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual LearningRepL4NLP 2021
- arXiv
2020
- AACL-IJCNLP
- arXivEmograph: Capturing emotion correlations using graph networksarXiv preprint arXiv:2008.09378 2020
2019
- MRQAGeneralizing Question Answering System with Pre-trained Language Model Fine-tuningIn EMNLP 2019 MRQA Workshop 2019
- FinNLPLearning to learn sales prediction with social media sentimentIn Proceedings of the First Workshop on Financial Technology and Natural Language Processing 2019
2018
- arXivTowards end-to-end automatic code-switching speech recognitionarXiv preprint arXiv:1810.12620 2018
- arXivLearn to code-switch: Data augmentation using copy mechanism on language modelingarXiv preprint arXiv:1810.10254 2018
- ICASSPEnd-to-End Dynamic Query Memory Network for Entity-Value Independent Task-oriented DialogIn 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
2017
- DSTC6End-to-end recurrent entity network for entity-value independent goal-oriented dialog learningIn Wu, Chien-Sheng, et al. "End-to-end recurrent entity network for entity-value independent goal-oriented dialog learning." Dialog System Technology Challenges Workshop, DSTC6 2017
- Interspeech