Publications
Publications is in reversed chronological order.
Detailed publications can be found in my Google Scholar profile.
2026
- ICLRmR3: Multilingual Rubric-Agnostic Reward Reasoning ModelsarXiv preprint arXiv:2510.01146, 2026
- Nature
- arXivCommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web DataarXiv preprint arXiv:2601.18026, 2026
- arXivPingPong: A Natural Benchmark for Multi-Turn Code-Switching DialoguesarXiv preprint arXiv:2601.17277, 2026
- arXivRouting with Generated Data: Annotation-Free LLM Skill Estimation and Expert SelectionarXiv preprint arXiv:2601.09692, 2026
- arXivCan Large Language Models Understand, Reason About, and Generate Code-Switched Text?arXiv preprint arXiv:2601.07153, 2026
2025
- arXivM4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAGarXiv preprint arXiv:2512.05959, 2025
- AACL-IJCNLPIndopref: A multi-domain pairwise preference dataset for indonesianIn Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025
- arXiv
- arXivLeveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMsarXiv preprint arXiv:2511.10850, 2025
- arXivOptimizing Reasoning Efficiency through Prompt Difficulty PredictionarXiv preprint arXiv:2511.03808, 2025
- WMTSmol: Professionally translated parallel data for 115 under-represented languagesIn Proceedings of the Tenth Conference on Machine Translation, 2025
- MRLENTROPY2VEC: Crosslingual Language Modeling Entropy as End-to-End Learnable Language RepresentationsIn Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), 2025
- arXivSEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian LanguagesarXiv preprint arXiv:2508.07069, 2025
- ACLCrowdsource, crawl, or generate? creating sea-vl, a multicultural vision-language dataset for southeast asiaIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
- MRL
- arXivDatasheets Aren’t Enough: DataRubrics for Automated Quality Metrics and AccountabilityarXiv preprint arXiv:2506.01789, 2025
- NeurIPST1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic PlanningarXiv preprint arXiv:2505.16986, 2025
- arXiv
- arXivBehind Maya: Building a Multilingual Vision Language ModelarXiv preprint arXiv:2505.08910, 2025
- arXiv
- JAIRPreference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A SurveyJournal of Artificial Intelligence Research, 2025
- MRLWhat Causes Knowledge Loss in Multilingual Language Models?arXiv preprint arXiv:2504.20356, 2025
- NAACL FindingsProxylm: Predicting language model performance on multilingual tasks via proxy modelsIn Findings of the Association for Computational Linguistics: NAACL 2025, 2025
- arXivFine-tuning diffusion generative models via rich preference optimizationarXiv preprint arXiv:2503.11720, 2025
- ACLDo Language Models Understand Honorific Systems in Javanese?arXiv preprint arXiv:2502.20864, 2025
- arXivTextgames: Learning to self-play text-based puzzle games via language model reasoningarXiv preprint arXiv:2502.18431, 2025
- ICLRMMTEB: Massive Multilingual Text Embedding BenchmarkIn The Thirteenth International Conference on Learning Representations, 2025
- arXiv
- COLINGTowards efficient and robust vqa-nle data generation with large vision-language modelsIn Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
- arXivMaya: An Instruction Finetuned Multilingual Multimodal ModelarXiv preprint arXiv:2412.07112, 2024
- arXivA Multi-Agent Dual Dialogue System to Support Mental Health Care ProvidersarXiv preprint arXiv:2411.18429, 2024
- arXivAn AI-Assisted Multi-Agent Dual Dialogue System to Support Mental Health Care ProvidersarXiv preprint arXiv:2411.18429, 2024
- EMNLPAcademics Can Contribute to Domain-Specialized Language ModelsIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
- EMNLPRe-Evaluating Evaluation for Multilingual SummarizationIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
- WMTMetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference CalibrationIn Proceedings of the Ninth Conference on Machine Translation, Nov 2024
- arXivLinguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language ModelsarXiv preprint arXiv:2410.22660, Nov 2024
- arXivWorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global CuisinesarXiv preprint arXiv:2410.12705, Nov 2024
- arXivRainbowPO: A Unified Framework for Combining Improvements in Preference OptimizationarXiv preprint arXiv:2410.04203, Nov 2024
- arXivMetaMetrics: Calibrating Metrics For Generation Tasks Using Human PreferencesarXiv preprint arXiv:2410.02381, Nov 2024
- COLINGTowards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language ModelsarXiv preprint arXiv:2409.14785, Nov 2024
- arXivPreference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A SurveyarXiv preprint arXiv:2409.11564, Nov 2024
- EMNLPSEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian LanguagesarXiv preprint arXiv:2406.10118, Nov 2024
- arXivProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy ModelsarXiv preprint arXiv:2406.09334, Nov 2024
- EMNLP FindingsMINERS: Multilingual Language Models as Semantic RetrieversarXiv preprint arXiv:2406.07424, Nov 2024
- arXivLessons from the Trenches on Reproducible Evaluation of Language ModelsarXiv preprint arXiv:2405.14782, Nov 2024
- ACL FindingsSemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 LanguagesarXiv preprint arXiv:2402.08638, Nov 2024
- ACLCendol: Open Instruction-tuned Generative Large Language Models for Indonesian LanguagesarXiv preprint arXiv:2404.06138, Nov 2024
- EMNLP FindingsLinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language GeneralizationarXiv preprint arXiv:2401.06034, Nov 2024
2023
- arXivBloom: A 176b-parameter open-access multilingual language modelarXiv preprint arXiv:2211.05100, Nov 2023
- SEALPIndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue SystemsIn Proceedings of the First Workshop in South East Asian Language Processing, Nov 2023
- AACLEfficient Zero-Shot Cross-lingual Inference via RetrievalIn Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), Nov 2023
- AACLNusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource LanguagesIn Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Nov 2023
- Machine LearningTransfer learning application of self-supervised learning in ARPESMachine Learning: Science and Technology, Nov 2023
- arXivMultilingual Few-Shot Learning via Language Model RetrievalarXiv preprint arXiv:2306.10964, Nov 2023
- EMNLPGlobalBench: A Benchmark for Global Progress in Natural Language ProcessingIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Nov 2023
- EMNLPMultilingual Large Language Models Are Not (Yet) Code-SwitchersIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Nov 2023
- CALCSPrompting multilingual large language models to generate code-mixed texts: The case of south east asian languagesIn Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching, Nov 2023
- ACL FindingsNusaCrowd: Open source initiative for Indonesian NLP resourcesIn Findings of the Association for Computational Linguistics: ACL 2023, Nov 2023
- ACL FindingsMulti-lingual and Multi-cultural Figurative Language UnderstandingIn Findings of the Association for Computational Linguistics: ACL 2023, Nov 2023
- ACL FindingsOvercoming Catastrophic Forgetting in Massively Multilingual Continual LearningIn Findings of the Association for Computational Linguistics: ACL 2023, Nov 2023
- ACLOn “Scientific Debt” in NLP: A Case for More Rigour in Language Model Pre-Training ResearchIn Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Nov 2023
- ICAICTAImplementing Quantization to Indonesian BERT Language ModelIn 2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA), Nov 2023
- EACLTowards a Unified Multi-Domain Multilingual Named Entity Recognition ModelIn Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Nov 2023
- ACL FindingsThe Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and ChallengesIn Findings of the Association for Computational Linguistics: ACL 2023, Nov 2023
2022
- SumEvalIndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local LanguagesSumEval 2022, Nov 2022
- ACLBLOOM+ 1: Adding Language Support to BLOOM for Zero-Shot PromptingarXiv preprint arXiv:2212.09535, Nov 2022
- AACLCross-lingual Few-Shot Learning on Unseen LanguagesIn Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Nov 2022
- SumEvalIndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local LanguagesIn Proceedings of the First Workshop on Scaling Up Multilingual Evaluation, Nov 2022
- arXivTransfer Learning Application of Self-supervised Learning in ARPESarXiv preprint arXiv:2208.10893, Nov 2022
- arXivNusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian LanguagesarXiv preprint arXiv:2207.10524, Nov 2022
- EMNLP DemoGEMv2: Multilingual NLG Benchmarking in a Single Line of CodearXiv preprint arXiv:2206.11249, Nov 2022
- Accepted at TMLRBeyond the Imitation Game: Quantifying and extrapolating the capabilities of language modelsNov 2022
- ACLOne Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in IndonesiaAccepted at ACL, Nov 2022
- LRECCI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command RecognitionAccepted at LREC, Nov 2022
- LRECASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn ConversationAccepted at LREC, Nov 2022
2021
- arXivNL-Augmenter: A Framework for Task-Sensitive Natural Language AugmentationarXiv preprint arXiv:2112.02721, Nov 2021
- arXivFew-Shot Bot: Prompt-Based Learning for Dialogue SystemsarXiv preprint arXiv:2110.08118, Nov 2021
- arXivGreenformer: Factorization toolkit for efficient deep neural networksarXiv preprint arXiv:2109.06762, Nov 2021
- InterspeechAdapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech RecognitionINTERSPEECH, Aug 2021
- SIGDIAL
- RepL4NLPExploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual LearningRepL4NLP, Aug 2021
- arXiv
2020
- AACL-IJCNLPIndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language UnderstandingIn Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Aug 2020
- arXivEmograph: Capturing emotion correlations using graph networksarXiv preprint arXiv:2008.09378, Aug 2020
2019
- MRQAGeneralizing Question Answering System with Pre-trained Language Model Fine-tuningIn EMNLP 2019 MRQA Workshop, Aug 2019
- FinNLPLearning to learn sales prediction with social media sentimentIn Proceedings of the First Workshop on Financial Technology and Natural Language Processing, Aug 2019
2018
- arXivTowards end-to-end automatic code-switching speech recognitionarXiv preprint arXiv:1810.12620, Aug 2018
- arXivLearn to code-switch: Data augmentation using copy mechanism on language modelingarXiv preprint arXiv:1810.10254, Aug 2018
- ICASSPEnd-to-End Dynamic Query Memory Network for Entity-Value Independent Task-oriented DialogIn 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Aug 2018
2017
- DSTC6End-to-end recurrent entity network for entity-value independent goal-oriented dialog learningIn Wu, Chien-Sheng, et al. "End-to-end recurrent entity network for entity-value independent goal-oriented dialog learning." Dialog System Technology Challenges Workshop, DSTC6, Aug 2017
- Interspeech