Publications
-
ARISE: Agentic Rubric-Guided Iterative Survey Engine for Automated Scholarly Paper Generation
Zi Wang, Xingqiao Wang, Sangah Lee, and Xiaowei Xu (2026)
To appear at HAXD 2026
-
Do Korean-Adapted LLMs Think in Korean? Analyzing Latent Language and the Preservation of Korean-Specific Knowledge
Sangah Lee (2025)
Language and Information, Vol.29, No.3, pp.229-256.
-
Nunchi-Bench: Benchmarking Language Models on Cultural Reasoning with a Focus on Korean Superstition
Kyuhee Kim and Sangah Lee (2025)
Findings of the Association for Computational Linguistics: ACL 2025
-
KoBALT: Korean Benchmark For Advanced Linguistic Tasks
Hyopil Shin, Sangah Lee, Dongjun Jang, Wooseok Song, Jaeyoon Kim, Chaeyoung Oh, Hyemi Jo, Youngchae Ahn, Sihyun Oh, Hyohyeong Chang, Sunkyoung Kim, Jinsik Lee (2025)
arXiv
-
A Short Note on the Structural Priming in LLM: Focusing on Dative Constructions in Korean
Semoon Hoe and Sangah Lee (2024)
Language and Information, Vol.28, No.3, pp.111-142 (In Korean).
-
ManWav: The First Manchu ASR Model
Jean Seo, Minha Kang, Sungjoo Byun, Sangah Lee (2024)
Proceedings of the Third Workshop on NLP Applications to Field Linguistics
-
Large Language Models Show Human-Like Abstract Thinking Patterns: A Construal-Level Perspective
Seung Joo Yoo and Sangah Lee (2024)
Proceedings of the Annual Meeting of the Cognitive Science Society
-
ManNER & ManPOS: Pioneering NLP for Endangered Manchu Language
Sangah Lee, Sungjoo Byun, Jean Seo, and Minha Kang (2024)
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
-
KoCoNovel: Annotated Dataset of Character Coreference in Korean Novels
Kyuhee Kim, Surin Lee, and Sangah Lee (2024)
arXiv
-
K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression
Kyuhee Kim, Surin Lee, and Sangah Lee (2024)
arXiv
-
DaG LLM ver 1.0: Pioneering Instruction-Tuned Language Modeling For Korean NLP
Dongjun Jang, Sangah Lee, Sungjoo Byun, Jinwoong Kim, Jean Seo, Minseok Kim, Soyeon Kim, Chaeyoung Oh, Jaeyoon Kim, Hyemi Jo, and Hyopil Shin (2023)
arXiv
-
Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data
Jean Seo, Sungjoo Byun, Minha Kang, and Sangah Lee (2023)
3rd Multilingual Represenation Learning (MRL) Workshop
-
Studies on Clauses in Computational Linguistics Focused on Korean Corpora
Sangah Lee (2023)
Journal of Korean Linguistics, No.107, pp. 445-468 (In Korean).
-
Contract Eligibility Verification Enhanced by Keyword and Contextual Embeddings
Sangah Lee, Seokgi Kim, Eunjin Kim, Minji Kang, and Hyopil Shin (2022)
KIISE Vol.49, No.10, pp.848-858 (In Korean).
-
The Korean Morphologically Tight-Fitting Tokenizer for Noisy User-Generated Texts
Sangah Lee and Hyopil Shin (2021)
2021 The 7th Workshop on Noisy User-Generated Text (W-NUT)
-
Combining Sentiment-Combined Model with Pre-Trained BERT Models for Sentiment Analysis
Sangah Lee and Hyopil Shin (2021)
KIISE Vol.49, No.10, pp.848-858 (In Korean).
-
Argument Facet Detection in Online Debates Based on Attention Weights and Clustering with Combined Similarity Matrices
Sangah Lee and Hyopil Shin (2021)
Korean Journal of Linguistics, Vol.46, No.1, pp.107-134.
-
KR-BERT: A Small-Scale Korean-Specific BERT Language Model
Sangah Lee, Hansol Jang, Yunmee Baik, Suzi Park and Hyopil Shin (2020)
Journal of KIISE, Vol.47, No.7, pp.682-692.
-
An Analysis of Linear Argumentation Structure of Korean Debate Texts Using Sequential Modeling and Linguistic Features
Sangah Lee and Hyopil Shin (2021)
Journal of KIISE, Vol.45, No.12, pp.1292-1301 (In Korean).
-
Stance Classification of Online Debate Texts based on Discourse Relations
Sangah Lee and Hyopil Shin (2021)
Language Research, Vol.52, No.3, pp.511-532 (In Korean).
Presentations
-
Cultural Assessment of Korean Language Generation in Large Language Models: Limitations of Machine-Translated Corpora
Sangah Lee (2024)
The 2024 Lingusitic Society of Korea Winter Conference
-
The Phonological Constraints on Korean Lexical Subclasses
Nayoung Park and Sangah Lee (2023)
The 9th International Conference on Phonology and Morphology (ICPM9)
-
Computational Linguistics and the Study of Korean Syntax
Sangah Lee (2022)
The Society of Korean Linguistics
-
A Method of Infusing Additional Features into Pre-Trained BERT Models for Sentiment Analysis
Sangah Lee and Hyopil Shin (2020)
Korea Software Congress 2020
-
The Occurrence and Evolution of Feminist Twitterians
Sangah Lee and Suzi Park (2018)
The Discourse and Cognitive Linguistics Society of Korea
-
Automatic Prediction of ‘Anti-Search Variants’ of Twitter based on Word Embeddings and Phonetic Similarity
Sangah Lee (2017)
The 29th Annual Conference on Human & Cognitive Language Technology
-
The POS Elderly: Semi-Automatic Annotation Tool for Historical Korean
Migyeong Kim, Suzi Park and Sangah Lee (2016)
The 28th Annual Conference on Human & Cognitive Language Technology
-
An Automatic Classification of Discourse Relations in the Arguing Structure of Korean Texts
Sangah Lee and Hyopil Shin (2015)
The 27th Annual Conference on Human & Cognitive Language Technology