The Impact of Domain Adaptation on the Activation Space of LLMs

Assala Benmalek; Celia Cintas; Miriam Rateike; Skyler Speakman

DLI 2025

Workshop paper

17 Aug 2025

The Impact of Domain Adaptation on the Activation Space of LLMs

Abstract

Large language models (LLMs) are typically pre-trained on general-purpose corpora and later refined through instruction tuning and domain-specific fine-tuning to adapt to specialized domains or tasks (domain adaptation). While LLMs benchmarking often focuses on downstream task performance, understanding how domain adaptation reshapes internal model representations remains underexplored. In this work, we analyze shifts in activation space across base, instruction-tuned, and domain aligned models from multiple LLM families in the legal and medical domains. We quantify the impact of domain alignment on internal representations by measuring Euclidean distances between convex hull centroids of the first two principal components and evaluating activation-space distribution shifts with DeepScan. Our findings reveal consistent activation shifts from base instruction-tuned to domain aligned models. However, the degree and significance of the change vary by model. In most cases, domain-adapted models exhibit more significant shifts in activation space relative to other models from the same family. Separability in activation space is generally high, especially in Legal, and to a lesser extent in Health. Our preliminary results highlight the value of activation-space analysis as a complementary perspective to traditional evaluation methods in the model’s output.

Conference paper