
LLM Learning That Combines Diversity And Task Specialization: TCIA Mechanism And Experimental Results
3 main points
✔️ TCIA is an instruction extension framework that combines versatility and task conformance
✔️ Decompose instructions into base queries and constraints and generate diverse instructions with BFS search
✔️ Experiments show an average performance improvement of 8.7%, also outperforming GPT-4o
TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning
written by Simin Ma, Shujian Liu, Jun Tan, Yebowen Hu, Song Wang, Sathish Reddy Indurthi, Sanqiang Zhao, Liwei Wu, Jianbing Han, Kaiqiang Song
(Submitted on 28 Aug 2025)
Comments: Published on arxiv.
Subjects: Artificial Intelligence (cs.AI)
The images used in this article are from the paper, the introductory slides, or were created based on them.
Summary
This paper proposes a task-centric instruction data augmentation method called TCIA (Task-Centric Instruction Augmentation) for LLM fine tuning, which is in line with real-world applications.
Conventional methods have tried to ensure diversity through self-generated instruction data augmentation, but there are problems with repetitive instructions and "task drift," which is a deviation from the target task.
In the real world, there are many situations where performance specialized for a specific task is required rather than a general-purpose model, so a mechanism to maintain task conformance as well as diversity is essential.
TCIA is a method that decomposes instructions given in natural language as a combination of "underlying questions" and "constraints," and expands the instructions widely while manipulating the constraints.
Experiments have shown that TCIA achieves an average performance improvement of 8.7% on practical tasks such as meeting summarization, exceeding GPT-4o in some cases.
In this way, TCIA provides a new framework for LLM tuning that is robust to realistic applications.
Proposed Methodology
TCIA is a systematic instruction expansion framework consisting of six steps.
First, the semantic structure of instructions is clarified by decomposing natural language instructions into "base queries" and "constraints".
Next, a diverse database of constraints constructed from public datasets (e.g., Tulu-3) is used to enable retrieval of constraints related to similar tasks.
Then, using breadth-first search (BFS), operations such as "add," "delete," and "replace" are repeated to generate a diverse and task-compatible set of constraints.
The generated instructions are again converted into natural language, and are high-qualityed through verification of missing constraints and resolution of inconsistencies.
Furthermore, only optimal instruction-response pairs are selected through response generation using multiple LLMs and screening by LLMs (5-dimensional evaluation of quality, usefulness, accuracy, consistency, etc.).
The result is a large training dataset that is faithful to the task and maintains diversity, enabling efficient and realistic fine tuning.
Experiments
The authors test the effectiveness of TCIA at both the instruction and model levels.
First, by comparing TCIA with conventional methods such as WizardLM, the authors show that TCIA maintains a high level of task conformance while preserving instructional diversity.
For example, even after three expansions, TCIA maintained a task conformance rate of almost 100% and outperformed WizardLM in the diversity metric.
Next, based on Llama-3.1-8B, we performed fine tuning on four practical tasks, such as meeting summarization and information extraction, and found an average performance improvement of 8.7%.
It is particularly noteworthy that the results outperformed GPT-4o.
In addition, experiments on adaptation to new constraints confirmed that models trained on TCIA are flexible enough to handle unseen requirements, such as changing from bulleted to numbered lists and output length restrictions.
Furthermore, the models maintained good scores in public benchmarks such as MMLU-Pro and GPQA, demonstrating both task-specific and general-purpose performance.
Categories related to this article