Đăng nhập
 
Tìm kiếm nâng cao
 
Tựa bài viết
Tác giả
Năm xuất bản
Tóm tắt
Lĩnh vực
Phân loại
Số tạp chí
 

Bản tin định kỳ
Báo cáo thường niên
Tạp chí khoa học ĐHCT
Tạp chí tiếng anh ĐHCT
Tạp chí trong nước
Tạp chí quốc tế
Kỷ yếu HN trong nước
Kỷ yếu HN quốc tế
Book chapter
Bài báo - Tạp chí
Vol. 17, No. Special issue: ISDS (2025) Trang: 106-116

Translating natural language into SQL is essential for intuitive database access, yet open-source small language models (SLMs) still lag behind larger systems when faced with complex schemas and tight context windows. This paper introduces a two-phase workflow designed to enhance the Text-to-SQL capabilities of SLMs. Phase 1 (offline) transforms the database schema into a graph, partitions it with Louvain community detection, and enriches each component in a cluster with metadata, relationships, and sample rows. Phase 2 (at runtime) selects the relevant tables, generates SQL queries, and iteratively refines the SQL through an execution-driven feedback loop until the query executes successfully. Evaluated on the Spider test set, our pipeline raises Qwen-2.5-Coder-14B to 86.2% Execution Accuracy (EX), surpassing its zero-shot baseline and outperforming all contemporary SLM + ICL approaches and narrowing the gap to GPT-4-based systems all while running on consumer-grade hardware. Ablation studies confirm that both schema enrichment and self-correction contribute significantly to the improvement. The study concludes that this workflow provides a practical methodology for deploying resource-efficient open-source SLMs in Text-to-SQL applications, effectively mitigating common challenges. An open-source implementation is released to support further research.

 


Vietnamese | English






 
 
Vui lòng chờ...