Empowering Large Language Models:
Tool Learning for Real-World Interaction

SIGIR 2024 Tutorial, Washington D.C

14th, July 2024

1The Chinese University of Hong Kong, 2Tsinghua University, 3UC Santa Cruz 4Renmin University of China, 5The University of Edinburgh

Summary

Since the advent of large language models (LLMs), the field of tool learning has remained very active in solving various tasks in practice, including but not limited to information retrieval. This half-day tutorial provides basic concepts of this field and an overview of recent advancements with several applications. In specific, we start with some foundational components and architecture of tool learning (i.e., cognitive tool and physical tool), and then we categorize existing studies in this field into tool-augmented learning and tool-oriented learning, and introduce various learning methods to empower LLMs this kind of capability. Furthermore, we provide several cases about when, what, and how to use tools in different applications. We end with some open challenges and several potential research directions for future studies. We believe this tutorial is suited for both researchers at different stages (introductory, intermediate, and advanced) and industry practitioners who are interested in LLMs and tool learning

To the best of our knowledge, this is the first tutorial about tool learning based on LLMs. More detail can be found in original proposal.

Schedule

Time Section Presenter
13:30—13:45 Section 1: Introduction Diji Yang
13:45—14:30 Section 2: Foundations of Tool Learning Hongru Wang
Section 2.1: Definition and Scope of Tools
Section 2.2: Components and Architecture of Tool Learning
14:30—15:00 Section 3: Tool Learning based on LLMs Yujia Qin
Section 3.1: Tool-oriented Learning
Section 3.2: Tool-augmented Learning
Section 3.3: "Learning" of Tool Learning
15:00—15:30 Coffee break
15:30—16:00 Section 4: Application of Tool Learning Diji Yang
Section 4.1: Tool Creation, Selection and Utilization
Section 4.2: Tool Learning in Information Retrieval
Section 4.3: Tool Learning in Embodied Environment
16:00-16:45 Section 5: Advanced Topics and Future Directions Hongru Wang
Section 5.1: Multi-modal and Multi-agent Tool Learning
Section 5.2: Safe, Trustworhy and Personalized Tool Learning
Section 5.3: Emerging Trends and Future Opportunities
16:45—17:00 Section 6: Summary and Overlook Hongru Wang

Resources

BibTeX

@inproceedings{toolmeetllm,
        author = {Wang, Hongru and Qin, Yujia and Lin, Yankai and Pan, Jeff Z. and Wong, Kam-Fai},
        title = {Empowering Large Language Models: Tool Learning for Real-World Interaction},
        year = {2024},
        isbn = {9798400704314},
        publisher = {Association for Computing Machinery},
        address = {New York, NY, USA},
        url = {https://doi.org/10.1145/3626772.3661381},
        doi = {10.1145/3626772.3661381},
        abstract = {Since the advent of large language models (LLMs), the field of tool learning has remained very active in solving various tasks in practice, including but not limited to information retrieval. This half-day tutorial provides basic concepts of this field and an overview of recent advancements with several applications. In specific, we start with some foundational components and architecture of tool learning (i.e., cognitive tool and physical tool), and then we categorize existing studies in this field into tool-augmented learning and tool-oriented learning, and introduce various learning methods to empower LLMs this kind of capability. Furthermore, we provide several cases about when, what, and how to use tools in different applications. We end with some open challenges and several potential research directions for future studies. We believe this tutorial is suited for both researchers at different stages (introductory, intermediate, and advanced) and industry practitioners who are interested in LLMs and tool learning.},
        booktitle = {Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval},
        pages = {2983–2986},
        numpages = {4},
        keywords = {language agents, large language models, tool learning},
        location = {Washington DC, USA},
        series = {SIGIR '24}
        }