Aligning Graph Neural Networks and Large Language Models for Learning on Text-Attributed Graphs

Many graphs in real-world applications are text-attributed graphs (TAG). Examples include the description of the condition of sensors and pipes and the water distribution network over the junctions, the posts on a social media platform and the following graph over the users, the text of academic articles and the citation network over these articles, among others. This project focuses on combining the power of graph neural networks (GNNs) and large language models (LLMs) to extract information from both the text modality and the graph modality to facilitate learning on TAGs in engineering applications.

Keywords: Graph Neural Networks, Large Language Models, Text-Attributed Graphs.

Description
Over the last few years, large language models (LLMs) have achieved remarkable advancements, revolutionizing the field of natural language processing. Although the primary focus of LLMs has been on text data, there is a growing interest in enhancing the multi-modal capabilities of LLMs to enable them to handle diverse data types, including images, videos, and graphs. In the past decade, graph neural networks (GNNs) have been widely applied in graph-structured data, such as social networks and citation networks. Many of these graphs are also associated with text attributes, for example, the descriptions of the condition of sensors and pipes in a water distribution network. Thus, it is natural to consider combining GNNs with LLMs for solving tasks on text-attributed graphs (TAGs). By doing so, we can combine the strength of LLMs in textural understanding and the strength of GNNs in capturing structural relationships, leading to more comprehensive and powerful learning on TAGs. The project will include the following tasks: 1. Literature review: Conduct a comprehensive review of existing literature on combining GNNs and LLMs for solving TAG tasks, especially on GNN-LLM alignment. 2. Implementation of state-of-the-art models: Implement the state-of-the-art approaches in aligning GNNs and LLMs for solving TAG tasks, such as GRENADA [1], G2P2 [2], and ConGraT [3]. 3. GNN-LLM alignment method development: Investigate the weaknesses of existing approaches and develop a new effective method for aligning GNNs and LLMs for TAG tasks. 4. Experiment evaluation: Compare the proposed method with existing state-of-the-art approaches on standard TAG benchmarks [5]. 5. Case study: Apply the proposed method to a real-world engineering application, such as water distribution networks and power distribution networks. Requirements: 1. Motivation to work on the intersection of GNN and LLM and their application in engineering problems. 2. Experience in Python and PyTorch. 3. Familiar with GNNs or LLMs.
Over the last few years, large language models (LLMs) have achieved remarkable advancements, revolutionizing the field of natural language processing. Although the primary focus of LLMs has been on text data, there is a growing interest in enhancing the multi-modal capabilities of LLMs to enable them to handle diverse data types, including images, videos, and graphs. In the past decade, graph neural networks (GNNs) have been widely applied in graph-structured data, such as social networks and citation networks. Many of these graphs are also associated with text attributes, for example, the descriptions of the condition of sensors and pipes in a water distribution network. Thus, it is natural to consider combining GNNs with LLMs for solving tasks on text-attributed graphs (TAGs). By doing so, we can combine the strength of LLMs in textural understanding and the strength of GNNs in capturing structural relationships, leading to more comprehensive and powerful learning on TAGs.

The project will include the following tasks:
1. Literature review: Conduct a comprehensive review of existing literature on combining GNNs and LLMs for solving TAG tasks, especially on GNN-LLM alignment.
2. Implementation of state-of-the-art models: Implement the state-of-the-art approaches in aligning GNNs and LLMs for solving TAG tasks, such as GRENADA [1], G2P2 [2], and ConGraT [3].
3. GNN-LLM alignment method development: Investigate the weaknesses of existing approaches and develop a new effective method for aligning GNNs and LLMs for TAG tasks.
4. Experiment evaluation: Compare the proposed method with existing state-of-the-art approaches on standard TAG benchmarks [5].
5. Case study: Apply the proposed method to a real-world engineering application, such as water distribution networks and power distribution networks.

Requirements:
1. Motivation to work on the intersection of GNN and LLM and their application in engineering problems.
2. Experience in Python and PyTorch.
3. Familiar with GNNs or LLMs.
Goal
The student will develop a method to effectively integrate the information from the graph modality with the information from the text modality by aligning GNNs and LLMs. The proposed model will be compared with existing state-of-the-art methods on TAG tasks and applied to address an engineering problem.
The student will develop a method to effectively integrate the information from the graph modality with the information from the text modality by aligning GNNs and LLMs. The proposed model will be compared with existing state-of-the-art methods on TAG tasks and applied to address an engineering problem.
Contact Details
zepeng.zhang@epfl.ch
zepeng.zhang@epfl.ch

Calendar

Earliest start	2024-09-09
Latest end	2025-02-01

Location

ENAC - Civil Engineering Section (EPFL)

Labels

Semester Project
Master Thesis

Topics

Information, Computing and Communication Sciences