September 3rd, 2024

Graph Language Models

The Graph Language Model (GLM) integrates language models and graph neural networks, enhancing understanding of graph concepts, outperforming existing models in relation classification, and effectively processing text and structured graph data.

Read original article

The paper "Graph Language Models" by Moritz Plenz and Anette Frank presents a new type of language model that integrates the capabilities of traditional language models (LMs) and graph neural networks (GNNs). While LMs are widely used in natural language processing (NLP), their interaction with structured knowledge graphs (KGs) remains an area of active research. Current methods either linearize graphs for embedding with LMs, which leads to a loss of structural information, or utilize GNNs, which struggle to represent text features effectively. The proposed Graph Language Model (GLM) addresses these limitations by initializing its parameters from a pretrained LM, enhancing its understanding of graph concepts and relationships. The architecture of the GLM is designed to incorporate graph biases, facilitating effective knowledge distribution. This allows the GLM to process both graphs and text, as well as combinations of the two. Empirical evaluations demonstrate that GLM embeddings outperform both LM and GNN baselines in relation classification tasks, showcasing their versatility in supervised and zero-shot settings. The findings suggest that GLMs could significantly advance the integration of structured knowledge into NLP applications.

- The Graph Language Model (GLM) combines strengths of language models and graph neural networks.

- GLM parameters are initialized from pretrained language models to improve understanding of graph concepts.

- The architecture of GLM incorporates graph biases for better knowledge distribution.

- Empirical evaluations show GLM embeddings outperform existing LM and GNN baselines.

- GLMs can process both text and structured graph data effectively.

4 comments

By @VHRanger - 8 months

I've talked about this before on my blog: https://singlelunch.com/2020/12/28/why-im-lukewarm-on-graph-...

Basically, language models are already graph neural networks. To understand why, go back to Word2Vec/GLoVe: word embedding distance represent co-occurrence frequency of words in a sentence.

Note how this is the same as a graph embedding problem: words are nodes, and the edge weight is co-occurence frequency. You embed the graph nodes. In fact, this is stated in formal math in the GLoVe paper.

The LLM architecture is basically doing the same thing, except the graph is conditional occurence based on the previous contextual words.

This setup makes for a graph with a truly astronomic number of nodes (word|context) and edges. This huge graph exists only in the land of abstract math, but it also shows why LLMs require so many parameters to perform well.

In any case, 4 years on, I'm still pretty lukewarm on the current gen of graph neural network architectures.

Case in point: the OP paper is pretty much the classic ML paper mill setup of "take some existing algorithm, add some stuff over it, spend a ton hyperparameter searching on your algo and show it beats some 2 year old baseline".

By @intalentive - 8 months

Yes transformers are a class of graph NN but there can still be value in dealing directly with both text and knowledge graphs (for RAG-style applications, presumably) because the graph enforces additional constraints.

It is like the difference between concrete and abstract syntax. LLMs frequently generate code that won't compile, since they predict tokens not AST nodes. They are underconstrained for the task.

How to address? You can train a single model to handle both, as the authors did, or you can manually enforce constraints while decoding.

By @fisian - 8 months

In case you haven't heard about Graph NNs yet, pyyorch geometric is a library to build GNNs and has some info how they work [1].

[1]: https://pytorch-geometric.readthedocs.io/

By @kelsey98765431 - 8 months

Graphing is not only the future, it's the present that is happening behind closed doors that you don't get to hear about. The things that can be done already are insane and the capabilities are growing quickly. Agentic graph based systems are the future for ASI.

Graph Language Models

Related

Related