GraphNoC: GNNs for FPGA NoC Summarized
Posted on
(Disclaimer: I am taking ECE493, which requires a discussion of this paper. This blog post on its own is not submitted to the course, but is understandably linked to what I did choose to submit.)
Introduction
GraphNoC: Graph Neural Networks for Application-Specific FPGA NoC Performance Prediction is a 2024 paper by Gurshaant Malik and Dr. Nachiket Kapre. Notably, it won Best-Paper Award at https://fpt2024.org in Sydney, NSW, AU.
It describes a paradigm by which GNNs are used to take a design for an FPGA NoC and then effectively 'simulate' the latency that a given NoC design will have.
Before we move on, I will summarize and define key terms.
NoC: Networks on Chip
Semiconductors have come a long way from being the focal points upon which a circuit routed signals. Now, having increased density chip density, there is clear motivation to connect IP blocks and route data between them in an efficient way.
IP blocks are defined as independent units of configurable logic that an integrated circuit or (in this case) an FPGA designer can use for some functionality within their overall design.
For those who are familiar with software, the clearest connection here is a 'library', since IP blocks often have verified behaviour and an interface.
An interface is effectively a 'contract' that states that one can expect certain behaviour if they follow certain rules. Given that IP blocks incur a high level of expertise to design and are often sold for some fee, their interfaces are frequently specified in a more comprehensive way than the average software library.
GNNs
Graph Neural Networks are a special form of the classic neural network that you may have worked with before. In Multilayer Feedforward Networks are Universal Approximators by Hornik, Stinchcombe, and White, we see that "feedforward networks are capable of arbitrarily accurate approximation to any real-valued continuous function over a compact set". We should note though that capability does not imply suitability.
Some problems, namely network problems, lend themselves to being represented in the form of a graph. Using a classic neural network for a problem represented in form of a graph will naturally lead to loss of 'dimensionality', since the classic neural network can only ingest 'flat' tensors.
Message Passing
A key distinction between a classic neural net and graph neural net is message passing. This is a procedure by which a given node aggregates messages from its neighbours. The locality of a given node informs what message it will get from its neighbours.
In GRAPH LEARNING AT SCALE: CHARACTERIZING AND OPTIMIZING PRE-PROPAGATION GNNS, Yue, Deng, and Zhang define message passing as being defined by an aggregation function $f_k$ that a node $v$ uses to collect and "aggregate" embeddings from its one-hop neighbours.
The follow equation explains it more thoroughly:
$$ h_v^{(k)} = l_k \left( h_v^{(k-1)}, f_k\left(\left\{(h_u^{(k-1)}, e_{vu}) \mid \forall u \in N(v)\right\}\right)\right) $$
This equation statees that the new embedding for node $v$ in layer $k$ is the transformation function ($l_k$) applied on the previous layer's node embedding for $v$ and the aggregation function $f_k$ applied on the set of pairs of the neighbouring nodes to $v$ in the $k-1$th layer and the edge attribute between them.
Node that $N(v)$ is the set of one-hop neighbours from node $v$.
Motivation
Now that some of the underlying pieces are clear, let's consider GraphNoC's raison d'ĂȘtre: the time it takes to simulate a given NoC design for an FPGA.
RTL (register-transfer level) simulations of FPGA NoCs take lots of time to run. According to GraphNoC, they are usually on the order of tens of minutes. As anyone who's ever baked, written a big C++ (or Rust!) codebase, or designed a NoC themselves knows, a feedback cycle above a minute is pretty much soul crushing.
The next logical choice here would be to opt for analytical models, which are in effect a mathematical model of the NoC by virtue of knowing its architecture. Examples of this include HopliteRT, which is another work by Dr. Kapre. One predicament with these models is that they tend to be bad at generalizing across different NoC architectures and specific application workflows.
GraphNoC also considers standard artifical neural nets, which tend to be faster than even analytical models (of course after training). A major benefit for ANNs is that they can learn specific details about a given NoC without being explicitly taught.
The major limiting factor for them, however, is the inherent topology of artificial neural nets, which is biased towards 'highly regular' and 'symmetric' NoC topologies such as meshes or tori.
Thus, GraphNoC comes to the conclusion that the most reasonable option would be graph neural networks, since they have approximately the same inference time, are compatible with the topologies that ANNs cannot manage with, and are feasible to use with different NoC architectures.
As is shown above, the loss (a measure of how accurately the prediction is emperically found to be) plateaus for ANNs whereas GNNs are sufficient to reduce it further.