-
Notifications
You must be signed in to change notification settings - Fork 63
Description
Features drop when using to_networkx() and resulting networkx graph is incompatible with torch_geometric from networkx
I'm trying to use the devmap dataset graphs on a torch geometric model
I started with the provided script to generate the graphs out of the devmap dataset here
To Reproduce
Then, I read each of the graph files and try to process them as follows:
import pathlib
import programl as pg
from programl.util.py import pbutil
from programl.proto import program_graph_pb2
from torch_geometric.utils.convert import from_newtorkx
filepath= 'path/to/pbfile.pb'
# Load graph from pb file
graph = pbutil.FromFile(pathlib.Path(filepath), program_graph_pb2.ProgramGraph())
# convert ProgramGraph to networkx graph
nx_graph = pg.to_networkx(graph)
# ISSUE: print networkx graph, now all features such as wgsize, transfer_bytes,... are gone
print(list(nx_graph.nodes(data=True)))
# ANOTHER ISSUE: cannot read in torch geometric
pyg_graph = from_networkx(nx_graph)
First error I get is at the invocation of from_networkx()
ValueError: Not all nodes contain the same attributes
by looking at the nodes, I found that the first node is different than other so I just decided to delete it to make it work
nx_graph.remove_node(0)
Now, after that the other error I get is:
RuntimeError: Could not infer dtype of dict
This happens when torch_geometric.utils.convert.from_networkx()
tries processing the values for the key features
in the networkx representations. I traced it back to find that the key blocks
is processed just fine because it is a list of integers. However, it fails when processing features
which is a dict of dicts {'full_text': str , 'function': int, 'text': int, 'type': int}
Is there any workaround for these two issues:
(1) The features being dropped when using to_networkx()
(2) the from_networkx()
torch_geometric routine failing on the resulting to_networkx(ProgramGraph)
Thanks!
Environment
- ProGraML version : 0.3.2
- How you installed ProGraML (source, pip): pip
- OS: Ubuntu 20.04.3
- Python version: 3.9.7