Skip to content

Features drop when using to_networkx() and resulting networkx graph is incompatible with torch_geometric from networkx #196

@khaled3ttia

Description

@khaled3ttia

Features drop when using to_networkx() and resulting networkx graph is incompatible with torch_geometric from networkx

I'm trying to use the devmap dataset graphs on a torch geometric model
I started with the provided script to generate the graphs out of the devmap dataset here

To Reproduce

Then, I read each of the graph files and try to process them as follows:

import pathlib
import programl as pg
from programl.util.py import pbutil
from programl.proto import program_graph_pb2
from torch_geometric.utils.convert import from_newtorkx

filepath= 'path/to/pbfile.pb'

# Load graph from pb file
graph = pbutil.FromFile(pathlib.Path(filepath), program_graph_pb2.ProgramGraph())

# convert ProgramGraph to networkx graph
nx_graph = pg.to_networkx(graph)

# ISSUE: print networkx graph, now all features such as wgsize, transfer_bytes,... are gone
print(list(nx_graph.nodes(data=True)))

# ANOTHER ISSUE: cannot read in torch geometric
pyg_graph = from_networkx(nx_graph)

First error I get is at the invocation of from_networkx()
ValueError: Not all nodes contain the same attributes

by looking at the nodes, I found that the first node is different than other so I just decided to delete it to make it work
nx_graph.remove_node(0)

Now, after that the other error I get is:
RuntimeError: Could not infer dtype of dict

This happens when torch_geometric.utils.convert.from_networkx() tries processing the values for the key features in the networkx representations. I traced it back to find that the key blocks is processed just fine because it is a list of integers. However, it fails when processing features which is a dict of dicts {'full_text': str , 'function': int, 'text': int, 'type': int}

Is there any workaround for these two issues:
(1) The features being dropped when using to_networkx()
(2) the from_networkx() torch_geometric routine failing on the resulting to_networkx(ProgramGraph)

Thanks!

Environment

  • ProGraML version : 0.3.2
  • How you installed ProGraML (source, pip): pip
  • OS: Ubuntu 20.04.3
  • Python version: 3.9.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions