METHOD FOR PROCESSING GRAPH AND SYSTEM THEREFOR
Provided are a method for processing a graph and system therefor. The method according to some embodiments may include calculating spectral information associated with a target graph, and generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
Latest Samsung Electronics Patents:
This application claims priority from Korean Patent Application No. 10-2022-0153494 filed on Nov. 16, 2022 in the Korean Intellectual Property Office, and all the benefits accruing therefrom under 35 U.S.C. 119, the contents of which in its entirety are herein incorporated by reference.
BACKGROUND 1. Technical FieldThe disclosure relates to a method for processing a graph and system therefor, and more particularly, to a method for generating an embedding representation for a graph and performing various types of tasks using the generated embedding representation and system therefor.
2. Description of the Related ArtGraph embedding means transforming a given graph into a vector or matrix representation in an embedding space. Recently, research on how to embed a graph using a neural network has been actively conducted, and a neural network that handles a graph is referred to as a graph neural network (GNN).
Meanwhile, most of graph embedding techniques proposed so far generate an embedding representation of a given graph using a GNN that aggregates information of neighboring nodes (i.e., passes a message between nodes). However, the proposed techniques have an obvious limitation in that an embedding representation to which characteristics (information) of a graph are reflected is not generated. Accordingly, there is a problem in that performance is significantly degraded for tasks in which the structural characteristics of the graph are important.
SUMMARYAspects of the disclosure provide a method capable of accurately generating an embedding representation to which structural characteristics (information) of a graph are reflected, and a system for performing the method.
Aspects of the disclosure also provide a method capable of improving performance of various graph tasks (e.g., an isomorphic graph determination task) and a system for performing the method.
Aspects of the disclosure also provide a method capable of accurately extracting structural characteristics (information) from a given graph and a system for performing the method.
Aspects of the disclosure also provide a method capable of accurately aggregating different graph information (e.g., node information and spectral information) without loss and a system for performing the method.
However, aspects of the disclosure are not restricted to those set forth herein. The above and other aspects of the disclosure will become more apparent to one of ordinary skill in the art to which the disclosure pertains by referencing the detailed description of the disclosure given below.
According to some embodiments of the disclosure, there is provided a method for processing a graph performed by at least one computing device. The method may include calculating spectral information associated with a target graph, and generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
In some embodiments, the calculating of the spectral information may include deriving an ego-graph for at least some of the nodes from the target graph, and calculating spectral information of the ego-graph.
In some embodiments, the spectral information of the ego-graph may include at least one of an eigenvalue and an eigenvector of the ego-graph.
In some embodiments, the spectral information of the ego-graph may include angle information of the ego-graph.
In some embodiments, the calculating of the spectral information may include deriving an ego-graph for at least some of the nodes from the target graph, deriving a sub-graph from which an ego node is removed from the ego-graph, and calculating spectral information of the derived sub-graph.
In some embodiments, the generating of the embedding representation of the target graph may include performing repetitive node label updating according to Weisfeiler-Lehman (WL) algorithm based on the information on the nodes and the spectral information, and generating the embedding representation based on a final label of the nodes obtained through the node label updating.
In some embodiments, the target graph may include a first graph and a second graph, and the method may further include determining whether the first graph and the second graph are isomorphic graphs by comparing an embedding representation of the first graph and an embedding representation of the second graph.
In some embodiments, the generating of the embedding representation of the target graph may include aggregating the information of the nodes and the spectral information, and generating the embedding representation by inputting the aggregated information into a graph neural network (GNN).
In some embodiments, the spectral information may be information in a form of a multi-set comprising a plurality of spectral elements, and the aggregating of the information of the nodes and the spectral information may include transforming the spectral information through a neural network module that transforms data in a form of multi-set into data in a form of vector or matrix, and aggregating the transformed spectral information and the information on the nodes.
In some embodiments, the aggregating of the information of the nodes and the spectral information may include changing a value of any one of the information of the nodes and the spectral information by reflecting a specific value to any one of the information of the nodes and the spectral information, and aggregating any changed information and other information.
In some embodiments, the specific value may be an irrational number.
In some embodiments, the specific value may be a value based on a learnable parameter, and the method may further include predicting a label for a predetermined task based on the generated embedding representation, and updating the value of the learnable parameter based on a difference between the predicted label and a correct label.
In some embodiments, the reflecting of the specific value may be performed based on a multiplication operation, and the aggregating of any changed information and other information may be performed based on an addition operation.
In some embodiments, the aggregating may be performed based on a concatenation operation.
In some embodiments, the GNN may be a neural network configured to generate an embedding representation of a graph by aggregating information of neighboring nodes.
In some embodiments, the generating of the embedding representation of the target graph may include generating an intermediate embedding representation of the target graph by inputting the information of the nodes into a graph neural network (GNN), and generating the embedding representation by aggregating the intermediate embedding representation and the spectral information.
According to other embodiments of the disclosure, there is provided a system for processing a graph. The system may include one or more processors, and a memory configured to store one or more instructions, wherein the one or more processors, by executing the stored one or more instructions, perform: calculating spectral information associated with a target graph, and generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
According to yet other embodiments of the disclosure, there is provided a computer program coupled to a computing device and stored in a computer readable reading medium to execute operations including calculating spectral information associated with a target graph, and generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
According to yet other embodiments of the disclosure, there is provided a non-transitory computer readable reading medium storing a computer program executable by at least one processor to perform: calculating spectral information associated with a target graph; and generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
The above and other aspects and features of the disclosure will become more apparent by describing in detail example embodiments thereof with reference to the attached drawings, in which:
Hereinafter, example embodiments of the disclosure will be described with reference to the attached drawings. Advantages and features of the disclosure and methods of accomplishing the same may be understood more readily by reference to the following detailed description of example embodiments and the accompanying drawings. The disclosure may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of the disclosure to those skilled in the art, and the disclosure will only be defined by the appended claims and their equivalents.
In adding reference numerals to the components of each drawing, it should be noted that the same reference numerals are assigned to the same components as much as possible even though they are shown in different drawings. In addition, in describing the disclosure, when it is determined that the detailed description of the related well-known configuration or function may obscure the gist of the disclosure, the detailed description thereof will be omitted.
Unless otherwise defined, all terms used in the present specification (including technical and scientific terms) may be used in a sense that may be commonly understood by those skilled in the art. In addition, the terms defined in the commonly used dictionaries are not ideally or excessively interpreted unless they are specifically defined clearly. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase.
In addition, in describing the component of this disclosure, terms, such as first, second, A, B, (a), (b), may be used. These terms are only for distinguishing the components from other components, and the nature or order of the components is not limited by the terms. If a component is described as being “connected,” “coupled” or “contacted” to another component, that component may be directly connected to or contacted with that other component, but it should be understood that another component also may be “connected,” “coupled” or “contacted” between each component.
Hereinafter, various example embodiments of the disclosure will be described in detail with reference to the accompanying drawings.
As illustrated in
The embedding representation 14 may mean a representation of a vector or matrix form for the graph 13. In this case, the matrix (e.g., a three-dimensional matrix) may encompass the concept of tensor. The embedding representation may be used interchangeably with terms such as “embedding vector”, “embedding matrix”, “embedding code”, “latent representation”, and “graph representation” depending on the cases (or forms).
Specifically, the processing system 10 may generate an embedding representation 14 of a graph 13 using a graph neural network (GNN) 11 and/or a modified Weisfeiler-Lehman (WL) algorithm 12. In this case, the processing system 10 may generate the embedded representation of the graph 13 by calculating spectral information associated with the graph 13 and further using the calculated spectral information. In this way, the embedding representation 14 to which structural characteristics (information) of the graph 13 are reflected may be generated. This will be described in detail with reference to
The spectral information may include, for example, eigenvalue, eigenvector, and/or angle information. However, the scope of the disclosure is not limited thereto. For example, the spectral information may further include various characteristic information of a graph that may be derived from the illustrated information. The angle information of the graph may be calculated based on Equation 1 below, for example. Equation 1 means Equation for calculating angle information “α” based on an adjacency matrix “A(G)” of a graph G having “n” nodes and “m” eigenvalues “μ” (μ1> . . . >μm) In Equation 1, “Di” is a diagonal matrix associated with an eigenvalue “μi”, and may mean a matrix in which a value of (j,j) of “Di” (that is, a value of a diagonal element) is “1” when a value of (j,j) of matrix “D” is equal to “μi”. In addition, “ej” means j-th standard basis, and “αij” means angle information for “μi” and the node “vj” (that is, a magnitude value of a vector derived through multiplication of Pi and ej).
spectral decomposition of A(G)=UGUT
PiUDiUT(if D(j,j)=μi, Du(j,j)=1 else Di(j,j)=0)
αij=∥Piej∥, where ej∈n [Equation 1]
Those skilled in the art will already know the meaning and calculation method of the exemplified spectral information (e.g., eigenvalue, eigenvector, angle information). Therefore, more detailed description thereof will be omitted. Reference for a more detailed description of the exemplified spectral information is made to a spectral graph theory.
In addition, the processing system 10 may perform various graph tasks (e.g., classification tasks, regression tasks) by using the embedding representation 14. For example, as illustrated in
The processing system 10 may be implemented with at least one computing device. For example, all functions of the processing system 10 may also be implemented in one computing device, a first function of the processing system 10 may also be implemented in a first computing device, and a second function thereof may also be implemented in a second computing device. Alternatively, a specific function of the processing system 10 may also be implemented in a plurality of computing devices.
The computing device may include any device having a computing function, and reference for an example of such a device is made to
The above has schematically described the operation of processing system 10 in accordance with some example embodiments of the disclosure with reference to
In addition, hereinafter, for clarity of the disclosure, description will be continued while changing reference numerals such as GNN (11 in
As illustrated in
The reason for calculating the spectral information is that structural characteristics (information) of the graph are implied in the spectral information. In other words, the spectral information such as eigenvalue may be calculated from an adjacent matrix of the graph, and the adjacency matrix includes the structural information of the graph as it is. Therefore, the spectral information calculated from the adjacency matrix also includes the structural characteristics (information) of the corresponding graph.
Meanwhile, a specific method of calculating spectral information in this step S31 may vary depending on the example embodiment. For example, the processing system 10 may extract the spectral information from an ego-graph for the target graph and a sub-graph (hereinafter referred to as “no-ego sub-graph”) derived by removing ego nodes from the ego-graph. In order to provide more convenience of understanding, the reason for extracting the spectral information from the illustrated graphs will be further described with reference to
First, the operation method and problems of the existing WL algorithm will be described with reference to
A basic operation of the WL algorithm is to update each node's label (i.e., node embedding representation) by iteratively aggregating labels (e.g., feature information) of neighboring nodes. For example, as illustrated in
The problem with the above-described method is that connectivity information (see edge 42) between the neighboring nodes is lost during an aggregation (or embedding) process as aggregation is performed by concentrating only on information on the neighboring nodes. Since the connectivity information between the nodes is information directly related to the structural characteristics of the graph, the structural characteristics (information) of the graph may not be properly included in the embedding representation of the graph generated through the existing WL algorithm.
Based on the above, the reason for extracting the spectral information from the ego-graph and the no-ego sub-graph will be described.
As illustrated in
Hereinafter, various example embodiments of calculating the spectral information will be described.
In some example embodiments, an ego-graph for at least one node may be derived from the target graph, and spectral information may be calculated from the derived ego-graph. For example, the processing system 10 may derive an ego-graph for each node from the target graph (i.e., derive a graph with each node as an ego-node), and may calculate spectral information based on an adjacency matrix of the ego-graph for each node.
Alternatively, the spectral information of the ego-graph may include eigenvalue and angle information of the ego-graph. Eigenvalues of all sub-graphs from which one node is removed from the graph are implied in the eigenvalue and angle information of a specific graph. Therefore, if the two pieces of information are calculated as the spectral information, there is an advantage in that there is no need to calculate the eigenvalue of the no-ego sub-graph (that is, an effect of reducing computing cost). Further referring to
In some other example embodiments, an ego-graph for at least one node may be derived from the target graph, and no-ego sub-graph may be derived from the ego-graph. In addition, the spectral information may be calculated from the no-ego sub-graph. For example, the processing system 10 may derive the no-ego sub-graph for each node by deriving an ego graph for each node from the target graph and removing an ego node from the ego-graph for each node. In addition, the processing system 10 may calculate the spectral information based on the adjacent matrix of the no-ego sub-graph for each node.
In some still other example embodiments, spectral information associated with the target graph may be calculated based on various combinations of the above-described example embodiments. For example, the processing system 10 may also calculate eigenvalue of the ego-graph, eigenvalue of the no-ego sub-graph, angle information of the ego-graph, and angle information of the no-ego sub-graph as the spectral information.
The description will be provided with reference to
In step S32, an embedding representation of the target graph may be generated based on the spectral information and the node information. Here, the node information may mean, for example, feature information (e.g., feature matrix) of a node, but the scope of the disclosure is not limited thereto. In this step S32, a specific method of generating the embedding representation of the target graph may vary depending on the example embodiment.
In some example embodiments, the spectral information may be information in a multi-set form including a plurality of spectral elements. In this case, the processing system 10 may transform the spectral information into vector or matrix-type information through a neural network module, and may generate an embedding representation of the target graph based on the transformed information and the node information. The present example embodiment will be further described later with reference to
In some other example embodiments, an embedding representation of the target graph may be generated using a GNN. For example, the processing system 10 may generate an embedding representation of the target graph by inputting information in which the node information and the spectral information are aggregated to the GNN. Alternatively, the processing system 10 may generate an embedding representation of the target graph by aggregating the spectral information with an intermediate embedding representation obtained by inputting the node information into the GNN. The present example embodiment will be described in more detail later with reference to
In some still other example embodiments, an embedding representation of the target graph may be generated using a modified WL algorithm. For example, the processing system 10 may generate an embedding representation of the target graph by performing a modified WL algorithm based on the node information and the spectral information. The present example embodiment will be described in more detail later with reference to
In some still other example embodiments, an embedding representation may also be generated based on various combinations of the above-described example embodiments. For example, the processing system 10 may generate a first embedding representation using a GNN, generate a second embedding representation using a modified WL algorithm, and generate a final embedding representation of the target graph by aggregating two embedding representations.
In step S33, a predetermined task may be performed using the embedding representation of the target graph. Here, the task to be performed may be a classification task (e.g., an isomorphic determination task) or a regression task. A specific type of tasks to be performed may vary, and any task may be performed.
In some example embodiments, processing system 10 may perform an isomorphic determination task. That is, the processing system 10 may determine whether the first graph and the second graph are isomorphic graphs by generating the embedding representation of the first graph and the embedding representation of the second graph as described above and aggregating the two embedding representations. The present example will be described later with further reference to the descriptions of
On the other hand, if the task is performed for embedding learning (e.g., embedding learning for GNN) (i.e., the learning step), the processing system 10 may update an embedding-related module and a value of a learnable parameter based on a difference between a result of performing the task and a correct answer. As such a process is repeatedly performed for a plurality of graphs, the embedding-related module (e.g., GNN, etc.) may be equipped with embedding capabilities for graphs. This will be described with further reference to the description of
So far, the graph processing method according to some example embodiments of the disclosure has described with reference to
In addition, the spectral information may be calculated from the ego-graph for nodes constituting the target graph. Since the spectral information of the ego-graph includes abundant structural characteristics (information) of the target graph, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be more accurately generated when the corresponding spectral information is used.
In addition, the spectral information may be calculated from the no-ego sub-graph. Since the spectral information of the no-ego sub-graph also includes abundant structural characteristics (information) of the target graph, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be more accurately generated when the corresponding spectral information is used.
In addition, the eigenvalue and the angle information may be calculated as the spectral information from the ego-graph for the nodes constituting the target graph. Since the eigenvalue and angle information imply eigenvalues of all sub-graphs from which one node is removed, there is no need to separately calculate eigenvalue of the no-ego sub-graph by using the eigenvalue and angle information. Accordingly, the computing cost required to calculate the eigenvalue of the no-ego sub-graph may be reduced, and the embedding representation of the target graph may be more accurately generated.
Hereinafter, a method of calculating spectral information according to some example embodiments of the disclosure will be described with reference to
As illustrated in
For reference, since the number of nodes for each ego-graph may vary, the adjacent matrices of the ego-graph may also have different sizes, and each ego-graph may have the degree of an ego node+1 node (note that the size of the adjacent matrix is deg(i)+1 ×deg(i)+1). In addition, in
Next, the processing system 10 may calculate spectral information 74 of the ego-graph for each node. For example, the processing system 10 may calculate eigenvalue and angle information of the corresponding ego-graph from the adjacent matrix 73 of the ego-graph for each node. However, the scope of the disclosure is not limited thereto.
Next, processing system 10 may transform the spectral information 74 into an appropriate form. Since such transformation is performed to aggregate the spectral information with node information, the processing system 10 may transform the spectral information 74 to be suitable for the form of the node information (see 75). If the transformation of spectral information is not required, the transformation process may be omitted.
In some example embodiments, the spectral information may be calculated in a multi-set form. For example, the processing system 10 may calculate eigenvalues and angle information of the corresponding ego-graph as many as the number of nodes of the ego-graph. In this case, information in the multi-set form having a pair (tuple) including eigenvalue and angle information as spectral elements may be calculated. Since those skilled in the art will already know that a plurality of eigenvalues and angle information are calculated from one graph, a description thereof will be omitted. In the present example embodiment, the processing system 10 may transform the spectral information into an appropriate form (e.g., vector or matrix) using a neural network module. Hereinafter, a transformation process will be further described with reference to
As illustrated in
Then, the processing system 10 may transform the spectral information 81 of the multi-set form through a neural network module configured to perform a first transformation operation (see 82), an addition operation, and a second transformation operation (see 85).
More specifically, the processing system 10 may perform the first transform operation on each of the spectral elements constituting the multi-set 81 through a first MLP 82 (see 83), and may aggregate (integrate) the spectral elements transformed through the addition operation into a vector (or matrix) form (see 84).
Next, the processing system 10 may perform the second transformation operation on the result of aggregation 84 through a second MLP 85. As a result of the second transformation operation, spectral information 86 of the vector (or matrix) form may be generated.
Since those skilled in the art will already know that a transformation module (function) having permutation invariant characteristics may be implemented through the addition operation and the two transformation operations, and the information of multi-set form may be transformed into the vector (or matrix) form through this, further description of
So far, the method of calculating the spectral information according to some example embodiments of the disclosure has described with reference to
Hereinafter, various example embodiments of the disclosure related to graph embedding will be described with reference to
First, a GNN-based graph embedding method according to some example embodiments of the disclosure will be described with reference to
As illustrated in
Specifically, the processing system 10 may calculate spectral information 94 associated with the target graph 91 based on an adj acent matrix 93 of the target graph 91.
Next, the processing system 10 may aggregate the node information 92 and the spectral information 94 of the target graph 91.
The aggregation method may be, for example, concatenation, multiplication, addition, elementwise product, neural network (e.g., multilayer perceptron) based aggregation, or a combination thereof. However, the scope of the disclosure is not limited thereto.
In some example embodiments, the node information 92 and the spectral information 94 may be aggregated through a concatenation operation. In this case, the two pieces of information may be accurately aggregated without loss of information.
In some other example embodiments, after changing a value by reflecting a specific value to at least one of the node information 92 and the spectral information 94, the two pieces of information 92 and 94 may be aggregated. For example, as illustrated in
In the previous example embodiments, a specific method of deriving/generating a specific value may be designed in various ways.
For example, the specific value may be a preset value. For example, the specific value is a value based on a kind of hyperparameter (e.g., “ε”), and may be preset by a user. As a more specific example, the specific value may be set to an irrational number. This is because when one (or both) of the two pieces of information is multiplied by an irrational number, it is possible to effectively prevent the values of the two pieces of information from being offset during an aggregation (e.g., addition operation) process. For reference, it has been mathematically proven by the inventors of the disclosure that information loss may be prevented by aggregating two pieces of information using an irrational number based multiplication operation and addition operation.
As another example, the specific value may be a value derived based on a learnable parameter (e.g., when the learnable parameter is “ε”, the specific value may be a value of ε itself, ε+irrational number, etc.). In this case, the processing system 10 may also update the value of the learnable parameter during the process of proceeding with embedding learning. By doing so, an optimal value capable of preventing loss of information (or expressive power) during the aggregation process may be naturally and accurately derived. This will be described with further reference to the description of
The description will be provided with reference to
The processing system 10 may aggregate the node information 92 and the spectral information 94 and input the aggregated information 95 to the GNN 96. In addition, the processing system 10 may generate a final embedding representation 98 by performing a pooling operation (e.g., addition, etc.) on an embedding representation 97 output from the GNN 96. In some cases, the pooling operation may be omitted. Those skilled in the art will already know the pooling operation used in the GNN field, and then a detailed description thereof will be omitted.
For reference, the pooling operation may also be performed by a pooling module positioned inside the GNN 96. In addition, in the art, the “pooling module” may be used interchangeably with terms such as a “pooling layer”, a “readout layer”, and a “readout module”.
The GNN 96 may be, for example, a model that performs embedding in a method of aggregating neighboring node information (i.e., message passing method between nodes), but the scope of the disclosure is not limited thereto. An example of the illustrated model may include a graph isomorphism network (GIN). The GIN is a GNN that implements a node label update operation (i.e., hash operation) of the WL algorithm with multilayer perceptron, and is known as a model with the same expressive power as the WL algorithm. Those skilled in the art will already know the structure and operating principle of the GIN, and then a detailed description thereof will be omitted.
Meanwhile, although not illustrated in
Meanwhile,
So far, the example embodiments of the GNN-based graph embedding method have been described with reference to
Hereinafter, a modified WL algorithm-based graph embedding method according to some example embodiments of the disclosure will be described with reference to
As illustrated in
In the present example embodiment, the processing system 10 may generate an embedding representation (e.g., 134) of a target graph (e.g., 131) in a manner similar to the WL algorithm. For example, the processing system 10 may generate the embedding representations (e.g., 134 and 138) of the target graphs (e.g., 131 and 135) by repeatedly updating a label of each node based on node information (e.g., 132 and 136) and spectral information (e.g., 133 and 137) of the target graph (e.g., 131). For example, the processing system 10 may generate the embedding representation (e.g., 134) of the target graph (e.g., 131) based on a final label obtained through the repeated updating. The embedding representation (e.g., 134) may be a set of the final labels of each node, and may be generated by appropriately aggregating the final labels of each node.
For reference, the label of the node may be used interchangeably with terms such as “color information”, “feature information”, and “signature” in the relevant technical field.
In addition, the spectral information (e.g., 133) may or may not be reflected to an initial label of each node. In addition, the spectral information (e.g., 133) may be used in all steps of updating the node's label, or may be used only in some steps of updating the node's label.
When the embedding representation (e.g., 134) of the graph (e.g., 131) is generated, the processing system 10 may perform a predetermined task using the corresponding embedding representation (e.g., 134). For example, as illustrated, the processing system 10 may determine whether the two graphs correspond to isomorphic graphs by comparing the embedding representation 134 of the first graph 131 and the embedding representation 138 of the second graph 135. If the embedding representation 134 of the first graph 131 is the same as the embedding representation 138 of the second graph 135, the processing system 10 may determine that the two graphs 131 and 135 are isomorphic graphs. In the opposite case, the processing system 10 may determine that the two graphs 131 and 135 are not isomorphic graphs.
So far, the modified WL algorithm-based graph embedding method according to some example embodiments of the disclosure has been described with reference to
Hereinafter, the results of experiments performed by the inventors of the disclosure will be briefly described.
In order to check the performance of the above-described graph processing (or embedding) method (hereinafter referred to as the “proposed method”), the present inventors conducted an experiment using graphs that the existing WL algorithm may not distinguish. Specifically,
Referring to Table 1, it may be confirmed that the embedding representations of the graph pairs illustrated in
So far, the performance experiment results for the method proposed by the present inventors have been briefly described. Hereinafter, an example computing device 170 capable of implementing the processing system 10 described above will be described with reference to
As illustrated in
The processor 171 may control an overall operation of each component of the computing device 170. The processor 171 may be configured to include at least one of a central processing unit (CPU), a micro processor unit (MPU), a micro controller unit (MCU), a graphic processing unit (GPU), or any type of processor well known in the art. In addition, the processor 171 may perform a calculation on at least one application or program for executing the operation/method according to various example embodiments of the disclosure. The computing device 170 may include one or more processors.
Next, the memory 172 stores various data, commands, and/or information. The memory 172 may load the computer program 176 from the storage 175 to execute the operations/methods according to various example embodiments of the disclosure. The memory 172 may be implemented as a volatile memory such as RAM, but the technical scope of the disclosure is not limited thereto.
Next, the bus 173 may provide a communications function between the components of the computing device 170. The bus 173 may be implemented as various types of buses, such as an address bus, a data bus, and a control bus.
Next, the communication interface 174 supports wired/wireless Internet communications of the computing device 170. In addition, the communication interface 174 may also support various communication methods other than Internet communications. To this end, the communication interface 174 may include a communication module well known in the art of the disclosure.
Next, the storage 175 may non-temporarily store one or more computer programs 176. The storage 175 may include a non-volatile memory such as a read only memory (ROM), an erasable programmable ROM (EPROM), an electrically erasable programmable ROM (EEPROM), or a flash memory, a hard disk, a removable disk, or any form of computer-readable recording medium well known in the art to which the disclosure pertains.
Next, the computer program 176 may include one or more instructions that when loaded into memory 172, cause the processor 171 to perform the operations/methods according to various example embodiments of the disclosure. That is, the processor 171 may perform the operations/methods according to various example embodiments of the disclosure by executing the loaded instructions.
For example, the computer program 176 may include one or more instructions for causing the processor to perform an operation of calculating spectral information associated with a target graph and an operation of generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information. In this case, the processing system 10 according to some example embodiments of the disclosure may be implemented through the computing device 170.
Meanwhile, in some example embodiments, the computing device 170 illustrated in
Hereinabove, the example computing device 170 capable of implementing the processing system 10 according to some example embodiments of the disclosure has been described with reference to
Hereinabove, various example embodiments of the disclosure and effects according to the example embodiments have been described with reference to
According to some example embodiments of the disclosure, an embedding representation of a target graph may be generated further based on spectral information (e.g., eigenvalues, eigenvectors, angle information, etc.) associated with the target graph. Accordingly, the embedding representation to which structural characteristics (information) of the target graph are reflected may be accurately generated. The embedding representation generated in this way may have stronger expressive power than an embedding representation generated through a graph neural network (GNN) (e.g., GIN) using the existing Weisfeiler-Lehman (WL) algorithm or neighbor node information aggregation method, and may improve the performance of various graph tasks For example, the performance of a task of determining isomorphism of graphs may be greatly improved.
In addition, spectral information may be calculated from an ego-graph for nodes constituting the target graph. Since the spectral information of the ego-graph includes abundant structural characteristics (information) of the target graph, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be more accurately generated when the corresponding spectral information is used.
In addition, the spectral information may be calculated from a sub-graph (hereinafter, referred to as a “no-ego sub-graph”) derived by removing an ego node from the ego-graph. Since the spectral information of the no-ego sub-graph also includes abundant structural characteristics (information) of the target graph, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be more accurately generated when the corresponding spectral information is used.
In addition, in an ego-graph for nodes constituting the target graph, eigenvalue and angle information may be calculated as the spectral information. Since the eigenvalue and angle information imply eigenvalues of all sub-graphs from which one node is removed, there is no need to separately calculate eigenvalue of the no-ego sub-graph by using the eigenvalue and angle information. Accordingly, computing cost required to calculate the eigenvalue of the no-ego sub-graph may be reduced.
In addition, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be accurately generated by aggregating spectral information associated with the target graph and node information (e.g., feature matrix) of the target graph and inputting the aggregated information into the GNN.
In addition, after a specific value is reflected to the spectral information (e.g., multiplication by an irrational number), the corresponding information may be aggregated with the node information of the target graph. In this case, since an offset of values of two pieces of information during the aggregation process may be prevented by the specific value, loss of information (or expressive power) during the aggregation process may be prevented. For example, by multiplying the spectral information by the irrational number, it is possible to effectively prevent the values of the two pieces of information from being offset during an aggregation (e.g., addition) process.
In addition, the specific value may be derived based on a learnable parameter. In this case, as embedding learning progresses, an optimal value capable of preventing loss of information (or expressive power) during the aggregation process may be naturally and accurately derived.
In addition, by performing the modified WL algorithm based on the spectral information associated with the target graph and the node information of the target graph, the embedding representation to which the structural characteristics (information) of the target graph are reflected may be accurately generated, and the performance of isomorphic determination task may be further improved.
Effects according to the technical idea of the disclosure are not limited to the effects mentioned above, and other effects that are not mentioned may be obviously understood by those skilled in the art from the following description.
The technical features of the disclosure described so far may be embodied as computer readable codes on a computer readable medium. The computer readable medium may be, for example, a removable recording medium (CD, DVD, Blu-ray disc, USB storage device, removable hard disk) or a fixed recording medium (ROM, RAM, computer equipped hard disk). The computer program recorded on the computer readable medium may be transmitted to other computing device via a network such as internet and installed in the other computing device, thereby being used in the other computing device.
Although operations are shown in a specific order in the drawings, it should not be understood that desired results may be obtained when the operations must be performed in the specific order or sequential order or when all of the operations must be performed. In certain situations, multitasking and parallel processing may be advantageous. According to the above-described embodiments, it should not be understood that the separation of various configurations is necessarily required, and it should be understood that the described program components and systems may generally be integrated together into a single software product or be packaged into multiple software products.
In concluding the detailed description, those skilled in the art will appreciate that many variations and modifications may be made to the example embodiments without substantially departing from the principles of the disclosure. Therefore, the disclosed example embodiments of the disclosure are used in a generic and descriptive sense only and not for purposes of limitation.
Claims
1. A method for processing a graph performed by at least one computing device, the method comprising:
- calculating spectral information associated with a target graph; and
- generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
2. The method of claim 1, wherein the calculating of the spectral information comprises:
- deriving an ego-graph for at least some of the nodes from the target graph; and
- calculating spectral information of the ego-graph.
3. The method of claim 2, wherein the spectral information of the ego-graph comprises at least one of an eigenvalue and an eigenvector of the ego-graph.
4. The method of claim 2, wherein the spectral information of the ego-graph comprises angle information of the ego-graph.
5. The method of claim 1, wherein the calculating of the spectral information comprises:
- deriving an ego-graph for at least some of the nodes from the target graph;
- deriving a sub-graph from which an ego node is removed from the ego-graph; and
- calculating spectral information of the derived sub-graph.
6. The method of claim 1, wherein the generating of the embedding representation of the target graph comprises:
- performing repetitive node label updating according to Weisfeiler-Lehman (WL) algorithm based on the information on the nodes and the spectral information; and
- generating the embedding representation based on a final label of the nodes obtained through the node label updating.
7. The method of claim 6, wherein the target graph comprises a first graph and a second graph, and
- the method further comprises:
- determining whether the first graph and the second graph are isomorphic graphs by comparing an embedding representation of the first graph and an embedding representation of the second graph.
8. The method of claim 1, wherein the generating of the embedding representation of the target graph comprises:
- aggregating the information of the nodes and the spectral information; and
- generating the embedding representation by inputting the aggregated information into a graph neural network (GNN).
9. The method of claim 8, wherein the spectral information is information in a form of a multi-set comprising a plurality of spectral elements, and
- the aggregating of the information of the nodes and the spectral information comprises:
- transforming the spectral information through a neural network module that transforms data in a form of multi-set into data in a form of vector or matrix; and
- aggregating the transformed spectral information and the information on the nodes.
10. The method of claim 8, wherein the aggregating of the information of the nodes and the spectral information comprises:
- changing a value of any one of the information of the nodes and the spectral information by reflecting a specific value to any one of the information of the nodes and the spectral information; and
- aggregating any changed information and other information.
11. The method of claim 10, wherein the specific value is an irrational number.
12. The method of claim 10, wherein the specific value is a value based on a learnable parameter, and
- the method further comprises:
- predicting a label for a predetermined task based on the generated embedding representation; and
- updating the value of the learnable parameter based on a difference between the predicted label and a correct label.
13. The method of claim 10, wherein the reflecting of the specific value is performed based on a multiplication operation, and
- the aggregating of any changed information and other information is performed based on an addition operation.
14. The method of claim 8, wherein the aggregating is performed based on a concatenation operation.
15. The method of claim 8, wherein the GNN is a neural network configured to generate an embedding representation of a graph by aggregating information of neighboring nodes.
16. The method of claim 1, wherein the generating of the embedding representation of the target graph comprises:
- generating an intermediate embedding representation of the target graph by inputting the information of the nodes into a graph neural network (GNN); and
- generating the embedding representation by aggregating the intermediate embedding representation and the spectral information.
17. A system for processing a graph, the system comprising:
- one or more processors; and
- a memory configured to store one or more instructions,
- wherein the one or more processors, by executing the stored one or more instructions, perform:
- calculating spectral information associated with a target graph; and
- generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
18. A non-transitory computer readable reading medium storing a computer program executable by at least one processor to perform:
- calculating spectral information associated with a target graph; and
- generating an embedding representation of the target graph based on information on nodes constituting the target graph and the spectral information.
Type: Application
Filed: Nov 13, 2023
Publication Date: May 16, 2024
Applicant: Samsung SDS Co., Ltd. (Seoul)
Inventors: Jae Sun SHIN (Seoul), Min Young Lee (Seoul)
Application Number: 18/507,766