Please provide community similarity algorithms #244

johnlinp · 2023-01-27T21:29:48Z

Is your feature request related to a problem? Please describe.
I have a graph of social media data. I used community detection algorithms (e.g. Louvain) to detect different sets of communities, based on different properties, like location, timestamp, etc. Therefore, I have a set of communities that are detected based on the location of the data, and another set of communities that are detected based on the timestamp of the data.

My next step would be comparing the similarity between these sets of communities. I saw some algorithms like Rand Index will do the job. Can GDS provide such algorithms? Thank you.

Describe the solution you would like
I wish GDS can provide community similarity algorithms, e.g. Rand Index.

Describe alternatives you have considered
If GDS doesn't provide it, I'll have to implement on my own.

johnlinp · 2023-02-10T00:33:47Z

If anyone need a simple version of Rand Index implementation, here it is.

Assume that we are analyzing a set of social media posts (:Post). We have did 2 Louvain community detection based on 2 different attributes and put community_1_id and community_2_id on the nodes. The way to calculate the Rand Index between these 2 community sets will be:

CALL {
  MATCH (n:Post)
  MATCH (m:Post)
  WHERE id(n) < id(m)
  AND n.community_1_id = m.community_1_id
  AND n.community_2_id = m.community_2_id
  RETURN count(*) AS a
}
CALL {
  MATCH (n:Post)
  MATCH (m:Post)
  WHERE id(n) < id(m)
  AND n.community_1_id <> m.community_1_id
  AND n.community_2_id <> m.community_2_id
  RETURN count(*) AS b
}
CALL {
  MATCH (n:Post)
  MATCH (m:Post)
  WHERE id(n) < id(m)
  AND n.community_1_id = m.community_1_id
  AND n.community_2_id <> m.community_2_id
  RETURN count(*) AS c
}
CALL {
  MATCH (n:Post)
  MATCH (m:Post)
  WHERE id(n) < id(m)
  AND n.community_1_id <> m.community_1_id
  AND n.community_2_id = m.community_2_id
  RETURN count(*) AS d
}
RETURN 1.0 * (a + b) / (a + b + c + d) AS rand_index;

gminneci · 2023-02-10T17:03:03Z

Hi @johnlinp! I am a product manager at Neo4j. Thank you for this feature request. We are looking at these type of features as 'subgraph similarity', but don't have an implementation plan just yet. Great to see that you have an implementation already - how is it working for you? Are there any specific limitations in what you are trying to achieve that you'd like to mention?

johnlinp added the feature request A suggestion for a new feature label Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please provide community similarity algorithms #244

Please provide community similarity algorithms #244

johnlinp commented Jan 27, 2023

johnlinp commented Feb 10, 2023

gminneci commented Feb 10, 2023

Please provide community similarity algorithms #244

Please provide community similarity algorithms #244

Comments

johnlinp commented Jan 27, 2023

johnlinp commented Feb 10, 2023

gminneci commented Feb 10, 2023