Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bidirectional and multi-source BFS algorithms #97

Open
szarnyasg opened this issue Jun 22, 2020 · 2 comments
Open

Add bidirectional and multi-source BFS algorithms #97

szarnyasg opened this issue Jun 22, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@szarnyasg
Copy link
Contributor

I'd like to add bidirectional and MSBFS algorithms. Some design considerations for these:

  • For MSBFS, we can either use boolean matrices or bitwise operations. The latter as described in a VLDB paper, which states that compared to a textbook BFS (T-BFS) and a direction-optimizing BFS (DO-BFS), it provides an ~80x speedup

    As an example, T-BFS and DO-BFS take 135 minutes and 22 minutes, respectively, to process the LDBC graph with 1 million vertices, while MS-BFS takes only 1.75 minutes. MS-BFS shows excellent scalability for all presented graphs

  • For both bidirectional BSF and MSBFS, it makes sense to experiment with push/pull.

  • Note however that for MSBFS, the experiments in the VLDB paper put the performance gains of push/pull to max. 30%:

    Our experiments show that with both this hybrid approach and ANP, MS-BFS can significantly reduce the number of random accesses to visit and visitNext, improving the performance by up to 30%

  • Typically, the MSBFS algorithm needs to perform some operation when it finds a node. To incorporate this in our algorithm, it probably makes sense to add a function pointer argument to its method so that users can pass the desired operations.

@szarnyasg szarnyasg self-assigned this Jun 22, 2020
@szarnyasg szarnyasg added the enhancement New feature or request label Jun 22, 2020
@DrTimothyAldenDavis
Copy link
Member

Using bitwise operators will probably be the fastest method. From the paper, it looks like there is a useful bitwise binary operator that could be added as a built-in: bitdiff(x,y) which computes x & ~y. For boolean, this operator already exists with the name GrB_GT_BOOL, computing x > y, which is the same as (x && !y). It also looks like this method could use a popcount unary function as well.

Regarding a function pointer for doing some work when a node is found: Is this a function pointer you would add to LAGraph, so that after each level of the BFS, LAGraph could call the function? Or would this be implemented as a unary operator passed to GrB_apply? (or a binary operator and a scalar)? Or perhaps a callback to a higher-level function, where LAGraph would call a user function once, with a list (say in a GrB_Vector, or boolean/bitwise/etc) of nodes seen at a given level, and then the user function would do whatever it wants?

@szarnyasg
Copy link
Contributor Author

Thanks, I wasn't aware of GrB_GT_BOOL. For the boolean case, it's probably best to use a complemented mask by computing Next<!Seen> = Frontier ANY.PAIR A. For the bitwise case, we currently use apply with BNOT (suggested by @marci543):

LAGr_apply(Next, Next, GrB_BAND_UINT64, GrB_BNOT_UINT64, Seen, NULL)

Regarding a function pointer for doing some work when a node is found: Is this a function pointer you would add to LAGraph, so that after each level of the BFS, LAGraph could call the function? [...] Or perhaps a callback to a higher-level function, where LAGraph would call a user function once, with a list (say in a GrB_Vector, or boolean/bitwise/etc) of nodes seen at a given level, and then the user function would do whatever it wants?

In our current problem, where we implement exact closeness centrality value, whenever a row is reached by any of the BFS traversals (i.e. sum popcount(A[i, :]) > 0 is satisfied), its s(i) value should be increased by the level * sum popcount(A[i, :]).
To achieve maximum efficiency in this case, it would be best to call a user function once, pass it the current state of the matrix and vector s, and let it compute s(i) += level * sum popcount(A[i, :]) on vector s.

@szarnyasg szarnyasg removed their assignment Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants