Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Points to analysis overlapping access #278

Merged
merged 30 commits into from
Dec 6, 2024

Conversation

sadrabt
Copy link
Contributor

@sadrabt sadrabt commented Nov 28, 2024

Overlapping accesses functionality for DSA.
Addresses overlapping accesses to Stack and Global Memory Locations through Addresses, Input/Output Parameters and Indirect Accesses

@l-kent
Copy link
Contributor

l-kent commented Nov 28, 2024

This seems to have fixed the issues (I'm surprised it was that simple) I was having yesterday but I'm going to test it with some more complex cases to make sure there aren't still any underlying problems.

@sadrabt
Copy link
Contributor Author

sadrabt commented Nov 28, 2024

This seems to have fixed the issues (I'm surprised it was that simple) I was having yesterday but I'm going to test it with some more complex cases to make sure there aren't still any underlying problems.

Did your issue's program include function pointers?
The subsequential change is that instead of initialising nodes for function memory locations with a cell of size 0 at offset 0, now DSA initialises a node with a cell of the size of the memory location at offset 0. When the function locations are loaded, they would be merged into the differences between offsets of their pointers in the SSA variable node. For example, in jumptable/clang since function cell sizes were 0 the two function memory cells wouldn't be merged unifying their nodes once the. Now that their sizes are 20 they would be merged at offsets 0 and 8 unifying their nodes.
The other global/stack/heap have their cell sizes initialised to 0 since DSA learns their sizes through accesses but the function locations are never read/written to.

@l-kent
Copy link
Contributor

l-kent commented Nov 28, 2024

Yes, it seemed to be that incorrect sizes were causing some weird problems.

@l-kent
Copy link
Contributor

l-kent commented Nov 29, 2024

I've added a test which demonstrates the remaining issue - if two pointers are loaded together into a single 128-bit register, then it is not correct to merge their pointees as if they are contiguous - the cells should always be collapsed.

@l-kent
Copy link
Contributor

l-kent commented Dec 2, 2024

You've misunderstood the problem, the "overlapping access" test was fine before.

@l-kent
Copy link
Contributor

l-kent commented Dec 2, 2024

Although I guess it doesn't make a difference right now since it would never be possible for the cells to point to different things

@sadrabt
Copy link
Contributor Author

sadrabt commented Dec 2, 2024

I agree it doesn't make a lot of difference with the current approach, it's simpler and definitely sound to grow the size of the pointer cell to equal the size of the access, it will then merge with it's adjacent cells that are also getting accessed which then in turn also unifies their pointees.

Keeping the multiple pointer cells getting dereferenced as separate is trickier. Currently the pointer cells are kept separate and their pointees are merged and also collapsed.

I'm fine with either specially if we want to be able to get more precision out of overlapping accesses, we will need to consider the pointer cells separate, so it's good to keep them that way and pick up on any issues if they still exist

@l-kent
Copy link
Contributor

l-kent commented Dec 2, 2024

For now it's we just get clearer results if we do merge & collapse the cells like that I think, but it's good that you've figured out how to make the more precise version work because we'll likely want to build on that in the future. Sorry for the confusion.

@l-kent
Copy link
Contributor

l-kent commented Dec 4, 2024

I tried to merge this in but I can't merge this in without breaking things so I will leave it for another time.

@sadrabt
Copy link
Contributor Author

sadrabt commented Dec 4, 2024

will have a look after the workshop

# Conflicts:
#	src/main/scala/analysis/data_structure_analysis/Graph.scala
#	src/main/scala/analysis/data_structure_analysis/LocalPhase.scala
#	src/main/scala/analysis/data_structure_analysis/SymbolicAddressAnalysis.scala
#	src/main/scala/analysis/data_structure_analysis/Utility.scala
@sadrabt
Copy link
Contributor Author

sadrabt commented Dec 4, 2024

I'm not sure about why some examples (secret_write and basic_function_call_caller for gtirb and basic_lock_read as well as the other two for bap ) fail verification now

Comment on lines 313 to 348
val index: Expr = unwrapPaddingAndSlicing(ind)
assert(size % 8 == 0)
val byteSize = size / 8
val global = isGlobal(index, n, byteSize)
val stack = isStack(index, n)
val indexCell = graph.adjust(graph.accessIndexToSlice(store))
assert(!indexCell.node.get.flags.merged) // check index cell is a placeholder
val addressPointee: Cell = if (global.isDefined) {
graph.mergeCells(graph.find(global.get), indexCell)
graph.adjust(graph.find(global.get).getPointee)
} else if (stack.isDefined) {
graph.mergeCells(graph.find(stack.get), indexCell)
graph.adjust(graph.find(stack.get).getPointee)
} else {
index match {
case BinaryExpr(op, arg1: Variable, arg2) if op.equals(BVADD) =>
evaluateExpression(arg2, constProp(n)) match {
case Some(v) =>
// assert(varToSym(n).contains(arg1))
val offset = v.value
visitPointerArithmeticOperation(n, indexCell, arg1, byteSize, false, offset)
visitPointerArithmeticOperation(n, Node(Some(graph)).cells(0), arg1, byteSize, true, offset)
case None =>
// assert(varToSym(n).contains(arg1))
// collapse the results
// visitPointerArithmeticOperation(n, DSN(Some(graph)).cells(0), arg1, byteSize, true, 0, true)
unsupportedPointerArithmeticOperation(n, index, Node(Some(graph)).cells(0))
}
case arg: Variable =>
// assert(varToSym(n).contains(arg))
visitPointerArithmeticOperation(n, indexCell, arg, byteSize, false)
visitPointerArithmeticOperation(n, Node(Some(graph)).cells(0), arg, byteSize, true)
case _ =>
???
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests fail because you didn't keep the changed structure here which was important for correctly identifying the access indices, but when I've tried to keep this changed structure, then the DataStructureTests start failing and I'm not sure why that is.

graph.mergeCells(graph.adjust(slice), c)
}

if global.isDefined then
graph.mergeCells(graph.find(global.get), indexCell)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not doing the same exact merges with indexCell that the other structure is doing? I think the only thing that is changes is that first we merge all the possible valueCells with each other and then merging them with the addressPointee instead of getting the address pointee and then merging it with value cells

Copy link
Contributor

@l-kent l-kent Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really remember why I figured out that change in order was necessary but it made a clear difference.

https://github.com/UQ-PAC/BASIL/tree/points_to_analysis_overlapping_access_merge_test

This branch has been merged to maintain that structure and with that all the memory region tests pass but it makes some of the tests in DataStructureAnalysisTest fail - can you have a look at that?

Copy link
Contributor Author

@sadrabt sadrabt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please take a look at this branch. I have reordered the merging operations to match the old way and now all the DSAMemoryREgionSystemTestsBAP pass but some of the gtirb ones throw an error where accessesEndian is empty

val body: BExpr = accessesEndian.tail.foldLeft(accessesEndian.head) { (concat: BExpr, next: MapAccess) =>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, great work. I should be able to fix that, I ran into a similar bug before.

Copy link
Contributor Author

@sadrabt sadrabt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are memory sections getting created with size 0 because the largestaccessed size of them are 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think the problem was that if a literal is assigned that DSA knows is not an address to a global, it ignored it but now it grows the memory region it was assigned to that size

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should grow the cell but it doesn't, I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why it worked for the adt version without that tho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should grow the cell but it doesn't, I think?

Yeah this def was the case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to have fixed it, great! I don't know why there was a difference either, I'll give it a closer look next week to see if there's anything obvious there.

@l-kent l-kent merged commit 8c49e69 into main Dec 6, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants