Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory management: Add virtual heap memory allocators #7276

Conversation

dabekjakub
Copy link
Member

@dabekjakub dabekjakub commented Mar 14, 2023

Add virtual heap allocators that allocate on proper heaps and map physical memory where nesscessary.
Add free function that frees up the virtual heap memory and unmaps physical memory when possible.

Some relevant links to understand context
https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/rtos_layer/memory_management/heap_sharing.html
https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/rtos_layer/memory_management/index.html

https://github.com/orgs/thesofproject/discussions/7514

Added a presentation that outlines the intent of the commit I crafted at the start of this commit lifecycle.
virtual heaps.pdf

Summarizing potential benefits that this api introduces:

  • Avoid wasting cache line size that happens currently with almost all allocations that are single core
  • Improves cache coherency handling by redirecting traffic from separate core to separate heaps
  • Powering up and down of memory banks is dynamic and lets us save power if done efficiently
  • Allows easy mapping for user-space heap allocations
  • Allows for Opportunistic allocator to be created "paging file"
  • Virtual memory aggregation across cores – can ease the process of debugging in compare to using zones and single system heap
  • Flexibility when defining zones by clients
  • It is aligned to better serve future platform memory architecture

zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
@marcinszkudlinski
Copy link
Contributor

tip: add pre-commit hook to your local git repo and run checkpatch at every commit. Sometimes its annoying, but helps

      #!/bin/sh
      set -e exec
      exec git diff --cached | (...your patch..._)/scripts/checkpatch.pl -

@dabekjakub
Copy link
Member Author

I knew there were issues in check patch and I intended to fix theme before changing it to full PR. I made this draft so we can discuss the arch approach not the cosmetics.

Copy link
Contributor

@marcinszkudlinski marcinszkudlinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More review is coming. This is a "classic", later I'll focus on the main code

ANYWAY - please use checkpatch even in draft PRs. Pls add comments.
It makes a revew - any review, even draft - easier and faster

zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch from d3d92e8 to bcbcba0 Compare March 21, 2023 16:28
@lyakh
Copy link
Collaborator

lyakh commented Mar 22, 2023

looks like not all comments have been addressed yet.

@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch 4 times, most recently from 803c646 to 67134e3 Compare March 22, 2023 19:16
@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch from 17b94f3 to c26d7fe Compare July 3, 2023 10:18
zephyr/include/sof/lib/regions_mm.h Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/include/sof/lib/regions_mm.h Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Show resolved Hide resolved
zephyr/lib/regions_mm.c Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch 4 times, most recently from cd3c39b to 2d09086 Compare July 7, 2023 12:21
@dabekjakub
Copy link
Member Author

dabekjakub commented Jul 7, 2023

Tested with asserts on @lyakh I want to talk with @lgirdwood offilne to be sure on the fix for multi thread but it is good to test if You want to do so.

@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch 2 times, most recently from 45074a9 to 46e7938 Compare July 25, 2023 17:05
@lyakh
Copy link
Collaborator

lyakh commented Jul 26, 2023

Tested with asserts on @lyakh I want to talk with @lgirdwood offilne to be sure on the fix for multi thread but it is good to test if You want to do so.

so have you cleaned up all the asserts? None are triggering any more?

@dabekjakub
Copy link
Member Author

@lyakh Run my basic scenarios Create heap/ alloc / alloc /free / free / free heap. And it worked with asserts on.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not a complete review, just your changes from the previous version, but I haven't finished reviewing the previous version, so I'll have to do that. But please also fix gcc compilation breakage e.g. https://github.com/thesofproject/sof/actions/runs/5659675012/job/15333903191?pr=7276

case MEM_REG_ATTR_SHARED_HEAP:
vm_heaps[i].memory_caps = SOF_MEM_CAPS_RAM | SOF_MEM_CAPS_HP;
struct vmh_heap *new_heap =
rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, sizeof(*new_heap));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now an "uncached allocation" - is this on purpose? All the allocations are now uncached, I thought it was an important aspect of this allocator to use cached access? Or you're converting addresses somewhere? Haven't checked yet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was needed to change adresses of spin locks in bittarays to uncached. And i decided not to split allocation for parts of struct. The Struct data that we are talking about here should not generate big overhead and the changes only apply to information about heap and its allocations. The buffer itself is still provided by zephyr defines and the allocated ptrs from this buffer (allocated using this api) and this structure that saves heap info are separate.

zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
*/
size_t bitarray_size = SOF_DIV_ROUND_UP(
SOF_DIV_ROUND_UP((uint64_t)cfg->block_bundles_table[i].number_of_blocks, 8),
sizeof(uint32_t));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just make it

	size_t bitarray_size = SOF_DIV_ROUND_UP(cfg->block_bundles_table[i].number_of_blocks, 8 * sizeof(uint32_t));

because if .number_of_blocks + 31 > 0xffffffff I don't think this will work anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to stay as true to the implementation that already exists in zephyr for calculation of the size. It is complicated as is would not like to convert it anymore since this will be easier for people to understand when they cross reference it with sys_bittaray implementation.

return NULL;
/* Only operations on the same core are allowed */
if (heap->core_id != cpu_get_id())
return NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, also check thread-locality and thread context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree with handling anything about threads here. It should be under the managment code that handles the threads.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, don't understand what you mean, what code that handles the threads? Is this code handling multiple cores? This is a function that has its limitations: it should run only on a specific core and only in specific thread context, so it should check all the conditions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it this should not be handling threads in any capacity and it is management code that does it. @mmaka1 didn't we agree upon it on the syncs ?

*/
block_size =
1 << heap->physical_blocks_allocators[mem_block_iterator]->blk_sz_shift;
block_count = SOF_DIV_ROUND_UP((uint64_t)alloc_size, (uint64_t)block_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, check the definition of SOF_DIV_ROUND_UP() - it's doing a division, the only way it can overflow (which is presumably what you're protecting against by typecasting to 64 bits), is from the addition operation. And we really shouldn't be allocating anything where alloc_size + block_size - 1 > 0xffffffff

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what really is Your point here ? You want to remove the protection ? I know that we will not be allocating sizes like that but who knows what will we use it for later. I talked it over with @tlissows and we came to conclusion that if it does not hurt anyone and makes this code safer then it should be there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't use 64-bit types. Type-casts are generally a rather crude instrument and should be avoided where possible, and here I don't see a need for them, and it causes a significant overhead - a 64-bit division.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I want it to be safe this way. The overhead seems negligible. And casts are tools to be used.

zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
@lyakh
Copy link
Collaborator

lyakh commented Jul 27, 2023

This version passed the previous version of my basic test in #7688 - that's already good. I then went ahead and added a call to vmh_alloc() to it - and it caused an exception. The exception is happening inside vmh_alloc(). I added some debugging there (see patch attached)
04.gitdiff.txt
and got this output:

[    0.524288] <err> vmh: vmh_alloc: error -12 allocating 26 of 3264 of size 64
[    0.524311] <err> vmh: vmh_alloc: error -12 allocating 13 of 1632 of size 128
[    0.524316] <err> vmh: vmh_alloc: error -12 allocating 7 of 816 of size 256
[    0.524323] <err> vmh: vmh_alloc: error -12 allocating 4 of 408 of size 512
[    0.524330] <err> vmh: vmh_alloc: error -12 allocating 2 of 204 of size 1024
[    0.524336] <err> vmh: vmh_alloc: NULL allocator #5
[    0.524341] <inf> boot_test: sof_boot_test: test 0xa007b9d0 returned -12

i.e. the allocator fails to allocate memory in the 5 blocks of the heap - which is wrong, because there seems to be plenty of free memory in all of them - this is the first allocation attempt, and having failed those 5 blocks it attempts to use the 6th block, which is apparently not present, so it hits a NULL dereference.

@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch from 46e7938 to 8d27ebf Compare July 27, 2023 09:31
@lyakh
Copy link
Collaborator

lyakh commented Jul 27, 2023

The new version fixed the crash. Now the new version of #7688 produces these errors:

[    0.523508] <wrn> vmhtest: vmh_test_single: Ignoring failure to allocate 1616 in non-contiguous mode
[    0.524056] <err> vmhtest: vmh_test_single: Free error -14
[    0.524065] <err> vmhtest: vmh_test_single: Free heap error -90
[    0.524071] <err> vmhtest: vmh_test: Contiguous test error -14
[    0.524078] <inf> boot_test: sof_boot_test: test 0xa007bad0 returned -14

i.e. vmh_free() in line 57 of vmh.c returns an error.

*/
struct sys_bitarray *allocation_sizes_bitarray =
rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED,
0, SOF_MEM_CAPS_RAM, sizeof(sys_bitarray_t));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have been discussing ways to reduce the number of heap allocations in SOF, and here we have an example of a single object creation process using multiple such allocations. Maybe we could calculate a size and allocate the whole memory in 1 go. Can be a follow-up optimisation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was exploring that idea myself. Agreed As a followup.

}
rfree(allocator);
}
rfree(new_heap);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for each new heap we do 5 * block-number + 1 heap allocations... That's really too many

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this could be followed up later.

* First: 1111111100000000 allocate another block 1111111100000000
* Second:1110000000000000 and another two blocks 1110001000000000
* We are still able to pinpoint allocations size
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in principle we need 3 values, i.e. 2 bits per block: free, used-single-or-last, used-linked-to-next. So in theory we'd need an array of 2-bit values, but that would be difficult to work with

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have two bits per block now. One bit is taken from the mem_block bitarray that saves information if it is allocated or not and the second bittaray is saved in heap struct that saves information if the block is linked to the next one. Aggregating both these arrays You end up with all information You need. I think that I explained it clearly in the in code comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sorry, it's me who didn't explain it clearly in my comment. I just meant that those two bitarrays actually represent an array of 2-bit fields, just in a slightly cumbersome and non-obvious way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the point of this comment ? Do You want any changes here ? I do not understand.

*/
phys_aligned_ptr = ALIGN_DOWN((uintptr_t)ptr, CONFIG_MM_DRV_PAGE_SIZE);
phys_aligned_alloc_end = ALIGN_UP((uintptr_t)ptr
+ alloc_size, CONFIG_MM_DRV_PAGE_SIZE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line, please

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm right about ptr about needing an array instead of it, then this is wrong too, and the following code as well. I'll stop here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line produces warnings in checkpath

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I miss some comment here ? Because i do not understand What should You be right about here ?

zephyr/lib/regions_mm.c Show resolved Hide resolved
zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
continue;
/* Try span alloc on first available mem_block for non span
* check if block size is sufficient.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are use-cases when we must have contiguous physical pages? DMA? Or would DMA also use virtual addresses and thus only care about virtual contiguity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has nothing to do with contiguous physical pages.
This is firstly checking if we have virtual blocks of sufficient size by allocating on memblocks. If allocation was successful we allocate physical pages if needed.
And the answer is we do not need it and I do not force anything like that here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok, so we just map them into a contiguous address region, but physical pages can be discontiguous. Maybe it would be good to clarify that somewhere in a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this gets in, one of the next steps would be documenting it. I think adding anything about span physical mapping would only cause confusion here since there is a clear distinction in wording - allocation / mapping.

zephyr/lib/regions_mm.c Show resolved Hide resolved
CONFIG_MM_DRV_PAGE_SIZE, SYS_MM_MEM_PERM_RW) != 0) {
ptrs_to_map[i] = (uintptr_t)NULL;
goto fail;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make a function vmh_is_page_mapped() and just call it for each physical page? Eventually for contiguous allocators you can just check the first and the last page, but that can be optimised later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refactor it after the implementation of zephyr physical memory counting - this will simplify this implementation and make it much shorter.

zephyr/lib/regions_mm.c Outdated Show resolved Hide resolved
@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch from 8d27ebf to 5ed4d5c Compare July 28, 2023 00:38
Add virtual heap allocators that allocate on proper heaps and
map physical memory where nesscessary.
Add free function that frees up the virtual heap memory and
unmaps physical memory when possible.
Virtual heap allocator allows using virtual memory as a base
for allocation and booking memory. Physical memory banks
will be mapped when needed allowing for greater flexibility
with mapping.

Signed-off-by: Jakub Dabek <[email protected]>
@dabekjakub dabekjakub force-pushed the jdabek/memory_managment/heap_allocators branch from 5ed4d5c to 02076a4 Compare July 30, 2023 07:32
@dabekjakub
Copy link
Member Author

@lyakh I was unable to reproduce the issue you present with my latest changes.- Haven't explicitly fix any bugs but still changes were made.

@lgirdwood
Copy link
Member

@lyakh I was unable to reproduce the issue you present with my latest changes.- Haven't explicitly fix any bugs but still changes were made.

ok, we can extend the UTs after merge and as clients are added.

@lgirdwood
Copy link
Member

SOFCI TEST

@mwasko mwasko merged commit 45d5693 into thesofproject:main Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.