Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added interop_iolist_fold #687

Merged
merged 1 commit into from
Aug 8, 2023
Merged

Conversation

fadushin
Copy link
Collaborator

This PR adds a function that supports traversal of values (integers and binaries) in an IO list.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@fadushin fadushin force-pushed the interop-iolist-additions branch 2 times, most recently from 083676e to 94ed2f2 Compare July 18, 2023 02:00
@fadushin fadushin requested a review from pguyot July 19, 2023 12:17
src/libAtomVM/interop.c Outdated Show resolved Hide resolved
src/libAtomVM/interop.c Show resolved Hide resolved
src/libAtomVM/interop.h Show resolved Hide resolved
Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to play a little bit with inline and some other attributes to make sure performance are not reduced by function calls.

@pguyot
Copy link
Collaborator

pguyot commented Jul 19, 2023

I would like to play a little bit with inline and some other attributes to make sure performance are not reduced by function calls.

#define would work as well

@bettio
Copy link
Collaborator

bettio commented Jul 19, 2023

I did a first investigation:

static inline InteropFunctionResult size_fold_fun(term t, void *accum)

and

inline InteropFunctionResult interop_iolist_fold(term t, interop_iolist_fold_fun fold_fun, void *accum)

on x86_64 allow gcc to perform a fair level of optimization, on xtensa doesn't look yet nearly the same, anyway I would love to do some additional investigation, but definitely using inline would be a good first step.

Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would slightly defer merging this PR after some investigation, minor changes and tricks to allow compiler optimizing it back to something comparable to previous version.
Right now without some changes it would have a significant penalty due to function calls / branching.

@fadushin
Copy link
Collaborator Author

Can you quantify the difference in performance with a test? And for the majority of cases, is it even measurable? Or are these a priori analyses based on generated code? It might be hard to make assessments about performance without actual measurements, e.g., due to processor pipelining and other tricks.

@fadushin
Copy link
Collaborator Author

fadushin commented Jul 20, 2023

The problem with in-lining is code bloat. We are already pushing the boundaries on some platforms, especially smaller microcontrollers.

src/libAtomVM/interop.c Outdated Show resolved Hide resolved
src/libAtomVM/interop.c Outdated Show resolved Hide resolved
@bettio
Copy link
Collaborator

bettio commented Jul 20, 2023

The problem with in-lining is code bloat. We are already pushing the boundaries on some platforms, especially smaller microcontrollers.

If we do inline it doesn't get worse than now since those functions are already duplicated, however by using inline I think the compiler is allowed to keep them not inlined when using -Os.

@pguyot
Copy link
Collaborator

pguyot commented Jul 21, 2023

For what is worth, emscripten builds require -O3 -sINLINING_LIMIT to run in the browser.

@fadushin
Copy link
Collaborator Author

fadushin commented Jul 21, 2023

If we do inline it doesn't get worse than now since those functions are already duplicated, however by using inline I think the compiler is allowed to keep them not inlined when using -Os.

True, but bear in mind that in the crypto branch, on ESP32 (due to the mbedtls API) we need to call this function for each crypto algorithm (for hashing, but maybe later for encryption as well). There are currently 6 hashing algorithms supported in that branch (md5, sha1, sha224, sha256, sha384, and sha512)

src/libAtomVM/interop.h Outdated Show resolved Hide resolved
@bettio
Copy link
Collaborator

bettio commented Jul 27, 2023

I used this version for testing purposes.

2-inlines-fold.txt

@fadushin fadushin force-pushed the interop-iolist-additions branch 2 times, most recently from 40b4dd8 to 6df6561 Compare July 28, 2023 12:53
@bettio bettio added this to the v0.6 milestone Jul 28, 2023
@fadushin fadushin requested a review from bettio July 31, 2023 14:51
@bettio bettio merged commit 6dd47da into atomvm:master Aug 8, 2023
80 checks passed
@fadushin fadushin deleted the interop-iolist-additions branch October 30, 2023 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants