Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudo-random generator prevents lineapy from capturing all relevant code #885

Open
VolodymyrOrlov opened this issue Jun 6, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@VolodymyrOrlov
Copy link

python version:
what python version are you using?
3.8
lineapy version
what version of lineapy are you using or which commit if installed from source?
0.2.3

Your code:
What code did you try to run with lineapy?

# Cell 1
import random

my_init_value = 3
my_var = None
if random.random() <= 0.5:
    my_var = my_init_value
else:
    my_var = 1
print(my_var)
# Cell 2
lineapy.save(my_var, "my_var")
# Cell 3
print(lineapy.get("my_var").get_code())

**Issue:
What went wrong when trying to run this code?
The last cell prints code that has been captured by lineapy:

import random

if random.random() <= 0.5:
    my_var = my_init_value
else:
    my_var = 1

This code is not self sufficient. What will happen if my_init_value is a complex function or a very important hyper parameter that has a great influence on model result?

import random

my_init_value = 3
my_var = None
if random.random() <= 0.5:
    my_var = my_init_value
else:
    my_var = 1
@VolodymyrOrlov VolodymyrOrlov added the bug Something isn't working label Jun 6, 2023
@dorx
Copy link
Contributor

dorx commented Jul 30, 2023

Thanks for filing the bug, @VolodymyrOrlov ! Our support for control flows is experimental. We will look into this issue.

@aayan636
Copy link
Contributor

Hi @VolodymyrOrlov , the issue you are facing is related to LineaPy's support for control flow structures, which as @dorx mentioned is experimental at this stage.
A bit of background: LineaPy relies on dynamic analysis of your program to figure out dependencies between different lines of code. What that means is LineaPy executes your program and analyses the interaction between different objects to create the Linea Graph which is further processed to generate the cleaned up version of the original code. In your example, the condition of the if statement is random.random() <= 0.5, which can either be true or false depending on the value taken by random.random() at runtime. In case the condition evaluates to false, the else branch would be taken, and my_var's final value would not depend on my_init_value. When it comes to program's with a control flow statement, LineaPy's slicing would be an overapproximation, it would include all lines of code in the entire if/else block, hence the entire code block gets included.

In your example, if you run the code again (till the true branch gets taken) you would notice that the my_init_value = 3 line would get included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants