Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-match with the examples result #58

Open
zhengpeirong opened this issue Jun 18, 2024 · 0 comments
Open

Un-match with the examples result #58

zhengpeirong opened this issue Jun 18, 2024 · 0 comments

Comments

@zhengpeirong
Copy link

Dear author,

I am using Raspberrry Pi 4B, and here is my result:

  1. sgemm.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/sgemm.py
==== sgemm example (1024x1024 times 1024x1024) ====
numpy: 0.08994 sec, 23.91 Gflop/s
QPU:   0.5661 sec, 3.799 Gflop/s
Minimum absolute error: 0.0
Maximum absolute error: 0.0003814697265625
Minimum relative error: 0.0
Maximum relative error: 0.13134673237800598
  1. pctr_gpu_clock.py
pi@node01:~/py-videocore6 $ sudo PYTHONPATH=sandbox/ python3 examples/pctr_gpu_clock.py 
==== QPU clock measurement with performance counters ====
500.08264399999996 MHz
  1. memset.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/memset.py 
==== memset example (64.0 MiB) ====
Preparing for buffers...
Traceback (most recent call last):
  File "/home/pi/py-videocore6/examples/memset.py", line 148, in <module>
    main()
  File "/home/pi/py-videocore6/examples/memset.py", line 143, in main
    memset(fill=0x5a5a5a5a, length=16 * 1024 * 1024)
  File "/home/pi/py-videocore6/examples/memset.py", line 119, in memset
    X.fill(~fill)
OverflowError: Python integer -1515870811 out of bounds for uint32
  1. scopy.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/scopy.py 
==== scopy example (16.0 Mi elements) ====
Preparing for buffers...
Traceback (most recent call last):
  File "/home/pi/py-videocore6/examples/scopy.py", line 201, in <module>
    main()
  File "/home/pi/py-videocore6/examples/scopy.py", line 196, in main
    scopy(length=16 * 1024 * 1024)
  File "/home/pi/py-videocore6/examples/scopy.py", line 181, in scopy
    unif[-1] = 4 * (-len(unif) + 3)
    ~~~~^^^^
OverflowError: Python integer -8 out of bounds for uint32
  1. summation.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/summation.py 
==== summaton example (32.0 Mi elements) ====
Preparing for buffers...
Traceback (most recent call last):
  File "/home/pi/py-videocore6/examples/summation.py", line 189, in <module>
    main()
  File "/home/pi/py-videocore6/examples/summation.py", line 184, in main
    summation(length=32 * 1024 * 1024)
  File "/home/pi/py-videocore6/examples/summation.py", line 169, in summation
    unif[-1] = 4 * (-len(unif) + 3)
    ~~~~^^^^
OverflowError: Python integer -12 out of bounds for uint32

In summary, only the QPU clock shows 500MHz, the same as the tutorial. But the segmm time of QPU is 0.5661 sec >> numpy: 0.08994 sec. Moreover, other examples have non-trivial bugs.
Could you please give some thoughts about the solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant