Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help with histogram split #1

Open
slowkow opened this issue Jul 26, 2020 · 5 comments
Open

help with histogram split #1

slowkow opened this issue Jul 26, 2020 · 5 comments

Comments

@slowkow
Copy link
Contributor

slowkow commented Jul 26, 2020

Hi Jon,

Thanks for the great code and youtube video! I decided to try it out on my own data, and I am a bit confused with the result.

I would expect that GHT() should return 2.6 or so, but I don't understand the output. Could I please ask if you might be able to help me understand how to use this function and how to interpret the output? Is there a way to call GHT() to get the threshold of 2.6?

Here is my data: n.txt

In [65]: d = np.genfromtxt('n.txt')

In [66]: d
Out[66]:
array([2.98497713, 3.53869938, 3.21138755, ..., 3.29136885, 3.02036128,
       3.03059972])

In [67]: GHT(d)
Out[67]:
(27376.0,
 array([334782.46328283, 334583.29822872, 334583.01013452, ...,
        334581.54735645, 334582.1549913 , 334785.61749101]))

In [74]: import matplotlib.pyplot as plt

In [77]: plt.hist(d, bins = 100); plt.show()

image

@slowkow
Copy link
Contributor Author

slowkow commented Jul 26, 2020

Is this how we're supposed to run the function?

In [90]: d = np.sort(d)

In [91]: res = GHT(d)

In [96]: d[int(res[0])]
Out[96]: 3.45438746714696

Does this mean that GHT() is suggesting to place a threshold at 3.45?

@jonbarron
Copy link
Owner

jonbarron commented Jul 26, 2020 via email

@jonbarron
Copy link
Owner

jonbarron commented Jul 26, 2020 via email

@slowkow
Copy link
Contributor Author

slowkow commented Jul 26, 2020

Thanks!

For anyone visiting this issue, here is the code. The d array holds a bunch of data points from n.txt

In [17]: print(GHT(np.ones_like(d), np.sort(d), 100000, .1, 0)[0])
Out[17]: 2.62013605497376

Could I ask if there's any intuitive way to describe the parameters nu and tau?

@jonbarron
Copy link
Owner

jonbarron commented Jul 26, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants