Allow Categorical to have different bounds then 1, ncategories? #449

jw3126 · 2016-01-21T18:10:44Z

How about adding an additional field to the Categorical type with default value 1. What I have in mind is a field Categorical.min such that

minimum(c) = c.min
maximum(c) = c.min + ncategories(c) - 1
samples are drawn from [minimum(c): maximum(c)] and not [1:ncategories(c)]

There would probably be several benefits, let me just describe the one that motivates me.
What I want to do is create new distributions out of old ones. For example

The distribution of the minimum, maximum of n draws from the same or several Distributions
The distribution of the sum/difference of draws from two or more distributions
etc.

In some cases (e.g. add two binomial same p) one can do so analytically, but more often then not I end up with a distribution which has no better description then a value range and a probability vector.
But this distribution is may not be Categorical because it can assume 0 or even negative values!
I feel it would be awkward to introduce a new type for this kind of thing and would love to use Categorical instead.

johnmyleswhite · 2016-01-21T18:11:37Z

This is pretty much exactly the use case for location-scale families, which people are already working on.

jw3126 · 2016-01-22T18:34:54Z

Ah thanks I see. For what I want to do, I would prefer a discrete version of UnivariateLocationScaleFamily. E.g. if the math preserves discreteness (sum, max...) the code should also. Also iterating constructions is much cleaner if the type does not forget discreteness along the way.
Would it be reasonable to also have a discrete UnivariateLocationScaleFamily or does such a thing even already exist somewhere?

johnmyleswhite · 2016-01-22T18:35:44Z

Yes, it's totally reasonable. I think one of the essential things we have to get right is making location and scale families respect discreteness when appropriate.

jw3126 · 2016-01-23T09:17:46Z

So we should actually have two types

DiscreteUnivariateLocationScaleFamily {T <: UnivariateDistribution} <: DiscreteUnivariateDistribution
ContinuousUnivariateLocationScaleFamily{T <: UnivariateDistribution} <: ContinuousUnivariateDistribution

(maybe the first type is parametrized by discrete T only.) And respect discreteness when it can be proven by type reasoning? e.g.

Binomial + 1 discrete (because discrete + int is always discrete)
Binomial + 1.0 continuous (because discrete + float might be continuous)
Normal * 0 continuous (because continuous * int might be continuous)

adityam · 2016-04-27T02:47:48Z

Another option is to pass two parameters to Categorical: probabilities and values (and have a constructor that takes only one parameter: probabilities and initializes values to [1:length(probabilities)]. For example:

d = Categorical([0.2, 0.3, 0.5], [-5,0,5])

will generate -5 with probability 0.2, 0 with probability 0.3, and 5 with probability 0.5.

cstjean · 2016-11-01T03:40:00Z

d = Categorical([0.2, 0.3, 0.5], [-5,0,5])

+1; I've got an application that would work great with Categorial([0.2, 0.3, 0.5], ["apple", "orange", "kiwi"]). Would such a PR be considered? Obviously, the mean, variance, etc. are not computable on such non-numerical distributions. This would solve #147 if I understand correctly.

andreasnoack · 2017-07-12T13:33:38Z

Related to #634

matbesancon · 2019-05-23T07:20:28Z

I would consider this closed with #634 then

matbesancon closed this as completed May 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Categorical to have different bounds then 1, ncategories? #449

Allow Categorical to have different bounds then 1, ncategories? #449

jw3126 commented Jan 21, 2016

johnmyleswhite commented Jan 21, 2016

jw3126 commented Jan 22, 2016

johnmyleswhite commented Jan 22, 2016

jw3126 commented Jan 23, 2016

adityam commented Apr 27, 2016

cstjean commented Nov 1, 2016

andreasnoack commented Jul 12, 2017

matbesancon commented May 23, 2019

Allow Categorical to have different bounds then 1, ncategories? #449

Allow Categorical to have different bounds then 1, ncategories? #449

Comments

jw3126 commented Jan 21, 2016

johnmyleswhite commented Jan 21, 2016

jw3126 commented Jan 22, 2016

johnmyleswhite commented Jan 22, 2016

jw3126 commented Jan 23, 2016

adityam commented Apr 27, 2016

cstjean commented Nov 1, 2016

andreasnoack commented Jul 12, 2017

matbesancon commented May 23, 2019