Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manage co-occurence of events #142

Open
turpaultn opened this issue Dec 18, 2020 · 2 comments
Open

Manage co-occurence of events #142

turpaultn opened this issue Dec 18, 2020 · 2 comments

Comments

@turpaultn
Copy link

turpaultn commented Dec 18, 2020

Would it be possible to manage the co-occurence of events ?

The idea I used to generate desed dataset was using the parameter "p" of np.random.choice to have "probas", so it is quite simple and everything is managed only depending on the first event sampled (which defines the co_occur_params dictionnary to use, because it is specific to an event):

def choose_cooccurence_class(co_occur_params, random_state=None):
    """ Choose another class given a dictionary of parameters (from an already specified class).
    Args:
        co_occur_params: dict, define the parameters of co-occurence of classes
            Example of co_occur_params dictionnary::
                {
                  "max_events": 13,
                  "classes": [
                    "Alarm_bell_ringing",
                    "Dog",
                  ],
                  "probas": [
                    70,
                    30
                  ]
                }
            classes and probas maps each others
        random_state: int, or RandomS0tate object
    Returns:
        str, the class name.
    """
    if random_state is not None:
        random_state = _check_random_state(random_state)
        chosen_class = random_state.choice(co_occur_params['classes'], p=co_occur_params['probas'])
    else:
        chosen_class = np.random.choice(co_occur_params['classes'], p=co_occur_params['probas'])
    return chosen_class

(the max_events is used to determine a random "number of events" in the soundscape depending on the class of the first event sampled once again, so not very good, but easy to make and at least class dependent)

This is very simplistic code.
But a goal could be to have a better co-occurence sampling (n-gram or other ideas inspired from generation of text from language model I guess ?), what do you think ?

@justinsalamon
Copy link
Owner

Cheers @turpaultn !

We could definitely add support for non-uniform discrete sampling, e.g. via a new choose_weighted distribution tuple.

IIUC in the example above you're providing the probability for each event being chosen, and then choosing one of these events, but that's not the same as co-occurrence probabilities, right? That is, it's different to say

  1. Choose between alarm/dog with prob .7/.3
  2. Give me a soundscape where alarm and dog co-occur with probability X.

My understanding from today's meeting was that the team is interested in the latter, but maybe I misunderstood?

Regardless, it looks like we'd need something like choose_weighted to support Gibbs or related types of sampling methods?

@turpaultn
Copy link
Author

Cool !

Well, I understand it's not clear, because I've put this little piece of code.

But the algorithm is like this:

  • Sample an event from the 10 classes (uniformly)
  • Take the dictionnary of co-occurence above (what's the probability of alarm/dog when "cat" is the first event)
  • Sample an event using this dictionnary (for example, no chance a vacuum_cleaner would be picked, there is only dog and alarm left)

The idea was that if an alarm ("bip") appeared for example, there is a lot of chance you can hear another one.
As I said, it is simple, but at least we were able to have a class balance closer to the real set without going spending too much time.

Regardless, it looks like we'd need something like choose_weighted to support Gibbs or related types of sampling methods?

I agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants