Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLDA's "Audio file too short!" warning is misleading #72

Open
andimarafioti opened this issue Oct 10, 2017 · 4 comments
Open

OLDA's "Audio file too short!" warning is misleading #72

andimarafioti opened this issue Oct 10, 2017 · 4 comments
Assignees
Labels

Comments

@andimarafioti
Copy link
Contributor

andimarafioti commented Oct 10, 2017

In OLDA's segmenter.py there's a warning for short files (both in 277-290 and 340-345):

        try:
            # Load and apply transform
            W = load_transform(self.config["transform"])
            F = W.dot(F)

            # Get Segments
            kmin, kmax = get_num_segs(dur)
            est_idxs = get_segments(F, kmin=kmin, kmax=kmax)
        except:
            # The audio file is too short, only beginning and end
            logging.warning("Audio file too short! "
                            "Only start and end boundaries.")
            est_idxs = [0, F.shape[1] - 1]

and

        try:
            # Load and apply transform
            W = load_transform(self.config["transform"])
            F = W.dot(F)

            # Get Segments
            kmin, kmax = get_num_segs(dur)

            # Run algorithm layer by layer
            est_idxs = []
            est_labels = []
            for k in range(kmin, kmax):
                S, cost = get_k_segments(F, k)
                est_idxs.append(S)
                est_labels.append(np.ones(len(S) - 1) * -1)

                # Make sure that the first and last boundaries are included
                assert est_idxs[-1][0] == 0 and \
                    est_idxs[-1][-1] == F.shape[1] - 1, "Layer %d does not " \
                    "start or end in the right frame(s)." % k

                # Post process layer
                est_idxs[-1], est_labels[-1] = \
                        self._postprocess(est_idxs[-1], est_labels[-1])
        except:
            # The audio file is too short, only beginning and end
            logging.warning("Audio file too short! "
                            "Only start and end boundaries.")
            est_idxs = [np.array([0, F.shape[1] - 1])]
            est_labels = [np.ones(1) * -1]

I found at the moment there is an issue between librosa and sklearn that makes the olda algorithm fail (should be fixed soon, though) and this logging to warn about something completely unrelated to the actual problem. Maybe someone familiarized with the olda algorithm could estimate how large the file should be for it to work?

@andimarafioti andimarafioti changed the title OLDA "Audio file too short!" warning is misleading OLDA's "Audio file too short!" warning is misleading Oct 10, 2017
@urinieto urinieto added the bug label Oct 11, 2017
@urinieto
Copy link
Owner

Yeah, this should be fixed as it is terrible coding practice to not catch specific exceptions. Will work on this in the next release. Thanks for pointing it out.

@urinieto urinieto self-assigned this Oct 11, 2017
@andimarafioti
Copy link
Contributor Author

Great! I just wanted to point out where this problem with the olda algorithm is coming from in case anyone else runs into it.
About the exception, it's not that bad. But if you're going to work on it I suggest not only catching specific exception but also doing smaller try blocks and never ever inside a for loop. There's even an assertion there that would be caught by the try block and you will get a "file too short" message instead of the intended "Layer %d does not start or end in the right frame(s).".

@PaulMcInnis
Copy link

Scikit-learn's 0.19.1 release has now fixed this issue, looks like all that needs to be done now is to update the setup.py 👍

@urinieto
Copy link
Owner

urinieto commented Dec 7, 2017

Awesome, thanks for sharing this, @PaulMcInnis. Will leave this open such that I remember to remove the general exception catching in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants