Skip to content

Added partial visualisation for Audio Datasets under tfds #1683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

harshitadd
Copy link

Added partial visualization support for audio datasets under tfds: Tested on LJspeech, Groove
Issue Link: 1528

Procedure:

  1. Pick random samples from argument dataset
  2. Trim the picked audio samples to include only the first 6 seconds of the audio
  3. Use IPython to display the trimmed audio sample
  4. Plot the corresponding audio waveforms.

Notebook demonstration: Link

@googlebot googlebot added the cla: yes Author has signed CLA label Mar 20, 2020
@tfds-bot tfds-bot added the community:please_review Community - We need your help to review this PR. label Mar 20, 2020
Copy link
Contributor

@Eshan-Agarwal Eshan-Agarwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshitadd Please have a look on some of the changes I requested if I am mistaken anywhere let me know.

Comment on lines 20 to 22
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove them as we drop use of python 2, so we don't need these codes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, we are still supporting Python2, so it shouldn't be removed yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Conchylicultor thanks for correction

Comment on lines 165 to 166


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshitadd Please remove one line and check your code style by running ./oss_scripts/lint.sh tensorflow_datasets/image

Comment on lines 75 to 79
if not image_keys:
raise ValueError(
"Visualisation not supported for dataset `{}`. Was not able to "
"auto-infer image.".format(ds_info.name))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing these, I think we should print a message if provided data is video or text. Maybe you can add param like audio key and use if not image_keys and not audio_keys then again check for if not image_keys so print this message else print shows error like visualization is supports only for audio and mage dataset

@@ -66,17 +70,13 @@ def show_examples(ds_info, ds, rows=3, cols=3, plot_scale=3., image_key=None):
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot

if not image_key:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use if not image_key and not audio_key after adding audio key param and generate audio_keys using features_lib.Audio so that if any other dataset like video or text is passed it can print error

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this - feature_lib.Audio was returning None for both the tested datasets. Since that instantiation check was not working - I had to resort to this implementation. []Here) is the commit with that code - Perhaps you could point out the mistake?

Copy link
Contributor

@Eshan-Agarwal Eshan-Agarwal Mar 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know why its not working , dataset you are trying for audio_keys have how many number of channels ? , Audio feature supports only 1-D values see

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rechecked - I think I was missing something trying it earlier - the instantiation check is working (at least with the feature_lib.Audio) so I will add that. My bad.
Thanks for the code style check reminder - have done that as well and will make the required changes! 👍

@@ -85,46 +85,82 @@ def show_examples(ds_info, ds, rows=3, cols=3, plot_scale=3., image_key=None):

image_key = image_keys[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think check first like if image_keys then only set image_key as if data is not image dataset image_key set to empty

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to comment 78 and code 79 - The check to run this code has been added ( only when the image instances are populated )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh sorry its my mistake

if not label_key:
logging.info("Was not able to auto-infer label.")

num_examples = rows * cols
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check here if image_key so that below code runs only if data is image dataset

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to code line 78 - This is all under if image keys: indicating that the code runs only when the image keys have been successfully generated.

return fig

# if not image item instances
if not image_key:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if dataset is like video dataset then it tries to run this command also so change it to if audio_key after passing audio_key param

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - As mentioned - The audio key instantiation check wasn't working ( commit linked above ) - I am working on resolving that, which should allow introducing video/text/audio based instance checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please

@tfds-bot tfds-bot added author:please_respond Author - please respond to the recent comments. tfds:is_reviewing TFDS team: PTAL and removed community:please_review Community - We need your help to review this PR. author:please_respond Author - please respond to the recent comments. labels Mar 21, 2020
@harshitadd
Copy link
Author

harshitadd commented Mar 27, 2020

@Conchylicultor - I tried refactoring the code to plug it in the visualizer codebase but the audio_key extraction seems to be returning a null as a result of which the match() method fails to execute.
I have removed the write on disk dependency ( audio now displays directly from the np array ) and added the missing imports.

Copy link
Member

@Conchylicultor Conchylicultor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check if a dataset has an audio feature, you can see on our catalog: For instance https://www.tensorflow.org/datasets/catalog/crema_d

It seems Groove is using tfds.features.Tensor instead of tfds.features.Audio, which sounds like a bug to me, we should upgrade groove to use audio feature. I'll open a bug for this:
https://www.tensorflow.org/datasets/catalog/groove#groovefull-16000hz

from tensorflow_datasets.core import features as features_lib
from tensorflow_datasets.core import lazy_imports_lib
from tensorflow_datasets.core.visualization import visualizer
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will raise ImportError for users not having installed matplotlib.
The goal of lazy import is to avoid non-essencial dependencies by importing within specific function instead of in global scope.

Copy link
Author

@harshitadd harshitadd Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check if a dataset has an audio feature, you can see on our catalog: For instance https://www.tensorflow.org/datasets/catalog/crema_d

It seems Groove is using tfds.features.Tensor instead of tfds.features.Audio, which sounds like a bug to me, we should upgrade groove to use audio feature. I'll open a bug for this:
https://www.tensorflow.org/datasets/catalog/groove#groovefull-16000hz

Edit - Thank you for your comments - A bug for the Groove 'Tensor' attribute will be helpful - #1741 : Just realized that you have generated the issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug has been fixed so it should works if you are rebasing from master.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, Thanks! The auto inferring works now ( Link . I have tested it with Groove, crema_d, ljspeech.

from tensorflow_datasets.core.visualization import visualizer
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot

class AudioGridVisualizer(visualizer.Visualizer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you format your code with https://www.tensorflow.org/datasets/add_dataset#5_check_your_code_style

(add docstring, new line before method declaration, correct docstring...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!



import random
import IPython.display
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash for all user not running in a non-ipython environement. This should not be imported in the global scope

@@ -0,0 +1,68 @@
""" Audio Visualization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test in show_example_test.py to make sure this works ? You can use https://docs.python.org/3/library/unittest.mock.html to make sure the AudioVisualizer is chosen.

Copy link
Author

@harshitadd harshitadd Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a mock patch for the AudioVisualizer class and get the following type error :
TypeError: test_show_examples() takes 2 positional arguments but 3 were given


Ran 2 tests in 0.557s

FAILED (errors=1, skipped=1)

):
"""Display the dataset.

Args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments are not matching

image_key: `string`, name of the feature that contains the image. If not
set, the system will try to auto-detect it.
"""
key = audio_keys[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it declared ?

@tfds-bot tfds-bot added author:please_respond Author - please respond to the recent comments. and removed tfds:is_reviewing TFDS team: PTAL labels Mar 27, 2020
image_key: `string`, name of the feature that contains the image. If not
set, the system will try to auto-detect it.
"""
import random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random is a standard Python module, so installed in every users, so it is safe to keep it in the global scope

@tfds-bot tfds-bot added tfds:is_reviewing TFDS team: PTAL author:please_respond Author - please respond to the recent comments. and removed author:please_respond Author - please respond to the recent comments. tfds:is_reviewing TFDS team: PTAL labels Apr 1, 2020
@tfds-bot tfds-bot added tfds:is_reviewing TFDS team: PTAL and removed author:please_respond Author - please respond to the recent comments. labels Apr 3, 2020
Copy link
Member

@Conchylicultor Conchylicultor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. This looks better.

key = audio_keys[0]
audio_samples = []

samplerate = 16000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you can use ds_info.features[key].sample_rate when defined (and use default value if sample_rate is None)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

audio = audio[t1:t2]
ipd.display(ipd.Audio(audio, rate=samplerate))
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot
fig, a = plt.subplots(2, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the code more generic, you could refactor the code to use rows and cols, similarly to https://github.com/tensorflow/datasets/blob/b5d7ec65b84aa95e7f5e78f01f7698958498f65c/tensorflow_datasets/core/visualization/image_visualizer.py

And ideally, you could try to reuse the make_grid function

def _make_grid(plot_single_ex_fn, ds, rows, cols, plot_scale):

Or make a similar util function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Right! I made some edits accordingly

ds,
rows=2,
cols=2,
plot_scale=3.,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the arguments here are unused.

@tfds-bot tfds-bot added author:please_respond Author - please respond to the recent comments. and removed tfds:is_reviewing TFDS team: PTAL labels Apr 3, 2020
@mock.patch('matplotlib.pyplot.figure')
def test_show_examples(self, mock_fig):
@mock.patch('audio_visualizer.AudioGridVisualizer')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to test AudioGridVisualizer to make sure it don't crash so it shouldn't be patched. Only the low level functions, like IPython.display might be patched.

@tfds-bot tfds-bot added tfds:is_reviewing TFDS team: PTAL and removed author:please_respond Author - please respond to the recent comments. labels Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Author has signed CLA tfds:is_reviewing TFDS team: PTAL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants