-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Added partial visualisation for Audio Datasets under tfds #1683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harshitadd Please have a look on some of the changes I requested if I am mistaken anywhere let me know.
from __future__ import absolute_import | ||
from __future__ import division | ||
from __future__ import print_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove them as we drop use of python 2, so we don't need these codes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally, we are still supporting Python2, so it shouldn't be removed yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Conchylicultor thanks for correction
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harshitadd Please remove one line and check your code style by running ./oss_scripts/lint.sh tensorflow_datasets/image
if not image_keys: | ||
raise ValueError( | ||
"Visualisation not supported for dataset `{}`. Was not able to " | ||
"auto-infer image.".format(ds_info.name)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing these, I think we should print a message if provided data is video or text. Maybe you can add param like audio key and use if not image_keys and not audio_keys
then again check for if not image_keys
so print this message else print shows error like visualization is supports only for audio and mage dataset
@@ -66,17 +70,13 @@ def show_examples(ds_info, ds, rows=3, cols=3, plot_scale=3., image_key=None): | |||
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot | |||
|
|||
if not image_key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use if not image_key and not audio_key
after adding audio key param and generate audio_keys using features_lib.Audio
so that if any other dataset like video or text is passed it can print error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this - feature_lib.Audio was returning None for both the tested datasets. Since that instantiation check was not working - I had to resort to this implementation. []Here) is the commit with that code - Perhaps you could point out the mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know why its not working , dataset you are trying for audio_keys have how many number of channels ? , Audio feature supports only 1-D values see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just rechecked - I think I was missing something trying it earlier - the instantiation check is working (at least with the feature_lib.Audio) so I will add that. My bad.
Thanks for the code style check reminder - have done that as well and will make the required changes! 👍
@@ -85,46 +85,82 @@ def show_examples(ds_info, ds, rows=3, cols=3, plot_scale=3., image_key=None): | |||
|
|||
image_key = image_keys[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think check first like if image_keys
then only set image_key
as if data is not image dataset image_key
set to empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to comment 78 and code 79 - The check to run this code has been added ( only when the image instances are populated )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh sorry its my mistake
if not label_key: | ||
logging.info("Was not able to auto-infer label.") | ||
|
||
num_examples = rows * cols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check here if image_key
so that below code runs only if data is image dataset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to code line 78 - This is all under if image keys: indicating that the code runs only when the image keys have been successfully generated.
return fig | ||
|
||
# if not image item instances | ||
if not image_key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if dataset is like video dataset then it tries to run this command also so change it to if audio_key
after passing audio_key param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - As mentioned - The audio key instantiation check wasn't working ( commit linked above ) - I am working on resolving that, which should allow introducing video/text/audio based instance checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please
@Conchylicultor - I tried refactoring the code to plug it in the visualizer codebase but the audio_key extraction seems to be returning a null as a result of which the match() method fails to execute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check if a dataset has an audio feature, you can see on our catalog: For instance https://www.tensorflow.org/datasets/catalog/crema_d
It seems Groove is using tfds.features.Tensor
instead of tfds.features.Audio
, which sounds like a bug to me, we should upgrade groove to use audio feature. I'll open a bug for this:
https://www.tensorflow.org/datasets/catalog/groove#groovefull-16000hz
from tensorflow_datasets.core import features as features_lib | ||
from tensorflow_datasets.core import lazy_imports_lib | ||
from tensorflow_datasets.core.visualization import visualizer | ||
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will raise ImportError for users not having installed matplotlib
.
The goal of lazy import is to avoid non-essencial dependencies by importing within specific function instead of in global scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check if a dataset has an audio feature, you can see on our catalog: For instance https://www.tensorflow.org/datasets/catalog/crema_d
It seems Groove is using
tfds.features.Tensor
instead oftfds.features.Audio
, which sounds like a bug to me, we should upgrade groove to use audio feature. I'll open a bug for this:
https://www.tensorflow.org/datasets/catalog/groove#groovefull-16000hz
Edit - Thank you for your comments - A bug for the Groove 'Tensor' attribute will be helpful - #1741 : Just realized that you have generated the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug has been fixed so it should works if you are rebasing from master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, Thanks! The auto inferring works now ( Link . I have tested it with Groove, crema_d, ljspeech.
from tensorflow_datasets.core.visualization import visualizer | ||
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot | ||
|
||
class AudioGridVisualizer(visualizer.Visualizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you format your code with https://www.tensorflow.org/datasets/add_dataset#5_check_your_code_style
(add docstring, new line before method declaration, correct docstring...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
|
||
|
||
import random | ||
import IPython.display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will crash for all user not running in a non-ipython environement. This should not be imported in the global scope
@@ -0,0 +1,68 @@ | |||
""" Audio Visualization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test in show_example_test.py
to make sure this works ? You can use https://docs.python.org/3/library/unittest.mock.html to make sure the AudioVisualizer is chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a mock patch for the AudioVisualizer class and get the following type error :
TypeError: test_show_examples() takes 2 positional arguments but 3 were given
Ran 2 tests in 0.557s
FAILED (errors=1, skipped=1)
): | ||
"""Display the dataset. | ||
|
||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguments are not matching
image_key: `string`, name of the feature that contains the image. If not | ||
set, the system will try to auto-detect it. | ||
""" | ||
key = audio_keys[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is it declared ?
image_key: `string`, name of the feature that contains the image. If not | ||
set, the system will try to auto-detect it. | ||
""" | ||
import random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random is a standard Python module, so installed in every users, so it is safe to keep it in the global scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. This looks better.
key = audio_keys[0] | ||
audio_samples = [] | ||
|
||
samplerate = 16000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now you can use ds_info.features[key].sample_rate
when defined (and use default value if sample_rate is None
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
audio = audio[t1:t2] | ||
ipd.display(ipd.Audio(audio, rate=samplerate)) | ||
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot | ||
fig, a = plt.subplots(2, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the code more generic, you could refactor the code to use rows
and cols
, similarly to https://github.com/tensorflow/datasets/blob/b5d7ec65b84aa95e7f5e78f01f7698958498f65c/tensorflow_datasets/core/visualization/image_visualizer.py
And ideally, you could try to reuse the make_grid
function
def _make_grid(plot_single_ex_fn, ds, rows, cols, plot_scale): |
Or make a similar util function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All Right! I made some edits accordingly
ds, | ||
rows=2, | ||
cols=2, | ||
plot_scale=3., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the arguments here are unused.
@mock.patch('matplotlib.pyplot.figure') | ||
def test_show_examples(self, mock_fig): | ||
@mock.patch('audio_visualizer.AudioGridVisualizer') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to test AudioGridVisualizer
to make sure it don't crash so it shouldn't be patched. Only the low level functions, like IPython.display
might be patched.
Added partial visualization support for audio datasets under tfds: Tested on LJspeech, Groove
Issue Link: 1528
Procedure:
Notebook demonstration: Link