Skip to content

Added partial visualisation for Audio Datasets under tfds #1683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 42 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
313b571
Checking for audio key generation
harshitadd Mar 18, 2020
cb417fc
1
harshitadd Mar 18, 2020
5489a5f
2
harshitadd Mar 18, 2020
f70cf46
3
harshitadd Mar 18, 2020
ebf78db
5
harshitadd Mar 18, 2020
8075721
Update visualization.py
harshitadd Mar 18, 2020
8fc8ddf
6
harshitadd Mar 18, 2020
02bc9e2
7
harshitadd Mar 18, 2020
5242458
8
harshitadd Mar 18, 2020
f801b41
9
harshitadd Mar 18, 2020
04af5b2
10
harshitadd Mar 18, 2020
d84455d
10
harshitadd Mar 18, 2020
581620d
Update visualization.py
harshitadd Mar 18, 2020
8791219
1.1
harshitadd Mar 20, 2020
17b9188
1.2
harshitadd Mar 20, 2020
a5dc660
1.2
harshitadd Mar 20, 2020
df53df0
adding null check on image key
harshitadd Mar 20, 2020
15550dc
unexpected indent
harshitadd Mar 20, 2020
cfb259b
audio key instancing not working
harshitadd Mar 20, 2020
f01622e
imported some headers
harshitadd Mar 20, 2020
0378ca4
taking the 1st 20 ms of the audio
harshitadd Mar 20, 2020
e95ded9
Update visualization.py
harshitadd Mar 20, 2020
42df323
Update visualization.py
harshitadd Mar 20, 2020
2539974
Proper formatting and instantiation type check
harshitadd Mar 22, 2020
ad62afe
Syncing with master
harshitadd Mar 27, 2020
ede24ba
Adding audio_visualizer defination
harshitadd Mar 27, 2020
10e96bb
Added AudioGridVisualizer
harshitadd Mar 27, 2020
bea3e4e
Removed write to disk dependency
harshitadd Mar 27, 2020
46c6260
using extract_keys to extract keys
harshitadd Mar 27, 2020
314ee0c
bug: audio_keys returning NULL
harshitadd Mar 27, 2020
344f8d3
_
harshitadd Mar 27, 2020
8df4e9d
Updated init.py to include AudioGridVisualizer
harshitadd Mar 27, 2020
1e12f8d
_
harshitadd Mar 27, 2020
bb07fa2
Added missing imports
harshitadd Mar 27, 2020
817dcfa
Fixed formatting
harshitadd Mar 27, 2020
7af71b3
Fixed formatting
harshitadd Mar 27, 2020
26fd653
Local scope, declaration of audio_keys
harshitadd Mar 28, 2020
3e99ff0
Merge branch 'master' of https://github.com/tensorflow/datasets
harshitadd Apr 3, 2020
ac73594
Class docstrings and code style
harshitadd Apr 3, 2020
b5d7ec6
Adding mock patch for AudioVisualizer
harshitadd Apr 3, 2020
cc487e9
Supporting generic inputs
harshitadd Apr 8, 2020
801987f
Removed unused imports
harshitadd Apr 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion tensorflow_datasets/core/visualization/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,15 @@
# limitations under the License.

"""Visualizer utils."""

from tensorflow_datasets.core.visualization.audio_visualizer import AudioGridVisualizer
from tensorflow_datasets.core.visualization.image_visualizer import ImageGridVisualizer
from tensorflow_datasets.core.visualization.show_examples import show_examples
from tensorflow_datasets.core.visualization.visualizer import Visualizer


__all__ = [
"ImageGridVisualizer",
"AudioGridVisualizer",
"show_examples",
"Visualizer",
]
68 changes: 68 additions & 0 deletions tensorflow_datasets/core/visualization/audio_visualizer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
""" Audio Visualization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test in show_example_test.py to make sure this works ? You can use https://docs.python.org/3/library/unittest.mock.html to make sure the AudioVisualizer is chosen.

Copy link
Author

@harshitadd harshitadd Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a mock patch for the AudioVisualizer class and get the following type error :
TypeError: test_show_examples() takes 2 positional arguments but 3 were given


Ran 2 tests in 0.557s

FAILED (errors=1, skipped=1)

"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function


import random
import IPython.display
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash for all user not running in a non-ipython environement. This should not be imported in the global scope


from absl import logging

from tensorflow_datasets.core import dataset_utils
from tensorflow_datasets.core import features as features_lib
from tensorflow_datasets.core import lazy_imports_lib
from tensorflow_datasets.core.visualization import visualizer
plt = lazy_imports_lib.lazy_imports.matplotlib.pyplot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will raise ImportError for users not having installed matplotlib.
The goal of lazy import is to avoid non-essencial dependencies by importing within specific function instead of in global scope.

Copy link
Author

@harshitadd harshitadd Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check if a dataset has an audio feature, you can see on our catalog: For instance https://www.tensorflow.org/datasets/catalog/crema_d

It seems Groove is using tfds.features.Tensor instead of tfds.features.Audio, which sounds like a bug to me, we should upgrade groove to use audio feature. I'll open a bug for this:
https://www.tensorflow.org/datasets/catalog/groove#groovefull-16000hz

Edit - Thank you for your comments - A bug for the Groove 'Tensor' attribute will be helpful - #1741 : Just realized that you have generated the issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug has been fixed so it should works if you are rebasing from master.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, Thanks! The auto inferring works now ( Link . I have tested it with Groove, crema_d, ljspeech.


class AudioGridVisualizer(visualizer.Visualizer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you format your code with https://www.tensorflow.org/datasets/add_dataset#5_check_your_code_style

(add docstring, new line before method declaration, correct docstring...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

def match(self, ds_info):
audio_keys = visualizer.extract_keys(ds_info.features, features_lib.Audio)
return len(audio_keys) > 0

def show(
self,
ds_info,
ds,
):
"""Display the dataset.

Args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments are not matching

ds_info: `tfds.core.DatasetInfo` object of the dataset to visualize.
ds: `tf.data.Dataset`. The tf.data.Dataset object to visualize. Examples
should not be batched. Examples will be consumed in order until
(rows * cols) are read or the dataset is consumed.
rows: `int`, number of rows of the display grid.
cols: `int`, number of columns of the display grid.
plot_scale: `float`, controls the plot size of the images. Keep this
value around 3 to get a good plot. High and low values may cause
the labels to get overlapped.
image_key: `string`, name of the feature that contains the image. If not
set, the system will try to auto-detect it.
"""
key = audio_keys[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it declared ?

audio_samples = []

samplerate = 16000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you can use ds_info.features[key].sample_rate when defined (and use default value if sample_rate is None)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

for features in ds:
audio_samples.append(features[key].numpy())
to_gen = []
for _ in range(4):
value = random.randint(0, len(audio_samples))
to_gen.append(audio_samples[value])

t1 = 0
t2 = 100 * 1000
for audio in to_gen:
newAudio = audio[t1:t2]
IPython.display.Audio(newAudio, rate=samplerate)

fig, a = plt.subplots(2, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the code more generic, you could refactor the code to use rows and cols, similarly to https://github.com/tensorflow/datasets/blob/b5d7ec65b84aa95e7f5e78f01f7698958498f65c/tensorflow_datasets/core/visualization/image_visualizer.py

And ideally, you could try to reuse the make_grid function

def _make_grid(plot_single_ex_fn, ds, rows, cols, plot_scale):

Or make a similar util function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Right! I made some edits accordingly


a[0][0].plot(to_gen[0])
a[0][1].plot(to_gen[1])
a[1][0].plot(to_gen[2])
a[1][1].plot(to_gen[3])
plt.show()
return fig
2 changes: 2 additions & 0 deletions tensorflow_datasets/core/visualization/show_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,12 @@
from __future__ import division
from __future__ import print_function

from tensorflow_datasets.core.visualization import audio_visualizer
from tensorflow_datasets.core.visualization import image_visualizer

_ALL_VISUALIZERS = [
image_visualizer.ImageGridVisualizer(),
audio_visualizer.AudioGridVisualizer(),
]


Expand Down