-
Notifications
You must be signed in to change notification settings - Fork 9
Using Obspy to load data into PH5
NOTE: The utilities described on this page are deprecated and cannot be used to build a valid PH5. DO NOT USE.
PH5 natively supports common data loggers in use by the IRIS PASSCAL facility. These include reftek 125a and 130, Q330, Geodes, and Fairfield 1 channel and 3 channel nodal instruments. As PH5 has grown it has become important to allow community members to easily import their own data in to PH5.
To accomplish this, two new packages were developed: metadatatoph5 and obspytoph5. Using these two packages, allows for the easy ingestion of new data types.
The easiest way to ingest metadata in to PH5 for a custom data source is to use stationXML. Creating stationXML allows you to load all the necessary metadata and response information into PH5 at one time.
A few things to note: PH5 works based on datalogger serial number, so in your stationxml a datalogger serial number will be required in order to load waveform data into PH5. Second, PH5 fundamentally works at the channel level. In your stationXML it is advisable to put lat, lon, and elevation for each channel. Metadatatoph5 though will use the station lat, lon and elevation for associated channels in the case channel locations are not present.
PASSCAL provides a tool called Nexus that can be used to create and edit stationXML. Nexus is also capable of grabbing RESP files from the IRIS Nominal Response Library and calculating responses if your instrumentation is in the NRL.
Once you have a valid stationXML file describing the data you would like to load, it can be loaded using metadatatoph5 from the command line metadatatoph5 -n master.ph5 -f <file-name>
. If you do not have a PH5 archive master.ph5 yet, this program will create one for you.
As an example we will load a simple stationxml file describing a single station and channel in to PH5. This PH5 loaded with metadata will be used later in the tutorial to load custom waveform data described by this stationxml.
<?xml version='1.0' encoding='UTF-8'?>
<FDSNStationXML xmlns="http://www.fdsn.org/xml/station/1" schemaVersion="1.0">
<Source/>
<Module>Nexus.2018.142.beta</Module>
<ModuleURI>www.passcal.nmt.edu</ModuleURI>
<Created>2019-03-14T17:17:10.202178Z</Created>
<Network code="XX" endDate="2019-03-14T23:59:00.000000Z" startDate="2019-03-14T00:00:00.000000Z">
<Station code="12345" endDate="2019-03-14T23:59:00.000000Z" startDate="2019-03-14T00:00:00.000000Z">
<Latitude unit="DEGREES">0.0</Latitude>
<Longitude unit="DEGREES">0.0</Longitude>
<Elevation unit="METERS">0.0</Elevation>
<Site>
<Name></Name>
</Site>
<CreationDate>2019-03-14T17:17:48.633005Z</CreationDate>
<Channel code="DPZ" endDate="2019-03-14T05:01:00.000000Z" locationCode="01" startDate="2019-03-14T05:00:00.000000Z">
<Comment>
<Value></Value>
</Comment>
<Latitude unit="DEGREES">34.054673</Latitude>
<Longitude unit="DEGREES">-106.906169</Longitude>
<Elevation unit="METERS">1400.0</Elevation>
<Depth unit="METERS">0.0</Depth>
<Azimuth unit="DEGREES">0.0</Azimuth>
<Dip unit="DEGREES">90.0</Dip>
<SampleRate>250.0</SampleRate>
<Sensor>
<Description></Description>
</Sensor>
<DataLogger>
<SerialNumber>11111</SerialNumber>
</DataLogger>
</Channel>
</Station>
</Network>
</FDSNStationXML>
With this file saved to disk we load it with the following command:
metadatatoph5 -n master.ph5 -f example.xml
We should get an output like this:
metadatatoph5 -n master.ph5 -f example.xml
[2019-03-14 17:24:57,730] - ph5.utilities.metadatatoph5 - WARNING: ./master.ph5 not found. Creating...
[2019-03-14 17:24:57,766] - ph5.utilities.initialize_ph5 - INFO: Creating temporary receiver_t.tmp kef file using default values.
[2019-03-14 17:24:57,770] - ph5.utilities.initialize_ph5 - INFO: Loading Experiment_g/Receivers_g/Receiver_t using receiver_t.tmp.
[2019-03-14 17:24:57,773] - ph5.utilities.metadatatoph5 - INFO: Removing temporary receiver_t.tmp kef file.
[2019-03-14 17:24:57,779] - ph5.utilities.metadatatoph5 - INFO: Done... Created new PH5 file ./master.ph5.
[2019-03-14 17:24:57,816] - ph5.utilities.metadatatoph5 - INFO: File example.xml is STATIONXML...
[2019-03-14 17:24:57,816] - ph5.utilities.metadatatoph5 - INFO: *****************
[2019-03-14 17:24:57,818] - ph5.utilities.metadatatoph5 - INFO: Found station 12345
[2019-03-14 17:24:57,820] - ph5.utilities.metadatatoph5 - INFO: Found channel DPZ
[2019-03-14 17:24:57,821] - ph5.utilities.metadatatoph5 - INFO: Loaded channel DPZ
[2019-03-14 17:24:57,821] - ph5.utilities.metadatatoph5 - INFO: Loaded Station 12345
[2019-03-14 17:24:57,822] - ph5.utilities.metadatatoph5 - INFO: ******************
Our station and channel metadata is now loaded. To verify we can run: tabletokef -n master.ph5 -A 1
Giving us an output similar to:
tabletokef -n master.ph5 -A 1
#
# Thu Mar 14 17:27:20 2019 ph5 version: 4.1.2
#
# Table row 1
/Experiment_g/Sorts_g/Array_t_001
id_s=12345
location/X/value_d=-106.906169
location/X/units_s=degrees
location/Y/value_d=34.054673
location/Y/units_s=degrees
location/Z/value_d=1400.0
location/Z/units_s=m
location/coordinate_system_s=
location/projection_s=
location/ellipsoid_s=
location/description_s=
deploy_time/ascii_s=2019-03-14T05:00:00
deploy_time/epoch_l=1552539600
deploy_time/micro_seconds_i=0
deploy_time/type_s=BOTH
pickup_time/ascii_s=2019-03-14T05:01:00
pickup_time/epoch_l=1552539660
pickup_time/micro_seconds_i=0
pickup_time/type_s=BOTH
das/serial_number_s=11111
das/model_s=None
das/manufacturer_s=None
das/notes_s=None
sensor/serial_number_s=None
sensor/model_s=None
sensor/manufacturer_s=None
sensor/notes_s=None
description_s=
seed_band_code_s=D
sample_rate_i=250
sample_rate_multiplier_i=1
seed_instrument_code_s=P
seed_orientation_code_s=Z
seed_location_code_s=01
seed_station_name_s=12345
channel_number_i=3
receiver_table_n_i=0
response_table_n_i=0
metadatatoph5 can also be used as a module in your own applications to load an obspy inventory object in to PH5. An example on how to do this will be provided here soon
Using the obspytoph5 package in your own software allows you to load obspy streams into PH5, provided the metadata already exists in PH5. Below will show a simple example of creating an obspy stream with data from a text file and loading it in to PH5. In this example our data is just a text file of random data with one sample per line. Example data
from ph5.utilities import obspytoph5
from ph5.core import experiment
import numpy as np
from obspy.core import Trace, UTCDateTime
# first let's read our data file into a numpy array
data = list()
fh = open('data.txt')
for line in fh:
data.append(int(line.strip('\r\n')))
fh.close()
data = np.array(data, dtype='int32')
# now create an obspy trace
# we could load the stationxml and get the metadata from there
# but that's a different tutorial
# for now let's do it by hand
trace = Trace()
trace.stats.network = 'XX'
trace.stats.station = '12345'
trace.stats.location = '01'
trace.stats.channel = 'DPZ'
trace.stats.starttime = UTCDateTime('2019-03-14T05:00:00.000000Z')
trace.stats.sampling_rate = 250.0
trace.data = data
# now we have our data let's set up PH5
# open ph5 for editing
path = '.'
ph5_object = experiment.ExperimentGroup(
nickname='master.ph5',
currentpath=path)
ph5_object.ph5open(True)
ph5_object.initgroup()
# lets create an obspytoph5 instance
# we want a single mini file and start at mini file 1
obs = obspytoph5.ObspytoPH5(
ph5_object,
path,
num_mini=1,
first_mini=1)
# turn on verbose logging so we can see more info
obs.verbose = True
# we give it a our trace and should get a message
# back saying done as well as an index table to be loaded
message, index_t = obs.toph5((trace, 'Trace'))
# now load are index table
for entry in index_t:
ph5_object.ph5_g_receivers.populateIndex_t(entry)
# the last thing we need ot do ater loading
# all our data is to update external refeerences
# this takes all the mini files and adds their
# references to the master so we can find the data
obs.update_external_references(index_t)
# be nice and close the file
ph5_object.ph5close()
Running this code we should see an output similar to:
python custom_data_example.py
[2019-03-14 19:30:21,790] - ph5.utilities.obspytoph5 - INFO: Processing 1 traces in stream for XX.12345.01.DPZ | 2019-03-14T05:00:00.000000Z - 2019-03-14T05:00:00.996000Z | 250.0 Hz, 250 samples
[2019-03-14 19:30:21,790] - ph5.utilities.obspytoph5 - INFO: Processing trace DPZ in 12345
[2019-03-14 19:30:21,822] - ph5.utilities.obspytoph5 - INFO: Finished processing XX.12345.01.DPZ | 2019-03-14T05:00:00.000000Z - 2019-03-14T05:00:00.996000Z | 250.0 Hz, 250 samples
[2019-03-14 19:30:21,836] - ph5.utilities.obspytoph5 - INFO: updating external references
We should now have two .ph5 files: master.ph5 and miniPH5_00001.ph5. The file master.ph5 contains the bulk of our metadata, while miniPH5_00001.ph5 contains our waveform data.
The last thing we need to do to make this PH5 work is to generate the overall experiment metadata.
This can be done using the command experiment_t_gen
The only required information to fill out here is Net code and experiment_id.
Future versions of metadatatoph5 and obspytoph5 will use the network information in the stationxml.
The network code is a SEED network code and the experiment_id is an IRIS assembled data set id in the form dd-dddd
where the first 2 digits are the year and the last 4 digits are the experiment number for the year.
Save this file then load it into PH5 with the following command: keftoph5 -n master.ph5 -k <file_name_here>
You should now have a PH5 archive containing your data. Let's output the data as miniSEED to check
ph5toms -n master.ph5 -o miniseed/
ls miniseed/
XX.12345.01.DPZ.2019-03-14T050000.000000.ms