-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add parsing for non 1-minute data to UO SRML parser #711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
af07bd1
9d7f657
e8944f5
a1ca00b
c32c8b3
55be463
da2ce6b
de734c0
3e220ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,11 +42,11 @@ def read_srml(filename): | |
|
||
Notes | ||
----- | ||
The time index is shifted back one minute to account for 2400 hours, | ||
and to avoid time parsing errors on leap years. The returned data | ||
values should be understood to occur during the interval from the | ||
time of the row until the time of the next row. This is consistent | ||
with pandas' default labeling behavior. | ||
The time index is shifted back by one interval to account for the | ||
daily endtime of 2400, and to avoid time parsing errors on leap | ||
years. The returned data values should be understood to occur | ||
during the interval from the time of the row until the time of the | ||
next row. This is consistent with pandas' default labeling behavior. | ||
|
||
See SRML's `Archival Files`_ page for more information. | ||
|
||
|
@@ -134,11 +134,17 @@ def format_index(df): | |
year = int(df.columns[1]) | ||
df_doy = df[df.columns[0]] | ||
# Times are expressed as integers from 1-2400, we convert to 0-2359 by | ||
# subracting one and then correcting the minutes at each former hour. | ||
df_time = df[df.columns[1]] - 1 | ||
fifty_nines = df_time % 100 == 99 | ||
times = df_time.where(~fifty_nines, df_time - 40) | ||
|
||
# subracting the length of one interval and then correcting the times | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add comment like "e.g. the first two rows of hourly data are 100, 200, so interval length = 100" |
||
# at each former hour. interval_length is determined by taking the | ||
# difference of the first two rows of the time column. | ||
interval_length = int(df[df.columns[1]][:2].diff()[1]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line would be easier to read as, e.g., |
||
df_time = df[df.columns[1]] - interval_length | ||
if interval_length == 100: | ||
# Hourly files do not require fixing the former hour timestamps. | ||
times = df_time | ||
else: | ||
old_hours = df_time % 100 == (100 - interval_length) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this needs a few comments |
||
times = df_time.where(~old_hours, df_time - 40) | ||
times = times.apply(lambda x: '{:04.0f}'.format(x)) | ||
wholmgren marked this conversation as resolved.
Show resolved
Hide resolved
|
||
doy = df_doy.apply(lambda x: '{:03.0f}'.format(x)) | ||
dts = pd.to_datetime(str(year) + '-' + doy + '-' + times, | ||
|
@@ -161,14 +167,30 @@ def read_srml_month_from_solardat(station, year, month, filetype='PO'): | |
month: int | ||
Month to request data for. | ||
filetype: string | ||
SRML file type to gather. 'RO' and 'PO' are the | ||
only minute resolution files. | ||
SRML file type to gather. See notes for explanation. | ||
|
||
Returns | ||
------- | ||
data: pd.DataFrame | ||
One month of data from SRML. | ||
|
||
Notes | ||
----- | ||
File types designate the time interval of a file and if it contains | ||
raw or processed data. For instance, `RO` designates raw, one minute | ||
data and `PO` designates processed one minute data. The availability | ||
of file types varies between sites. Below is a table of file types | ||
and their time intervals. See [1] for site information. | ||
|
||
============= ============ ================== | ||
time interval raw filetype processed filetype | ||
============= ============ ================== | ||
1 minute RO PO | ||
5 minute RF PF | ||
15 minute RQ PQ | ||
hourly RH PH | ||
============= ============ ================== | ||
|
||
References | ||
---------- | ||
[1] University of Oregon Solar Radiation Measurement Laboratory | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are labeled by the left endpoint of interval, and should be understood...