Skip to content

Commit 138ed0e

Browse files
authored
Merge pull request #870 from edx/robrap/code-owner-monitoring-phase-2
feat: phase 2 of code owner monitoring rollout
2 parents b02eb4c + a96686d commit 138ed0e

15 files changed

+289
-369
lines changed

CHANGELOG.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,19 @@ Change Log
1414
Unreleased
1515
~~~~~~~~~~
1616

17+
[6.1.0] - 2024-12-10
18+
~~~~~~~~~~~~~~~~~~~~
19+
Changed
20+
_______
21+
* Completes code owner monitoring updates, which drops owner theme and finalizes the code owner span tags. See doc and ADR updates for more details.
22+
23+
* The code_owner_theme_2 tag was dropped altogether.
24+
* The temporary suffix (_2) was removed from other span tags.
25+
* The code_owner (formerly code_owner_2) tag no longer includes the theme name.
26+
* The new name for the django setting is CODE_OWNER_TO_PATH_MAPPINGS (formerly CODE_OWNER_MAPPINGS).
27+
* The django setting CODE_OWNER_THEMES was dropped.
28+
* Updates the generate_code_owner_mappings.py script accordingly.
29+
1730
[6.0.0] - 2024-12-05
1831
~~~~~~~~~~~~~~~~~~~~
1932
Removed

edx_arch_experiments/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
A plugin to include applications under development by the architecture team at 2U.
33
"""
44

5-
__version__ = '6.0.0'
5+
__version__ = '6.1.0'
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Datadog Monitoring
22
###################
33

4-
When installed in the LMS as a plugin app, the ``datadog_monitoring`` app adds additional monitoring.
4+
When installed in the LMS as a plugin app, the ``datadog_monitoring`` app adds additional 2U-specific monitoring.
55

6-
This is where our code_owner_2 monitoring code lives, for example.
6+
This is where our code owner monitoring code lives, for example.

edx_arch_experiments/datadog_monitoring/code_owner/datadog.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
Datadog span processor for celery span code owners.
33
"""
4-
from .utils import set_code_owner_attribute_from_module
4+
from .utils import set_code_owner_span_tags_from_module
55

66

77
class CeleryCodeOwnerSpanProcessor:
@@ -17,7 +17,7 @@ def on_span_start(self, span):
1717
# We can use this for celery spans, because the resource name is more predictable
1818
# and available from the start. For django requests, we'll instead continue to use
1919
# django middleware for setting code owner.
20-
set_code_owner_attribute_from_module(span.resource)
20+
set_code_owner_span_tags_from_module(span.resource)
2121

2222
def on_span_finish(self, span):
2323
pass
Lines changed: 42 additions & 141 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
Utilities for monitoring code_owner_2
2+
Utilities for monitoring code_owner
33
"""
44
import logging
55

@@ -19,9 +19,6 @@ def get_code_owner_from_module(module):
1919
this lookup would match on 'openedx.features.discounts' before
2020
'openedx.features', because the former is more specific.
2121
22-
See how to:
23-
https://github.com/openedx/edx-django-utils/blob/master/edx_django_utils/monitoring/docs/how_tos/add_code_owner_custom_attribute_to_an_ida.rst
24-
2522
"""
2623
if not module:
2724
return None
@@ -54,7 +51,7 @@ def is_code_owner_mappings_configured():
5451

5552
def get_code_owner_mappings():
5653
"""
57-
Returns the contents of the CODE_OWNER_MAPPINGS Django Setting, processed
54+
Returns the contents of the CODE_OWNER_TO_PATH_MAPPINGS Django Setting, processed
5855
for efficient lookup by path.
5956
6057
Returns:
@@ -81,12 +78,12 @@ def get_code_owner_mappings():
8178
# processed map. Worst case, it is processed more than once at start-up.
8279
path_to_code_owner_mapping = {}
8380

84-
# .. setting_name: CODE_OWNER_MAPPINGS
81+
# .. setting_name: CODE_OWNER_TO_PATH_MAPPINGS
8582
# .. setting_default: None
8683
# .. setting_description: Used for monitoring and reporting of ownership. Use a
8784
# dict with keys of code owner name and value as a list of dotted path
8885
# module names owned by the code owner.
89-
code_owner_mappings = getattr(settings, 'CODE_OWNER_MAPPINGS', None)
86+
code_owner_mappings = getattr(settings, 'CODE_OWNER_TO_PATH_MAPPINGS', None)
9087
if code_owner_mappings is None:
9188
return None
9289

@@ -97,72 +94,73 @@ def get_code_owner_mappings():
9794
path_to_code_owner_mapping[path] = code_owner
9895
except TypeError as e:
9996
log.exception(
100-
'Error processing CODE_OWNER_MAPPINGS. {}'.format(e) # pylint: disable=logging-format-interpolation
97+
'Error processing CODE_OWNER_TO_PATH_MAPPINGS. {}'.format(e) # pylint: disable=logging-format-interpolation
10198
)
10299
raise e
103100

104101
_PATH_TO_CODE_OWNER_MAPPINGS = path_to_code_owner_mapping
105102
return _PATH_TO_CODE_OWNER_MAPPINGS
106103

107104

108-
def set_code_owner_attribute_from_module(module):
105+
def set_code_owner_span_tags_from_module(module):
109106
"""
110-
Updates the code_owner_2 and code_owner_2_module custom attributes.
111-
112-
Celery tasks or other non-web functions do not use middleware, so we need
113-
an alternative way to set the code_owner_2 custom attribute.
114-
115-
Note: These settings will be overridden by the CodeOwnerMonitoringMiddleware.
116-
This method can't be used to override web functions at this time.
107+
Updates the code_owner and code_owner_module custom span tags.
117108
118109
Usage::
119110
120-
set_code_owner_2_attribute_from_module(__name__)
111+
set_code_owner_span_tags_from_module(__name__)
121112
122113
"""
123-
set_custom_attribute('code_owner_2_module', module)
114+
# .. custom_attribute_name: code_owner_module
115+
# .. custom_attribute_description: The module used to determine the code_owner. This can
116+
# be useful for debugging issues for missing code owner span tags.
117+
set_custom_attribute('code_owner_module', module)
124118
code_owner = get_code_owner_from_module(module)
125119

126120
if code_owner:
127-
set_code_owner_custom_attributes(code_owner)
121+
set_code_owner_custom_span_tags(code_owner)
128122

129123

130-
def set_code_owner_custom_attributes(code_owner):
124+
def set_code_owner_custom_span_tags(code_owner):
131125
"""
132-
Sets custom metrics for code_owner_2, code_owner_2_theme, and code_owner_2_squad
126+
Sets custom span tags for code_owner, and code_owner_squad
133127
"""
134128
if not code_owner: # pragma: no cover
135129
return
136-
set_custom_attribute('code_owner_2', code_owner)
137-
theme = _get_theme_from_code_owner(code_owner)
138-
if theme:
139-
set_custom_attribute('code_owner_2_theme', theme)
140-
squad = _get_squad_from_code_owner(code_owner)
141-
if squad:
142-
set_custom_attribute('code_owner_2_squad', squad)
143130

131+
# .. custom_attribute_name: code_owner
132+
# .. custom_attribute_description: The squad owner name for the tagged span.
133+
set_custom_attribute('code_owner', code_owner)
134+
# .. custom_attribute_name: code_owner_squad
135+
# .. custom_attribute_description: Deprecated code_owner_squad is now redundant
136+
# to the code_owner span tag.
137+
set_custom_attribute('code_owner_squad', code_owner)
138+
139+
# .. custom_attribute_name: code_owner_plugin
140+
# .. custom_attribute_description: This is a temporary span tag to roll out the
141+
# the switch from edx-django-utils to this plugin. If this span tag is True,
142+
# the plugin has added the above custom span tags (possibly in addition to
143+
# edx-django-utils). If the code_owner_theme span tag is also seen, then
144+
# edx-django-utils is also adding these span tags. If not, only the plugin is
145+
# creating these tags.
146+
set_custom_attribute('code_owner_plugin', True)
144147

145-
def set_code_owner_attribute(request):
148+
149+
def set_code_owner_span_tags_from_request(request):
146150
"""
147-
Sets the code_owner_2 custom attribute for the request.
151+
Sets the code_owner custom span tag for the request.
148152
"""
149-
code_owner = None
150153
module = _get_module_from_request(request)
151154
if module:
152-
code_owner = get_code_owner_from_module(module)
153-
154-
if code_owner:
155-
set_code_owner_custom_attributes(code_owner)
155+
set_code_owner_span_tags_from_module(module)
156156

157157

158158
def _get_module_from_request(request):
159159
"""
160160
Get the module from the request path or the current transaction.
161161
162162
Side-effects:
163-
Sets code_owner_2_module custom attribute, used to determine code_owner_2.
164-
If module was not found, may set code_owner_2_path_error custom attribute
165-
if applicable.
163+
Sets code_owner_path_error custom span tag if applicable.
166164
167165
Returns:
168166
str: module name or None if not found
@@ -172,14 +170,14 @@ def _get_module_from_request(request):
172170
return None
173171

174172
module, path_error = _get_module_from_request_path(request)
175-
if module:
176-
set_custom_attribute('code_owner_2_module', module)
177-
return module
178173

179-
# monitor errors if module was not found
180174
if path_error:
181-
set_custom_attribute('code_owner_2_path_error', path_error)
182-
return None
175+
# .. custom_attribute_name: code_owner_path_error
176+
# .. custom_attribute_description: Error details if the module can't be found. This can
177+
# be useful for debugging issues for missing code owner span tags.
178+
set_custom_attribute('code_owner_path_error', path_error)
179+
180+
return module
183181

184182

185183
def _get_module_from_request_path(request):
@@ -204,100 +202,3 @@ def clear_cached_mappings():
204202
"""
205203
global _PATH_TO_CODE_OWNER_MAPPINGS
206204
_PATH_TO_CODE_OWNER_MAPPINGS = None
207-
global _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS
208-
_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = None
209-
210-
211-
# Cached lookup table for code owner theme and squad given a code owner.
212-
# - Although code owner is "theme-squad", a hyphen may also be in the theme or squad name, so this ensures we get both
213-
# correctly from config.
214-
# Do not access this directly, but instead use get_code_owner_theme_squad_mappings.
215-
_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = None
216-
217-
218-
def get_code_owner_theme_squad_mappings():
219-
"""
220-
Returns the contents of the CODE_OWNER_THEMES Django Setting, processed
221-
for efficient lookup by path.
222-
223-
Returns:
224-
(dict): dict mapping code owners to a dict containing the squad and theme, or
225-
an empty dict if there are no configured mappings.
226-
227-
Example return value::
228-
229-
{
230-
'theme-x-team-red': {
231-
'theme': 'theme-x',
232-
'squad': 'team-red',
233-
},
234-
'theme-x-team-blue': {
235-
'theme': 'theme-x',
236-
'squad': 'team-blue',
237-
},
238-
}
239-
240-
"""
241-
global _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS
242-
243-
# Return cached processed mappings if already processed
244-
if _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS is not None:
245-
return _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS
246-
247-
# Uses temporary variable to build mappings to avoid multi-threading issue with a partially
248-
# processed map. Worst case, it is processed more than once at start-up.
249-
code_owner_to_theme_and_squad_mapping = {}
250-
251-
# .. setting_name: CODE_OWNER_THEMES
252-
# .. setting_default: None
253-
# .. setting_description: Used for monitoring and reporting of ownership. Use a
254-
# dict with keys of code owner themes and values as a list of code owner names
255-
# including theme and squad, separated with a hyphen.
256-
code_owner_themes = getattr(settings, 'CODE_OWNER_THEMES', {})
257-
258-
try:
259-
for theme in code_owner_themes:
260-
code_owner_list = code_owner_themes[theme]
261-
for code_owner in code_owner_list:
262-
squad = code_owner.split(theme + '-', 1)[1]
263-
code_owner_details = {
264-
'theme': theme,
265-
'squad': squad,
266-
}
267-
code_owner_to_theme_and_squad_mapping[code_owner] = code_owner_details
268-
except TypeError as e:
269-
log.exception(
270-
'Error processing CODE_OWNER_THEMES setting. {}'.format(e) # pylint: disable=logging-format-interpolation
271-
)
272-
raise e
273-
274-
_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = code_owner_to_theme_and_squad_mapping
275-
return _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS
276-
277-
278-
def _get_theme_from_code_owner(code_owner):
279-
"""
280-
Returns theme for a code_owner (e.g. 'theme-my-squad' => 'theme')
281-
"""
282-
mappings = get_code_owner_theme_squad_mappings()
283-
if mappings is None: # pragma: no cover
284-
return None
285-
286-
if code_owner in mappings:
287-
return mappings[code_owner]['theme']
288-
289-
return None
290-
291-
292-
def _get_squad_from_code_owner(code_owner):
293-
"""
294-
Returns squad for a code_owner (e.g. 'theme-my-squad' => 'my-squad')
295-
"""
296-
mappings = get_code_owner_theme_squad_mappings()
297-
if mappings is None: # pragma: no cover
298-
return None
299-
300-
if code_owner in mappings:
301-
return mappings[code_owner]['squad']
302-
303-
return None
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
Monitoring by Code Owner
2+
************************
3+
4+
Status
5+
======
6+
7+
Accepted
8+
9+
Context
10+
=======
11+
12+
It is difficult for different teams to have team-based on-calls rotations, alerting and monitoring for various parts of the edx-platform (specifically LMS).
13+
14+
* The original decision to add code_owner custom span tags (custom attributes) was documented in `edx-platform in 0001-monitoring-by-code-owner`_.
15+
* The decision to move the code for reuse across IDAs was documented in `edx-django-utils in 0001-monitoring-by-code-owner.rst`_.
16+
* The decision for how to implement code owner details for celery tasks was documented in `0003-code-owner-for-celery-tasks_, and was limited by New Relic's instrumentation.
17+
* The decision to break up the ``code_owner`` custom span tag (custom attribute) into ``code_owner_squad`` and ``code_owner_theme`` tags was documented in `0004-code-owner-theme-and-squad`_.
18+
19+
Some changes or clarifications since this time:
20+
21+
* It turns out that this functionality is only really useful for the edx-platform LMS. Our other services (IDAs) are small enough to keep to a single owner, or to solve monitoring issues in other ways.
22+
* It is likely that the ``code_owner`` code is only really needed by 2U.
23+
* 2U wants to drop owner themes from its code_owner custom span tag.
24+
* 2U has switched to Datadog, which has slightly different capabilities from New Relic.
25+
26+
* Note that Datadog has custom span tags, where New Relic has custom attributes to refer to its tagging capabilities.
27+
28+
.. _edx-platform in 0001-monitoring-by-code-owner: https://github.com/openedx/edx-platform/blob/f29e418264f374099930a5b1f5b8345c569892e9/lms/djangoapps/monitoring/docs/decisions/0001-monitoring-by-code-owner.rst
29+
.. _edx-django-utils in 0001-monitoring-by-code-owner.rst: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0001-monitoring-by-code-owner.rst
30+
.. _0003-code-owner-for-celery-tasks: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0003-code-owner-for-celery-tasks.rst
31+
.. _0004-code-owner-theme-and-squad: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0004-code-owner-theme-and-squad.rst
32+
33+
Decision
34+
========
35+
36+
2U has moved its code owner monitoring implementation to the datadog_monitoring plugin.
37+
38+
* The owner theme name has been dropped from the ``code_owner`` custom span tag value in this new implementation.
39+
* The ``code_owner_theme`` span tag has been dropped altogether from this new implementation.
40+
* The now deprecated ``code_owner_squad`` span tag, which is redundant with the updated ``code_owner`` tag, will continue to be supported for backward compatibility.
41+
* A Datadog span processor was used to add the code owner span tags for celery tasks, so there is no longer a need for a special decorator on each celery task.
42+
* A new capability added to edx-django-utils to add `monitoring signals for plugins`_ is used to monitor Django requests.
43+
44+
Also, a new search script was implemented in this repository: `search_datadog.rst`_.
45+
46+
.. _monitoring signals for plugins: https://github.com/openedx/edx-django-utils/pull/467
47+
.. _search_datadog.rst: https://github.com/edx/edx-arch-experiments/blob/main/edx_arch_experiments/datadog_monitoring/scripts/datadog_search.py
48+
49+
Consequences
50+
============
51+
52+
- In addition to having greater flexibility to update these custom tags as-needed for 2U, we can also DEPR the code owner functionality in the Open edX codebase, where it is not likely to be needed.
53+
- Spreadsheet changes will no longer be required when a squad moves from one part of the organization to another (e.g. changes themes).
54+
- However, without including themes, it may take additional time to learn about a squad's place in the organization when seen in the code_owner span tag. For example, it will not be as immediately clear when dealing with an enterprise squad, unless you are familiar with all of the squad names.

0 commit comments

Comments
 (0)