Standardize Python Log Exporter JSON output to match canonical schema#715
Standardize Python Log Exporter JSON output to match canonical schema#715Miqueasher wants to merge 10 commits intomainfrom
Conversation
…ch canonical schema
There was a problem hiding this comment.
Review Comments
Thanks for the work on standardizing the console exporter output! Two items need to be fixed:
1. LogExportResult import missing fallback
The base class import has a try/except for LogRecordExporter vs LogExporter, but the return type import doesn't:
from opentelemetry.sdk._logs.export import LogExportResult # no fallbackOlder SDK versions export LogRecordExportResult, not LogExportResult. This will crash at import time on those versions. Should add:
try:
from opentelemetry.sdk._logs.export import LogExportResult
except ImportError:
from opentelemetry.sdk._logs.export import LogRecordExportResult as LogExportResultSame issue in the test file.
2. Boolean attribute values: "True" vs "true"
attrs[k] = str(v) # Python str(True) → "True"Java's String.valueOf(true) produces "true". Since the goal is matching the canonical schema across all languages, this is a mismatch. The test asserts "True" which confirms the divergence.
Suggest: str(v).lower() if isinstance(v, bool) else str(v), or a more explicit type coercion.
| # ReadableLogRecord has to_json() directly; LogData has it on .log_record | ||
| obj = data if hasattr(data, "to_json") else data.log_record | ||
| formatted_json = obj.to_json() | ||
| return re.sub(r"\s*([{}[\]:,])\s*", r"\1", formatted_json) |
There was a problem hiding this comment.
Low priority (pre-existing): fallback regex can corrupt JSON string values. The regex operates on raw JSON text without distinguishing structural characters from those inside string values. A log body containing colons, brackets, or commas would have surrounding whitespace stripped. Since this preserves pre-existing behavior and is now only in the fallback path, this is low priority but worth noting for future cleanup.
|
consider using the JSON formatter in here: https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py#L214 |
The ADOT Lambda layers support
OTEL_LOGS_EXPORTER=console, which writeslog records as compact JSON to stdout. Lambda's FluxPump reads stdout and
forwards to PLE. Previously, each language (Java, Python, JS, .NET) produced
different JSON output because Python and JS delegated to their upstream OTel
SDK's serialization methods, which each made independent formatting choices
(snake_case vs camelCase field names, 0x-prefixed trace IDs, different
timestamp formats, missing fields).
This PR aligns the Python exporter output with the canonical schema defined
by the Java implementation, ensuring consistent JSON structure across all
ADOT language SDKs
Rewrites the
CompactConsoleLogRecordExporterto build JSON directly fromLogRecordfields instead of delegating to the upstream SDK'sto_json()serialization, which produced a different schema than the Java reference
Adds try-catch fallback to the original upstream SDK format if the new
serialization fails, to avoid breaking existing customer infrastructure
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.