Skip to content

Conversation

@duncanista
Copy link
Contributor

@duncanista duncanista commented Aug 26, 2024

What?

  • Adds Universal Instrumentation to provide support for .NET, Go, Java, and Ruby runtimes.
  • Adds couple more enhanced metrics

Motivation

We need it.

How?

Multiple PRs.

Notes

Read all the other PRs

duncanista and others added 27 commits November 15, 2024 14:52
* remove `hello_agent.rs`

in favor of a later agent

* create the `LifecycleListener`

agent in charge of listening to lambda-library/tracer events, moved the `hello_agent` handler here

* fmt
* decouple `hyper` from `trace_processor`

* add `handle_traces`

* fix tests

* removed unused import

* move `invocation_context` to `invocation::context` module

also added some more fields and refactored it

* add `new` and `get_sender_copy` to `trace_agent`

* add `get_canonical_resource_name` to `tags_provider`

* add `get_function_name` to `lambda::tags`

* add `MS_TO_NS` constant

* add `invocation::processor`

* update use of `invocation::context`

* make `lifecycle::listener` to use `invocation::processor`

* use `invocation::processor` in `main.rs`

* move `MS_TO_NS` to `invocation::processor`

* remove unnecessary constant

* add `Box::new` back to `trace_agent`

* add some comments

* add unit tests for `context.rs`

* use `on_invocation_start`

* rename `lambda_library_detected` to `tracer_detected`

* fmt

* remove `current_request_id`

I think we dont need it

* add comment

* fmt
* add `thiserror` and `lazystatic`

* add Span/Trace `context`

* update `mod.rs`

* add `propagation` module

* add `propagation::Error`

* add interface for `carrier` and `HashMap` implementation

* add `text_map_propagator`

added `Datadog` and `Tracecontext` implementations

* update `LICENSE-3rdparty.yml`
…TTP spans (#405)

* add `Trigger` trait for inferred spans

* add `ApiGatewayHttpEvent` trigger

* add `SpanInferrer`

* make `invocation::processor` to use `SpanInferrer`

* send `aws_config` to `invocation::processor`

* use incoming payload for `invocation::processor` for span inferring

* add `api_gateway_http_event.json` for testing

* add `api_gateway_proxy_event.json` for testing

* fix: Convert tag hashmap to sorted vector of tags

* fix: fmt

---------

Co-authored-by: AJ Stuyvenberg <astuyve@gmail.com>
* feat: support APIGW v1

* feat: Tests for unparameterized payload working

* feat: parameterized test

* fix: specs

* fix: unwrap_or_default, route has no http verb but is parameterized.

* fix: lint

* fix: Remove debugs, consolidate import

* fix: oneline
* add `trace_propagation_style.rs`

* add Trace Propagation to `config.rs`

also updated unit tests, as we have custom behavior, we should check only the fields we care about in the tests

* add `links` to `SpanContext`

* add composite propagator

also known as our internal http propagator, but in reality, http doesnt make any sense to me, its just a composite propagator which we used based on our configuration

* update `TextMapPropagator`s to comply with interface

also updated the naming

* fmt

* add unit testing for `config.rs`

* add `PartialEq` to `SpanContext`

* correct logic from `text_map_propagator.rs`

logic was wrong in some parts, this was discovered through unit tests

* add unit tests for `DatadogCompositePropagator`

also corrected some logic
* headers `HeaderMap` to `HashMap`

* add `Send` to propagators traits

* add `serde_json::Value` extractor + injector

* add `get_carrier` to `Trigger` trait

* add `get_carrier` method to current inferred spans

* update `span_inferrer.rs` to use `get_carrier` methods for distributed tracing

* add `headers_to_map` function

* reparent spans

I suspect there might be something wrong here, the code in Go is quite convoluted

* make some variables public

* fix to return early on `extract_span_context`

* fix how 128 bit is handled

also updated some variable names

* update comment
* send network enhanced metrics

* naming fixes

* reformatting reading data from proc
* use `get_tags` from `Trigger` trait

* remove unneeded comment

* add trigger tags to invocation span
* create context on invoke event

* update tests

* clippy fixes

* remove `allow(clippy::ptr_arg)`

---------

Co-authored-by: jordan gonzález <30836115+duncanista@users.noreply.github.com>
* send cpu metrics

* clippy fixes

* fixes

* set utilization metrics before flushing & format fixes

* added comment to explain utilization metrics calculation timing

* use nix instead of libc to get system clock

* update LICENSE-3rdparty.yml

* added comments to explain calculations

* clippy
* wip: sqs

* feat: sqs tests

* invert duration check

* remove duration set

* fmt and add `test_get_arn`

* remove unneeded reference

* remove unneeded comments

* add `get_carrier` implementation for `SqsRecord`

* add trace context to `sqs_event.json`

* fix: resource_names is not needed

* fix: don't deserialize body

* avoid `use super::...`

* fix unit tests

* set carrier and trigger tags

* remove duplicate tag

* fmt

* pass headers to `on_invocation_end`

* infer first, then extract

or else theres nothing to extract, reset values also for next inferr, no need to keep state after we complete

* reset values on every infer

* move some constants

* add missing trigger tags

* missed one case

---------

Co-authored-by: AJ Stuyvenberg <astuyve@gmail.com>
* filter http, tcp and local spans

* fix original condition

* update comments

* reuse constants and match sytle

* Update bottlecap/src/traces/mod.rs

* Update bottlecap/src/traces/mod.rs

* Update bottlecap/src/traces/mod.rs

* Update bottlecap/src/traces/mod.rs

---------

Co-authored-by: jordan gonzález <30836115+duncanista@users.noreply.github.com>
* wip: sqs

* feat: sqs tests

* invert duration check

* remove duration set

* fmt and add `test_get_arn`

* remove unneeded reference

* remove unneeded comments

* add `get_carrier` implementation for `SqsRecord`

* add trace context to `sqs_event.json`

* fix: resource_names is not needed

* fix: don't deserialize body

* avoid `use super::...`

* fix unit tests

* set carrier and trigger tags

* remove duplicate tag

* fmt

* pass headers to `on_invocation_end`

* infer first, then extract

or else theres nothing to extract, reset values also for next inferr, no need to keep state after we complete

* reset values on every infer

* add `sns_event.rs`

* add `sns_event*.json` payloads

* add `base64_to_string` method

and also move some variables

* surrender resource

* use `SnsRecord` for inferred spans

* move some constants

* add missing trigger tags

* missed one case

* update unit tests

* update `tt` to `t.get_tags()`

* fmt

* typo

---------

Co-authored-by: AJ Stuyvenberg <astuyve@gmail.com>
…spans (#440)

* wip: sqs

* feat: sqs tests

* invert duration check

* remove duration set

* fmt and add `test_get_arn`

* remove unneeded reference

* remove unneeded comments

* add `get_carrier` implementation for `SqsRecord`

* add trace context to `sqs_event.json`

* fix: resource_names is not needed

* fix: don't deserialize body

* avoid `use super::...`

* fix unit tests

* set carrier and trigger tags

* remove duplicate tag

* fmt

* pass headers to `on_invocation_end`

* infer first, then extract

or else theres nothing to extract, reset values also for next inferr, no need to keep state after we complete

* reset values on every infer

* add `sns_event.rs`

* add `sns_event*.json` payloads

* add `base64_to_string` method

and also move some variables

* surrender resource

* use `SnsRecord` for inferred spans

* move some constants

* add missing trigger tags

* missed one case

* update unit tests

* update `tt` to `t.get_tags()`

* fmt

* typo

* update tags

* SQS event can contain SNS carrier

* make some `Trigger` methods to be `Sized`

* add `sns_sqs_event.json`

also update path

* account for wrapped inferred span in processor

* simplify code in `span_inferrer.rs`

* remove duplicated condition

---------

Co-authored-by: AJ Stuyvenberg <astuyve@gmail.com>
* add `S3Event`

* add `s3_event.json`

* add `S3Record` into `span_inferrer.rs`
* add `S_TO_NS`

* add `DynamoDbEvent`

* use `DynamoDbEvent` in `SpanInferrer`

* update to parse `approximate_creation_date_time` as `f64`
* add eventbridge event

* fix test path

* add comments with code ref and fix metadata api-gateway

* fix error message

* clean import

* make build faster using host network

* fix conflicts and tests

* fix test conflicts

* resolve merge conflicts

* minor changes

* add missing unit test

* update events for testing

* account for millisecond resolution and resource name

* fix unit tests

* remove `network` tag for runners

---------

Co-authored-by: jordan gonzález <30836115+duncanista@users.noreply.github.com>
…sor` (#446)

* move `EnhancedMetrics` to live in `InvocationProcessor`

* rename field to `enhanced_metrics_enabled`
* move `base64_to_string` to `lifecycle::invocation` module

* set error on span from headers

checks the headers to identify errors that should be attatched to the invocation span and the inferred span

* increment metrics on error

* fmt

* remove a todo
* add kinesis

* Update bottlecap/src/lifecycle/invocation/triggers/kinesis_event.rs

Co-authored-by: jordan gonzález <30836115+duncanista@users.noreply.github.com>

* group up import and address comments

* fix timestamp, it's in seconds

* fix clippy

* deserialized carrier

* remove manual deref and resourcename meta tag since it is not used

---------

Co-authored-by: jordan gonzález <30836115+duncanista@users.noreply.github.com>
* generate tmp enhanced metrics

* fix channel stop signal

* use tokio async task instead of thread

* statfs fix

* fixes

* remove unused import

* rename tmp_chan to tmp_chan_tx
alexgallotta and others added 8 commits November 15, 2024 14:52
* add step functions events payloads

* make some methods public

* add `StepFunctionEvent`

* adapt `SpanInferrer` for generated `SpanContext`

* adapt `InvocationProcessor` for generated `SpanContext`

* resolve merge conflicts

* resolve clippy issues

* add allow clippy

* do not serialize the `entered_time`

* set `None` for inferred span when `generated_span_context` exists

* tidy code for last trace context update

* fix unit test
* add `LambdaFunctionUrlEvent`

* fmt

* update span inferrer
* add `tag_span_from_value`

* add `capture_lambda_payload` config

* add unit testing for `tag_span_from_value`

* update listener `end_invocation_handler`

parsing should not be handled here

* add capture lambda payload feature

also parse body properly, and handle `statusCode`
#453)

* add fd and threads enhanced metrics

* clippy fixes

* fixes

* rename var
* add some helper functions to `invocation::lifecycle` mod

* create cold start span on processor

* move `generate_span_id` to father module

* send `platform_init_start` data to processor

* send `PlatformInitStart` to main bus

* update cold start `parent_id`

* fix start time of cold start span

* enhanced metrics now have a `dynamic_value_tags` for tags which we have to calculate at points in time

* `AwsConfig` now has a `sandbox_init_time` value

* add `is_empty` to `ContextBuffer`

* calculate init tags on invoke

also add a method to reset processor invocation state

* restart init tags on set

* set tags properly for proactive init

* fix unit test

* remove debug line

* make sure `cold_start` tag is only set in one place
* add some helper functions to `invocation::lifecycle` mod

* create cold start span on processor

* move `generate_span_id` to father module

* send `platform_init_start` data to processor

* send `PlatformInitStart` to main bus

* update cold start `parent_id`

* fix start time of cold start span

* enhanced metrics now have a `dynamic_value_tags` for tags which we have to calculate at points in time

* `AwsConfig` now has a `sandbox_init_time` value

* add `is_empty` to `ContextBuffer`

* calculate init tags on invoke

also add a method to reset processor invocation state

* restart init tags on set

* set tags properly for proactive init

* fix unit test

* remove debug line

* make sure `cold_start` tag is only set in one place

* add service mapping config serializer

* add `service_mapping.rs`

* add `ServiceNameResolver` interface

for service mapping

* implement interface in every trigger

* send `service_mapping` lookup table to span enricher

* create `SpanInferrer` with `service_mapping` config

* fmt
@duncanista duncanista force-pushed the jordan.gonzalez/bottlecap/universal-instrumentation branch from dca2dd1 to b52e738 Compare November 15, 2024 19:55
duncanista and others added 6 commits November 15, 2024 15:31
* add aws trace header for java and sqs

* fix priority sampling

* remove clippy warnings

* fix: do not skip inferred spans with aws headers

* make clippy happy

* add comment for 64 bits trace id

* fix clippy warnings

* fix import and tests

* format

* remove dead code
@duncanista duncanista marked this pull request as ready for review November 19, 2024 20:08
@duncanista duncanista requested a review from a team as a code owner November 19, 2024 20:08
@duncanista duncanista merged commit 320e100 into main Nov 19, 2024
24 checks passed
@duncanista duncanista deleted the jordan.gonzalez/bottlecap/universal-instrumentation branch November 19, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants