Skip to content

Tags: DataDog/dd-trace-dotnet

Tags

v3.29.0

Toggle v3.29.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[OTEL] Vendoring OtlpGrpcExportClient And Enabling OTLP Metrics gRPC …

…Tests (#7666)

## Summary of changes

Added gRPC protocol support to the OTLP metrics exporter by vendoring
OpenTelemetry's gRPC transport client. The implementation now supports
`grpc` which also was updated to be the default protocol value.

## Reason for change

Customers need gRPC support for OTLP metrics export to maintain
compatibility with OpenTelemetry ecosystem standards and their existing
infrastructure that may require or prefer gRPC transport.

## Implementation details

**Vendored dependencies:**
- Vendored `OpenTelemetry.Exporter.OpenTelemetryProtocol`
v1.13.1(lastest ATM with included fixes) gRPC transport client
(`OtlpGrpcExportClient` and related helpers)
- Created `OpenTelemetryStubs.cs` with minimal stub implementations to
avoid vendoring the entire OpenTelemetry SDK:
- `Guard` and `UriExtensions` from `OpenTelemetry.Internal` (Would need
to discard deletion after usng vendoring command)
  - `OtlpExporterOptions` (minimal configuration class)
- `OtlpExportProtocol` enum (kept in stub due to namespace issues with
the original being in parent namespace `OpenTelemetry.Exporter`)
- `OpenTelemetryProtocolExporterEventSource` (stub EventSource for
logging)

**Core changes:**
- Updated `OtlpExporter.cs` to instantiate and use the vendored
`OtlpGrpcExportClient` when protocol is `Grpc`
- Added 5-byte gRPC message frame prefix (compression flag + message
length) to protobuf payloads for gRPC transport
- Modified `OtlpMetricsSerializer.cs` to optionally reserve bytes at the
start of the buffer for the gRPC frame header

**Vendoring infrastructure:**
- Added `AddOpenTelemetryUsings` transform to inject common `using`
directives needed by vendored OTel files
- Configured exclusions to vendor only the gRPC transport client, not
the full OTLP exporter

## Test coverage

- Updated `OpenTelemetrySdkTests.SubmitsOtlpMetrics` to test both
`http/protobuf` and `grpc` protocols
- gRPC tests use the `dd-apm-test-agent` container
- All tests validate metrics are correctly exported and received by the
respective agents

## DLL File Size Difference

After building both `master` branch and this branch locally then
comparing file size I got:
|Target Framework | Master (bytes) | Feature (bytes) | Difference
(bytes) | Difference (%) | Impact |

|------------------|----------------|-----------------|-------------------|----------------|--------|
| net6.0 | 8,252,416 | 8,270,336 | **+17,920** | +0.22% | +17.5 KB |
| netcoreapp3.1 | 8,173,056 | 8,190,976 | **+17,920** | +0.22% | +17.5
KB |

## Other Details
[APMAPI-1679](https://datadoghq.atlassian.net/browse/APMAPI-1679)

[APMAPI-1679]:
https://datadoghq.atlassian.net/browse/APMAPI-1679?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

v3.28.0

Toggle v3.28.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix for sending duplicate logs when using Agentless Logging in Azure …

…Function host (#7383)

## Summary of changes

Disables agentless logging (for the Azure Functions Host Process) if we
detect that we are instrumenting an Azure Functions host process.
Previously this would cause us to duplicate every log sent from the
worker process.

This bug fix can be reverted by setting the new configuration key
`DD_LOGS_DIRECT_SUBMISSION_AZURE_FUNCTIONS_HOST_ENABLED` to `true`.

## Reason for change

If customers are running with the Isolated Azure Function model we will
be instrumenting two applications, one is the function host process the
other is the worker process. If they have direct log submission enabled
the function host ends up duplicating the logs from the function process
which results in us shipping two nearly identical logs.

This behavior isn't ideal as the duplicate log is likely not valuable so
we've disabled agentless logging in the azure functions host process.

## Implementation details

Added `IsRunningInAzureFunctionsHost()` to `EnvironmentHelpers.cs` which
allows for a rough detection of whether we are running on the function
host using the following logic:

- Is `FUNCTIONS_WORKER_RUNTIME` present AND set to `dotnet-isolated`?
- Are both `--functions-worker-id` or `--workerId` NOT present in the
command line?

If both are true we treat that scenario as being running in the function
host - otherwise we are likely the worker process

I wasn't able to find a more robust way of checking, but when looking at
the various log output that I had the `--functions-worker-id` or
`--workerId` seemed to always be called by the function host.

```
[2025-10-03T16:01:42.901Z] Reading functions metadata (Worker)
[2025-10-03T16:01:47.176Z] {
[2025-10-03T16:01:47.177Z]   "ProcessId": 71080,
[2025-10-03T16:01:47.178Z]   "RuntimeIdentifier": "win-x64",
[2025-10-03T16:01:47.179Z]   "WorkerVersion": "2.0.0.0",
[2025-10-03T16:01:47.180Z]   "ProductVersion": "2.0.0\u002Bd8b5fe998a8c92819b8ee41d2569d2525413e9c5",
[2025-10-03T16:01:47.181Z]   "FrameworkDescription": ".NET 9.0.9",
[2025-10-03T16:01:47.182Z]   "OSDescription": "Microsoft Windows 10.0.26100",
[2025-10-03T16:01:47.183Z]   "OSArchitecture": "X64",
[2025-10-03T16:01:47.184Z]   "CommandLine": "C:\\Users\\steven.bouwkamp\\source\\repos\\dd-trace-dotnet\\artifacts\\bin\\Samples.AzureFunctions.V4Isolated.AspNetCore\\debug_net9.0\\Samples.AzureFunctions.V4Isolated.AspNetCore.dll --host 127.0.0.1 --port 65401 --workerId e94d23fd-cd3c-4780-a3e3-4980d7b0f644 --requestId 6dba68ac-1954-466a-aeb4-9570cc9b12c2 --grpcMaxMessageLength 2147483647 --functions-uri http://127.0.0.1:65401/ --functions-worker-id e94d23fd-cd3c-4780-a3e3-4980d7b0f644 --functions-request-id 6dba68ac-1954-466a-aeb4-9570cc9b12c2 --functions-grpc-max-message-length 2147483647"
[2025-10-03T16:01:47.185Z] }
```

- Added `DD_LOGS_DIRECT_SUBMISSION_AZURE_FUNCTIONS_HOST_ENABLED` that
defaults to `false` to disable the duplicate logs from being sent.

## Test coverage

- Added a new test project and test for having host logs enabled /
disabled.
- The reason why I added a new test project instead of re-using an
existing one was because when I re-ran the Function application in our
tests multiple times the `func.exe` would fail to obtain a lock and
would need to wait some period of time to for recovery after each
subsequent run of the same application. I think this is because we end
the `func.exe` process with a `Kill()`. Making a new project wasn't
ideal but was a quick and simple workaround.

## Other details
Fixes SLES-2364

<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

v3.27.0

Toggle v3.27.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix the `verify_app_trimming_changes_are_persisted` job (#7542)

## Summary of changes

Fixes the `verify_app_trimming_changes_are_persisted` job

## Reason for change

The trimming job is currently trying to build the native code, but
[we're getting this
error](https://github.com/DataDog/dd-trace-dotnet/actions/runs/17909695789/job/50921858072?pr=7287):

```
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\Microsoft.Cpp.WindowsSDK.targets(46,5):
error MSB8036: The Windows SDK version 10.0.19041.0 was not found.
Install the required version of Windows SDK or change the SDK version in the project property
pages or by right-clicking the solution and selecting "Retarget solution".
```

We only need to build the managed code to regenerate the trimming file,
so fix that

## Implementation details

`BuildTracerHome` -> `BuildManagedTracerHome`

## Test coverage

If this PR passes, we're good

## Other details

Blocking CI in general

v3.26.3

Toggle v3.26.3's commit message

Partially verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
We cannot verify signatures from co-authors, and some of the co-authors attributed to this commit require their commits to be signed.
Hotfix v3.26.3 - Update CI with Windows signing remediations (#7527)

## Summary of changes

This cherry-picks the commits related to resolving the issues that we
had with not correctly signing Windows artifacts.

## Reason for change

The remediations were only on `master` causing the hotfix to not be
signed again. This resolves that.

## Implementation details

`git cherry-pick` the three commits

## Test coverage

If the build works and says that everything gets correctly signed again
then 👍

## Other details
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

---------

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
Co-authored-by: Zach Montoya <zach.montoya@datadoghq.com>
Co-authored-by: NachoEchevarria <53266532+NachoEchevarria@users.noreply.github.com>
Co-authored-by: Lucas Pimentel <lucas.pimentel@datadoghq.com>

v3.26.2

Toggle v3.26.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Maximo/hotfix patch 3.26.2 (#7518)

## Summary of changes
Trying to release fixes for telemetry and Statsd:

- #7503
- #7507
- #7512

---------

Co-authored-by: Ganesh Jangir <ganesh.jangir@datadoghq.com>
Co-authored-by: Tony Redondo <tony.redondo@datadoghq.com>

v3.26.1

Toggle v3.26.1's commit message
[Version Bump] 3.26.1

v3.26.0

Toggle v3.26.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix GRPC IAST tests (#7485)

## Summary of changes

In a previous PR, we updated the [GRPC
sample](https://github.com/DataDog/dd-trace-dotnet/pull/7457/files#diff-a584760853efe2efa5b346a11c7a95486f1c9aa7700bcdfb97729eba34e23135).

This has affected the IAST tests, that use it. Since the code ownership
was not set to ASM, the previous PR passed.
This PR:
* Updates the snapshots of the IAST tests. The location of the
vulnerability has changed (line number, changing the hash of the
vulnerability) This particular sample has debug information, so line
numbers are taken into account.
* The ownership of the sample has changed to include security.
* Deleted debug info in test.

## Reason for change

## Implementation details

## Test coverage

## Other details
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

v3.25.0

Toggle v3.25.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Tracer] fix: Re-use runtime metrics writer resources to limit memory…

… growth (#7434)

## Summary of changes
Updates `TracerManagerFactory.CreateTracerManager` to pass and re-use
the previous `RuntimeMetricsWriter`, if runtime metrics is enabled for
the new TracerManager. We still create a new `IDogStatsd` client every
time, so any updated Agent settings are adopted by the DogStatsd client,
but this could be further optimized at a later time to re-use the client
if none of its configuration (e.g. host/port/tags) have changed.

## Reason for change
We've observed a scenario where the number of `RuntimeEventListener`
instances continues to grow, consuming more and more memory. This
happens whenever new Dynamic Configuration settings are received by the
tracer and runtime metrics are enabled. This PR resolves the issue.

## Implementation details
When creating the new `TracerManager`, pass in the previous
`RuntimeMetricsWriter` instance and only update the `IDogStatsd` object
with new settings. This makes sure that we maintain only one
`RuntimeMetricsWriter` instance while getting up-to-date DogStatsD
settings throughout the application lifetime.

## Test coverage
Adds a small unit test to confirm that the previous
`RuntimeMetricsWriter` is re-used. Additionally, local testing was done
to confirm that the number of
`Datadog.Trace.RuntimeMetrics.RuntimeEventListener` objects does not
grow when Dynamic Configuration is updated.

### Without the fix
After some number of Dynamic Configuration settings were made in the
Datadog UI, a dump was taken with the following analysis:
```
> dumpheap -type Datadog.Trace.RuntimeMetrics.RuntimeEventListener
         Address               MT           Size
    0159c3c7fe88     7ffe4e801810             88
    0159c442ed28     7ffe4e801810             88
    0159c44eac60     7ffe4e801810             88
    0159c4500f00     7ffe4e801810             88
    0159c454ea68     7ffe4e801810             88

Statistics:
          MT Count TotalSize Class Name
7ffe4e801810     5       440 Datadog.Trace.RuntimeMetrics.RuntimeEventListener
Total 5 objects, 440 bytes
```

Then after one additional Dynamic Configuration update was made in the
Datadog UI, a dump was taken with the following analysis:
```
> dumpheap -type Datadog.Trace.RuntimeMetrics.RuntimeEventListener
         Address               MT           Size
    0159c3c7fe88     7ffe4e801810             88
    0159c442ed28     7ffe4e801810             88
    0159c44eac60     7ffe4e801810             88
    0159c4500f00     7ffe4e801810             88
    0159c454ea68     7ffe4e801810             88
    0159c61fef48     7ffe4e801810             88

Statistics:
          MT Count TotalSize Class Name
7ffe4e801810     6       528 Datadog.Trace.RuntimeMetrics.RuntimeEventListener
Total 6 objects, 528 bytes
```

### With the fix
After some number of Dynamic Configuration settings were made in the
Datadog UI, a dump was taken with the following analysis:
```
> dumpheap -type Datadog.Trace.RuntimeMetrics.RuntimeEventListener
         Address               MT           Size
    01527107faf0     7ffe4e7acc00             88

Statistics:
          MT Count TotalSize Class Name
7ffe4e7acc00     1        88 Datadog.Trace.RuntimeMetrics.RuntimeEventListener
Total 1 objects, 88 bytes
```

Then after one additional Dynamic Configuration update was made in the
Datadog UI, a dump was taken with the following analysis:
```
> dumpheap -type Datadog.Trace.RuntimeMetrics.RuntimeEventListener
         Address               MT           Size
    01527107faf0     7ffe4e7acc00             88

Statistics:
          MT Count TotalSize Class Name
7ffe4e7acc00     1        88 Datadog.Trace.RuntimeMetrics.RuntimeEventListener
Total 1 objects, 88 bytes
```

## Other details
<!-- Fixes #{issue} -->


<!--  ⚠️ Note:

Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.

MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->

---------

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

v3.24.1

Toggle v3.24.1's commit message

Verified

This commit was signed with the committer’s verified signature.
andrewlock Andrew Lock
Fix Windows SSI release

Make sure we publish the telemetry_forwarder.exe as part of the publish step, and pull this when creating the release

v3.24.0

Toggle v3.24.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Removes exception throwing during shutdown of dynamic instrumentation (

…#7375)

## Summary of changes

Removes exception throwing in "success" path, even when DI is disabled

## Reason for change

#7304 introduced a bunch of changes, but the use of
`CancellationTokenSource` resulted in exceptions being thrown in the
"happy" shutdown path, which can cause crashes in buggy versions of .NET
(i.e. all of them, currently)

## Implementation details

Replace usages of `TaskCompletionSource<bool>` with
`CancellationTokenSource`

## Test coverage

Covered by existing - checked the execution tests to make sure the
exception count has gone back own.

## Other details

Discovered some additional issues that need to be addressed I think.
Most importantly, `SafeDisposal` looks like a band-aid due to unclear
lifetime management. We should refactor the code to not require it i.e.
`Disposing` types should be safe and should not throw (regardless of
whether we catch the exception).

Additionally, this looks like it does quite a lot of work even when DI
is disabled. I would suggest we refactor it so that it doesn't do a
bunch of work in cases where it's never enabled.