Skip to content

Conversation

@houseme
Copy link
Contributor

@houseme houseme commented Nov 7, 2025

Type of Change

  • New Feature
  • Bug Fix
  • Documentation
  • Performance Improvement
  • Test/CI
  • Refactor
  • Other:

Related Issues

#796 #802

Summary of Changes

This PR refactors the telemetry system in crates/obs/src/telemetry.rs and related utilities in crates/utils/src/envs.rs to improve modularity, fix bugs, and enhance functionality. The changes address three primary logging modes with prioritized rules, ensure proper resource lifecycle management, and add support for HTTP-based observability with sub-endpoints.

Key Changes:

  1. Environment Functions Ordering (crates/utils/src/envs.rs):

    • Reordered functions by data type size (8-bit → 16-bit → 32-bit → 64-bit → usize → String → bool).
    • Prioritized signed types before unsigned within the same bit size.
    • Added missing optional variants (e.g., get_env_opt_u16, get_env_opt_u32) for consistency.
  2. Telemetry Initialization (crates/obs/src/telemetry.rs):

    • Implemented three prioritized rules:
      • Rule 1 (Default): Stdout logging at error level with span tracing (unique identifiers via with_current_span and with_span_list).
      • Rule 2 (File Logging): Rolling file logs with size-based rotation and configurable retention, triggered by explicit log directory setting.
      • Rule 3 (Observability): HTTP-based OTLP export with sub-endpoints (trace, metric, log) falling back to a unified endpoint if not specified.
    • Fixed stdout logging bug where logs were not written post-initialization by retaining WorkerGuard in OtelGuard to keep async writer threads alive.
    • Enhanced observability with HTTP protocol, Zstd compression, and proper sampler configuration.
    • Updated OtelGuard to manage tracing_guard for stdout and flexi_logger_handles for file logging, ensuring RAII-compliant shutdown.
  3. Configuration and Dependencies (crates/obs/src/config.rs and Cargo.toml):

    • Refined OtelConfig extraction to handle sub-endpoints correctly.
    • Ensured dependencies like opentelemetry-otlp support HTTP features.

Motivation:

  • Addresses issues with stdout logging dropping after init, improving reliability.
  • Provides flexible observability setup for production environments.
  • Maintains backward compatibility while adding new features.

Testing:

  • Verified stdout logging retains output across spans.
  • Tested file rotation with size thresholds and retention limits.
  • Confirmed HTTP observability exports to sub-endpoints or unified fallback.
  • All existing tests pass; added unit tests for new functions.

Breaking Changes:

  • None; changes are additive and maintain existing API.

Related Issues:

  • Fixes logging drop issue in stdout mode.
  • Enhances support for multi-endpoint observability.

Checklist

  • I have read and followed the CONTRIBUTING.md guidelines
  • Passed make pre-commit
  • Added/updated necessary tests
  • Documentation updated (if needed)
  • CI/CD passed (if applicable)

Impact

  • Breaking change (compatibility)
  • Requires doc/config/deployment update
  • Other impact:

Additional Notes


Thank you for your contribution! Please ensure your PR follows the community standards (CODE_OF_CONDUCT.md) and sign the CLA if this is your first contribution.

- Reorder functions in envs.rs by type size (8-bit to 64-bit, signed before unsigned) and add missing variants like get_env_opt_u16.
- Optimize init_telemetry to support three modes: stdout logging (default error level with span tracing), file rolling logs (size-based with retention), and HTTP-based observability with sub-endpoints (trace, metric, log) falling back to unified endpoint.
- Fix stdout logging issue by retaining WorkerGuard in OtelGuard to prevent premature release of async writer threads.
- Enhance observability mode with HTTP protocol, compression, and proper resource management.
- Update OtelGuard to include tracing_guard for stdout and flexi_logger_handles for file logging.
- Improve error handling and configuration extraction in OtelConfig.
@houseme houseme requested a review from Copilot November 7, 2025 06:02
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

Scorecard details
PackageVersionScoreDetails
cargo/aws-credential-types 1.2.9 UnknownUnknown
cargo/aws-runtime 1.5.14 UnknownUnknown
cargo/aws-sdk-s3 1.112.0 UnknownUnknown
cargo/aws-sdk-sso 1.89.0 UnknownUnknown
cargo/aws-sdk-ssooidc 1.91.0 UnknownUnknown
cargo/aws-sdk-sts 1.91.0 UnknownUnknown
cargo/aws-smithy-checksums 0.63.11 UnknownUnknown
cargo/convert_case 0.9.0 🟢 3
Details
CheckScoreReason
Token-Permissions⚠️ -1No tokens found
Packaging⚠️ -1packaging workflow not detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ -1no dependencies found
Maintained⚠️ 00 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Code-Review⚠️ 0Found 1/30 approved changesets -- score normalized to 0
Dangerous-Workflow⚠️ -1no workflows found
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/crc-fast 1.6.0 UnknownUnknown
cargo/crypto-common 0.2.0-rc.5 🟢 4.9
Details
CheckScoreReason
Security-Policy🟢 10security policy file detected
Code-Review⚠️ 2Found 5/24 approved changesets -- score normalized to 2
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Maintained🟢 1030 commit(s) and 22 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
Packaging⚠️ -1packaging workflow not detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
License⚠️ 0license file not detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/erased-serde 0.4.9 🟢 5.5
Details
CheckScoreReason
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Binary-Artifacts🟢 10no binaries found in the repo
Maintained🟢 1014 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 10
Code-Review⚠️ 0Found 0/27 approved changesets -- score normalized to 0
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Packaging⚠️ -1packaging workflow not detected
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
Security-Policy🟢 3security policy file detected
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/jsonwebtoken 10.2.0 🟢 4.9
Details
CheckScoreReason
Code-Review🟢 6Found 17/25 approved changesets -- score normalized to 6
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Maintained🟢 1022 commit(s) and 14 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Security-Policy⚠️ 0security policy file not detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/quote 1.0.42 🟢 5.7
Details
CheckScoreReason
Code-Review⚠️ 2Found 4/20 approved changesets -- score normalized to 2
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Maintained🟢 1030 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
Packaging⚠️ -1packaging workflow not detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Vulnerabilities🟢 100 existing vulnerabilities detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
Security-Policy🟢 3security policy file detected
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/rmcp 0.8.5 UnknownUnknown
cargo/rmcp-macros 0.8.5 UnknownUnknown
cargo/schemars 1.1.0 🟢 4.5
Details
CheckScoreReason
Maintained🟢 104 commit(s) and 8 issue activity found in the last 90 days -- score normalized to 10
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Code-Review⚠️ 1Found 3/30 approved changesets -- score normalized to 1
Binary-Artifacts🟢 10no binaries found in the repo
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/schemars_derive 1.1.0 🟢 4.5
Details
CheckScoreReason
Maintained🟢 104 commit(s) and 8 issue activity found in the last 90 days -- score normalized to 10
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Code-Review⚠️ 1Found 3/30 approved changesets -- score normalized to 1
Binary-Artifacts🟢 10no binaries found in the repo
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/syn 2.0.109 🟢 6.4
Details
CheckScoreReason
Maintained🟢 1030 commit(s) and 22 issue activity found in the last 90 days -- score normalized to 10
Code-Review🟢 3Found 6/19 approved changesets -- score normalized to 3
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Packaging⚠️ -1packaging workflow not detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 10license file detected
Fuzzing🟢 10project is fuzzed
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 3security policy file detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/wildmatch 2.6.0 🟢 3.7
Details
CheckScoreReason
Code-Review⚠️ 1Found 4/22 approved changesets -- score normalized to 1
Packaging⚠️ -1packaging workflow not detected
Binary-Artifacts🟢 10no binaries found in the repo
Maintained⚠️ 23 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 2
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Fuzzing⚠️ 0project is not fuzzed
Security-Policy⚠️ 0security policy file not detected
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0

Scanned Files

  • Cargo.lock

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the observability and telemetry infrastructure to support HTTP-based OpenTelemetry Protocol (OTLP) exporters with separate endpoint configuration for traces, metrics, and logs. The changes include reorganizing environment variable utilities, simplifying telemetry initialization logic, and updating Docker configurations for the observability stack.

Key Changes:

  • Migration from gRPC to HTTP-based OTLP exporters with Zstandard compression
  • Separate endpoint configuration support for traces, metrics, and logs
  • Refactored environment variable utilities with expanded type support (moved from sys/envs to root-level envs module)
  • Simplified telemetry initialization with three distinct modes: observability (HTTP export), file logging, and stdout logging
  • Enhanced audit system with OpenTelemetry metrics instrumentation

Reviewed Changes

Copilot reviewed 44 out of 45 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
scripts/run.sh Updated OTLP endpoint configuration examples with separate trace/metric/log endpoints and HTTP port (4318)
crates/utils/src/envs.rs New comprehensive environment variable utility functions supporting multiple numeric types and Option types
crates/utils/src/sys/mod.rs Removed envs module (relocated to root)
crates/utils/src/lib.rs Updated exports to use root-level envs module instead of sys/envs
crates/obs/src/telemetry.rs Major refactoring: split init into three modes (HTTP observability, file logging, stdout), added HTTP OTLP exporters with compression
crates/obs/src/config.rs Added trace/metric/log endpoint fields, simplified env var parsing using new utility functions
crates/obs/src/system/mod.rs Updated to use configurable system metrics interval from environment variables
crates/obs/src/metrics/*.rs Updated import paths from crate::metrics:: to crate::
crates/config/src/observability/mod.rs Added new endpoint environment variables and metrics configuration constants
crates/audit/src/observability.rs Added OpenTelemetry metrics instrumentation with counter/gauge/histogram support
Cargo.toml Changed opentelemetry-otlp features from gRPC to HTTP with zstd compression
.docker/observability/*.yml Updated Docker images to latest versions and enhanced Prometheus/Grafana configurations with OTLP support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@houseme houseme requested a review from Copilot November 7, 2025 07:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@houseme houseme requested a review from Copilot November 7, 2025 07:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@houseme houseme requested a review from Copilot November 7, 2025 10:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 46 out of 47 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@houseme houseme merged commit 29056a7 into main Nov 7, 2025
16 checks passed
@houseme houseme deleted the feature/improve-audit-1106 branch November 7, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant