Skip to content

Add runtime evidence mode to agt verify#969

Merged
imran-siddique merged 8 commits intomicrosoft:mainfrom
electricwolfemarshmallowhypertext:wolfe/agt-verify-evidence
Apr 16, 2026
Merged

Add runtime evidence mode to agt verify#969
imran-siddique merged 8 commits intomicrosoft:mainfrom
electricwolfemarshmallowhypertext:wolfe/agt-verify-evidence

Conversation

@electricwolfemarshmallowhypertext
Copy link
Copy Markdown
Contributor

Adds an optional runtime evidence path to agt verify.

Current behavior stays unchanged:

  • agt verify
  • agt verify --badge
  • agt --json verify

New behavior:

  • agt verify --evidence ./agt-evidence.json
  • agt verify --evidence ./agt-evidence.json --strict

This lets the verifier check a deployment evidence manifest for:

  • loaded policy files
  • deny rule or deny-by-default semantics
  • registered tools
  • audit sink configuration
  • identity state
  • package/version manifest

--strict exits non-zero when runtime evidence is missing or weak.

Tests:

  • 116 passed
  • pytest packages/agent-compliance/tests/test_integrity_and_verify.py packages/agent-compliance/tests/test_agt_cli.py -q

@github-actions github-actions Bot added documentation Improvements or additions to documentation tests labels Apr 14, 2026
@github-actions
Copy link
Copy Markdown

Welcome to the Agent Governance Toolkit! Thanks for your first pull request.
Please ensure tests pass, code follows style (ruff check), and you have signed the CLA.
See our Contributing Guide.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 14, 2026

🤖 AI Agent: docs-sync-checker — Issues Found

📝 Documentation Sync Report

Issues Found

  • verify_evidence() in packages/agent-compliance/src/agent_compliance/verify.py — missing docstring
  • verify() in packages/agent-compliance/src/agent_compliance/verify.py — docstring does not reflect new verify_evidence() behavior
  • ⚠️ packages/agent-compliance/README.md — new --evidence and --strict flags are mentioned, but their detailed behavior (e.g., validation criteria) is not explained.
  • ⚠️ docs/tutorials/18-compliance-verification.md — tutorial mentions evidence mode but lacks examples of expected evidence manifest structure.
  • ✅ CHANGELOG.md — entry for new --evidence and --strict flags is present.
  • ⚠️ Example code in packages/agent-compliance/tests/ — no test or example demonstrating verify_evidence() usage with a sample evidence manifest.
  • ❌ Type hints for verify_evidence() — missing type annotations for parameters and return values.

Suggestions

  • 💡 Add docstring for verify_evidence(evidence_path: str, strict: bool) -> GovernanceAttestation explaining parameters, return values, and exceptions.
  • 💡 Update verify() docstring to mention its interaction with verify_evidence() when evidence is provided.
  • 💡 Expand README.md to include a subsection detailing the structure of the runtime evidence manifest and the implications of --strict.
  • 💡 Add examples of evidence manifest validation in docs/tutorials/18-compliance-verification.md.
  • 💡 Include a test case in packages/agent-compliance/tests/test_agt_cli.py demonstrating verify_evidence() usage with a sample evidence manifest.
  • 💡 Add type hints for verify_evidence() parameters and return values.

Summary

The new runtime evidence mode introduces significant functionality that requires better documentation and examples to ensure users understand its usage and implications. Addressing the missing docstrings, type hints, and examples will improve clarity and usability.

@github-actions github-actions Bot added the size/XL Extra large PR (500+ lines) label Apr 14, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 14, 2026

🤖 AI Agent: breaking-change-detector — Summary

🔍 API Compatibility Report

Summary

The pull request introduces new functionality (runtime evidence mode) to the agt verify command, which is additive. However, there are no breaking changes to existing APIs, function signatures, or exported modules. The changes are backward-compatible.

Findings

Severity Package Change Impact
🔵 agent-compliance Added verify_evidence() method to GovernanceVerifier New public API, not breaking
🔵 agent-compliance Added --evidence and --strict flags to agt verify CLI New CLI options, not breaking

Migration Guide

  • For existing users: No migration is required. Existing functionality remains unchanged.
  • For new users: To use the runtime evidence mode, pass the --evidence flag with the path to the evidence manifest and optionally use --strict for stricter validation.

Conclusion

✅ No breaking changes detected. The pull request is safe for downstream users.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: contributor-guide — Welcome 🎉

Welcome 🎉

Hi there, and welcome to the microsoft/agent-governance-toolkit project! Thank you so much for taking the time to contribute. We’re thrilled to have you here, and we appreciate your effort in improving the project. Your contribution is what makes open source so amazing!


Review of Your Pull Request

First off, great job on this PR! 🎉 You've clearly put a lot of thought into it, and I can see the effort you've made to maintain consistency with the existing codebase. Here's what I loved about your contribution:

  1. Clear Description: Your PR description is detailed and well-structured. It provides a clear explanation of the changes, the new behavior, and the testing you've done. This makes it much easier for reviewers to understand the scope of your work.
  2. Thorough Documentation: You've updated the documentation (18-compliance-verification.md) to reflect the new --evidence and --strict options. This is a fantastic practice and ensures that users can easily understand and use the new feature.
  3. Testing: Including the test results and specifying the test files you ran is a great touch. It shows that you've taken the time to verify your changes.

Suggestions for Improvement

While this is a strong contribution, I have a few suggestions to make it even better:

1. Code Style and Linting

  • We use ruff for linting in this project, specifically with the E, F, and W error codes enabled. Could you please run ruff on your changes and address any issues it flags? This will ensure consistency with the project's coding standards.

2. Tests

  • While you've mentioned running tests, it would be great to see specific test cases for the new --evidence and --strict options. Could you add these tests under packages/agent-compliance/tests/? This will ensure the new functionality is robust and doesn't break in future updates.

3. Conventional Commits

  • We follow the Conventional Commits standard for commit messages. For example, your commit message could be updated to something like:
    feat(verify): add runtime evidence mode and strict flag
    
    If you have multiple commits, consider squashing them into one with a clear message.

4. Security Considerations

  • Since this PR introduces functionality that processes runtime evidence manifests, it's important to ensure that the input is properly validated and sanitized. I noticed you’re using yaml.safe_load for parsing the evidence file, which is great! However, please confirm that the file is validated against a schema or that its contents are checked for expected structure and values to avoid potential security risks.

Helpful Resources

Here are some resources to help you with the suggestions above:

  1. CONTRIBUTING.md: This document outlines the contribution guidelines for this project, including coding standards and best practices.
  2. QUICKSTART.md: This guide can help you set up your development environment and get started with the project.
  3. Ruff Documentation: https://beta.ruff.rs/docs/
  4. Conventional Commits: https://www.conventionalcommits.org/

Next Steps

  1. Address the suggestions mentioned above:

    • Run ruff and fix any linting issues.
    • Add test cases for the new --evidence and --strict options.
    • Ensure your commit messages follow the Conventional Commits format.
    • Double-check the security of the runtime evidence manifest processing.
  2. Once you've made the updates, push your changes to this branch. The CI/CD pipeline will automatically re-run, and we'll review your changes again.

  3. If you have any questions or need help, feel free to ask! We're here to support you.


Thank you again for your contribution. We’re excited to see this feature come to life! 🚀

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

The pull request introduces a new "runtime evidence mode" to the agt verify command, allowing users to validate runtime evidence manifests for compliance. This feature is a significant addition to the agent-governance-toolkit and aligns well with its goal of providing robust governance and compliance tools for AI agents. The changes include updates to documentation, CLI commands, and the GovernanceVerifier class.

Below is a detailed review of the changes, focusing on the specified areas of concern.


🔴 CRITICAL Issues

  1. Unvalidated Input in Evidence Mode:

    • The verify_evidence method in GovernanceVerifier does not appear to validate the structure or content of the evidence manifest (agt-evidence.json) against a schema or expected format. This could lead to:
      • Security risks: Malformed or malicious evidence files could bypass governance checks.
      • False negatives: Missing or weak governance setups might not be detected.
    • Recommendation: Use a schema validation library like Pydantic or jsonschema to validate the evidence manifest before processing it.
  2. Insufficient Defense Against Sandbox Escape:

    • The ALLOWED_MODULE_PREFIXES list is used to restrict dynamic imports, but there is no explicit validation of the evidence manifest to ensure it does not reference unauthorized modules or components.
    • Recommendation: Add explicit checks to ensure that all module/component references in the evidence manifest conform to the ALLOWED_MODULE_PREFIXES.
  3. Potential RCE via YAML Parsing:

    • The yaml.safe_load function is not used when parsing the evidence manifest. Using yaml.load (if present) can lead to remote code execution if the YAML file contains malicious payloads.
    • Recommendation: Always use yaml.safe_load for parsing YAML files.

🟡 WARNING: Potential Breaking Changes

  1. New CLI Flags:

    • The addition of --evidence and --strict flags to the agt verify command changes the CLI interface. While this is a backward-compatible change, it may still require updates to existing automation scripts or documentation.
    • Recommendation: Clearly communicate this change in the release notes and provide examples of how to use the new flags.
  2. Behavior Change with --strict:

    • The --strict flag introduces a new behavior where the command exits with a non-zero status if the evidence is deemed weak or missing. This could break existing workflows that assume a zero exit code for successful verification.
    • Recommendation: Document this behavior explicitly and consider adding a warning in the CLI output when --strict is used.

💡 Suggestions for Improvement

  1. Thread Safety in Evidence Validation:

    • If the verify_evidence method is intended to be used in a concurrent environment, ensure that it is thread-safe. For example, avoid modifying shared state or using non-thread-safe data structures.
  2. Enhanced Logging:

    • Add more detailed logging for evidence validation steps, especially when --strict is enabled. This will help users debug issues with their evidence manifests.
  3. Backward Compatibility Tests:

    • Add tests to ensure that the existing agt verify functionality (without --evidence) continues to work as expected. This will help catch any regressions introduced by the new feature.
  4. Documentation Clarity:

    • The updated documentation is clear but could benefit from a dedicated section on "Evidence Mode" with a detailed example of a valid evidence manifest. This will help users understand how to use the new feature effectively.
  5. Type Annotations:

    • Ensure that all new methods, especially verify_evidence, have complete and accurate type annotations. This will improve code readability and help with static analysis.
  6. Test Coverage:

    • While the PR mentions that 116 tests passed, it is not clear if new tests were added specifically for the verify_evidence functionality. Ensure that edge cases, such as malformed evidence files, are thoroughly tested.

Summary of Actionable Feedback

Critical

  • 🔴 Validate the evidence manifest against a schema to prevent security risks and false negatives.
  • 🔴 Ensure that module/component references in the evidence manifest conform to ALLOWED_MODULE_PREFIXES.
  • 🔴 Use yaml.safe_load for parsing YAML files to prevent remote code execution.

Warnings

  • 🟡 Communicate the addition of new CLI flags (--evidence, --strict) in the release notes.
  • 🟡 Document the behavior change introduced by the --strict flag.

Suggestions

  • 💡 Ensure thread safety in the verify_evidence method.
  • 💡 Add detailed logging for evidence validation steps.
  • 💡 Add backward compatibility tests for the existing agt verify functionality.
  • 💡 Provide a detailed example of a valid evidence manifest in the documentation.
  • 💡 Add complete type annotations for all new methods.
  • 💡 Add comprehensive tests for edge cases in evidence validation.

Conclusion

The new runtime evidence mode is a valuable addition to the agent-governance-toolkit, but it introduces potential security risks and breaking changes that need to be addressed. By implementing the recommendations above, the feature can be made more robust, secure, and user-friendly.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 14, 2026

🤖 AI Agent: security-scanner — Security Review of Pull Request: Add runtime evidence mode to `agt verify`

Security Review of Pull Request: Add runtime evidence mode to agt verify

This pull request introduces a new runtime evidence mode to the agt verify command, allowing verification of deployment evidence manifests. The feature includes a --strict option to enforce stricter validation. Below is a security analysis of the changes.


Findings

1. Prompt Injection Defense Bypass

  • Risk: No direct prompt injection risks were observed in this PR, as the changes primarily involve CLI and evidence file processing. However, the evidence manifest (agt-evidence.json) could potentially include maliciously crafted data.
  • Attack Vector: If the evidence manifest is not sanitized or validated properly, it could lead to injection vulnerabilities in downstream systems that consume the evidence.
  • Rating: 🔵 LOW
  • Recommendation: Ensure strict validation of the evidence manifest's structure and content. Use a schema validation library (e.g., jsonschema) to enforce constraints on the evidence file.

2. Policy Engine Circumvention

  • Risk: The --evidence and --strict options introduce new paths for policy validation. If the verify_evidence method does not enforce the same rigor as the standard verify method, it could allow circumvention of governance policies.
  • Attack Vector: A malicious actor could craft an evidence file that bypasses certain checks, especially if the strict mode is not enabled or implemented inconsistently.
  • Rating: 🟠 HIGH
  • Recommendation:
    • Ensure verify_evidence performs all critical checks performed by verify.
    • Add unit tests to confirm that --evidence and --strict modes enforce deny-by-default semantics and validate all required governance controls.

3. Trust Chain Weaknesses

  • Risk: The runtime evidence manifest could be tampered with if not cryptographically signed or verified.
  • Attack Vector: An attacker could modify the evidence file to misrepresent the state of the deployment, leading to false attestations.
  • Rating: 🔴 CRITICAL
  • Recommendation:
    • Require evidence manifests to be signed using a trusted key.
    • Implement signature verification as part of the verify_evidence method.
    • Reject unsigned or tampered evidence files.

4. Credential Exposure

  • Risk: No direct credential exposure was observed in this PR. However, if the evidence manifest contains sensitive information (e.g., identity state, audit sink configuration), it could be inadvertently exposed in logs or error messages.
  • Attack Vector: Debugging or error handling could leak sensitive data from the evidence manifest.
  • Rating: 🟡 MEDIUM
  • Recommendation:
    • Scrub sensitive fields from logs and error messages.
    • Add tests to ensure sensitive data is not exposed in any output.

5. Sandbox Escape

  • Risk: No sandboxing mechanisms were directly affected by this PR.
  • Rating: 🔵 LOW

6. Deserialization Attacks

  • Risk: The evidence manifest is loaded as JSON or YAML. If YAML is used, it could be vulnerable to deserialization attacks.
  • Attack Vector: A malicious YAML file could exploit unsafe deserialization to execute arbitrary code.
  • Rating: 🔴 CRITICAL
  • Recommendation:
    • Use a safe YAML parser (e.g., ruamel.yaml.safe_load or PyYAML.safe_load).
    • Avoid using yaml.load or any equivalent unsafe deserialization methods.

7. Race Conditions

  • Risk: No race conditions were observed in the changes.
  • Rating: 🔵 LOW

8. Supply Chain

  • Risk: No new dependencies were introduced in this PR, so no immediate supply chain risks were identified.
  • Rating: 🔵 LOW

Summary of Findings

Category Rating Details
Prompt Injection Defense 🔵 LOW Ensure evidence manifests are validated against a strict schema.
Policy Engine Circumvention 🟠 HIGH Ensure verify_evidence enforces all critical governance checks.
Trust Chain Weaknesses 🔴 CRITICAL Require cryptographic signatures for evidence manifests.
Credential Exposure 🟡 MEDIUM Scrub sensitive data from logs and error messages.
Sandbox Escape 🔵 LOW No sandboxing mechanisms were affected.
Deserialization Attacks 🔴 CRITICAL Use safe YAML/JSON parsers to prevent code execution from malicious files.
Race Conditions 🔵 LOW No race conditions were observed.
Supply Chain 🔵 LOW No new dependencies were introduced.

Recommendations

  1. Evidence Validation:

    • Use a schema validation library to enforce strict constraints on the evidence manifest.
    • Reject files that do not conform to the expected schema.
  2. Cryptographic Signatures:

    • Implement signature verification for evidence manifests.
    • Reject unsigned or tampered files.
  3. Safe Deserialization:

    • Use safe_load for YAML parsing.
    • Avoid any unsafe deserialization methods.
  4. Testing:

    • Add unit tests to ensure --evidence and --strict modes enforce all governance policies.
    • Test for edge cases, such as malformed evidence files or missing fields.
  5. Logging and Error Handling:

    • Scrub sensitive data from logs and error messages.
    • Add tests to confirm that sensitive fields are not exposed.

Conclusion

This PR introduces useful functionality for runtime evidence validation but also opens up critical attack vectors. Addressing the identified issues—particularly deserialization safety and trust chain weaknesses—will be essential to ensure the security of the agt verify command.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 14, 2026

🤖 AI Agent: test-generator — `agt.py`

🧪 Test Coverage Analysis

agt.py

  • ✅ Existing coverage: The existing tests cover the basic functionality of the agt verify command, including the --badge and --json options.
  • ❌ Missing coverage: There is no coverage for the new --evidence and --strict options. Specifically, edge cases related to the handling of runtime evidence, such as missing or weak evidence, are not tested.
  • 💡 Suggested test cases:
    1. test_verify_with_valid_evidence — Test the agt verify --evidence ./valid-evidence.json command to ensure it correctly processes valid runtime evidence.
    2. test_verify_with_missing_evidence — Test the agt verify --evidence ./missing-evidence.json command to check that it fails gracefully when the evidence file does not exist.
    3. test_verify_with_weak_evidence — Test the agt verify --evidence ./weak-evidence.json --strict command to verify that it exits with a non-zero status when the evidence is deemed weak.
    4. test_verify_with_conflicting_policies — Test how the verifier handles conflicting policies in the evidence file to ensure it correctly identifies and reports the conflict.

verify.py

  • ✅ Existing coverage: The existing tests cover the basic functionality of the GovernanceVerifier class and its verify() method.
  • ❌ Missing coverage: There is no coverage for the new verify_evidence() method, which processes the runtime evidence manifest. Edge cases such as expired certificates, revoked trust, and handling of malformed evidence files are not tested.
  • 💡 Suggested test cases:
    1. test_verify_evidence_with_expired_certificates — Test the verify_evidence() method with evidence that includes expired certificates to ensure it correctly identifies and handles this scenario.
    2. test_verify_evidence_with_revoked_trust — Test the verify_evidence() method with evidence that indicates revoked trust to verify that it fails as expected.
    3. test_verify_evidence_with_malformed_json — Test the verify_evidence() method with a malformed JSON evidence file to ensure it raises appropriate validation errors.
    4. test_verify_evidence_with_oversized_payload — Test the verify_evidence() method with an oversized evidence payload to check for proper handling and error reporting.

By implementing these suggested test cases, the coverage for the new functionality introduced in the agt verify command and the GovernanceVerifier class will be significantly improved, ensuring robustness against various edge cases.

@electricwolfemarshmallowhypertext
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces a new feature to the agt verify command, allowing it to validate runtime evidence manifests using the --evidence and --strict flags. The feature is well-documented, and the implementation appears to be well-integrated into the existing codebase. However, there are several areas where improvements can be made to ensure better security, maintainability, and robustness.


🔴 CRITICAL

  1. Lack of Input Validation for Evidence File:

    • The --evidence flag accepts a file path, but there is no validation of the file's contents. If the file is malformed, contains malicious payloads, or is tampered with, it could lead to undefined behavior or security vulnerabilities.
    • Recommendation: Use a schema validation library like pydantic or jsonschema to validate the structure and content of the evidence file. Ensure that the file is parsed safely and that any unexpected fields or invalid data are handled gracefully.
  2. Potential Path Traversal Vulnerability:

    • The evidence_path parameter is directly passed to open() without sanitization. This could allow for path traversal attacks if the input is not properly validated.
    • Recommendation: Use os.path.abspath() to resolve the absolute path and ensure it is within a trusted directory. Additionally, consider using a library like pathlib for safer path handling.
  3. Insufficient Error Handling for Evidence Parsing:

    • The verify_evidence method does not include robust error handling for scenarios where the evidence file is missing required fields, contains invalid data types, or is otherwise corrupted.
    • Recommendation: Implement detailed error handling for all potential failure scenarios when parsing and validating the evidence file. Provide meaningful error messages to the user.
  4. Lack of Cryptographic Integrity Verification for Evidence File:

    • The evidence file is not verified for integrity or authenticity. This could allow an attacker to tamper with the evidence file and bypass compliance checks.
    • Recommendation: Require a cryptographic signature for the evidence file and verify it before processing. This ensures that the evidence file has not been tampered with and comes from a trusted source.

🟡 WARNING

  1. Potential Breaking Change in verify Command:

    • The verify command now includes additional options (--evidence and --strict). While the default behavior remains unchanged, users relying on custom scripts or automation may need to update their workflows to account for the new options.
    • Recommendation: Clearly document this change in the release notes and provide examples of how to use the new options. Consider adding a deprecation warning for any old behavior that may be removed in the future.
  2. Backward Compatibility of GovernanceVerifier:

    • The GovernanceVerifier class now includes a new method verify_evidence. While this does not directly change the existing verify method, any downstream consumers of this class may need to update their code to handle the new functionality.
    • Recommendation: Ensure that the new method is thoroughly documented and that its usage is clearly explained in the release notes.

💡 SUGGESTIONS

  1. Thread Safety:

    • The GovernanceVerifier class does not appear to be explicitly thread-safe. If this class is used in a concurrent environment, there could be race conditions or data corruption.
    • Recommendation: Add thread-safety mechanisms (e.g., locks) if this class is expected to be used in multi-threaded or multi-process environments.
  2. Logging for Evidence Validation:

    • The verify_evidence method does not include detailed logging for the validation process. This could make it difficult to debug issues or understand why a particular evidence file failed validation.
    • Recommendation: Add logging at key points in the evidence validation process to provide visibility into what checks are being performed and their outcomes.
  3. Unit Tests for Edge Cases:

    • While the PR mentions that 116 tests passed, it is unclear if edge cases for the new --evidence and --strict flags have been thoroughly tested.
    • Recommendation: Add unit tests for the following scenarios:
      • Evidence file is missing required fields.
      • Evidence file contains invalid data types.
      • Evidence file is empty or malformed.
      • Evidence file is tampered with (if cryptographic verification is implemented).
      • Evidence file is valid but governance setup is weak (e.g., missing deny rules or audit sinks).
  4. Documentation Improvements:

    • The documentation for the --evidence and --strict flags is clear but could benefit from additional examples, such as:
      • A sample agt-evidence.json file.
      • Expected output for both valid and invalid evidence files.
    • Recommendation: Add these examples to the documentation to help users understand how to use the new feature effectively.
  5. Refactor CLI Code:

    • The CLI code in agt.py has several sections with inconsistent formatting and redundant comments (e.g., "Helpers," "Error handling"). Additionally, some functions (e.g., _handle_error) could be refactored for better readability.
    • Recommendation: Clean up the CLI code by removing unnecessary comments, consolidating related sections, and ensuring consistent formatting.
  6. Rich Output for Evidence Validation:

    • The CLI output for evidence validation could benefit from richer formatting (e.g., tables or color-coded results) to improve readability.
    • Recommendation: Use the rich library to format the output for evidence validation, similar to how it is used in the doctor command.

Conclusion

The new runtime evidence mode is a valuable addition to the agt verify command, enhancing its ability to validate live deployments. However, there are critical security issues that need to be addressed, particularly around input validation, path traversal, and cryptographic integrity verification. Additionally, there are opportunities to improve documentation, testing, and code quality. Addressing these issues will ensure the feature is robust, secure, and user-friendly.

@electricwolfemarshmallowhypertext
Copy link
Copy Markdown
Contributor Author

Added follow-up docs based on review signal:

  • documented agt verify --evidence and --strict in README quickstart
  • added CHANGELOG entry under Unreleased

Also confirmed evidence YAML parsing uses yaml.safe_load / yaml.safe_load_all.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution — the runtime evidence feature is valuable and the implementation is well-structured. A few items to address before merge:

*🔴 Blocking: Path traversal in _resolve_reported_paths()*
The evidence manifest declares \policy_files_loaded\ paths that are resolved via (base_dir / candidate).resolve()\ but never validated to stay within \�ase_dir. An attacker-crafted manifest can read arbitrary files. Please add a boundary check:
\\python
candidate = (base_dir / candidate).resolve()
if not candidate.is_relative_to(base_dir):
raise ValueError(f'Policy path escapes evidence directory: {value}')
\\

*🟡 No file size limit in _load_policy_documents()*
Could OOM in CI with very large files. Add a size check (e.g., 10 MB) before reading.

🟡 Missing trailing newlines
All 4 modified .py files end without a newline (W292).

🟡 MSRC Case 112362 comment deleted
The security comment on \ALLOWED_MODULE_PREFIXES\ documenting the MSRC case was removed. Please restore it — it's load-bearing security context.

Otherwise the scope is clean, \yaml.safe_load\ usage is correct, \�ncoding='utf-8'\ is consistent, and license headers are present. Happy to re-review once the path traversal fix is in.

@electricwolfemarshmallowhypertext
Copy link
Copy Markdown
Contributor Author

electricwolfemarshmallowhypertext commented Apr 15, 2026

Hello Imran,

Addressed review items:

  • blocked policy path traversal outside the evidence directory
  • added a policy file size limit before loading YAML
  • restored trailing newlines on modified Python files
  • restored the MSRC security comment on the module allowlist

Re-ran:

  • pytest packages/agent-compliance/tests/test_integrity_and_verify.py packages/agent-compliance/tests/test_agt_cli.py -q
  • 118 passed

Note: I didn’t modify .github/workflows/weekly-security-audit.yml; this looks like an unrelated workflow parsing error (Unrecognized named-value: 'pattern') rather than an issue in the agt verify changes.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review of Pull Request: Add runtime evidence mode to agt verify


🔴 CRITICAL: Security Issues

  1. Unvalidated Input in Evidence Mode

    • The --evidence flag allows the user to specify a file path for the runtime evidence manifest. However, there is no validation or sanitization of the contents of the file before processing. If the file contains malicious or malformed data, it could lead to undefined behavior or security vulnerabilities.
    • Recommendation: Use a strict schema validation (e.g., Pydantic models) to validate the structure and content of the evidence manifest before processing it.
  2. Weak Error Handling in Evidence Parsing

    • The verify_evidence method does not appear to handle parsing errors robustly. If the evidence file is corrupted or contains invalid JSON/YAML, the program may crash or behave unpredictably.
    • Recommendation: Add explicit error handling for JSON/YAML parsing errors and provide meaningful error messages to the user.
  3. Potential Sandbox Escape via Evidence File

    • If the evidence file is processed without proper validation, it could potentially be used as a vector for a sandbox escape, especially if the file contains unexpected or malicious payloads.
    • Recommendation: Ensure that the evidence file is processed in a secure, sandboxed environment. Avoid executing any code or commands embedded in the file.

🟡 WARNING: Breaking Changes

  1. Behavior Change with --strict Flag

    • The introduction of the --strict flag changes the behavior of the agt verify command by causing it to exit with a non-zero status if runtime evidence is weak or missing. This could break existing CI/CD pipelines that rely on the previous behavior.
    • Recommendation: Clearly document this change in the release notes and consider making the --strict flag opt-in rather than default behavior.
  2. Backward Compatibility of GovernanceVerifier

    • The GovernanceVerifier class now includes a new verify_evidence method. While this does not break existing functionality, it introduces new behavior that could lead to unexpected results if users are not aware of the changes.
    • Recommendation: Ensure that the new method is well-documented and backward-compatible with existing usage patterns.

💡 Suggestions for Improvement

  1. Type Safety with Pydantic Models

    • The verify_evidence method should use Pydantic models to validate the structure and content of the evidence manifest. This would ensure type safety and reduce the risk of runtime errors.
    • Example:
      from pydantic import BaseModel, ValidationError
      
      class EvidenceManifest(BaseModel):
          policy_files: List[str]
          deny_rules: bool
          registered_tools: List[str]
          audit_sink: str
          identity_state: str
          package_versions: Dict[str, str]
      
      def verify_evidence(self, evidence_path: str, strict: bool) -> GovernanceAttestation:
          try:
              with open(evidence_path, 'r') as f:
                  data = json.load(f)
              evidence = EvidenceManifest(**data)
          except (FileNotFoundError, ValidationError) as e:
              raise ValueError(f"Invalid evidence file: {e}")
  2. Thread Safety in Evidence Mode

    • If the GovernanceVerifier class is used in a multi-threaded context, ensure that the verify_evidence method is thread-safe. For example, avoid using mutable global variables or shared state.
    • Recommendation: Add thread-safety tests to verify the behavior of the verify_evidence method in concurrent environments.
  3. OWASP Agentic Top 10 Compliance

    • The runtime evidence manifest should explicitly include checks for OWASP Agentic Top 10 compliance, such as secure storage of credentials, proper use of cryptographic operations, and prevention of unauthorized access.
    • Recommendation: Extend the evidence validation logic to include checks for these compliance requirements.
  4. Improved Documentation

    • The documentation for the new --evidence and --strict flags is clear but could benefit from additional examples and edge case scenarios. For instance, provide examples of valid and invalid evidence manifests and explain how the --strict flag affects the verification process.
  5. Test Coverage

    • While the PR mentions that 116 tests passed, it is unclear if these tests cover all edge cases for the new functionality. Specifically, tests should cover:
      • Invalid evidence file paths.
      • Malformed or incomplete evidence manifests.
      • Behavior of the --strict flag with weak evidence.
      • Concurrent execution of verify_evidence.
  6. Error Messages

    • Improve the clarity and user-friendliness of error messages, especially for issues related to the evidence file. For example, instead of a generic "Invalid evidence file" message, specify the exact issue (e.g., "Missing required field: policy_files").
  7. Logging Enhancements

    • Add logging for key steps in the verify_evidence method, such as when the evidence file is loaded, validated, and processed. This will help with debugging and auditing.

Summary of Changes

  • Added --evidence and --strict flags to the agt verify command.
  • Introduced verify_evidence method in the GovernanceVerifier class to validate runtime evidence manifests.
  • Updated documentation and tutorials to reflect the new functionality.
  • Modified CLI error handling and output formatting for better user experience.

Final Recommendations

  • Address the critical security issues related to unvalidated input and potential sandbox escape.
  • Add robust error handling and type validation for the evidence manifest.
  • Ensure backward compatibility and clearly document the behavior changes introduced by the --strict flag.
  • Enhance test coverage to include edge cases and concurrent execution scenarios.
  • Improve logging and error messages for better debugging and user experience.

Once these issues are addressed, the PR will be in a much stronger position for approval. Let me know if you need further clarification or assistance!

@imran-siddique imran-siddique force-pushed the wolfe/agt-verify-evidence branch from 583675c to e00b7e5 Compare April 16, 2026 01:14
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: Add runtime evidence mode to agt verify


🔴 CRITICAL: Security Concerns

  1. Runtime Evidence Validation Weakness

    • The verify_evidence method in GovernanceVerifier does not appear to validate the cryptographic integrity of the runtime evidence manifest (agt-evidence.json). Without verifying the authenticity and integrity of the evidence file (e.g., via a signature or hash), attackers could tamper with the evidence to bypass governance checks.
    • Actionable Fix: Add cryptographic validation (e.g., signed evidence manifests with public key verification) to ensure the evidence file is trustworthy.
  2. Sandbox Escape Risk

    • The runtime evidence manifest includes sensitive configuration details such as "identity state" and "audit sink configuration." If these are improperly sanitized or exposed, they could leak sensitive information or enable sandbox escape vectors.
    • Actionable Fix: Ensure that all sensitive fields in the evidence manifest are sanitized before processing or outputting them. Add strict validation rules for these fields.
  3. Deny Rule Semantics

    • The implementation does not enforce deny-by-default semantics in verify_evidence. If the runtime evidence lacks explicit deny rules, the system should fail securely.
    • Actionable Fix: Add a mandatory check for deny-by-default semantics in strict mode, and fail verification if they are absent.

🟡 WARNING: Potential Breaking Changes

  1. CLI Behavior Change

    • Adding --evidence and --strict options changes the behavior of agt verify. While backward compatibility is maintained for existing commands, users relying on default behavior may encounter unexpected results if they unintentionally pass evidence files.
    • Actionable Fix: Update documentation and release notes to clearly communicate the new behavior and ensure users are aware of the implications of using --strict.
  2. Exit Code Changes

    • The --strict flag introduces stricter exit code semantics. CI/CD pipelines relying on agt verify may fail if runtime evidence is weak or missing.
    • Actionable Fix: Provide clear migration guidance for users upgrading to this version, including how to handle stricter exit codes.

💡 Suggestions for Improvement

  1. Thread Safety

    • The verify_evidence method processes runtime evidence but does not explicitly account for concurrent execution. If multiple agents are verified simultaneously, race conditions could occur.
    • Suggestion: Use thread-safe mechanisms (e.g., locks or atomic operations) when processing shared resources like evidence files.
  2. Type Safety and Validation

    • The runtime evidence manifest is parsed without strict type validation. Consider using Pydantic models to enforce schema validation for the evidence file.
    • Suggestion: Define a Pydantic model for the evidence manifest and validate it before processing.
  3. OWASP Agentic Top 10 Compliance

    • While the PR mentions OWASP ASI 2026 compliance, runtime evidence validation should explicitly check for compliance with OWASP Agentic Top 10 controls (e.g., secure identity management, audit logging).
    • Suggestion: Add explicit checks for OWASP Agentic Top 10 compliance in the verify_evidence method.
  4. Documentation

    • The documentation updates are helpful but could benefit from more detailed examples of runtime evidence manifest structure and strict mode behavior.
    • Suggestion: Add sample evidence files and strict mode failure scenarios to the documentation.

Summary of Actions

  • 🔴 Critical Fixes

    • Add cryptographic validation for runtime evidence manifests.
    • Sanitize sensitive fields in the evidence manifest.
    • Enforce deny-by-default semantics in strict mode.
  • 🟡 Warnings

    • Update documentation and release notes for CLI behavior changes.
    • Provide migration guidance for stricter exit codes.
  • 💡 Suggestions

    • Implement thread-safe mechanisms for concurrent execution.
    • Use Pydantic models for type validation.
    • Add explicit OWASP Agentic Top 10 compliance checks.
    • Enhance documentation with examples and failure scenarios.

Recommendation

Do not merge until 🔴 CRITICAL issues are resolved. Address 🟡 WARNING items before release to avoid breaking changes for users. Implement 💡 suggestions to improve robustness and usability.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feature addition — well-scoped and backward-compatible.

@imran-siddique imran-siddique merged commit 51ed650 into microsoft:main Apr 16, 2026
9 of 10 checks passed
imran-siddique added a commit to imran-siddique/agent-governance-toolkit that referenced this pull request Apr 16, 2026
* fix(pipeline): run NuGet ESRP signing on Windows agent (microsoft#1022)

The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile,
ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest
produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'.

Changes:
- Add per-job pool override: PublishNuGet runs on windows-latest
- Convert FolderPath and all shell commands to Windows paths
- Replace bash scripts with PowerShell for the Windows agent
- PyPI and npm stages remain on ubuntu-latest (unchanged)
- Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable

Co-authored-by: Copilot <[email protected]>

* docs: reland empty-merge changes from PRs microsoft#1017 and microsoft#1020 (microsoft#1125)

PRs microsoft#1017 and microsoft#1020 were squash-merged as empty commits (0 file
changes). This commit re-applies the intended documentation updates.

From PR microsoft#1017 (critic gaps):
- LIMITATIONS.md: add sections 7 (knowledge governance gap), 8
  (credential persistence gap), 9 (initialization bypass risk)
- LIMITATIONS.md: add knowledge governance and enforcement infra
  rows to 'What AGT Is Not' table
- THREAT_MODEL.md: add knowledge flow and credential persistence
  to residual risks, add configuration bypass vectors table,
  remove stale '10/10' qualifier

From PR microsoft#1020 (SOC2 resolved gaps):
- soc2-mapping.md: mark kill switch as resolved (saga handoff
  implemented in kill_switch.py:69-178)
- soc2-mapping.md: mark DeltaEngine verify_chain() as resolved
  (SHA-256 chain verification in delta.py:67-127)
- soc2-mapping.md: add Resolved section to gaps summary, update
  Processing Integrity to 2 of 4 defects (was 3 of 4)

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity (microsoft#1021)

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31



* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges



* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.



---------

Co-authored-by: Copilot <[email protected]>

* docs: address external critic gaps (microsoft#1025)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add 26 xUnit tests
- Update README

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#6)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7)

* feat(openshell): add governance skill package and runnable example (microsoft#942)

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code (#8)

* feat(openshell): add governance skill package and runnable example (microsoft#942)

Co-authored-by: Copilot <[email protected]>

* feat(typescript): add MCP security scanner and lifecycle management to TS SDK (microsoft#947)

Co-authored-by: Copilot <[email protected]>

* docs: update SDK feature matrix after parity pass (microsoft#950)

Reflects new capabilities added in PRs microsoft#947 (TS), .NET, Rust, Go:
- TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14)
- .NET: Kill switch + lifecycle management (was 8/14, now 10/14)
- Rust: Execution rings + lifecycle management (was 6/14, now 8/14)
- Go: MCP security + rings + lifecycle (was 4/14, now 7/14)

All SDKs now have lifecycle management. Core governance (policy, identity,
trust, audit) + lifecycle = 5 primitives shared across all 5 languages.

Co-authored-by: Copilot <[email protected]>

* docs: add LIMITATIONS.md - honest design boundaries and layered defense (microsoft#953)

Addresses valid external critique of AGT's architectural blind spots:

1. Action vs Intent: AGT governs individual actions, not reasoning or
   action sequences. Documents the compound-action gap explicitly and
   recommends content policies + model safety layers.

2. Audit logs record attempts, not outcomes: Documents that post-action
   state verification is the user's responsibility today, with hooks planned.

3. Performance honesty: README now notes that <0.1ms is policy-eval only;
   distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md.

4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator,
   no mesh/crypto) vs full enterprise stack.

5. Vendor independence: Documents zero cloud dependencies in core,
   standard formats for all state, migration path.

6. Recommended layered defense architecture diagram showing AGT as one
   layer alongside model safety, application logic, and infrastructure.

Co-authored-by: Copilot <[email protected]>

* fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (microsoft#954)

Closes microsoft#952

Co-authored-by: Copilot <[email protected]>

* feat: reversibility checker, trust calibration guide, escalation tests (microsoft#955)

ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]>

* feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (microsoft#956)

agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]>

* fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (microsoft#957)

- Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0
- Bump version= in 6 setup.py files from 3.0.2 to 3.1.0
- Bump meter version strings in _mcp_metrics.py
- Bump 9 package.json files from 3.0.2 to 3.1.0
- Bump .NET csproj Version from 3.0.2 to 3.1.0
- Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0
- Create Go sdk doc.go with version marker 3.1.0
- Fix ruff W292 (missing newline at EOF) in data_classification.py
- Fix CLI init regex to allow dots in agent names (test_init_special_characters)

Co-authored-by: Copilot <[email protected]>

* fix(openclaw): critical honesty pass — document what works vs what's planned (microsoft#958)

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (microsoft#959)

* fix(openclaw): critical honesty pass — document what works vs what's planned

Server (__main__.py):
- Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080)

Dockerfile.sidecar:
- Copy modules/ directory (was missing, causing build failure)
- Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers)
- Remove phantom port 9091 (no separate metrics listener exists)

openclaw-sidecar.md — full honesty rewrite:
- Add status banner: transparent interception is NOT yet implemented
- Document actual sidecar API endpoints (health, detect/injection, execute, metrics)
- Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile)
- Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this)
- Replace fictional SLO/Grafana sections with real /api/v1/metrics docs
- Add Roadmap section listing what's planned vs shipped

openshell.md:
- Remove references to non-existent shell scripts
- Fix python -m agentmesh.server to python -m agent_os.server
- Add note that sidecar doesn't transparently intercept (must call API)
- Replace pip install agentmesh-platform with Python skill library usage

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh

cargo package in a workspace writes .crate files to the workspace root's
target/package/, not the individual crate's directory. The pipeline was
running from the crate subdirectory and couldn't find the output.

Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh
to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh
to all cargo commands to target the specific crate.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (microsoft#948)

Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from microsoft#772.

Co-authored-by: kevinkaylie <[email protected]>

* blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (microsoft#899)

Co-authored-by: aymenhmaidiwastaken <[email protected]>

* feat: add LotL prevention policy for security measures (microsoft#949)

YAML policy template for Living-off-the-Land detection and prevention.

* feat(examples): add ATR community security rules for PolicyEvaluator (microsoft#908)

15 curated ATR detection rules + sync script. Closes microsoft#901.

* fix(docs): correct npm package name and stale version refs across 21 files (microsoft#960)

- Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files
  (README, QUICKSTART, tutorials, SDK docs, i18n, changelog)
- Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py)
- Remove stale v1.0.0 labels from extension status table
- Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and
  ATF conformance assessment

Co-authored-by: Copilot <[email protected]>

* fix(ci): use ESRP Release for NuGet signing (microsoft#961)

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing (microsoft#962)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag (microsoft#963)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): use EsrpCodeSigning + dotnet push for NuGet (microsoft#965)

EsrpRelease@11 does not support NuGet as a contenttype — it's for
PyPI/npm/Maven/crates.io package distribution. NuGet packages must be
signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push.

New flow:
1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405)
2. dotnet nuget push with the signed .nupkg to nuget.org

This matches the standard Microsoft NuGet ESRP signing pattern used by
azure-sdk, dotnet runtime, and other Microsoft OSS projects.

Co-authored-by: Copilot <[email protected]>

* fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (microsoft#966)

Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities.

CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration
via Header Injection Chain — prototype pollution gadget enables CRLF
injection in HTTP headers, bypassing AWS IMDSv2 session tokens.

CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing
dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through
attacker-controlled proxy.

Upgraded in 3 packages:
- extensions/copilot: 1.14.0 → 1.15.0
- extensions/cursor:  1.13.5 → 1.15.0
- agent-os-vscode:    1.13.6 → 1.15.0

Co-authored-by: Copilot <[email protected]>

* fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (microsoft#967)

The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference,
preventing ESRP authentication across ALL publishing stages (PyPI, npm,
NuGet, crates.io).

Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the
well-known Microsoft corporate tenant ID (72f988bf-..., same default
used by ESRP Release action.yml). This is a public value, not a secret.

Also: NuGet publishing requires Microsoft as co-owner of the package
on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code

- Update SOC2 mapping to reflect CredentialRedactor now redacts
  credential-like secrets before audit persistence (API keys, tokens,
  JWTs, connection strings, etc.). Remaining gap: non-credential PII
  (email, phone, addresses) not yet redacted in audit entries.
- Replace 'kernel-level enforcement' with 'policy-layer enforcement'
  in README, OWASP compliance, and architecture overview to match the
  existing 'application-level governance' framing in README Security
  section and LIMITATIONS.md.
- Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote
  clarifying this means mitigation components exist per risk category,
  not full elimination.
- Update owasp-llm-top10-mapping.md LLM06 row for credential redaction.

Addresses doc/code inconsistencies identified in external review.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

* fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (microsoft#1028)

- Remove unused variable profiling_override (F841)
- Remove f-string without placeholders (F541)
- Fix whitespace in docstrings (W293)

Co-authored-by: Copilot <[email protected]>

* fix(ci): add path filters and concurrency; announce v3.1.0 release (microsoft#1039)

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: add ADOPTERS.md and make deployment guides multi-cloud (microsoft#1040)

- New ADOPTERS.md following Backstage/Flatcar pattern with Production,
  Evaluation, and Academic tables + instructions for adding your org
- Rewrite docs/deployment/README.md from Azure-only to multi-cloud:
  Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE),
  Docker Compose, self-hosted. Updated architecture diagram to show
  cloud-agnostic deployment patterns.
- Fix broken AWS/GCP links (pointed to non-existent paths)
- README now links to 'Deployment Guides' (multi-cloud) instead of
  'Azure Deployment'
- README Contributing section invites adopters to add their org

Co-authored-by: Copilot <[email protected]>

* feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (microsoft#1044)

Addresses the #1 developer experience criticism: AGT is too complex to start.

New: agent_os.lite — lightweight governance module
- govern() factory: one line to create a governance gate
- check(action): one line to enforce — raises GovernanceViolation or returns True
- check.is_allowed(action): non-raising bool version
- Allow lists, deny lists, regex patterns, content filtering, rate limiting
- Built-in audit trail and stats
- Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms)
- Zero dependencies beyond stdlib (re, time, datetime)
- 16 tests passing

Fix: govern_in_60_seconds.py quickstart
- BROKEN: was calling PolicyEvaluator.add_rules() which does not exist
- FIXED: now uses agent_os.lite.govern() which actually works
- Verified end-to-end: script runs and produces correct output

The lite module is for developers who just want basic governance
without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc.
Upgrade to the full stack when you need it.

Co-authored-by: Copilot <[email protected]>

* feat(ci): enhance weekly security audit with 7 new scan jobs (microsoft#1051)

Add comprehensive security checks based on issues found during
the MSRC-111178 security audit and ongoing post-merge reviews:

- Workflow security regression (MSRC-111178 pull_request_target check)
- Expression injection scan (github.event.* in run: blocks)
- Docker security (root containers, wildcard CORS, hardcoded passwords,
  0.0.0.0 bindings)
- XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True)
- Action SHA pinning compliance
- Version pinning (pyproject.toml upper bounds, Docker :latest tags,
  license field format)
- Dependency confusion with --strict mode (pyproject.toml + package.json)
- Retention days updated to 180 (EU AI Act Art. 26(6))

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (microsoft#1057)

- Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.)
- Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md
- Fix policy validation: replace starts_with (invalid) with matches + regex

Co-authored-by: Copilot <[email protected]>

* feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (microsoft#1061)

Addresses critical review feedback:

1. Rollback/reversibility (agent_os.reversibility)
   - ReversibilityChecker: pre-execution assessment of action reversibility
   - 4 levels: fully_reversible, partially_reversible, irreversible, unknown
   - CompensatingAction: structured undo plans for each action type
   - Built-in rules for 12 common actions (write, deploy, delete, email, etc.)
   - block_irreversible mode for strict environments

2. Trust score calibration guide (docs/security/trust-score-calibration.md)
   - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%)
   - Decay functions with tier floors
   - Initial score assignments by agent origin
   - Threshold recommendations (conservative/moderate/permissive)
   - Anti-gaming measures and operational playbook

3. Tests: 19 passing (10 escalation + 9 reversibility)

Co-authored-by: Copilot <[email protected]>

* feat: deployment runtime (Docker/AKS) and shared trust core types (microsoft#1062)

agent-runtime: Evolve from thin re-export shim to deployment runtime
- DockerDeployer: container deployment with security hardening
  (cap-drop ALL, no-new-privileges, read-only rootfs)
- KubernetesDeployer: AKS pod deployment with governance sidecars
  (runAsNonRoot, seccompProfile, resource limits)
- GovernanceConfig: policy/trust/audit config injected as env vars
- DeploymentTarget protocol for extensibility (ADC, nono, etc.)
- 24 tests (all subprocess calls mocked)

agent-mesh: Extract shared trust types into agentmesh.trust_types
- TrustScore, AgentProfile, TrustRecord, TrustTracker
- Canonical implementations replacing ~800 lines of duplicated code
  across 6+ integration packages
- 25 tests covering clamping, scoring, history, capabilities

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (microsoft#1065)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (microsoft#1066)

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (microsoft#1067)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix: align lotl_prevention_policy.yaml with PolicyDocument schema

The policy file used an incompatible schema format (id, parameter,
regex_match, effect) instead of the expected PolicyDocument fields
(name, condition.field, operator, action). This caused the
validate-policies CI check to fail for all PRs.

Changes:
- id → name
- condition.parameter → condition.field
- operator: regex_match → operator: matches
- action at rule level (shell_exec/file_read) → action: deny
- effect: DENY → removed (redundant with action: deny)
- Added version, name, description, disclaimer at top level

Co-authored-by: Copilot <[email protected]>

* fix: resolve .NET ESRP signing issues blocking NuGet publish

GitHub Actions (publish.yml):
- Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in
  step-level env (invisible to if-expressions). Replace with job-level
  ESRP_CONFIGURED env derived from secrets.
- Add missing ESRP_CERT_IDENTIFIER to signing step env blocks.
- Gate the publish step on ESRP_CONFIGURED so unsigned packages are
  never pushed to NuGet.org under the Microsoft.* prefix.
- Make stub signing steps fail-fast (exit 1) instead of silently
  succeeding, preventing unsigned packages from reaching NuGet push.

ADO Pipeline (esrp-publish.yml):
- Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push
  has a guaranteed SDK version on the Windows agent.

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (microsoft#1163)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (microsoft#1164)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(ci): use PME tenant ID for ESRP cert signing

The ESRP signing cert lives in the PME (Partner Managed Engineering)
tenant (975f013f), not the Microsoft corporate tenant (72f988bf).
Using the wrong tenant ID causes ESRP signing to fail when looking
up the cert.

Co-authored-by: Copilot <[email protected]>

* docs: Add Scaling AI Agents article to COMMUNITY.md (microsoft#857)

Co-authored-by: deepsearch <[email protected]>

* Add runtime evidence mode to agt verify (microsoft#969)

* Track agt verify evidence plan

* Add runtime evidence mode to agt verify

* Add runtime evidence verifier tests

* Add CLI tests for agt verify evidence mode

* Document evidence mode for compliance verification

* Remove local implementation notes

* Document agt verify evidence mode

* Harden evidence path handling in verify

---------

Co-authored-by: T. Smith <[email protected]>

* docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix

- Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID
  - Detailed roles & responsibilities between AGT and Entra/Agent365
  - Architecture diagram showing the identity bridge
  - Step-by-step: DID creation, Entra binding, AKS workload identity,
    token validation, lifecycle sync, access verification
  - Known gaps and limitations table
  - Platform independence note (AWS, GCP, Okta patterns)
- Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh
  for consistency with Python reference implementation and .NET SDK)
- Update tutorials README with Enterprise Identity section

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>
imran-siddique added a commit that referenced this pull request Apr 16, 2026
…#1166)

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: add Entra Agent ID bridge tutorial (Tutorial 31) (#10)

* fix(pipeline): run NuGet ESRP signing on Windows agent (#1022)

The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile,
ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest
produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'.

Changes:
- Add per-job pool override: PublishNuGet runs on windows-latest
- Convert FolderPath and all shell commands to Windows paths
- Replace bash scripts with PowerShell for the Windows agent
- PyPI and npm stages remain on ubuntu-latest (unchanged)
- Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable

Co-authored-by: Copilot <[email protected]>

* docs: reland empty-merge changes from PRs #1017 and #1020 (#1125)

PRs #1017 and #1020 were squash-merged as empty commits (0 file
changes). This commit re-applies the intended documentation updates.

From PR #1017 (critic gaps):
- LIMITATIONS.md: add sections 7 (knowledge governance gap), 8
  (credential persistence gap), 9 (initialization bypass risk)
- LIMITATIONS.md: add knowledge governance and enforcement infra
  rows to 'What AGT Is Not' table
- THREAT_MODEL.md: add knowledge flow and credential persistence
  to residual risks, add configuration bypass vectors table,
  remove stale '10/10' qualifier

From PR #1020 (SOC2 resolved gaps):
- soc2-mapping.md: mark kill switch as resolved (saga handoff
  implemented in kill_switch.py:69-178)
- soc2-mapping.md: mark DeltaEngine verify_chain() as resolved
  (SHA-256 chain verification in delta.py:67-127)
- soc2-mapping.md: add Resolved section to gaps summary, update
  Processing Integrity to 2 of 4 defects (was 3 of 4)

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity (#1021)

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31



* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges



* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.



---------

Co-authored-by: Copilot <[email protected]>

* docs: address external critic gaps (#1025)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add 26 xUnit tests
- Update README

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#6)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code (#8)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(typescript): add MCP security scanner and lifecycle management to TS SDK (#947)

Co-authored-by: Copilot <[email protected]>

* docs: update SDK feature matrix after parity pass (#950)

Reflects new capabilities added in PRs #947 (TS), .NET, Rust, Go:
- TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14)
- .NET: Kill switch + lifecycle management (was 8/14, now 10/14)
- Rust: Execution rings + lifecycle management (was 6/14, now 8/14)
- Go: MCP security + rings + lifecycle (was 4/14, now 7/14)

All SDKs now have lifecycle management. Core governance (policy, identity,
trust, audit) + lifecycle = 5 primitives shared across all 5 languages.

Co-authored-by: Copilot <[email protected]>

* docs: add LIMITATIONS.md - honest design boundaries and layered defense (#953)

Addresses valid external critique of AGT's architectural blind spots:

1. Action vs Intent: AGT governs individual actions, not reasoning or
   action sequences. Documents the compound-action gap explicitly and
   recommends content policies + model safety layers.

2. Audit logs record attempts, not outcomes: Documents that post-action
   state verification is the user's responsibility today, with hooks planned.

3. Performance honesty: README now notes that <0.1ms is policy-eval only;
   distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md.

4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator,
   no mesh/crypto) vs full enterprise stack.

5. Vendor independence: Documents zero cloud dependencies in core,
   standard formats for all state, migration path.

6. Recommended layered defense architecture diagram showing AGT as one
   layer alongside model safety, application logic, and infrastructure.

Co-authored-by: Copilot <[email protected]>

* fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (#954)

Closes #952

Co-authored-by: Copilot <[email protected]>

* feat: reversibility checker, trust calibration guide, escalation tests (#955)

ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]>

* feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (#956)

agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]>

* fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (#957)

- Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0
- Bump version= in 6 setup.py files from 3.0.2 to 3.1.0
- Bump meter version strings in _mcp_metrics.py
- Bump 9 package.json files from 3.0.2 to 3.1.0
- Bump .NET csproj Version from 3.0.2 to 3.1.0
- Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0
- Create Go sdk doc.go with version marker 3.1.0
- Fix ruff W292 (missing newline at EOF) in data_classification.py
- Fix CLI init regex to allow dots in agent names (test_init_special_characters)

Co-authored-by: Copilot <[email protected]>

* fix(openclaw): critical honesty pass — document what works vs what's planned (#958)

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (#959)

* fix(openclaw): critical honesty pass — document what works vs what's planned

Server (__main__.py):
- Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080)

Dockerfile.sidecar:
- Copy modules/ directory (was missing, causing build failure)
- Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers)
- Remove phantom port 9091 (no separate metrics listener exists)

openclaw-sidecar.md — full honesty rewrite:
- Add status banner: transparent interception is NOT yet implemented
- Document actual sidecar API endpoints (health, detect/injection, execute, metrics)
- Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile)
- Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this)
- Replace fictional SLO/Grafana sections with real /api/v1/metrics docs
- Add Roadmap section listing what's planned vs shipped

openshell.md:
- Remove references to non-existent shell scripts
- Fix python -m agentmesh.server to python -m agent_os.server
- Add note that sidecar doesn't transparently intercept (must call API)
- Replace pip install agentmesh-platform with Python skill library usage

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh

cargo package in a workspace writes .crate files to the workspace root's
target/package/, not the individual crate's directory. The pipeline was
running from the crate subdirectory and couldn't find the output.

Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh
to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh
to all cargo commands to target the specific crate.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (#948)

Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from #772.

Co-authored-by: kevinkaylie <[email protected]>

* blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (#899)

Co-authored-by: aymenhmaidiwastaken <[email protected]>

* feat: add LotL prevention policy for security measures (#949)

YAML policy template for Living-off-the-Land detection and prevention.

* feat(examples): add ATR community security rules for PolicyEvaluator (#908)

15 curated ATR detection rules + sync script. Closes #901.

* fix(docs): correct npm package name and stale version refs across 21 files (#960)

- Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files
  (README, QUICKSTART, tutorials, SDK docs, i18n, changelog)
- Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py)
- Remove stale v1.0.0 labels from extension status table
- Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and
  ATF conformance assessment

Co-authored-by: Copilot <[email protected]>

* fix(ci): use ESRP Release for NuGet signing (#961)

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing (#962)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag (#963)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): use EsrpCodeSigning + dotnet push for NuGet (#965)

EsrpRelease@11 does not support NuGet as a contenttype — it's for
PyPI/npm/Maven/crates.io package distribution. NuGet packages must be
signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push.

New flow:
1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405)
2. dotnet nuget push with the signed .nupkg to nuget.org

This matches the standard Microsoft NuGet ESRP signing pattern used by
azure-sdk, dotnet runtime, and other Microsoft OSS projects.

Co-authored-by: Copilot <[email protected]>

* fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (#966)

Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities.

CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration
via Header Injection Chain — prototype pollution gadget enables CRLF
injection in HTTP headers, bypassing AWS IMDSv2 session tokens.

CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing
dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through
attacker-controlled proxy.

Upgraded in 3 packages:
- extensions/copilot: 1.14.0 → 1.15.0
- extensions/cursor:  1.13.5 → 1.15.0
- agent-os-vscode:    1.13.6 → 1.15.0

Co-authored-by: Copilot <[email protected]>

* fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (#967)

The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference,
preventing ESRP authentication across ALL publishing stages (PyPI, npm,
NuGet, crates.io).

Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the
well-known Microsoft corporate tenant ID (72f988bf-..., same default
used by ESRP Release action.yml). This is a public value, not a secret.

Also: NuGet publishing requires Microsoft as co-owner of the package
on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code

- Update SOC2 mapping to reflect CredentialRedactor now redacts
  credential-like secrets before audit persistence (API keys, tokens,
  JWTs, connection strings, etc.). Remaining gap: non-credential PII
  (email, phone, addresses) not yet redacted in audit entries.
- Replace 'kernel-level enforcement' with 'policy-layer enforcement'
  in README, OWASP compliance, and architecture overview to match the
  existing 'application-level governance' framing in README Security
  section and LIMITATIONS.md.
- Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote
  clarifying this means mitigation components exist per risk category,
  not full elimination.
- Update owasp-llm-top10-mapping.md LLM06 row for credential redaction.

Addresses doc/code inconsistencies identified in external review.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

* fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (#1028)

- Remove unused variable profiling_override (F841)
- Remove f-string without placeholders (F541)
- Fix whitespace in docstrings (W293)

Co-authored-by: Copilot <[email protected]>

* fix(ci): add path filters and concurrency; announce v3.1.0 release (#1039)

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: add ADOPTERS.md and make deployment guides multi-cloud (#1040)

- New ADOPTERS.md following Backstage/Flatcar pattern with Production,
  Evaluation, and Academic tables + instructions for adding your org
- Rewrite docs/deployment/README.md from Azure-only to multi-cloud:
  Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE),
  Docker Compose, self-hosted. Updated architecture diagram to show
  cloud-agnostic deployment patterns.
- Fix broken AWS/GCP links (pointed to non-existent paths)
- README now links to 'Deployment Guides' (multi-cloud) instead of
  'Azure Deployment'
- README Contributing section invites adopters to add their org

Co-authored-by: Copilot <[email protected]>

* feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (#1044)

Addresses the #1 developer experience criticism: AGT is too complex to start.

New: agent_os.lite — lightweight governance module
- govern() factory: one line to create a governance gate
- check(action): one line to enforce — raises GovernanceViolation or returns True
- check.is_allowed(action): non-raising bool version
- Allow lists, deny lists, regex patterns, content filtering, rate limiting
- Built-in audit trail and stats
- Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms)
- Zero dependencies beyond stdlib (re, time, datetime)
- 16 tests passing

Fix: govern_in_60_seconds.py quickstart
- BROKEN: was calling PolicyEvaluator.add_rules() which does not exist
- FIXED: now uses agent_os.lite.govern() which actually works
- Verified end-to-end: script runs and produces correct output

The lite module is for developers who just want basic governance
without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc.
Upgrade to the full stack when you need it.

Co-authored-by: Copilot <[email protected]>

* feat(ci): enhance weekly security audit with 7 new scan jobs (#1051)

Add comprehensive security checks based on issues found during
the MSRC-111178 security audit and ongoing post-merge reviews:

- Workflow security regression (MSRC-111178 pull_request_target check)
- Expression injection scan (github.event.* in run: blocks)
- Docker security (root containers, wildcard CORS, hardcoded passwords,
  0.0.0.0 bindings)
- XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True)
- Action SHA pinning compliance
- Version pinning (pyproject.toml upper bounds, Docker :latest tags,
  license field format)
- Dependency confusion with --strict mode (pyproject.toml + package.json)
- Retention days updated to 180 (EU AI Act Art. 26(6))

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (#1057)

- Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.)
- Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md
- Fix policy validation: replace starts_with (invalid) with matches + regex

Co-authored-by: Copilot <[email protected]>

* feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (#1061)

Addresses critical review feedback:

1. Rollback/reversibility (agent_os.reversibility)
   - ReversibilityChecker: pre-execution assessment of action reversibility
   - 4 levels: fully_reversible, partially_reversible, irreversible, unknown
   - CompensatingAction: structured undo plans for each action type
   - Built-in rules for 12 common actions (write, deploy, delete, email, etc.)
   - block_irreversible mode for strict environments

2. Trust score calibration guide (docs/security/trust-score-calibration.md)
   - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%)
   - Decay functions with tier floors
   - Initial score assignments by agent origin
   - Threshold recommendations (conservative/moderate/permissive)
   - Anti-gaming measures and operational playbook

3. Tests: 19 passing (10 escalation + 9 reversibility)

Co-authored-by: Copilot <[email protected]>

* feat: deployment runtime (Docker/AKS) and shared trust core types (#1062)

agent-runtime: Evolve from thin re-export shim to deployment runtime
- DockerDeployer: container deployment with security hardening
  (cap-drop ALL, no-new-privileges, read-only rootfs)
- KubernetesDeployer: AKS pod deployment with governance sidecars
  (runAsNonRoot, seccompProfile, resource limits)
- GovernanceConfig: policy/trust/audit config injected as env vars
- DeploymentTarget protocol for extensibility (ADC, nono, etc.)
- 24 tests (all subprocess calls mocked)

agent-mesh: Extract shared trust types into agentmesh.trust_types
- TrustScore, AgentProfile, TrustRecord, TrustTracker
- Canonical implementations replacing ~800 lines of duplicated code
  across 6+ integration packages
- 25 tests covering clamping, scoring, history, capabilities

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#1065)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#1066)

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#1067)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix: align lotl_prevention_policy.yaml with PolicyDocument schema

The policy file used an incompatible schema format (id, parameter,
regex_match, effect) instead of the expected PolicyDocument fields
(name, condition.field, operator, action). This caused the
validate-policies CI check to fail for all PRs.

Changes:
- id → name
- condition.parameter → condition.field
- operator: regex_match → operator: matches
- action at rule level (shell_exec/file_read) → action: deny
- effect: DENY → removed (redundant with action: deny)
- Added version, name, description, disclaimer at top level

Co-authored-by: Copilot <[email protected]>

* fix: resolve .NET ESRP signing issues blocking NuGet publish

GitHub Actions (publish.yml):
- Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in
  step-level env (invisible to if-expressions). Replace with job-level
  ESRP_CONFIGURED env derived from secrets.
- Add missing ESRP_CERT_IDENTIFIER to signing step env blocks.
- Gate the publish step on ESRP_CONFIGURED so unsigned packages are
  never pushed to NuGet.org under the Microsoft.* prefix.
- Make stub signing steps fail-fast (exit 1) instead of silently
  succeeding, preventing unsigned packages from reaching NuGet push.

ADO Pipeline (esrp-publish.yml):
- Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push
  has a guaranteed SDK version on the Windows agent.

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1163)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1164)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(ci): use PME tenant ID for ESRP cert signing

The ESRP signing cert lives in the PME (Partner Managed Engineering)
tenant (975f013f), not the Microsoft corporate tenant (72f988bf).
Using the wrong tenant ID causes ESRP signing to fail when looking
up the cert.

Co-authored-by: Copilot <[email protected]>

* docs: Add Scaling AI Agents article to COMMUNITY.md (#857)

Co-authored-by: deepsearch <[email protected]>

* Add runtime evidence mode to agt verify (#969)

* Track agt verify evidence plan

* Add runtime evidence mode to agt verify

* Add runtime evidence verifier tests

* Add CLI tests for agt verify evidence mode

* Document evidence mode for compliance verification

* Remove local implementation notes

* Document agt verify evidence mode

* Harden evidence path handling in verify

---------

Co-authored-by: T. Smith <[email protected]>

* docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix

- Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID
  - Detailed roles & responsibilities between AGT and Entra/Agent365
  - Architecture diagram showing the identity bridge
  - Step-by-step: DID creation, Entra binding, AKS workload identity,
    token validation, lifecycle sync, access verification
  - Known gaps and limitations table
  - Platform independence note (AWS, GCP, Okta patterns)
- Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh
  for consistency with Python reference implementation and .NET SDK)
- Update tutorials README with Enterprise Identity section

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>
imran-siddique added a commit that referenced this pull request Apr 16, 2026
…) + critic gap docs (#1170)

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: add Entra Agent ID bridge tutorial (Tutorial 31) (#10)

* fix(pipeline): run NuGet ESRP signing on Windows agent (#1022)

The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile,
ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest
produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'.

Changes:
- Add per-job pool override: PublishNuGet runs on windows-latest
- Convert FolderPath and all shell commands to Windows paths
- Replace bash scripts with PowerShell for the Windows agent
- PyPI and npm stages remain on ubuntu-latest (unchanged)
- Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable

Co-authored-by: Copilot <[email protected]>

* docs: reland empty-merge changes from PRs #1017 and #1020 (#1125)

PRs #1017 and #1020 were squash-merged as empty commits (0 file
changes). This commit re-applies the intended documentation updates.

From PR #1017 (critic gaps):
- LIMITATIONS.md: add sections 7 (knowledge governance gap), 8
  (credential persistence gap), 9 (initialization bypass risk)
- LIMITATIONS.md: add knowledge governance and enforcement infra
  rows to 'What AGT Is Not' table
- THREAT_MODEL.md: add knowledge flow and credential persistence
  to residual risks, add configuration bypass vectors table,
  remove stale '10/10' qualifier

From PR #1020 (SOC2 resolved gaps):
- soc2-mapping.md: mark kill switch as resolved (saga handoff
  implemented in kill_switch.py:69-178)
- soc2-mapping.md: mark DeltaEngine verify_chain() as resolved
  (SHA-256 chain verification in delta.py:67-127)
- soc2-mapping.md: add Resolved section to gaps summary, update
  Processing Integrity to 2 of 4 defects (was 3 of 4)

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity (#1021)

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31



* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges



* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.



---------

Co-authored-by: Copilot <[email protected]>

* docs: address external critic gaps (#1025)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add 26 xUnit tests
- Update README

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#6)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code (#8)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(typescript): add MCP security scanner and lifecycle management to TS SDK (#947)

Co-authored-by: Copilot <[email protected]>

* docs: update SDK feature matrix after parity pass (#950)

Reflects new capabilities added in PRs #947 (TS), .NET, Rust, Go:
- TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14)
- .NET: Kill switch + lifecycle management (was 8/14, now 10/14)
- Rust: Execution rings + lifecycle management (was 6/14, now 8/14)
- Go: MCP security + rings + lifecycle (was 4/14, now 7/14)

All SDKs now have lifecycle management. Core governance (policy, identity,
trust, audit) + lifecycle = 5 primitives shared across all 5 languages.

Co-authored-by: Copilot <[email protected]>

* docs: add LIMITATIONS.md - honest design boundaries and layered defense (#953)

Addresses valid external critique of AGT's architectural blind spots:

1. Action vs Intent: AGT governs individual actions, not reasoning or
   action sequences. Documents the compound-action gap explicitly and
   recommends content policies + model safety layers.

2. Audit logs record attempts, not outcomes: Documents that post-action
   state verification is the user's responsibility today, with hooks planned.

3. Performance honesty: README now notes that <0.1ms is policy-eval only;
   distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md.

4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator,
   no mesh/crypto) vs full enterprise stack.

5. Vendor independence: Documents zero cloud dependencies in core,
   standard formats for all state, migration path.

6. Recommended layered defense architecture diagram showing AGT as one
   layer alongside model safety, application logic, and infrastructure.

Co-authored-by: Copilot <[email protected]>

* fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (#954)

Closes #952

Co-authored-by: Copilot <[email protected]>

* feat: reversibility checker, trust calibration guide, escalation tests (#955)

ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]>

* feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (#956)

agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]>

* fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (#957)

- Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0
- Bump version= in 6 setup.py files from 3.0.2 to 3.1.0
- Bump meter version strings in _mcp_metrics.py
- Bump 9 package.json files from 3.0.2 to 3.1.0
- Bump .NET csproj Version from 3.0.2 to 3.1.0
- Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0
- Create Go sdk doc.go with version marker 3.1.0
- Fix ruff W292 (missing newline at EOF) in data_classification.py
- Fix CLI init regex to allow dots in agent names (test_init_special_characters)

Co-authored-by: Copilot <[email protected]>

* fix(openclaw): critical honesty pass — document what works vs what's planned (#958)

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (#959)

* fix(openclaw): critical honesty pass — document what works vs what's planned

Server (__main__.py):
- Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080)

Dockerfile.sidecar:
- Copy modules/ directory (was missing, causing build failure)
- Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers)
- Remove phantom port 9091 (no separate metrics listener exists)

openclaw-sidecar.md — full honesty rewrite:
- Add status banner: transparent interception is NOT yet implemented
- Document actual sidecar API endpoints (health, detect/injection, execute, metrics)
- Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile)
- Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this)
- Replace fictional SLO/Grafana sections with real /api/v1/metrics docs
- Add Roadmap section listing what's planned vs shipped

openshell.md:
- Remove references to non-existent shell scripts
- Fix python -m agentmesh.server to python -m agent_os.server
- Add note that sidecar doesn't transparently intercept (must call API)
- Replace pip install agentmesh-platform with Python skill library usage

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh

cargo package in a workspace writes .crate files to the workspace root's
target/package/, not the individual crate's directory. The pipeline was
running from the crate subdirectory and couldn't find the output.

Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh
to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh
to all cargo commands to target the specific crate.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (#948)

Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from #772.

Co-authored-by: kevinkaylie <[email protected]>

* blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (#899)

Co-authored-by: aymenhmaidiwastaken <[email protected]>

* feat: add LotL prevention policy for security measures (#949)

YAML policy template for Living-off-the-Land detection and prevention.

* feat(examples): add ATR community security rules for PolicyEvaluator (#908)

15 curated ATR detection rules + sync script. Closes #901.

* fix(docs): correct npm package name and stale version refs across 21 files (#960)

- Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files
  (README, QUICKSTART, tutorials, SDK docs, i18n, changelog)
- Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py)
- Remove stale v1.0.0 labels from extension status table
- Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and
  ATF conformance assessment

Co-authored-by: Copilot <[email protected]>

* fix(ci): use ESRP Release for NuGet signing (#961)

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing (#962)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag (#963)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): use EsrpCodeSigning + dotnet push for NuGet (#965)

EsrpRelease@11 does not support NuGet as a contenttype — it's for
PyPI/npm/Maven/crates.io package distribution. NuGet packages must be
signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push.

New flow:
1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405)
2. dotnet nuget push with the signed .nupkg to nuget.org

This matches the standard Microsoft NuGet ESRP signing pattern used by
azure-sdk, dotnet runtime, and other Microsoft OSS projects.

Co-authored-by: Copilot <[email protected]>

* fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (#966)

Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities.

CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration
via Header Injection Chain — prototype pollution gadget enables CRLF
injection in HTTP headers, bypassing AWS IMDSv2 session tokens.

CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing
dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through
attacker-controlled proxy.

Upgraded in 3 packages:
- extensions/copilot: 1.14.0 → 1.15.0
- extensions/cursor:  1.13.5 → 1.15.0
- agent-os-vscode:    1.13.6 → 1.15.0

Co-authored-by: Copilot <[email protected]>

* fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (#967)

The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference,
preventing ESRP authentication across ALL publishing stages (PyPI, npm,
NuGet, crates.io).

Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the
well-known Microsoft corporate tenant ID (72f988bf-..., same default
used by ESRP Release action.yml). This is a public value, not a secret.

Also: NuGet publishing requires Microsoft as co-owner of the package
on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code

- Update SOC2 mapping to reflect CredentialRedactor now redacts
  credential-like secrets before audit persistence (API keys, tokens,
  JWTs, connection strings, etc.). Remaining gap: non-credential PII
  (email, phone, addresses) not yet redacted in audit entries.
- Replace 'kernel-level enforcement' with 'policy-layer enforcement'
  in README, OWASP compliance, and architecture overview to match the
  existing 'application-level governance' framing in README Security
  section and LIMITATIONS.md.
- Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote
  clarifying this means mitigation components exist per risk category,
  not full elimination.
- Update owasp-llm-top10-mapping.md LLM06 row for credential redaction.

Addresses doc/code inconsistencies identified in external review.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

* fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (#1028)

- Remove unused variable profiling_override (F841)
- Remove f-string without placeholders (F541)
- Fix whitespace in docstrings (W293)

Co-authored-by: Copilot <[email protected]>

* fix(ci): add path filters and concurrency; announce v3.1.0 release (#1039)

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: add ADOPTERS.md and make deployment guides multi-cloud (#1040)

- New ADOPTERS.md following Backstage/Flatcar pattern with Production,
  Evaluation, and Academic tables + instructions for adding your org
- Rewrite docs/deployment/README.md from Azure-only to multi-cloud:
  Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE),
  Docker Compose, self-hosted. Updated architecture diagram to show
  cloud-agnostic deployment patterns.
- Fix broken AWS/GCP links (pointed to non-existent paths)
- README now links to 'Deployment Guides' (multi-cloud) instead of
  'Azure Deployment'
- README Contributing section invites adopters to add their org

Co-authored-by: Copilot <[email protected]>

* feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (#1044)

Addresses the #1 developer experience criticism: AGT is too complex to start.

New: agent_os.lite — lightweight governance module
- govern() factory: one line to create a governance gate
- check(action): one line to enforce — raises GovernanceViolation or returns True
- check.is_allowed(action): non-raising bool version
- Allow lists, deny lists, regex patterns, content filtering, rate limiting
- Built-in audit trail and stats
- Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms)
- Zero dependencies beyond stdlib (re, time, datetime)
- 16 tests passing

Fix: govern_in_60_seconds.py quickstart
- BROKEN: was calling PolicyEvaluator.add_rules() which does not exist
- FIXED: now uses agent_os.lite.govern() which actually works
- Verified end-to-end: script runs and produces correct output

The lite module is for developers who just want basic governance
without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc.
Upgrade to the full stack when you need it.

Co-authored-by: Copilot <[email protected]>

* feat(ci): enhance weekly security audit with 7 new scan jobs (#1051)

Add comprehensive security checks based on issues found during
the MSRC-111178 security audit and ongoing post-merge reviews:

- Workflow security regression (MSRC-111178 pull_request_target check)
- Expression injection scan (github.event.* in run: blocks)
- Docker security (root containers, wildcard CORS, hardcoded passwords,
  0.0.0.0 bindings)
- XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True)
- Action SHA pinning compliance
- Version pinning (pyproject.toml upper bounds, Docker :latest tags,
  license field format)
- Dependency confusion with --strict mode (pyproject.toml + package.json)
- Retention days updated to 180 (EU AI Act Art. 26(6))

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (#1057)

- Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.)
- Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md
- Fix policy validation: replace starts_with (invalid) with matches + regex

Co-authored-by: Copilot <[email protected]>

* feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (#1061)

Addresses critical review feedback:

1. Rollback/reversibility (agent_os.reversibility)
   - ReversibilityChecker: pre-execution assessment of action reversibility
   - 4 levels: fully_reversible, partially_reversible, irreversible, unknown
   - CompensatingAction: structured undo plans for each action type
   - Built-in rules for 12 common actions (write, deploy, delete, email, etc.)
   - block_irreversible mode for strict environments

2. Trust score calibration guide (docs/security/trust-score-calibration.md)
   - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%)
   - Decay functions with tier floors
   - Initial score assignments by agent origin
   - Threshold recommendations (conservative/moderate/permissive)
   - Anti-gaming measures and operational playbook

3. Tests: 19 passing (10 escalation + 9 reversibility)

Co-authored-by: Copilot <[email protected]>

* feat: deployment runtime (Docker/AKS) and shared trust core types (#1062)

agent-runtime: Evolve from thin re-export shim to deployment runtime
- DockerDeployer: container deployment with security hardening
  (cap-drop ALL, no-new-privileges, read-only rootfs)
- KubernetesDeployer: AKS pod deployment with governance sidecars
  (runAsNonRoot, seccompProfile, resource limits)
- GovernanceConfig: policy/trust/audit config injected as env vars
- DeploymentTarget protocol for extensibility (ADC, nono, etc.)
- 24 tests (all subprocess calls mocked)

agent-mesh: Extract shared trust types into agentmesh.trust_types
- TrustScore, AgentProfile, TrustRecord, TrustTracker
- Canonical implementations replacing ~800 lines of duplicated code
  across 6+ integration packages
- 25 tests covering clamping, scoring, history, capabilities

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#1065)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#1066)

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#1067)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix: align lotl_prevention_policy.yaml with PolicyDocument schema

The policy file used an incompatible schema format (id, parameter,
regex_match, effect) instead of the expected PolicyDocument fields
(name, condition.field, operator, action). This caused the
validate-policies CI check to fail for all PRs.

Changes:
- id → name
- condition.parameter → condition.field
- operator: regex_match → operator: matches
- action at rule level (shell_exec/file_read) → action: deny
- effect: DENY → removed (redundant with action: deny)
- Added version, name, description, disclaimer at top level

Co-authored-by: Copilot <[email protected]>

* fix: resolve .NET ESRP signing issues blocking NuGet publish

GitHub Actions (publish.yml):
- Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in
  step-level env (invisible to if-expressions). Replace with job-level
  ESRP_CONFIGURED env derived from secrets.
- Add missing ESRP_CERT_IDENTIFIER to signing step env blocks.
- Gate the publish step on ESRP_CONFIGURED so unsigned packages are
  never pushed to NuGet.org under the Microsoft.* prefix.
- Make stub signing steps fail-fast (exit 1) instead of silently
  succeeding, preventing unsigned packages from reaching NuGet push.

ADO Pipeline (esrp-publish.yml):
- Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push
  has a guaranteed SDK version on the Windows agent.

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1163)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1164)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(ci): use PME tenant ID for ESRP cert signing

The ESRP signing cert lives in the PME (Partner Managed Engineering)
tenant (975f013f), not the Microsoft corporate tenant (72f988bf).
Using the wrong tenant ID causes ESRP signing to fail when looking
up the cert.

Co-authored-by: Copilot <[email protected]>

* docs: Add Scaling AI Agents article to COMMUNITY.md (#857)

Co-authored-by: deepsearch <[email protected]>

* Add runtime evidence mode to agt verify (#969)

* Track agt verify evidence plan

* Add runtime evidence mode to agt verify

* Add runtime evidence verifier tests

* Add CLI tests for agt verify evidence mode

* Document evidence mode for compliance verification

* Remove local implementation notes

* Document agt verify evidence mode

* Harden evidence path handling in verify

---------

Co-authored-by: T. Smith <[email protected]>

* docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix

- Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID
  - Detailed roles & responsibilities between AGT and Entra/Agent365
  - Architecture diagram showing the identity bridge
  - Step-by-step: DID creation, Entra binding, AKS workload identity,
    token validation, lifecycle sync, access verification
  - Known gaps and limitations table
  - Platform independence note (AWS, GCP, Okta patterns)
- Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh
  for consistency with Python reference implementation and .NET SDK)
- Update tutorials README with Enterprise Identity section

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>

* docs: address external critic gaps in limitations and threat model (#11)

Add three new sections to LIMITATIONS.md addressing gaps identified in
public criticism and external security analysis:

- §10 Physical AI and Embodied Agent Governance: documents that AGT
  governs software agents not physical actuators, with mitigations
- §11 Streaming Data and Real-Time Assurance: documents that AGT
  evaluates per-action not continuously over data streams
- §12 DID Method Inconsistency Across SDKs: documents the did:mesh
  vs did:agentmesh split with migration plan for v4.0

Update THREAT_MODEL.md residual risks to reference all three new
limitation sections.

Co-authored-by: Copilot <[email protected]>

* fix!: standardize DID method to did:agentmesh across all SDKs (#12)

* fix!: standardize DID method to did:agentmesh across all SDKs

BREAKING CHANGE: All agent DIDs now use the did:agentmesh: prefix.
The legacy did:mesh: prefix used by Python and .NET has been migrated
to match the did:agentmesh: convention already used by TypeScript,
Rust, and Go SDKs.

Changes:
- Python: agent_id.py, delegation.py, entra.py, all integrations
- .NET: AgentIdentity.cs, Jwk.cs, GovernanceKernel.cs, all tests
- Docs: README, tutorials, identity docs, FAQ, compliance docs
- Tests: all test fixtures updated across Python, .NET, TS, VSCode
- Version bump: 3.1.0 → 3.2.0 (.NET, Python agent-mesh, TypeScript)

Migration: replace did:mesh: with did:agentmesh: in your policies,
identity registries, and agent configurations.

Co-authored-by: Copilot <[email protected]>

* docs: add Q11-Q13 to FAQ — AGT scope, Agent 365, and DLP comparison

Adds three new customer Q&As:
- Q11: Is AGT for Foundry agents or any agent type? (any)
- Q12: Relationship between AGT and Agent 365 (different layers)
- Q13: How is AGT different from DLP/communication compliance
  (content vs action governance)

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>
imran-siddique added a commit that referenced this pull request Apr 17, 2026
…1192)

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: add Entra Agent ID bridge tutorial (Tutorial 31) (#10)

* fix(pipeline): run NuGet ESRP signing on Windows agent (#1022)

The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile,
ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest
produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'.

Changes:
- Add per-job pool override: PublishNuGet runs on windows-latest
- Convert FolderPath and all shell commands to Windows paths
- Replace bash scripts with PowerShell for the Windows agent
- PyPI and npm stages remain on ubuntu-latest (unchanged)
- Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable

Co-authored-by: Copilot <[email protected]>

* docs: reland empty-merge changes from PRs #1017 and #1020 (#1125)

PRs #1017 and #1020 were squash-merged as empty commits (0 file
changes). This commit re-applies the intended documentation updates.

From PR #1017 (critic gaps):
- LIMITATIONS.md: add sections 7 (knowledge governance gap), 8
  (credential persistence gap), 9 (initialization bypass risk)
- LIMITATIONS.md: add knowledge governance and enforcement infra
  rows to 'What AGT Is Not' table
- THREAT_MODEL.md: add knowledge flow and credential persistence
  to residual risks, add configuration bypass vectors table,
  remove stale '10/10' qualifier

From PR #1020 (SOC2 resolved gaps):
- soc2-mapping.md: mark kill switch as resolved (saga handoff
  implemented in kill_switch.py:69-178)
- soc2-mapping.md: mark DeltaEngine verify_chain() as resolved
  (SHA-256 chain verification in delta.py:67-127)
- soc2-mapping.md: add Resolved section to gaps summary, update
  Processing Integrity to 2 of 4 defects (was 3 of 4)

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity (#1021)

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31



* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges



* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.



---------

Co-authored-by: Copilot <[email protected]>

* docs: address external critic gaps (#1025)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add 26 xUnit tests
- Update README

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#6)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code (#8)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(typescript): add MCP security scanner and lifecycle management to TS SDK (#947)

Co-authored-by: Copilot <[email protected]>

* docs: update SDK feature matrix after parity pass (#950)

Reflects new capabilities added in PRs #947 (TS), .NET, Rust, Go:
- TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14)
- .NET: Kill switch + lifecycle management (was 8/14, now 10/14)
- Rust: Execution rings + lifecycle management (was 6/14, now 8/14)
- Go: MCP security + rings + lifecycle (was 4/14, now 7/14)

All SDKs now have lifecycle management. Core governance (policy, identity,
trust, audit) + lifecycle = 5 primitives shared across all 5 languages.

Co-authored-by: Copilot <[email protected]>

* docs: add LIMITATIONS.md - honest design boundaries and layered defense (#953)

Addresses valid external critique of AGT's architectural blind spots:

1. Action vs Intent: AGT governs individual actions, not reasoning or
   action sequences. Documents the compound-action gap explicitly and
   recommends content policies + model safety layers.

2. Audit logs record attempts, not outcomes: Documents that post-action
   state verification is the user's responsibility today, with hooks planned.

3. Performance honesty: README now notes that <0.1ms is policy-eval only;
   distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md.

4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator,
   no mesh/crypto) vs full enterprise stack.

5. Vendor independence: Documents zero cloud dependencies in core,
   standard formats for all state, migration path.

6. Recommended layered defense architecture diagram showing AGT as one
   layer alongside model safety, application logic, and infrastructure.

Co-authored-by: Copilot <[email protected]>

* fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (#954)

Closes #952

Co-authored-by: Copilot <[email protected]>

* feat: reversibility checker, trust calibration guide, escalation tests (#955)

ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]>

* feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (#956)

agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]>

* fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (#957)

- Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0
- Bump version= in 6 setup.py files from 3.0.2 to 3.1.0
- Bump meter version strings in _mcp_metrics.py
- Bump 9 package.json files from 3.0.2 to 3.1.0
- Bump .NET csproj Version from 3.0.2 to 3.1.0
- Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0
- Create Go sdk doc.go with version marker 3.1.0
- Fix ruff W292 (missing newline at EOF) in data_classification.py
- Fix CLI init regex to allow dots in agent names (test_init_special_characters)

Co-authored-by: Copilot <[email protected]>

* fix(openclaw): critical honesty pass — document what works vs what's planned (#958)

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (#959)

* fix(openclaw): critical honesty pass — document what works vs what's planned

Server (__main__.py):
- Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080)

Dockerfile.sidecar:
- Copy modules/ directory (was missing, causing build failure)
- Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers)
- Remove phantom port 9091 (no separate metrics listener exists)

openclaw-sidecar.md — full honesty rewrite:
- Add status banner: transparent interception is NOT yet implemented
- Document actual sidecar API endpoints (health, detect/injection, execute, metrics)
- Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile)
- Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this)
- Replace fictional SLO/Grafana sections with real /api/v1/metrics docs
- Add Roadmap section listing what's planned vs shipped

openshell.md:
- Remove references to non-existent shell scripts
- Fix python -m agentmesh.server to python -m agent_os.server
- Add note that sidecar doesn't transparently intercept (must call API)
- Replace pip install agentmesh-platform with Python skill library usage

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh

cargo package in a workspace writes .crate files to the workspace root's
target/package/, not the individual crate's directory. The pipeline was
running from the crate subdirectory and couldn't find the output.

Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh
to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh
to all cargo commands to target the specific crate.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (#948)

Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from #772.

Co-authored-by: kevinkaylie <[email protected]>

* blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (#899)

Co-authored-by: aymenhmaidiwastaken <[email protected]>

* feat: add LotL prevention policy for security measures (#949)

YAML policy template for Living-off-the-Land detection and prevention.

* feat(examples): add ATR community security rules for PolicyEvaluator (#908)

15 curated ATR detection rules + sync script. Closes #901.

* fix(docs): correct npm package name and stale version refs across 21 files (#960)

- Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files
  (README, QUICKSTART, tutorials, SDK docs, i18n, changelog)
- Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py)
- Remove stale v1.0.0 labels from extension status table
- Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and
  ATF conformance assessment

Co-authored-by: Copilot <[email protected]>

* fix(ci): use ESRP Release for NuGet signing (#961)

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing (#962)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag (#963)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): use EsrpCodeSigning + dotnet push for NuGet (#965)

EsrpRelease@11 does not support NuGet as a contenttype — it's for
PyPI/npm/Maven/crates.io package distribution. NuGet packages must be
signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push.

New flow:
1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405)
2. dotnet nuget push with the signed .nupkg to nuget.org

This matches the standard Microsoft NuGet ESRP signing pattern used by
azure-sdk, dotnet runtime, and other Microsoft OSS projects.

Co-authored-by: Copilot <[email protected]>

* fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (#966)

Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities.

CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration
via Header Injection Chain — prototype pollution gadget enables CRLF
injection in HTTP headers, bypassing AWS IMDSv2 session tokens.

CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing
dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through
attacker-controlled proxy.

Upgraded in 3 packages:
- extensions/copilot: 1.14.0 → 1.15.0
- extensions/cursor:  1.13.5 → 1.15.0
- agent-os-vscode:    1.13.6 → 1.15.0

Co-authored-by: Copilot <[email protected]>

* fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (#967)

The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference,
preventing ESRP authentication across ALL publishing stages (PyPI, npm,
NuGet, crates.io).

Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the
well-known Microsoft corporate tenant ID (72f988bf-..., same default
used by ESRP Release action.yml). This is a public value, not a secret.

Also: NuGet publishing requires Microsoft as co-owner of the package
on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code

- Update SOC2 mapping to reflect CredentialRedactor now redacts
  credential-like secrets before audit persistence (API keys, tokens,
  JWTs, connection strings, etc.). Remaining gap: non-credential PII
  (email, phone, addresses) not yet redacted in audit entries.
- Replace 'kernel-level enforcement' with 'policy-layer enforcement'
  in README, OWASP compliance, and architecture overview to match the
  existing 'application-level governance' framing in README Security
  section and LIMITATIONS.md.
- Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote
  clarifying this means mitigation components exist per risk category,
  not full elimination.
- Update owasp-llm-top10-mapping.md LLM06 row for credential redaction.

Addresses doc/code inconsistencies identified in external review.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

* fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (#1028)

- Remove unused variable profiling_override (F841)
- Remove f-string without placeholders (F541)
- Fix whitespace in docstrings (W293)

Co-authored-by: Copilot <[email protected]>

* fix(ci): add path filters and concurrency; announce v3.1.0 release (#1039)

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: add ADOPTERS.md and make deployment guides multi-cloud (#1040)

- New ADOPTERS.md following Backstage/Flatcar pattern with Production,
  Evaluation, and Academic tables + instructions for adding your org
- Rewrite docs/deployment/README.md from Azure-only to multi-cloud:
  Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE),
  Docker Compose, self-hosted. Updated architecture diagram to show
  cloud-agnostic deployment patterns.
- Fix broken AWS/GCP links (pointed to non-existent paths)
- README now links to 'Deployment Guides' (multi-cloud) instead of
  'Azure Deployment'
- README Contributing section invites adopters to add their org

Co-authored-by: Copilot <[email protected]>

* feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (#1044)

Addresses the #1 developer experience criticism: AGT is too complex to start.

New: agent_os.lite — lightweight governance module
- govern() factory: one line to create a governance gate
- check(action): one line to enforce — raises GovernanceViolation or returns True
- check.is_allowed(action): non-raising bool version
- Allow lists, deny lists, regex patterns, content filtering, rate limiting
- Built-in audit trail and stats
- Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms)
- Zero dependencies beyond stdlib (re, time, datetime)
- 16 tests passing

Fix: govern_in_60_seconds.py quickstart
- BROKEN: was calling PolicyEvaluator.add_rules() which does not exist
- FIXED: now uses agent_os.lite.govern() which actually works
- Verified end-to-end: script runs and produces correct output

The lite module is for developers who just want basic governance
without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc.
Upgrade to the full stack when you need it.

Co-authored-by: Copilot <[email protected]>

* feat(ci): enhance weekly security audit with 7 new scan jobs (#1051)

Add comprehensive security checks based on issues found during
the MSRC-111178 security audit and ongoing post-merge reviews:

- Workflow security regression (MSRC-111178 pull_request_target check)
- Expression injection scan (github.event.* in run: blocks)
- Docker security (root containers, wildcard CORS, hardcoded passwords,
  0.0.0.0 bindings)
- XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True)
- Action SHA pinning compliance
- Version pinning (pyproject.toml upper bounds, Docker :latest tags,
  license field format)
- Dependency confusion with --strict mode (pyproject.toml + package.json)
- Retention days updated to 180 (EU AI Act Art. 26(6))

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (#1057)

- Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.)
- Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md
- Fix policy validation: replace starts_with (invalid) with matches + regex

Co-authored-by: Copilot <[email protected]>

* feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (#1061)

Addresses critical review feedback:

1. Rollback/reversibility (agent_os.reversibility)
   - ReversibilityChecker: pre-execution assessment of action reversibility
   - 4 levels: fully_reversible, partially_reversible, irreversible, unknown
   - CompensatingAction: structured undo plans for each action type
   - Built-in rules for 12 common actions (write, deploy, delete, email, etc.)
   - block_irreversible mode for strict environments

2. Trust score calibration guide (docs/security/trust-score-calibration.md)
   - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%)
   - Decay functions with tier floors
   - Initial score assignments by agent origin
   - Threshold recommendations (conservative/moderate/permissive)
   - Anti-gaming measures and operational playbook

3. Tests: 19 passing (10 escalation + 9 reversibility)

Co-authored-by: Copilot <[email protected]>

* feat: deployment runtime (Docker/AKS) and shared trust core types (#1062)

agent-runtime: Evolve from thin re-export shim to deployment runtime
- DockerDeployer: container deployment with security hardening
  (cap-drop ALL, no-new-privileges, read-only rootfs)
- KubernetesDeployer: AKS pod deployment with governance sidecars
  (runAsNonRoot, seccompProfile, resource limits)
- GovernanceConfig: policy/trust/audit config injected as env vars
- DeploymentTarget protocol for extensibility (ADC, nono, etc.)
- 24 tests (all subprocess calls mocked)

agent-mesh: Extract shared trust types into agentmesh.trust_types
- TrustScore, AgentProfile, TrustRecord, TrustTracker
- Canonical implementations replacing ~800 lines of duplicated code
  across 6+ integration packages
- 25 tests covering clamping, scoring, history, capabilities

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#1065)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#1066)

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#1067)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix: align lotl_prevention_policy.yaml with PolicyDocument schema

The policy file used an incompatible schema format (id, parameter,
regex_match, effect) instead of the expected PolicyDocument fields
(name, condition.field, operator, action). This caused the
validate-policies CI check to fail for all PRs.

Changes:
- id → name
- condition.parameter → condition.field
- operator: regex_match → operator: matches
- action at rule level (shell_exec/file_read) → action: deny
- effect: DENY → removed (redundant with action: deny)
- Added version, name, description, disclaimer at top level

Co-authored-by: Copilot <[email protected]>

* fix: resolve .NET ESRP signing issues blocking NuGet publish

GitHub Actions (publish.yml):
- Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in
  step-level env (invisible to if-expressions). Replace with job-level
  ESRP_CONFIGURED env derived from secrets.
- Add missing ESRP_CERT_IDENTIFIER to signing step env blocks.
- Gate the publish step on ESRP_CONFIGURED so unsigned packages are
  never pushed to NuGet.org under the Microsoft.* prefix.
- Make stub signing steps fail-fast (exit 1) instead of silently
  succeeding, preventing unsigned packages from reaching NuGet push.

ADO Pipeline (esrp-publish.yml):
- Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push
  has a guaranteed SDK version on the Windows agent.

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1163)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1164)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(ci): use PME tenant ID for ESRP cert signing

The ESRP signing cert lives in the PME (Partner Managed Engineering)
tenant (975f013f), not the Microsoft corporate tenant (72f988bf).
Using the wrong tenant ID causes ESRP signing to fail when looking
up the cert.

Co-authored-by: Copilot <[email protected]>

* docs: Add Scaling AI Agents article to COMMUNITY.md (#857)

Co-authored-by: deepsearch <[email protected]>

* Add runtime evidence mode to agt verify (#969)

* Track agt verify evidence plan

* Add runtime evidence mode to agt verify

* Add runtime evidence verifier tests

* Add CLI tests for agt verify evidence mode

* Document evidence mode for compliance verification

* Remove local implementation notes

* Document agt verify evidence mode

* Harden evidence path handling in verify

---------

Co-authored-by: T. Smith <[email protected]>

* docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix

- Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID
  - Detailed roles & responsibilities between AGT and Entra/Agent365
  - Architecture diagram showing the identity bridge
  - Step-by-step: DID creation, Entra binding, AKS workload identity,
    token validation, lifecycle sync, access verification
  - Known gaps and limitations table
  - Platform independence note (AWS, GCP, Okta patterns)
- Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh
  for consistency with Python reference implementation and .NET SDK)
- Update tutorials README with Enterprise Identity section

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>

* docs: address external critic gaps in limitations and threat model (#11)

Add three new sections to LIMITATIONS.md addressing gaps identified in
public criticism and external security analysis:

- §10 Physical AI and Embodied Agent Governance: documents that AGT
  governs software agents not physical actuators, with mitigations
- §11 Streaming Data and Real-Time Assurance: documents that AGT
  evaluates per-action not continuously over data streams
- §12 DID Method Inconsistency Across SDKs: documents the did:mesh
  vs did:agentmesh split with migration plan for v4.0

Update THREAT_MODEL.md residual risks to reference all three new
limitation sections.

Co-authored-by: Copilot <[email protected]>

* fix!: standardize DID method to did:agentmesh across all SDKs (#12)

* fix!: standardize DID method to did:agentmesh across all SDKs

BREAKING CHANGE: All agent DIDs now use the did:agentmesh: prefix.
The legacy did:mesh: prefix used by Python and .NET has been migrated
to match the did:agentmesh: convention already used by TypeScript,
Rust, and Go SDKs.

Changes:
- Python: agent_id.py, delegation.py, entra.py, all integrations
- .NET: AgentIdentity.cs, Jwk.cs, GovernanceKernel.cs, all tests
- Docs: README, tutorials, identity docs, FAQ, compliance docs
- Tests: all test fixtures updated across Python, .NET, TS, VSCode
- Version bump: 3.1.0 → 3.2.0 (.NET, Python agent-mesh, TypeScript)

Migration: replace did:mesh: with did:agentmesh: in your policies,
identity registries, and agent configurations.

Co-authored-by: Copilot <[email protected]>

* docs: add Q11-Q13 to FAQ — AGT scope, Agent 365, and DLP comparison

Adds three new customer Q&As:
- Q11: Is AGT for Foundry agents or any agent type? (any)
- Q12: Relationship between AGT and Agent 365 (different layers)
- Q13: How is AGT different from DLP/communication compliance
  (content vs action governance)

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix lint errors and remove pi-mono breaking dep scan

Two CI failures on main:

1. lint (agent-compliance): W293/W292 trailing whitespace and missing
   newlines in agt.py and verify.py — fixed.

2. dependency-scan: pi-mono-agentmesh references unregistered npm
   packages — removed entire pi-mono integration that was merged
   from draft PR #970 without proper review.

Co-authored-by: Copilot <[email protected]>

* feat: add AgentMesh component container images and GHCR publishing

Add FastAPI server entrypoints for all four AgentMesh components:
- trust-engine (port 8443): Agent identity verification, IATP handshakes
- policy-server (port 8444): Governance policy evaluation from YAML/JSON
- audit-collector (port 8445): Merkle-chained audit logging with persistence
- api-gateway (port 8446): Reverse proxy with per-agent rate limiting

Infrastructure:
- Single Dockerfile with COMPONENT build arg (non-root, tini, health checks)
- GitHub Actions workflow for GHCR publishing (multi-arch amd64/arm64)
- Helm chart updated to reference ghcr.io/microsoft/agentmesh/* images
- 28 integration tests covering all server endpoints

Resolves the missing container images that blocked full AgentMesh cluster
deployment (images were referenced in Helm chart but never built).

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>
imran-siddique added a commit that referenced this pull request Apr 20, 2026
* feat(dotnet): add MCP security namespace — completes cross-language MCP parity

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: add Entra Agent ID bridge tutorial (Tutorial 31) (#10)

* fix(pipeline): run NuGet ESRP signing on Windows agent (#1022)

The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile,
ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest
produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'.

Changes:
- Add per-job pool override: PublishNuGet runs on windows-latest
- Convert FolderPath and all shell commands to Windows paths
- Replace bash scripts with PowerShell for the Windows agent
- PyPI and npm stages remain on ubuntu-latest (unchanged)
- Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable

Co-authored-by: Copilot <[email protected]>

* docs: reland empty-merge changes from PRs #1017 and #1020 (#1125)

PRs #1017 and #1020 were squash-merged as empty commits (0 file
changes). This commit re-applies the intended documentation updates.

From PR #1017 (critic gaps):
- LIMITATIONS.md: add sections 7 (knowledge governance gap), 8
  (credential persistence gap), 9 (initialization bypass risk)
- LIMITATIONS.md: add knowledge governance and enforcement infra
  rows to 'What AGT Is Not' table
- THREAT_MODEL.md: add knowledge flow and credential persistence
  to residual risks, add configuration bypass vectors table,
  remove stale '10/10' qualifier

From PR #1020 (SOC2 resolved gaps):
- soc2-mapping.md: mark kill switch as resolved (saga handoff
  implemented in kill_switch.py:69-178)
- soc2-mapping.md: mark DeltaEngine verify_chain() as resolved
  (SHA-256 chain verification in delta.py:67-127)
- soc2-mapping.md: add Resolved section to gaps summary, update
  Processing Integrity to 2 of 4 defects (was 3 of 4)

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add MCP security namespace — completes cross-language MCP parity (#1021)

* fix(ci): add path filters and concurrency; announce v3.1.0 release

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31



* docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps

- Kill switch is no longer placeholder: now implements saga handoff
  with handoff_success_count tracking (kill_switch.py:69-178)
- DeltaEngine verify_chain() is no longer a stub: now performs SHA-256
  chain verification (delta.py:67-127)
- Move both from Critical/High gaps to new 'Resolved' section
- Update Processing Integrity coverage (2 of 4 defects, not 3 of 4)
- Update evidence table with current line ranges



* feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer

Add AgentGovernance.Mcp namespace implementing full MCP security parity with
TypeScript and Rust SDKs:

- McpSecurityScanner: tool poisoning, typosquatting, hidden instructions,
  rug pull, schema abuse, cross-server attack, and description injection detection
- McpCredentialRedactor: regex-based redaction of API keys, bearer tokens,
  connection strings, and secret assignments
- McpResponseSanitizer: response scanning for prompt injection tags,
  imperative phrasing, credential leakage, and exfiltration URLs
- McpGateway: policy enforcement pipeline with deny/allow lists, payload
  sanitization, rate limiting, and human approval gates

Includes 46 xUnit tests covering all threat categories. Updates
SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅.



---------

Co-authored-by: Copilot <[email protected]>

* docs: address external critic gaps (#1025)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add 26 xUnit tests
- Update README

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#6)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code (#8)

* feat(openshell): add governance skill package and runnable example (#942)

Co-authored-by: Copilot <[email protected]>

* feat(typescript): add MCP security scanner and lifecycle management to TS SDK (#947)

Co-authored-by: Copilot <[email protected]>

* docs: update SDK feature matrix after parity pass (#950)

Reflects new capabilities added in PRs #947 (TS), .NET, Rust, Go:
- TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14)
- .NET: Kill switch + lifecycle management (was 8/14, now 10/14)
- Rust: Execution rings + lifecycle management (was 6/14, now 8/14)
- Go: MCP security + rings + lifecycle (was 4/14, now 7/14)

All SDKs now have lifecycle management. Core governance (policy, identity,
trust, audit) + lifecycle = 5 primitives shared across all 5 languages.

Co-authored-by: Copilot <[email protected]>

* docs: add LIMITATIONS.md - honest design boundaries and layered defense (#953)

Addresses valid external critique of AGT's architectural blind spots:

1. Action vs Intent: AGT governs individual actions, not reasoning or
   action sequences. Documents the compound-action gap explicitly and
   recommends content policies + model safety layers.

2. Audit logs record attempts, not outcomes: Documents that post-action
   state verification is the user's responsibility today, with hooks planned.

3. Performance honesty: README now notes that <0.1ms is policy-eval only;
   distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md.

4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator,
   no mesh/crypto) vs full enterprise stack.

5. Vendor independence: Documents zero cloud dependencies in core,
   standard formats for all state, migration path.

6. Recommended layered defense architecture diagram showing AGT as one
   layer alongside model safety, application logic, and infrastructure.

Co-authored-by: Copilot <[email protected]>

* fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (#954)

Closes #952

Co-authored-by: Copilot <[email protected]>

* feat: reversibility checker, trust calibration guide, escalation tests (#955)

ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]>

* feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (#956)

agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]>

* fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (#957)

- Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0
- Bump version= in 6 setup.py files from 3.0.2 to 3.1.0
- Bump meter version strings in _mcp_metrics.py
- Bump 9 package.json files from 3.0.2 to 3.1.0
- Bump .NET csproj Version from 3.0.2 to 3.1.0
- Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0
- Create Go sdk doc.go with version marker 3.1.0
- Fix ruff W292 (missing newline at EOF) in data_classification.py
- Fix CLI init regex to allow dots in agent names (test_init_special_characters)

Co-authored-by: Copilot <[email protected]>

* fix(openclaw): critical honesty pass — document what works vs what's planned (#958)

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (#959)

* fix(openclaw): critical honesty pass — document what works vs what's planned

Server (__main__.py):
- Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080)

Dockerfile.sidecar:
- Copy modules/ directory (was missing, causing build failure)
- Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers)
- Remove phantom port 9091 (no separate metrics listener exists)

openclaw-sidecar.md — full honesty rewrite:
- Add status banner: transparent interception is NOT yet implemented
- Document actual sidecar API endpoints (health, detect/injection, execute, metrics)
- Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile)
- Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this)
- Replace fictional SLO/Grafana sections with real /api/v1/metrics docs
- Add Roadmap section listing what's planned vs shipped

openshell.md:
- Remove references to non-existent shell scripts
- Fix python -m agentmesh.server to python -m agent_os.server
- Add note that sidecar doesn't transparently intercept (must call API)
- Replace pip install agentmesh-platform with Python skill library usage

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh

cargo package in a workspace writes .crate files to the workspace root's
target/package/, not the individual crate's directory. The pipeline was
running from the crate subdirectory and couldn't find the output.

Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh
to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh
to all cargo commands to target the specific crate.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (#948)

Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from #772.

Co-authored-by: kevinkaylie <[email protected]>

* blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (#899)

Co-authored-by: aymenhmaidiwastaken <[email protected]>

* feat: add LotL prevention policy for security measures (#949)

YAML policy template for Living-off-the-Land detection and prevention.

* feat(examples): add ATR community security rules for PolicyEvaluator (#908)

15 curated ATR detection rules + sync script. Closes #901.

* fix(docs): correct npm package name and stale version refs across 21 files (#960)

- Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files
  (README, QUICKSTART, tutorials, SDK docs, i18n, changelog)
- Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py)
- Remove stale v1.0.0 labels from extension status table
- Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and
  ATF conformance assessment

Co-authored-by: Copilot <[email protected]>

* fix(ci): use ESRP Release for NuGet signing (#961)

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing (#962)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag (#963)

* fix(ci): add missing packages to ESRP pipeline and fix Go version tag

Three gaps found during publish verification:

1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix)
2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp)
   - Changed from single-crate to workspace build (--workspace)
   - Package loop builds both .crate files
   - Renamed artifact from 'rust-agentmesh' to 'rust-crates'
3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0)
   - Go module tags require semver with v prefix
   - Pipeline grep expects '// Version: v...' format

Co-authored-by: Copilot <[email protected]>

* fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget'

ESRP Release rejected 'Nuget' with: 'The value provided for
ReleaseContentType property is invalid.' ErrorCode 2254.

ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(ci): use EsrpCodeSigning + dotnet push for NuGet (#965)

EsrpRelease@11 does not support NuGet as a contenttype — it's for
PyPI/npm/Maven/crates.io package distribution. NuGet packages must be
signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push.

New flow:
1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405)
2. dotnet nuget push with the signed .nupkg to nuget.org

This matches the standard Microsoft NuGet ESRP signing pattern used by
azure-sdk, dotnet runtime, and other Microsoft OSS projects.

Co-authored-by: Copilot <[email protected]>

* fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (#966)

Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities.

CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration
via Header Injection Chain — prototype pollution gadget enables CRLF
injection in HTTP headers, bypassing AWS IMDSv2 session tokens.

CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing
dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through
attacker-controlled proxy.

Upgraded in 3 packages:
- extensions/copilot: 1.14.0 → 1.15.0
- extensions/cursor:  1.13.5 → 1.15.0
- agent-os-vscode:    1.13.6 → 1.15.0

Co-authored-by: Copilot <[email protected]>

* fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (#967)

The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference,
preventing ESRP authentication across ALL publishing stages (PyPI, npm,
NuGet, crates.io).

Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the
well-known Microsoft corporate tenant ID (72f988bf-..., same default
used by ESRP Release action.yml). This is a public value, not a secret.

Also: NuGet publishing requires Microsoft as co-owner of the package
on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance

Co-authored-by: Copilot <[email protected]>

* docs: sync audit redaction status and framing with current code

- Update SOC2 mapping to reflect CredentialRedactor now redacts
  credential-like secrets before audit persistence (API keys, tokens,
  JWTs, connection strings, etc.). Remaining gap: non-credential PII
  (email, phone, addresses) not yet redacted in audit entries.
- Replace 'kernel-level enforcement' with 'policy-layer enforcement'
  in README, OWASP compliance, and architecture overview to match the
  existing 'application-level governance' framing in README Security
  section and LIMITATIONS.md.
- Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote
  clarifying this means mitigation components exist per risk category,
  not full elimination.
- Update owasp-llm-top10-mapping.md LLM06 row for credential redaction.

Addresses doc/code inconsistencies identified in external review.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>

* fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (#1028)

- Remove unused variable profiling_override (F841)
- Remove f-string without placeholders (F541)
- Fix whitespace in docstrings (W293)

Co-authored-by: Copilot <[email protected]>

* fix(ci): add path filters and concurrency; announce v3.1.0 release (#1039)

CI optimization:
- Add paths-ignore for docs to 5 code-only workflows
- Add paths filter to Link Check (only run on docs changes)
- Add concurrency groups to 7 heavy workflows
- Docs-only PRs drop from ~14 checks to ~4

README:
- Add v3.1.0 release announcement callout
- Add PyPI version badge
- Update tutorial count to 31

Co-authored-by: Copilot <[email protected]>

* docs: add ADOPTERS.md and make deployment guides multi-cloud (#1040)

- New ADOPTERS.md following Backstage/Flatcar pattern with Production,
  Evaluation, and Academic tables + instructions for adding your org
- Rewrite docs/deployment/README.md from Azure-only to multi-cloud:
  Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE),
  Docker Compose, self-hosted. Updated architecture diagram to show
  cloud-agnostic deployment patterns.
- Fix broken AWS/GCP links (pointed to non-existent paths)
- README now links to 'Deployment Guides' (multi-cloud) instead of
  'Azure Deployment'
- README Contributing section invites adopters to add their org

Co-authored-by: Copilot <[email protected]>

* feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (#1044)

Addresses the #1 developer experience criticism: AGT is too complex to start.

New: agent_os.lite — lightweight governance module
- govern() factory: one line to create a governance gate
- check(action): one line to enforce — raises GovernanceViolation or returns True
- check.is_allowed(action): non-raising bool version
- Allow lists, deny lists, regex patterns, content filtering, rate limiting
- Built-in audit trail and stats
- Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms)
- Zero dependencies beyond stdlib (re, time, datetime)
- 16 tests passing

Fix: govern_in_60_seconds.py quickstart
- BROKEN: was calling PolicyEvaluator.add_rules() which does not exist
- FIXED: now uses agent_os.lite.govern() which actually works
- Verified end-to-end: script runs and produces correct output

The lite module is for developers who just want basic governance
without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc.
Upgrade to the full stack when you need it.

Co-authored-by: Copilot <[email protected]>

* feat(ci): enhance weekly security audit with 7 new scan jobs (#1051)

Add comprehensive security checks based on issues found during
the MSRC-111178 security audit and ongoing post-merge reviews:

- Workflow security regression (MSRC-111178 pull_request_target check)
- Expression injection scan (github.event.* in run: blocks)
- Docker security (root containers, wildcard CORS, hardcoded passwords,
  0.0.0.0 bindings)
- XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True)
- Action SHA pinning compliance
- Version pinning (pyproject.toml upper bounds, Docker :latest tags,
  license field format)
- Dependency confusion with --strict mode (pyproject.toml + package.json)
- Retention days updated to 180 (EU AI Act Art. 26(6))

Co-authored-by: Copilot <[email protected]>

* fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (#1057)

- Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.)
- Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md
- Fix policy validation: replace starts_with (invalid) with matches + regex

Co-authored-by: Copilot <[email protected]>

* feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (#1061)

Addresses critical review feedback:

1. Rollback/reversibility (agent_os.reversibility)
   - ReversibilityChecker: pre-execution assessment of action reversibility
   - 4 levels: fully_reversible, partially_reversible, irreversible, unknown
   - CompensatingAction: structured undo plans for each action type
   - Built-in rules for 12 common actions (write, deploy, delete, email, etc.)
   - block_irreversible mode for strict environments

2. Trust score calibration guide (docs/security/trust-score-calibration.md)
   - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%)
   - Decay functions with tier floors
   - Initial score assignments by agent origin
   - Threshold recommendations (conservative/moderate/permissive)
   - Anti-gaming measures and operational playbook

3. Tests: 19 passing (10 escalation + 9 reversibility)

Co-authored-by: Copilot <[email protected]>

* feat: deployment runtime (Docker/AKS) and shared trust core types (#1062)

agent-runtime: Evolve from thin re-export shim to deployment runtime
- DockerDeployer: container deployment with security hardening
  (cap-drop ALL, no-new-privileges, read-only rootfs)
- KubernetesDeployer: AKS pod deployment with governance sidecars
  (runAsNonRoot, seccompProfile, resource limits)
- GovernanceConfig: policy/trust/audit config injected as env vars
- DeploymentTarget protocol for extensibility (ADC, nono, etc.)
- 24 tests (all subprocess calls mocked)

agent-mesh: Extract shared trust types into agentmesh.trust_types
- TrustScore, AgentProfile, TrustRecord, TrustTracker
- Canonical implementations replacing ~800 lines of duplicated code
  across 6+ integration packages
- 25 tests covering clamping, scoring, history, capabilities

Co-authored-by: Copilot <[email protected]>

* feat(dotnet): add kill switch and lifecycle management to .NET SDK (#1065)

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#1066)

- mcp.go: MCP security scanner detecting tool poisoning, typosquatting,
  hidden instructions (zero-width chars, homoglyphs), and rug pulls
- rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed)
  with default-deny access control
- lifecycle.go: Eight-state agent lifecycle manager with validated transitions
- Full test coverage for all three modules
- Updated README with API docs and examples

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK (#1067)

* feat(dotnet): add kill switch and lifecycle management to .NET SDK

- Add KillSwitch with arm/disarm, event history, and subscriber notifications
- Add LifecycleManager with 8-state machine and validated transitions
- Add comprehensive xUnit tests for both components (26 tests)
- Update .NET SDK README with usage documentation

Co-authored-by: Copilot <[email protected]>

* feat(rust): add execution rings and lifecycle management to Rust SDK

Add two new modules to the agentmesh Rust crate:

- rings.rs: Four-level execution privilege ring model (Admin/Standard/
  Restricted/Sandboxed) with per-agent assignment and per-ring action
  permissions, ported from the Python hypervisor enforcer.

- lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through
  Decommissioned) with validated state transitions and event history,
  matching the lifecycle model used across other SDK languages.

Both modules include comprehensive unit tests and are re-exported from
the crate root. README updated with API tables and usage examples.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix: align lotl_prevention_policy.yaml with PolicyDocument schema

The policy file used an incompatible schema format (id, parameter,
regex_match, effect) instead of the expected PolicyDocument fields
(name, condition.field, operator, action). This caused the
validate-policies CI check to fail for all PRs.

Changes:
- id → name
- condition.parameter → condition.field
- operator: regex_match → operator: matches
- action at rule level (shell_exec/file_read) → action: deny
- effect: DENY → removed (redundant with action: deny)
- Added version, name, description, disclaimer at top level

Co-authored-by: Copilot <[email protected]>

* fix: resolve .NET ESRP signing issues blocking NuGet publish

GitHub Actions (publish.yml):
- Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in
  step-level env (invisible to if-expressions). Replace with job-level
  ESRP_CONFIGURED env derived from secrets.
- Add missing ESRP_CERT_IDENTIFIER to signing step env blocks.
- Gate the publish step on ESRP_CONFIGURED so unsigned packages are
  never pushed to NuGet.org under the Microsoft.* prefix.
- Make stub signing steps fail-fast (exit 1) instead of silently
  succeeding, preventing unsigned packages from reaching NuGet push.

ADO Pipeline (esrp-publish.yml):
- Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push
  has a guaranteed SDK version on the Windows agent.

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1163)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1164)

The docker-compose example in openclaw-sidecar.md was illustrative only
and did not work — it referenced a non-existent OpenClaw image and lacked
healthchecks. Users were hitting this and getting confused.

Changes:
- Add working demo at demo/openclaw-governed/ with docker-compose.yaml
  that builds and runs the governance sidecar from source
- Replace the inline docker-compose in the doc with a link to the demo
  plus a clearly-labeled reference template for custom deployments
- Add prominent WARNING callout listing known limitations (no native
  OpenClaw integration, no published images, explicit API required)
- Remove stale orphaned curl snippet after the docker-compose block
- Add healthcheck to docker-compose governance-sidecar service
- Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest
  to a placeholder users must replace with their own image

Co-authored-by: Copilot <[email protected]>

* fix(ci): use PME tenant ID for ESRP cert signing

The ESRP signing cert lives in the PME (Partner Managed Engineering)
tenant (975f013f), not the Microsoft corporate tenant (72f988bf).
Using the wrong tenant ID causes ESRP signing to fail when looking
up the cert.

Co-authored-by: Copilot <[email protected]>

* docs: Add Scaling AI Agents article to COMMUNITY.md (#857)

Co-authored-by: deepsearch <[email protected]>

* Add runtime evidence mode to agt verify (#969)

* Track agt verify evidence plan

* Add runtime evidence mode to agt verify

* Add runtime evidence verifier tests

* Add CLI tests for agt verify evidence mode

* Document evidence mode for compliance verification

* Remove local implementation notes

* Document agt verify evidence mode

* Harden evidence path handling in verify

---------

Co-authored-by: T. Smith <[email protected]>

* docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix

- Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID
  - Detailed roles & responsibilities between AGT and Entra/Agent365
  - Architecture diagram showing the identity bridge
  - Step-by-step: DID creation, Entra binding, AKS workload identity,
    token validation, lifecycle sync, access verification
  - Known gaps and limitations table
  - Platform independence note (AWS, GCP, Okta patterns)
- Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh
  for consistency with Python reference implementation and .NET SDK)
- Update tutorials README with Enterprise Identity section

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>

* docs: address external critic gaps in limitations and threat model (#11)

Add three new sections to LIMITATIONS.md addressing gaps identified in
public criticism and external security analysis:

- §10 Physical AI and Embodied Agent Governance: documents that AGT
  governs software agents not physical actuators, with mitigations
- §11 Streaming Data and Real-Time Assurance: documents that AGT
  evaluates per-action not continuously over data streams
- §12 DID Method Inconsistency Across SDKs: documents the did:mesh
  vs did:agentmesh split with migration plan for v4.0

Update THREAT_MODEL.md residual risks to reference all three new
limitation sections.

Co-authored-by: Copilot <[email protected]>

* fix!: standardize DID method to did:agentmesh across all SDKs (#12)

* fix!: standardize DID method to did:agentmesh across all SDKs

BREAKING CHANGE: All agent DIDs now use the did:agentmesh: prefix.
The legacy did:mesh: prefix used by Python and .NET has been migrated
to match the did:agentmesh: convention already used by TypeScript,
Rust, and Go SDKs.

Changes:
- Python: agent_id.py, delegation.py, entra.py, all integrations
- .NET: AgentIdentity.cs, Jwk.cs, GovernanceKernel.cs, all tests
- Docs: README, tutorials, identity docs, FAQ, compliance docs
- Tests: all test fixtures updated across Python, .NET, TS, VSCode
- Version bump: 3.1.0 → 3.2.0 (.NET, Python agent-mesh, TypeScript)

Migration: replace did:mesh: with did:agentmesh: in your policies,
identity registries, and agent configurations.

Co-authored-by: Copilot <[email protected]>

* docs: add Q11-Q13 to FAQ — AGT scope, Agent 365, and DLP comparison

Adds three new customer Q&As:
- Q11: Is AGT for Foundry agents or any agent type? (any)
- Q12: Relationship between AGT and Agent 365 (different layers)
- Q13: How is AGT different from DLP/communication compliance
  (content vs action governance)

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* fix(security): address all 14 open code scanning alerts (#13)

* fix: address 6 Dependabot security vulnerabilities

- python-multipart 0.0.22 → 0.0.26 (DoS via large preamble/epilogue)
- pytest 8.4.1 → 9.0.3 (tmpdir handling vulnerability)
- langchain-core 1.2.11 → 1.2.28 (SSRF, path traversal, f-string validation)
- langchain-core >=0.2.0,<1.0 → >=1.2.28 in langchain-agentmesh pyproject.toml
- tsup 8.0.0 → 8.5.1 (DOM clobbering vulnerability)
- rand 0.8.5: dismissed #176 as inaccurate (vuln affects rand::rng() 0.9.x API only)

Fixes Dependabot alerts: #177, #175, #166, #164, #157, #156
Dismissed: #176 (not applicable to rand 0.8.x)

Co-authored-by: Copilot <[email protected]>

* fix(security): address all 14 open code scanning alerts

Scorecard HIGH:
- publish-containers.yml: scope packages:write to job level (#316)

Scorecard MEDIUM (pinned dependencies):
- docs.yml: pin 4 GitHub Actions by SHA hash (#311-314)
- docs.yml: use requirements.txt for pip install (#315)
- agent-mesh Dockerfile: pin python:3.11-slim by SHA (#317,#318)
- agent-os Dockerfile.sidecar: pin python:3.14-slim by SHA (#295,#296)
- dashboard Dockerfile: pin python:3.12-slim by SHA (#291,#293)

CodeQL:
- test_time_decay.py: timedelta(days=365) -> 366 for leap safety (#289,#290)

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: kevinkaylie <[email protected]>
Co-authored-by: Aymen Hmaidi <[email protected]>
Co-authored-by: harshnair75567-cloud <[email protected]>
Co-authored-by: Adamthereal <[email protected]>
Co-authored-by: Jack Batzner <[email protected]>
Co-authored-by: lawcontinue <[email protected]>
Co-authored-by: deepsearch <[email protected]>
Co-authored-by: ewmh <[email protected]>
Co-authored-by: T. Smith <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants