Skip to content

Source Files contain Unicode Text #2706

@perhapsmaple

Description

@perhapsmaple

Describe your environment

Branch: main (commit 25738f3)

Steps to reproduce

find . -type f -name "*.h" -exec file {} \; | grep UTF-8
find . -type f -name "*.cc" -exec file {} \; | grep UTF-8

What is the expected behavior?
Source files are expected to be ASCII encoded except when unicode characters are required for tests.

What is the actual behavior?

harish@Harishs-MacBook-Air opentelemetry-cpp % find . -type f -name "*.h" -exec file {} \; | grep UTF-8  
./api/include/opentelemetry/context/context.h: C++ source text, Unicode text, UTF-8 (with BOM) text
./api/include/opentelemetry/context/runtime_context.h: C++ source text, Unicode text, UTF-8 (with BOM) text
./api/include/opentelemetry/baggage/baggage.h: C++ source text, Unicode text, UTF-8 (with BOM) text

harish@Harishs-MacBook-Air opentelemetry-cpp % find . -type f -name "*.cc" -exec file {} \; | grep UTF-8
./ext/test/http/url_parser_test.cc: c program text, Unicode text, UTF-8 text
./sdk/test/metrics/instrument_metadata_validator_test.cc: c program text, Unicode text, UTF-8 text
./opentracing-shim/src/span_shim.cc: C++ source text, Unicode text, UTF-8 text

Additional context
The headers listed above all have a BOM character at the start of the file, and span_shim.cc has a unicode character in a comment. I currently use an in-house build system that is built on flex which has trouble parsing UTF-8 encoded files. I think we should convert all source files to use ASCII encoding unless required. I would be happy to contribute a PR if required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions