Skip to content

Rename db.name to db.namespace and generalize it#911

Merged
lmolkova merged 17 commits intoopen-telemetry:mainfrom
lmolkova:db-collection-namespace
Apr 24, 2024
Merged

Rename db.name to db.namespace and generalize it#911
lmolkova merged 17 commits intoopen-telemetry:mainfrom
lmolkova:db-collection-namespace

Conversation

@lmolkova
Copy link
Copy Markdown
Member

@lmolkova lmolkova commented Apr 8, 2024

Changes

Deprecates db.name and db.redis.database_index in favor of db.collection.namespace
Removes db.mssql.instance_name (includes it into db.collection.namespace).

Open question:
where does the schema belong?

  • within name: namespace = mydb, name = dbo.mytable , span name is SELECT mydb.dbo.mytable
    • If schema is not set in the query, then we won't have it (namespace = mydb and name = mytable and span name is SELECT mydb.mytable
    • But it's a best practice to qualify table name with schema
  • within namespace: namespace = mydb.dbo, name = dbo.mytable, span name is SELECT mydb.dbo.dbo.mytable
    • if schema is provided as a part of table name (best practice), we have duplication

The schema (at least sometimes) requires actual database call: postgres jdbc driver, SQL Server JDBC driver.
It's not a connection property per se, but a database server configuration, i.e. can be modified elsewhere and can't be cached on the client.

So in this PR schema stays inside table name (if users provided it in the query) and is not captured inside the database name.

Merge requirement checklist

@lmolkova lmolkova requested review from a team April 8, 2024 23:21
Comment thread docs/attributes-registry/db.md
Comment thread docs/database/hbase.md Outdated
Comment thread docs/database/hbase.md Outdated
Comment thread docs/database/mongodb.md
Comment thread docs/database/redis.md Outdated
@Oberon00
Copy link
Copy Markdown
Member

Oberon00 commented Apr 9, 2024

Was there any discussion before this PR in some SIG meeting or similar? Is there any issue associated with this PR?

@lmolkova
Copy link
Copy Markdown
Member Author

lmolkova commented Apr 9, 2024

Was there any discussion before this PR in some SIG meeting or similar? Is there any issue associated with this PR?

yes, this is the design approach discussed within DB semconv WG: https://docs.google.com/document/d/1zTi_Z4WBisytPnXTpwtnGOK-4Gv5oSTuMHvIQ8--5hg

It partially addresses #749, clarifies parts of #718, #714

@trask
Copy link
Copy Markdown
Member

trask commented Apr 10, 2024

if schema is provided as a part of table name (best practice), we have duplication

wdyt of just SELECT {db.collection.name} for the span name? I'm not sure we need db.collection.namespace in the span name, sort of like HTTP spans include the route but not the server name. you can always group span names further by db.collection.namespace if that's desired

@trask
Copy link
Copy Markdown
Member

trask commented Apr 10, 2024

where does the schema belong?

For SQL at least:

Comment thread docs/database/database-spans.md Outdated
Comment thread docs/attributes-registry/db.md
@lmolkova lmolkova force-pushed the db-collection-namespace branch from a68803e to a8e64c3 Compare April 11, 2024 00:33
@gregkalapos
Copy link
Copy Markdown
Member

gregkalapos commented Apr 12, 2024

As discussed in the last WG call, looked into what we can collect in .NET.

The DbConnection ADO interface has following relevant methods:

  • DbConnection.DataSource: name of the database server
  • DbConnection.Database: name of the database
  • DbConnection.GetSchema(string): returns a DataTable which contains metadata info about the objects in the db including schema and collection. Here is a super useful article on this.

I ran following code snippet with different DBs and libraries:

static void DisplayConnectionData(DbConnection connection)
{
    Console.WriteLine($"DataSource: {connection.DataSource}, Database: {connection.Database}");
    Console.WriteLine("============================");

    DataTable table = connection.GetSchema("Tables");
    foreach (System.Data.DataRow row in table.Rows)
    {
        foreach (System.Data.DataColumn col in table.Columns)
        {
            Console.WriteLine("{0} = {1}", col.ColumnName, row[col]);
        }
        Console.WriteLine("============================");
    }
}

Above code queries the Tables schema collection (so it only returns infos about tables). Here is a list about other collections (e.g. Procedures could be used for stored procedures).

MS SQL Server with System.Data.SqlClient

Output:

DataSource: 127.0.0.1,50312, Database: master
============================
TABLE_CATALOG = master
TABLE_SCHEMA = dbo
TABLE_NAME = spt_fallback_db
TABLE_TYPE = BASE TABLE
============================
....similar info for all tables...

MySQL with MySqlConnector

Output:

DataSource: 127.0.0.1, Database: test
============================
TABLE_CATALOG = def
TABLE_SCHEMA = information_schema
TABLE_NAME = CHARACTER_SETS
TABLE_TYPE = SYSTEM VIEW
ENGINE =
VERSION = 10
ROW_FORMAT =
TABLE_ROWS = 0
AVG_ROW_LENGTH = 0
DATA_LENGTH = 0
MAX_DATA_LENGTH = 0
INDEX_LENGTH = 0
DATA_FREE = 0
AUTO_INCREMENT =
CREATE_TIME = 12.04.2024 14:18:11
UPDATE_TIME =
CHECK_TIME =
TABLE_COLLATION =
CHECKSUM =
CREATE_OPTIONS =
TABLE_COMMENT =
============================
....similar info for all tables...

PostgreSql with Npgsql

Output:

DataSource: tcp://127.0.0.1:54251, Database: postgres
============================
table_catalog = postgres
table_schema = public
table_name = foo
table_type = BASE TABLE
============================

So, overall it seems, at least in ADO.NET we can get the schema. Issue I see with this is that this seems to query the database - not only that, but connection.GetSchema("Tables") returns the schema info for all tables, which can be very high overhead. Plus it's always a point in time info - e.g. if a table is created after the instrumentation calls connection.GetSchema("Tables"), then connection.GetSchema("Tables") needs to be called again to get the info for that table.

Comment thread docs/attributes-registry/db.md Outdated
Comment thread model/trace/database.yaml Outdated
Comment thread model/registry/db.yaml Outdated
Comment thread model/registry/db.yaml Outdated
Copy link
Copy Markdown
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread docs/database/database-spans.md Outdated
@trask trask changed the title Rename db.name to db.collection.namespace Rename db.name to db.namespace Apr 18, 2024
@trask trask changed the title Rename db.name to db.namespace Rename and generalize db.name to db.namespace Apr 18, 2024
@trask trask changed the title Rename and generalize db.name to db.namespace Rename db.name to db.namespace and generalize it Apr 18, 2024
Comment thread model/trace/database.yaml Outdated
@lmolkova lmolkova force-pushed the db-collection-namespace branch 2 times, most recently from 207d213 to a308a49 Compare April 19, 2024 18:49
@lmolkova lmolkova force-pushed the db-collection-namespace branch 2 times, most recently from 4cd93aa to 55697b5 Compare April 23, 2024 21:28
@lmolkova lmolkova force-pushed the db-collection-namespace branch from 5a87e3f to 15a0841 Compare April 24, 2024 17:03
Comment thread model/registry/db.yaml

Semantic conventions for individual database systems SHOULD document what `db.namespace`
means in the context of that system.
examples: [ 'customers', 'test.users' ]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sound like table (collection) names rather than namespace (database) names

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

9 participants