The current database semconv states the following:
Should be collected by default only if there is sanitization that excludes sensitive information.
source:
|
Should be collected by default only if there is sanitization that excludes sensitive information. |
In JS, most of our instrumentations solve this by taking advantage of the parameterized API provided by SQL clients. For example, the user may call sql.query('SELECT * FROM mydb WHERE userid = ?', userId) or similar. In this case we collect the string 'SELECT * FROM mydb WHERE userid = ?'. This presents the following problems:
- It is still possible for the string in a parameterized query to contain some non-parameterized values, which may contain sensitive data. Without parsing the string, we have no way of knowing.
- If the query is not parameterized, we collect it directly and it may still contain sensitive data.
In at least some SIGs such as Java, they have handled this by parsing the SQL and removing values, but in JS this is difficult due to bundle size restrictions and lack of good parsers available. I suspect this may also affect other SIGs. What is the recommendation of the Semantic Conventions group?
The current database semconv states the following:
source:
semantic-conventions/model/trace/database.yaml
Line 211 in 064fe4e
In JS, most of our instrumentations solve this by taking advantage of the parameterized API provided by SQL clients. For example, the user may call
sql.query('SELECT * FROM mydb WHERE userid = ?', userId)or similar. In this case we collect the string'SELECT * FROM mydb WHERE userid = ?'. This presents the following problems:In at least some SIGs such as Java, they have handled this by parsing the SQL and removing values, but in JS this is difficult due to bundle size restrictions and lack of good parsers available. I suspect this may also affect other SIGs. What is the recommendation of the Semantic Conventions group?