Skip to content

Conversation

@robertlaurin
Copy link
Contributor

😭

ArgumentError: invalid byte sequence in UTF-8
gems/opentelemetry-instrumentation-mysql2-0.21.0/lib/opentelemetry/instrumentation/mysql2/patches/client.rb:76 gsub	
gems/opentelemetry-instrumentation-mysql2-0.21.0/lib/opentelemetry/instrumentation/mysql2/patches/client.rb:76 obfuscate_sql	
gems/opentelemetry-instrumentation-mysql2-0.21.0/lib/opentelemetry/instrumentation/mysql2/patches/client.rb:59 query

@robertlaurin robertlaurin force-pushed the fix-obfuscation-encoding-error branch from 1806119 to c98338c Compare October 24, 2022 20:33
@robertlaurin robertlaurin force-pushed the fix-obfuscation-encoding-error branch from c98338c to cbcad76 Compare October 24, 2022 20:41
Copy link
Contributor

@plantfansam plantfansam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

obfuscated
end
rescue StandardError
'OpenTelemetry error: failed to obfuscate sql'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make a more specific error message here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how specific to make it, this is what will get plunked into the the db.statement attribute field. I'm being prudent about capturing any information about what was failed to be obfuscated.

What did you have in mind? Like the error class or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't realize that this would get put into the db.statement field, which is my bad for not looking closely enough. I think this is fine. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not something you need to address here, but it'd be useful to use a common prefix for all the "error" type substitutions (this and the two above) - it'd make it easier for Observability teams to monitor this behaviour. Metrics might be nice as well, but a competent engineer can build metrics from the db.statement in a span processor or in the collector.

@robertlaurin
Copy link
Contributor Author

Ya'll need to stop approving my failing PR lol

  1) Failure:
OpenTelemetry::Instrumentation::Mysql2::Instrumentation::tracing::when enable_sql_obfuscation is enabled#test_0002_encodes invalid byte sequences for db.statement [/home/runner/work/opentelemetry-ruby-contrib/opentelemetry-ruby-contrib/instrumentation/mysql2/test/opentelemetry/instrumentation/mysql2/instrumentation_test.rb:222]:
Expected: "select"
  Actual: true

@robertlaurin
Copy link
Contributor Author

Can anyone guess why the span.name is true in my failing test 😭

def extract_statement_type(sql)
QUERY_NAME_RE.match(sql) { |match| match[1].downcase } unless sql.nil?
rescue StandardError => e
OpenTelemetry.logger.debug("Error extracting sql statement type: #{e.message}")
end

@robertlaurin
Copy link
Contributor Author

I mean technically true is not nil.

def database_span_name(sql)
# Setting span name to the SQL query without obfuscation would
# result in PII + cardinality issues.
# First attempt to infer the statement type then fallback to
# current Otel approach {database.component_name}.{database_instance_name}
# https://github.com/open-telemetry/opentelemetry-python/blob/39fa078312e6f41c403aa8cad1868264011f7546/ext/opentelemetry-ext-dbapi/tests/test_dbapi_integration.py#L53
# This creates span names like mysql.default, mysql.replica, postgresql.staging etc.
statement_type = extract_statement_type(sql)
return statement_type unless statement_type.nil?
# fallback
database_name ? "mysql.#{database_name}" : 'mysql'
end

@robertlaurin robertlaurin force-pushed the fix-obfuscation-encoding-error branch from 8f213a1 to bab4c08 Compare October 25, 2022 16:19
QUERY_NAME_RE.match(sql) { |match| match[1].downcase } unless sql.nil?
rescue StandardError => e
OpenTelemetry.logger.debug("Error extracting sql statement type: #{e.message}")
nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is covered by my new test.

'SQL query too large to remove sensitive data ...'
else
obfuscated = sql.gsub(generated_mysql_regex, '?')
obfuscated = OpenTelemetry::Common::Utilities.utf8_encode(sql, binary: true)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're setting binary to true in redis, and dalli.

OpenTelemetry::Common::Utilities.utf8_encode(serialized_commands, binary: true)

command = OpenTelemetry::Common::Utilities.utf8_encode(command, binary: true, placeholder: placeholder)

So I think it makes sense here.
https://github.com/open-telemetry/opentelemetry-ruby/blob/18bfd391f2bda2c958d5d6935886c8cba61414dd/common/lib/opentelemetry/common/utilities.rb#L40-L63

@ericmustin
Copy link
Contributor

Ya'll need to stop approving my failing PR lol

@robertlaurin this is merely a reflection of how high your trust battery charge is with me

@ericmustin ericmustin self-requested a review October 25, 2022 17:53
@arielvalentin
Copy link
Contributor

Are other SQL instrumentations broken in the same way?

@robertlaurin
Copy link
Contributor Author

Are other SQL instrumentations broken in the same way?

It looks like Trilogy and PG may be vulnerable to the same error. I'll get this out and it see how it does in production before porting it over to the other two.

@robertlaurin robertlaurin merged commit ed4eec3 into main Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants