Support Jaeger tracer #1030#1147
Support Jaeger tracer #1030#1147bwplotka merged 40 commits intothanos-io:masterfrom monitoring-tools:feature/jaeger-tracing
Conversation
…tracing # Conflicts: # pkg/tracing/gct.go # pkg/tracing/provider/stackdriver/tracer_test.go # pkg/tracing/tracing.go
bwplotka
left a comment
There was a problem hiding this comment.
Thanks for this! It looks generally good, but I would really vote for sticking to the same configuration method we did for object storage.
Essentially factory like this: https://github.com/improbable-eng/thanos/blob/2c4f64b1b96907295a7f8e99d8fd64697f0eb12a/pkg/objstore/client/factory.go#L38
And using pathOrContent flag (: What do you think? (:
We are really grateful that you contribute back with the awesome addition like this 👍 Let us know what other awesome things you have in your fork 💪 (:
docs/components/bucket.md
Outdated
| GCP project to send Google Cloud Trace tracings to. | ||
| If empty, tracing will be disabled. | ||
| --gcloudtrace.sample-factor=1 | ||
| --stackdriver.sample-factor=1 |
There was a problem hiding this comment.
What do you think about having flags totally provider agnostic, same way we do for objconfig? Benefits would be great as everytime you change or add thousands of customization flags to one provider the flag help stays clean relevant for all users?
Even though we only agreed ont jaeger and stackdriver for now. What do you think?
There was a problem hiding this comment.
I would stick to objstore config like configuration, mainly for consistency. Shouldn't be much more code
There was a problem hiding this comment.
Also it would be great to move most of configuration to one file to have prometheus-like configuration approach. Its convenient to have all flags in one place.
There was a problem hiding this comment.
I think it's a bigger decision. We chose not to do it as:
- We have many smaller components instead of big single one like Prometheus
- Most of options are not dynamic. Adding this dynamicity is enourmous amount of work
- Objstore is separate to tracing. And it could be multiple of objstore configs as well, so it's hard to unifiy.
There was a problem hiding this comment.
But someday (: Also, we are fans of flags (: for readability
There was a problem hiding this comment.
Okay, got it, thanks! :)
pkg/tracing/provider/factory.go
Outdated
| return f | ||
| } | ||
|
|
||
| // Create creates the tracer in appliance with a tracerType |
pkg/tracing/provider/factory.go
Outdated
|
|
||
| // Create creates the tracer in appliance with a tracerType | ||
| func (f *Factory) Create(ctx context.Context, logger log.Logger, serviceName string) (opentracing.Tracer, io.Closer) { | ||
| var tracer opentracing.Tracer |
There was a problem hiding this comment.
I would suggest this to make it bit more elegant
var (
)
There was a problem hiding this comment.
Also TBH I would just name return arguments (:
pkg/tracing/provider/factory.go
Outdated
| var err error | ||
| factory, ok := f.factories[*f.tracingType]; | ||
| if !ok { | ||
| level.Info(logger).Log("msg", "Invalid tracer type. Tracing will be disabled.", "err", err) |
There was a problem hiding this comment.
I am not fan of this. I would prefer error handling done outside of factory I think.
Reason is, that you should fail fast - if user actually provide some tracing configuration - if it's wrong you want to fail Thanos process immdiately (since it's on start - it's ok)
Allowing silent errors (only log lines and Info) is not enough - it will be missed and create annoyance. Can we fix this? (:
| ) | ||
|
|
||
| // Factory implements tracing.Factory for Jaeger Tracer. | ||
| type Factory struct { |
There was a problem hiding this comment.
| type Factory struct { | |
| type Factory struct {} |
|
|
||
| // Create implements tracing.Factory | ||
| func (f *Factory) Create(ctx context.Context, logger log.Logger, serviceName string) (opentracing.Tracer, io.Closer, error) { | ||
| cfg, err := config.FromEnv() |
There was a problem hiding this comment.
Hmmm this is again - breaking consistency with other things we have. Having verbose config instead like objstore config would allow dynamic docs like here: https://thanos.io/storage.md/#aws-s3-configuration that is auto generated on every commit.
This way we are sure that all fields are known to the users for exact version of Thanos binary (: I know it's additional work, but IMO it would be awesome to allow specfying this not from env but from config (which is can be passed directly by flags as well). Additionally grabing from env might be possible as well
| JaegerDebugHeader: tracing.ForceTracingBaggageKey, | ||
| } | ||
| cfg.Headers.ApplyDefaults() | ||
| if serviceName != "" { |
There was a problem hiding this comment.
we would not need this if we would explicitly parse config from flags (:
| logger: logger, | ||
| } | ||
| jMetricsFactory := prometheus.New() | ||
| tracer, closer, err := cfg.NewTracer( |
There was a problem hiding this comment.
just return cfg.NewTracer(...) would be enough here?
| return tracer, closer, nil | ||
| } | ||
|
|
||
| // RegisterKingpinFlags implements tracing.Factory |
| @@ -0,0 +1,112 @@ | |||
| package stackdriver | |||
There was a problem hiding this comment.
just curious, any change here or just move?
There was a problem hiding this comment.
Also can we look on pjg/objstore package? Can we stick to same approach and just skip this provider directory? What do you think? (: I think both makes sense, just let's choose one
|
@bwplotka Hello =) I sent this pull-request to discuss configurations and flags. I agree with configuration via the same configuration method we did for object storage, but I want to save jaeger.FromEnv configuration. We deploy prometheus with sidecar via prometheus-operator and configuration via ENV is most convenient. |
|
@IKSIN this makes sense, and I am sure we can do proper consistent config from flag thing AND support doing it from env as well (: |
|
@bwplotka Hello) Method to configure tracing was changed as you want (jaeger config from file not implemented yet). |
bwplotka
left a comment
There was a problem hiding this comment.
Some comments, generally that's what we want ❤️ Thanks for the good work!
cmd/thanos/flags.go
Outdated
| } | ||
|
|
||
|
|
||
| func regCommonTracingFlags(cmd *kingpin.Application, suffix string, required bool, extraDesc ...string) *pathOrContent { |
There was a problem hiding this comment.
let's remove suffix and require: YAGNI - we don't need that now (: same with extraDesc
cmd/thanos/main.go
Outdated
| String() | ||
| gcloudTraceSampleFactor := app.Flag("gcloudtrace.sample-factor", "How often we send traces (1/<sample-factor>). If 0 no trace will be sent periodically, unless forced by baggage item. See `pkg/tracing/tracing.go` for details."). | ||
| Default("1").Uint64() | ||
| //tracingFactory := provider.NewFactory(app) |
There was a problem hiding this comment.
Let's remove commented code from PR if it's rdy for review (:
cmd/thanos/main.go
Outdated
| var confContentYaml []byte | ||
| confContentYaml, err = tracingConfig.Content() | ||
| //tracer, closer = tracingFactory.Create(ctx, logger, *debugName) | ||
| tracer, closer, err = provider.NewTracer(ctx, logger, confContentYaml) |
There was a problem hiding this comment.
We need to manually make sure closer is closed on error cases of this function
pkg/tracing/provider/factory.go
Outdated
| level.Info(logger).Log("msg", "loading tracing configuration") | ||
| tracingConf := &TracingConfig{} | ||
| if err := yaml.UnmarshalStrict(confContentYaml, tracingConf); err != nil { | ||
| return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config YAML file") |
There was a problem hiding this comment.
| return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config YAML file") | |
| return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config tracing YAML") |
| } | ||
|
|
||
| func parseConfig(conf []byte) (Config, error) { | ||
| config := Config{} |
There was a problem hiding this comment.
Too shallow function IMO - it does not hide much. What do you think about inlining this?
| } | ||
|
|
||
| func NewTracerWithConfig(ctx context.Context, logger log.Logger, conf Config) (opentracing.Tracer, io.Closer, error) { | ||
| cfg, err := config.FromEnv() |
There was a problem hiding this comment.
We should grab from env only if config is empty? So Flags > Environment? That avoids confusion
| logger: logger, | ||
| } | ||
| jMetricsFactory := prometheus.New() | ||
| tracer, closer, err := cfg.NewTracer( |
There was a problem hiding this comment.
literally return cfg.NewTracer(.... is enough here (:
| if err != nil { | ||
| return nil, nil, err | ||
| } | ||
| println(config.ProjectId) |
| }, func(error) { | ||
| if err := closeFn(); err != nil { | ||
| level.Warn(logger).Log("msg", "closing tracer failed", "err", err) | ||
| if closer != nil { |
There was a problem hiding this comment.
closer is never nil in this case, right?
bwplotka
left a comment
There was a problem hiding this comment.
It LGTM! Thanks for this! We need changelog entry, but we can add it ourselve as well.
pkg/tracing/jaeger/jaeger.go
Outdated
| ) | ||
| if conf != nil { | ||
| level.Info(logger).Log("msg", "loading Jaeger tracing configuration from YAML") | ||
| cfg, err = FromYaml(conf) |
There was a problem hiding this comment.
| cfg, err = FromYaml(conf) | |
| cfg, err = ParseConfigFromYaml(conf) |
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := conf.RPCMetrics; e != "" { | ||
| if value, err := strconv.ParseBool(e); err == nil { |
There was a problem hiding this comment.
Couldn't we make a field bool and get rid of parsing?
There was a problem hiding this comment.
Good point, missed that - We can enforce YAML unmarshalling to parse that for us.
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := os.Getenv(conf.Disabled); e != "" { | ||
| if value, err := strconv.ParseBool(e); err == nil { |
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := cfg.SamplerParam; e != "" { | ||
| if value, err := strconv.ParseFloat(e, 64); err == nil { |
There was a problem hiding this comment.
Shouldn't Config.SamplerParam be float64 in struct?
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := cfg.SamplerMaxOperations; e != "" { | ||
| if value, err := strconv.ParseInt(e, 10, 0); err == nil { |
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := cfg.SamplerRefreshInterval; e != "" { | ||
| if value, err := time.ParseDuration(e); err == nil { |
pkg/tracing/jaeger/config_yaml.go
Outdated
| } | ||
|
|
||
| if e := conf.RPCMetrics; e != "" { | ||
| if value, err := strconv.ParseBool(e); err == nil { |
There was a problem hiding this comment.
Good point, missed that - We can enforce YAML unmarshalling to parse that for us.
povilasv
left a comment
There was a problem hiding this comment.
I think we should simplify Config struct, instead of having all the fields as string and parsing that, let's just let the yaml unmarshaller to do the work.
# Conflicts: # go.mod # go.sum
# Conflicts: # go.mod # go.sum
|
@bwplotka Done. |
bwplotka
left a comment
There was a problem hiding this comment.
Thanks for awesome work on this. LGTM
There are some minior comment nits to be strict... ((:
pkg/tracing/jaeger/config_yaml.go
Outdated
| Tags string `yaml:"tags"` | ||
| SamplerType string `yaml:"sampler_type"` | ||
| SamplerParam float64 `yaml:"sampler_param"` | ||
| SamplerManagerHostPort string `yaml:"sampler_manager_host_port"` |
There was a problem hiding this comment.
Some docs on this structure would be nice at some point in next PR (:
pkg/tracing/jaeger/jaeger.go
Outdated
| ) | ||
|
|
||
|
|
||
| // Tracer extends opentracing.Tracer |
There was a problem hiding this comment.
missing trailig period and all below. (:
| "gopkg.in/yaml.v2" | ||
| ) | ||
|
|
||
| // Config - YAML configuration |
There was a problem hiding this comment.
missing trailing period on comment.
| SampleFactor uint64 `yaml:"sample_factor"` | ||
| } | ||
|
|
||
| // NewTracer create tracer from YAML |
|
Refactored ) |
|
Awesome! Thank you for awesome job on this! we have Jaeger support now 🎉 |
|
Great job, guys! |
|
Yes, we can definitely do better in terms of spans. Also we can stream
better in Querier proxy.go code, so we can tell more on what is the slowest
StoreAPI etc. (: If you can put a github issue if there is not any on
particular parts that's needs attention, someone can grab it as well (:
…On Tue, 4 Jun 2019 at 19:22, Dmitry Ulyanov ***@***.***> wrote:
Great job, guys!
And thanks a lot for thorough review. We'll add more spans to code cuz
currently there are a lot of blind spots. I think we'll share this code
soon too :)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1147>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVA3O7ZYJJYFZVWAPKT7ITPY2XGVANCNFSM4HNECJQQ>
.
|
| return nil, nil, err | ||
| } | ||
|
|
||
| r, err := gcloudtracer.NewRecorder( |
There was a problem hiding this comment.
not sure how the tests passed her as gcloudtracer is not defined anywhere and when I run make locally it fails there.
There was a problem hiding this comment.
did you download the required go mods?
There was a problem hiding this comment.
yep I found the culprit the module was corrupted during the download.
it was missleading that the module name is gcloudtracer and the import path is github.com/lovoo/gcloud-opentracing, strange why they did it this way.
* Add jaeger tracing feature. * Remove comments * Remove comments * Refactoring tracing * Implementing jaeger logger. * Add jaeger force tracing header. * Use debugName for tracing service-name * RemoveFactory config * Formatting fix; Use io.Closer intead func() error * Rename gcloud => stackdriver * Refactoring google tracing testing. * Delete comments. * Rename gcloud flags => stackdriver. * Update tracing docs. * Fix ..traceType to ..tracerType * Refactoring NoopTracer; Comments for exported functions. * Remove noop tracer. Fix docs. * Remove noop tracer. Fix docs. * Config tracing same as objstore. * Configure jaeger tracing from YAML. Some small fixes. * Fix errcheck * Add X-Thanos-Trace-Id HTTP header for simplified search traces * Cleanup * make docs * Add store addr to tracing tags. * Tracing refactoring. * Fix noop tracing closer. * Add few tracing spans. * Pass prometheus registry to jaeger. * go.mod * Refactoring * Resolve go mod conflicts * Remove comments. * PR refactoring for review. * Format files.
Changes
Verification
pkg/tracing/tracing_test.gomoved topkg/tracing/provider/stackdriver/tracer_test.go