Skip to content

Conversation

@bsorbo
Copy link
Collaborator

@bsorbo bsorbo commented Jun 25, 2025

This is a successor to #628 - in this PR, we add a general-purpose mechanism for GraphQL extensions (such as the query registry) to log additional data within the ElasticGraphQueryExecutorQueryDuration message, and then utilize that within the query registry gem to log whether a given query was registered or not.

How tested?

  • Added/ran unit and acceptance tests.
  • Clean CI build?

@CLAassistant
Copy link

CLAassistant commented Jun 25, 2025

CLA assistant check
All committers have signed the CLA.

@bsorbo bsorbo force-pushed the bsorbo/registered_query_logging branch from 2d5db86 to d2c4156 Compare June 25, 2025 21:03
Copy link
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @bsorbo / goose :). Left some feedback.

end

# Allows extensions to add custom data that will be included in the query duration log
def set_extension_data(key, value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def set_extension_data(key, value)
def []=(key, value)

It's more idiomatic in Ruby to model setters as assignment rather than set_xyz methods, and the []= method is a special method name that lets you use array/hash element assignment syntax:

tracker["key"] = "value"

That said: part of me wonders if this should just be a merge! method:

def merge_extension_data!(other_extension_data)
  mutex.synchronize do
    extension_data.merge!(other_extension_data)
  end
end

What's nice about this is that it lets you merge in as many bits of extension data as you want in one call (which only grabs the mutex once) rather than requiring a separate call per entry.

I don't know which approach I prefer, though.

Comment on lines +120 to +123
}

# Merge in any extension-provided data
log_data.merge!(query_tracker.extension_data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
# Merge in any extension-provided data
log_data.merge!(query_tracker.extension_data)
}.merge(query_tracker.extension_data)

/nit As a general rule we favor a functional, "avoid mutation when possible" approach, so we tend to use Hash#merge (which returns a new hash, leaving the old one unmodified) rather than merge!.

attr_accessor datastore_query_server_duration_ms: ::Integer
attr_accessor datastore_query_client_duration_ms: ::Integer
attr_accessor queried_shard_count: ::Integer
attr_accessor extension_data: ::Hash[::String, ::String]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attr_accessor extension_data: ::Hash[::String, ::String]
attr_accessor extension_data: ::Hash[::String, untyped]

I don't think we should limit the values to strings. The logged data is just meant to be valid JSON. It's reasonable to put a boolean or integer in this data.

(And if you accept this suggestion, please apply it to the other spots below).

expect(tracker.extension_data).to eq("key" => "new_value")
end

it "is thread-safe" do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever, but honestly we don't bother testing the thread safety of any of the other QueryDetailsTracker methods, and I don't think we need to here. This kind of test is going to be more expensive than the others (it spins up 100 threads and waits on them!) and it's inherently non-deterministic (if we don't use a mutex, the test may still pass...) so I don't think it clears the bar of being valuable enough relative to the cost of keeping this test around.

Mind dropping it?

"another_field" => "another_value"
)
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 good test!

end

def build_and_validate_query(query_string, client:, variables: {}, operation_name: nil, context: {})
def build_and_validate_query_with_registration_status(query_string, client:, variables: {}, operation_name: nil, context: {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/nit I think I prefer keeping the old name:

  • build_and_validate_query_with_registration_status is pretty long
  • It's not like the old name stated what the return value was, so I don't think we need to call it out here. (Otherwise, to be consistent, should this be build_and_validate_query_with_query_errors_and_registration_status?)
  • I've usually seen _with used in method names for cases where the "with" refers to an input argument or a change in behavior of the method rather than naming part of the return data structure.

# Check if the query matches a registered query
query_is_registered = client_data.canonical_query_strings.include?(
ClientData.canonical_query_string_from(query, schema_element_names: schema.element_names)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a bit of a smell to call a method on client_data and to pass it an argument that you get from ClientData--that suggests to me that maybe there's a missing method on client_data here? Something like:

query_is_registered = client_data.registered?(query)


if !client_data.canonical_query_strings.include?(ClientData.canonical_query_string_from(query, schema_element_names: schema.element_names))
return [query, [client_data.unregistered_query_error_for(query, client)]]
if !query_is_registered
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !query_is_registered
unless query_is_registered

/nit unless is slightly more idiomatic as it's nice to avoid negations when feasible...

end

# Same as build_and_validate_query but also returns whether the query was registered
def build_and_validate_query_with_registration_status(query_string, client:, variables: {}, operation_name: nil, context: {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have both build_and_validate_query_with_registration_status and build_and_validate_query? Or can we just have a single build_and_validate_query method that always includes the registration status in its return tuple?

}.to log_warning a_string_including("is unregistered")
end

it "logs query_registered=true when a query is registered for the client" do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it "logs query_registered=true when a query is registered for the client" do
it "logs `query_registered: true` when a query is registered for the client" do

/nit I think it's slightly more readable this way--and ultimately it's logged as JSON which uses key: value syntax instead of key=value.

(And if you accept this suggestion, please apply it below...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants