Update cloud instance type history ingestion process.#2160
Merged
eiffel777 merged 8 commits intoubccr:mainfrom Mar 3, 2026
Merged
Update cloud instance type history ingestion process.#2160eiffel777 merged 8 commits intoubccr:mainfrom
eiffel777 merged 8 commits intoubccr:mainfrom
Conversation
…hp class now that we have access to window functions
… that is no longer needed
jpwhite4
reviewed
Mar 2, 2026
| "disk_gb": "staging.disk_gb", | ||
| "start_time": "staging.start_time", | ||
| "end_time": -1 | ||
| "end_time": "CASE WHEN LEAD(staging.start_time) OVER (PARTITION BY staging.resource_id, staging.instance_type ORDER BY staging.start_time) IS NOT NULL THEN LEAD(staging.start_time) OVER (PARTITION BY staging.resource_id, staging.instance_type ORDER BY staging.start_time) - 0.000001 ELSE UNIX_TIMESTAMP(DATE_ADD(TIMESTAMP(CURDATE()), INTERVAL '23:59:59' HOUR_SECOND)) END" |
Member
There was a problem hiding this comment.
What is the data type of end_time? The - 0.000001 seems like an unsual offset to add since the else statement is a UNIX_TIMESTAMP() which is going to be to the nearest INT
Member
There was a problem hiding this comment.
What is the semantics of end_time? Is it the closed or open interval end?
Contributor
Author
There was a problem hiding this comment.
@jpwhite4 The data type for end_time is decimal(16,6) hence the 6 decimal places. This is because in the cloud log files the event time is to 6 decimal places. Also, end_time is a closed interval.
…dmod into remove-groupby-cloud-staging
jpwhite4
approved these changes
Mar 3, 2026
aaronweeden
pushed a commit
to aaronweeden/xdmod
that referenced
this pull request
Mar 30, 2026
…ging Update cloud instance type history ingestion process.
aaronweeden
pushed a commit
to aaronweeden/xdmod
that referenced
this pull request
Mar 30, 2026
…ging Update cloud instance type history ingestion process.
aaronweeden
added a commit
that referenced
this pull request
Mar 31, 2026
* Merge pull request #2072 from eiffel777/add-memory-instance-state-machine Add memory to cloud instance type ingestor sorting to prevent unique key errors * Merge pull request #2084 from eiffel777/add-straight-join-metrics-explorer Adding STRAIGHT_JOIN to metrics explorer/usage tab queries to improve performance * Merge pull request #2160 from eiffel777/remove-groupby-cloud-staging Update cloud instance type history ingestion process. --------- Co-authored-by: Greg Dean <gmdean@buffalo.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR does two things.
A couple of notes.
instance_type_change_flag.json- usesLAGto set a 1 or 0 to denote if that row is where a change in configuation occurredinstance_type_config_group.json- uses theis_changecolumn frommodw_cloud.instance_type_change_flagand a window function to set a mark each different configuration for an instance typeinstance_type_grouped.json- Groups all instance types using the config group to get a MIN start time for each configuration. Uses MAX on display and description columns to make it compliant with the ONLY_FULL_GROUP_BY mode. Since these column should have the same value within each group it should always have the correct value.instance_type_staging.json- Sets the end time for each instance type configuration using LEAD.CloudInstanceTypeStateIngestoringestor and it's associated test were deleted because they are no longer needed.Tests performed
Tested in docker
Checklist: