feat(optimizer): [0/N] Optimizer Data Model#527
feat(optimizer): [0/N] Optimizer Data Model#527mkuchenbecker wants to merge 3 commits intolinkedin:mainfrom
Conversation
Introduces the optimizer service module with: - MySQL/H2 schema for table_operations, table_stats, table_stats_history, and table_operations_history - JPA entities with JSON column support (vladmihalcea hibernate-types) - All model/DTO/enum types: OperationType, OperationStatus, TableStats, CompleteOperationRequest, JobResult, OperationMetrics, etc. - JPA AttributeConverters for JobResult and OperationMetrics JSON columns - MapStruct mapper (OptimizerMapper) for entity→DTO conversion - Spring Boot application shell and build wiring (settings.gradle, build.gradle dockerPrereqs) No repositories, controllers, or service layer yet — those follow in subsequent PRs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...ces/optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/OperationMetrics.java
Outdated
Show resolved
Hide resolved
services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/OperationType.java
Outdated
Show resolved
Hide resolved
...s/optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/TableOperationsDto.java
Show resolved
Hide resolved
...izer/src/main/java/com/linkedin/openhouse/optimizer/api/model/TableOperationsHistoryDto.java
Outdated
Show resolved
Hide resolved
services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/TableStats.java
Show resolved
Hide resolved
services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/TableStatsDto.java
Show resolved
Hide resolved
...optimizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/TableStatsHistoryDto.java
Show resolved
Hide resolved
...r/src/main/java/com/linkedin/openhouse/optimizer/api/model/UpsertTableOperationsRequest.java
Outdated
Show resolved
Hide resolved
...imizer/src/main/java/com/linkedin/openhouse/optimizer/api/model/UpsertTableStatsRequest.java
Show resolved
Hide resolved
...timizer/src/main/java/com/linkedin/openhouse/optimizer/entity/TableOperationsHistoryRow.java
Outdated
Show resolved
Hide resolved
- Remove OperationMetrics class and converter; stats are read directly from table_stats instead of duplicating into operations - Remove orphanFilesDeleted/orphanBytesDeleted from history entity, DTO, and schema; operation-specific data belongs in the result JSON - Add addedSizeBytes to CommitDelta for tracking write volume - Fix OperationType javadoc to describe current state, not roadmap - Fix TableOperationsHistoryRow javadoc: written on operation complete, not by Spark app directly - Add field comments to all DTOs and request objects Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These fields never belonged in the data model — remove them at the source rather than adding then deleting in a later PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| /** Terminal states for a completed Spark maintenance job. */ | ||
| public enum OperationHistoryStatus { | ||
| SUCCESS, |
There was a problem hiding this comment.
We should have keep the existing status such as canceled, queued etc. These are valid status as some times jobs could not be submitted due to GGW/Yarn issue etc.
| private String jobId; | ||
|
|
||
| /** Reserved for future per-operation metadata; currently unused. */ | ||
| private String metrics; |
There was a problem hiding this comment.
Can we have a class instead to capture more info? Or do we plan to capture json string here?
| /** Same UUID as the originating {@code table_operations.id}. Set by the caller; not generated. */ | ||
| @Id | ||
| @Column(name = "id", nullable = false, length = 36) | ||
| private String id; |
There was a problem hiding this comment.
Looks like this UUID and generated as part of job submission?
| private String tableUuid; | ||
|
|
||
| @Column(name = "database_name", nullable = false, length = 255) | ||
| private String databaseName; |
There was a problem hiding this comment.
This seems to be 128 char long in the current prod schema.
| @Column(name = "database_name", nullable = false, length = 255) | ||
| private String databaseName; | ||
|
|
||
| @Column(name = "table_name", nullable = false, length = 255) |
There was a problem hiding this comment.
table name is also 128 char long. But yeah we can double check.
| @Id | ||
| @GeneratedValue(strategy = GenerationType.IDENTITY) | ||
| @Column(name = "id", nullable = false) | ||
| private Long id; |
There was a problem hiding this comment.
Is this auto increment id or primary key?
| @Column(name = "table_uuid", nullable = false, length = 36) | ||
| private String tableUuid; | ||
|
|
||
| @Column(name = "database_id", nullable = false, length = 255) |
There was a problem hiding this comment.
can we use only database_name for consistency?
| -- Optimizer Service Schema | ||
| -- Compatible with MySQL (production) and H2 in MySQL mode (tests). | ||
| CREATE TABLE IF NOT EXISTS table_operations ( | ||
| id VARCHAR(36) NOT NULL, |
There was a problem hiding this comment.
Can we consider adding indexes for these tables too?
|
|
||
| /** When the operation completed, as recorded by the complete endpoint. */ | ||
| @Column(name = "submitted_at", nullable = false) | ||
| private Instant submittedAt; |
There was a problem hiding this comment.
SHould this be completionTime instead?
| @Builder(toBuilder = true) | ||
| @NoArgsConstructor | ||
| @AllArgsConstructor | ||
| public static class CommitDelta { |
There was a problem hiding this comment.
does this also require @JsonIgnoreProperties ? could provide forward compatibility or safeguard during upgrades in case of new fields addition
Optimizer Stack
Summary
PR 0 of N in the optimizer stack.
Overall Project
Service Design doc.
Introduces the optimizer service module mysql data model.
Changes
Testing Done
This PR contains only the data model (entities, DTOs, converters). Repository tests follow in PR 1. Verified:
./gradlew :services:optimizer:compileJavapasses./gradlew compileJava(full project) passes with no regressionsAdditional Information