-
Notifications
You must be signed in to change notification settings - Fork 540
[New Integration] Greenhouse ATS audit logs #17347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a new integration to collect audit logs from Greenhouse Applicant Tracking System (ATS) via the Greenhouse Audit Log API. Features: - CEL input with two-step JWT authentication flow - Cursor-based pagination with Pit-Id and Search-After headers - Time-based filtering for incremental data collection - Full ECS mapping including user, event, and source fields - GeoIP enrichment for source IP addresses - Pipeline and system tests with mock service Dashboards: - Audit Logs Overview: Event counts, timeline, user activity, geographic distribution, event types, request types, target types, performer types - Data Changes: Focused view on create/update/delete operations with field-level change tracking and user activity breakdown Co-authored-by: Cursor <[email protected]>
Vale Linting ResultsSummary: 1 warning found
|
| File | Line | Rule | Message |
|---|---|---|---|
| packages/greenhouse/docs/README.md | 124 | Elastic.Latinisms | Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'. |
The Vale linter checks documentation changes against the Elastic Docs style guide.
To use Vale locally or report issues, refer to Elastic style guide for Vale.
f9f4fe9 to
9611d58
Compare
Co-authored-by: Cursor <[email protected]>
Migrate from Audit Log V2 (deprecated August 2026) to Harvest API V3 with OAuth 2.0 Client Credentials authentication. Changes: - Replace harvest_api_key with client_id, client_secret, and user_id - Update token endpoint from /auth/jwt_access_token to /token - Use OAuth 2.0 Client Credentials grant_type with sub parameter - Update token response parsing for expires_at field - Update mock service and system tests for V3 flow - Update documentation with V3 setup instructions Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
Add the docs README.md template with {{fields}} and {{event}}
placeholders that get expanded during package build. This matches
the structure used by other integrations. The template includes
additional sections for rate limiting, data retention, and
troubleshooting.
Co-authored-by: Cursor <[email protected]>
The ecs.yml file is only needed for backward compatibility with Kibana versions prior to 8.14. Since this integration requires Kibana 8.18+, the explicit ECS field declarations are not needed. ECS fields are automatically recognized in newer Kibana versions. Co-authored-by: Cursor <[email protected]>
💚 Build Succeeded
History
|
ShourieG
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI-Generated Review | Elastic Integration PR Review Bot
⚠️ This is an automated review generated by an AI assistant. Please verify all suggestions before applying changes. This review does not represent a human reviewer's opinion.
PR Review | elastic/integrations #17347
Field Mapping
✅ Reviewed - No actionable issues found.
Pipeline
Data Stream: audit (package: greenhouse)
File: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml
Issue 1: Date Processor Missing Error Handling
Severity: 🟡 Medium
Location: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml lines 37-43
Problem: The date processor lacks both a tag field and an on_failure handler. Date parsing can fail with malformed timestamps, and without error handling, the entire document ingestion will fail.
Recommendation:
- date:
field: greenhouse.audit.event_time
target_field: "@timestamp"
formats:
- "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
- "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
- ISO8601
tag: parse_timestamp
on_failure:
- append:
field: error.message
value: 'Failed to parse timestamp from greenhouse.audit.event_time: {{{ _ingest.on_failure_message }}}'Issue 2: Convert Processors Missing on_failure Handlers
Severity: 🟡 Medium
Location: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml lines 106, 121, 154, 161
Problem: Multiple convert processors have ignore_missing: true but lack on_failure handlers. Conversions can fail when data is in an unexpected format (e.g., invalid IP address, non-numeric string for ID), causing document ingestion to fail.
Recommendation:
- convert:
field: greenhouse.audit.performer.id
type: string
target_field: user.id
ignore_missing: true
tag: convert_performer_id
on_failure:
- append:
field: error.message
value: 'Failed to convert performer.id to string: {{{ _ingest.on_failure_message }}}'Apply similar fixes to:
- Line 121:
greenhouse.audit.performer.ip_address→source.ip(type: ip) - Line 154:
greenhouse.audit.organization_id→organization.id(type: string) - Line 161:
greenhouse.audit.event.target_id(type: string)
Issue 3: Global on_failure Handler Missing Processor Tag
Severity: 🟡 Medium
Location: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml line 232
Problem: The global error handler doesn't include _ingest.on_failure_processor_tag, making it harder to identify which processor failed during troubleshooting.
Recommendation:
- append:
field: error.message
value: 'Processor {{{ _ingest.on_failure_processor_tag }}} failed: {{{ _ingest.on_failure_message }}}'Issue 4: JSON Processor Missing Tag
Severity: 🔵 Low
Location: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml line 17
Problem: The json processor should have a tag field for error identification and traceability.
Recommendation:
- json:
field: event.original
target_field: json
tag: parse_jsonIssue 5: GeoIP Processors Missing Tags
Severity: 🔵 Low
Location: packages/greenhouse/data_stream/audit/elasticsearch/ingest_pipeline/default.yml lines 132, 136
Problem: The geoip processors lack tag fields for better traceability during debugging.
Recommendation:
- geoip:
field: source.ip
target_field: source.geo
ignore_missing: true
tag: geoip_source_geo
- geoip:
database_file: GeoLite2-ASN.mmdb
field: source.ip
target_field: source.as
properties:
- asn
- organization_name
ignore_missing: true
tag: geoip_source_asn💡 Suggestions
- Consider adding
event.createdfield to track when Greenhouse created the audit log record (if the API provides this field) - Consider adding
event.outcomefield to distinguish successful operations from failed attempts (if the API provides status information)
Input Configuration
Data Stream: audit (package: greenhouse)
File: packages/greenhouse/data_stream/audit/agent/stream/cel.yml.hbs
Issue 1: Missing state.with() wrapper - cursor and token state not preserved on errors
Severity: 🟠 High
Location: packages/greenhouse/data_stream/audit/agent/stream/cel.yml.hbs line 35
Problem: The entire CEL program is not wrapped in state.with(), which means cursor and token state will be lost when errors occur. Without state.with(), only explicitly returned fields are preserved. The token acquisition error branch (lines 56-68) doesn't preserve any state, and error branches in the main API request don't preserve cursor state. This can lead to data loss or duplicate ingestion after errors.
Recommendation:
program: |
state.with(
// Check if token needs refresh (expired or not present)
(
!has(state.token) || state.token == "" ||
!has(state.token_expires) || timestamp(state.token_expires) < now
).as(need_token,
// ... rest of the program
)
)Issue 2: CEL program not formatted according to celfmt standards
Severity: 🔵 Low
Location: packages/greenhouse/data_stream/audit/agent/stream/cel.yml.hbs line 35
Problem: CEL program formatting does not match celfmt standards. While syntactically valid, the formatting differs from canonical format, affecting readability and consistency.
Recommendation:
Run celfmt on the CEL program to apply standard formatting for improved readability and consistency.
Transform
✅ No transform files in this PR.
Summary
| Severity | Count |
|---|---|
| 🔴 Critical | 0 |
| 🟠 High | 1 |
| 🟡 Medium | 3 |
| 🔵 Low | 3 |
Total Actionable Items: 7
| ignore_missing: true | ||
|
|
||
| # Timestamp | ||
| - date: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 🟡 Date Processor Missing Error Handling — The date processor lacks both a tag field and an on_failure handler. Date parsing can fail with malformed timestamps, and without error handling, the entire document ingestion will fail.
Recommendation:
- date:
field: greenhouse.audit.event_time
target_field: "@timestamp"
formats:
- "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
- "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
- ISO8601
tag: parse_timestamp
on_failure:
- append:
field: error.message
value: 'Failed to parse timestamp from greenhouse.audit.event_time: {{{ _ingest.on_failure_message }}}'🤖 AI-generated comment by Elastic Integration PR Review Bot
| if: ctx.event?.category == null | ||
|
|
||
| # User fields from performer | ||
| - convert: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 🟡 Convert Processors Missing on_failure Handlers — Multiple convert processors have ignore_missing: true but lack on_failure handlers. Conversions can fail when data is in an unexpected format (e.g., invalid IP address, non-numeric string for ID), causing document ingestion to fail.
Recommendation:
- convert:
field: greenhouse.audit.performer.id
type: string
target_field: user.id
ignore_missing: true
tag: convert_performer_id
on_failure:
- append:
field: error.message
value: 'Failed to convert performer.id to string: {{{ _ingest.on_failure_message }}}'Apply similar fixes to:
- Line 121:
greenhouse.audit.performer.ip_address→source.ip(type: ip) - Line 154:
greenhouse.audit.organization_id→organization.id(type: string) - Line 161:
greenhouse.audit.event.target_id(type: string)
🤖 AI-generated comment by Elastic Integration PR Review Bot
| allow_duplicates: false | ||
| - append: | ||
| field: error.message | ||
| value: '{{{ _ingest.on_failure_message }}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 🟡 Global on_failure Handler Missing Processor Tag — The global error handler doesn't include _ingest.on_failure_processor_tag, making it harder to identify which processor failed during troubleshooting.
Recommendation:
- append:
field: error.message
value: 'Processor {{{ _ingest.on_failure_processor_tag }}} failed: {{{ _ingest.on_failure_message }}}'🤖 AI-generated comment by Elastic Integration PR Review Bot
| ignore_missing: true | ||
| if: ctx.event?.original != null | ||
| description: 'The `message` field is no longer required if the document has an `event.original` field.' | ||
| - json: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 🟡 JSON Processor Missing Tag — The json processor should have a tag field for error identification and traceability.
Recommendation:
- json:
field: event.original
target_field: json
tag: parse_json🤖 AI-generated comment by Elastic Integration PR Review Bot
| value: 'Failed to convert performer.ip_address to IP: {{{ _ingest.on_failure_message }}}' | ||
|
|
||
| # GeoIP enrichment | ||
| - geoip: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 💡 GeoIP Processors Missing Tags — The geoip processors lack tag fields for better traceability during debugging.
Recommendation:
- geoip:
field: source.ip
target_field: source.geo
ignore_missing: true
tag: geoip_source_geo
- geoip:
database_file: GeoLite2-ASN.mmdb
field: source.ip
target_field: source.as
properties:
- asn
- organization_name
ignore_missing: true
tag: geoip_source_asn💡 Suggestions
- Consider adding
event.createdfield to track when Greenhouse created the audit log record (if the API provides this field) - Consider adding
event.outcomefield to distinguish successful operations from failed attempts (if the API provides status information)
🤖 AI-generated comment by Elastic Integration PR Review Bot
| fields: | ||
| - client_secret | ||
| - token | ||
| program: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Review: 🔴 Missing state.with() wrapper - cursor and token state not preserved on errors — The entire CEL program is not wrapped in state.with(), which means cursor and token state will be lost when errors occur. Without state.with(), only explicitly returned fields are preserved. The token acquisition error branch (lines 56-68) doesn't preserve any state, and error branches in the main API request don't preserve cursor state. This can lead to data loss or duplicate ingestion after errors.
Recommendation:
program: |
state.with(
// Check if token needs refresh (expired or not present)
(
!has(state.token) || state.token == "" ||
!has(state.token_expires) || timestamp(state.token_expires) < now
).as(need_token,
// ... rest of the program
)
)🤖 AI-generated comment by Elastic Integration PR Review Bot
Summary
This PR adds a new integration to collect audit logs from Greenhouse Applicant Tracking System (ATS) via the Greenhouse Audit Log API.
Closes #16983
Features
Pit-IdandSearch-AfterheadersData Stream
data_change_create- Record creation eventsdata_change_update- Record modification eventsdata_change_destroy- Record deletion eventsharvest_access- API access eventsaction- User action eventsDashboards
[Greenhouse] Audit Logs Overview
[Greenhouse] Data Changes
Testing
Checklist
changelog.ymlMade with Cursor