Skip to content

Populate contributor profile fields from GitHub user API (#3694)#3695

Merged
MoralCode merged 3 commits intochaoss:mainfrom
antcybersec:fix-contributor-profile-fields
Feb 18, 2026
Merged

Populate contributor profile fields from GitHub user API (#3694)#3695
MoralCode merged 3 commits intochaoss:mainfrom
antcybersec:fix-contributor-profile-fields

Conversation

@antcybersec
Copy link
Contributor

@antcybersec antcybersec commented Feb 14, 2026

  • Fix NameError when email is missing: set canonical_email from email (both can be None) so contributor insert never fails on missing email.
  • Use .get() for all optional profile fields (email, name, company, location, created_at, updated_at, and all gh_* URL fields) so every available value from the user endpoint is stored in the contributors table.

Description

  • Please include a summary of the change.

This PR fixes #3694

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have specific logs without this fix to illustrate it fixes the bug? Have you tested this? Really appreciate you diving in on this one, it's just more efficient to ask this question than to run it myself. 🫠

#session.logger.info(f"Contributor: {cntrb} \n")
batch_insert_contributors(logger, [cntrb])

except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MoralCode : Is this how we are propagating errors now? I'm asking because I know it's how we did that originally, and I know @ABrain7710 had at one point advised a different strategy. Time for a style guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the existing pattern in this file (log then raise e) and didn’t change the error handling. Happy to follow whatever strategy you and @ABrain7710 prefer for propagating errors and to update this PR if you decide on a style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we've edited this file in a spell, so I'm more questioning if there's a new style we are using that we should implement while we are here. So, that's not at all a critique of what you did. Following what's already in file is exactly what I would have done. I'm more asking other maintainers so we take the opportunity of this pr to pay down tech debt. This might be how we are doing it now ... I've been on augur for too long to be sure without asking Andrew and Adrian.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a style guide currently available that i know about. I think that would make a good compliment to #3662. I don't think we should enforce style guidelines that are undocumented as that isn't fair to new contributors.

- Fix NameError when email is missing: set canonical_email from email (both
  can be None) so contributor insert never fails on missing email.
- Use .get() for all optional profile fields (email, name, company, location,
  created_at, updated_at, and all gh_* URL fields) so every available value
  from the user endpoint is stored in the contributors table.

Signed-off-by: antcybersec <anant1234466@gmail.com>
…#3694)

Signed-off-by: antcybersec <anant1234466@gmail.com>
@MoralCode
Copy link
Contributor

@antcybersec are you on slack?
I'd like to chat about this in more detail over on the CHAOSS Slack in the #wg-augur-8knot channel. I think its a good change, but want to make sure everyone is on the same page

Copy link
Contributor

@MoralCode MoralCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Remove the two new script files and ill change this to approved.

@MoralCode MoralCode added the pending changes PRs that have review comments that have been added but havent been addressed in a while label Feb 16, 2026
@MoralCode MoralCode added this to the v0.93.0 milestone Feb 16, 2026
Signed-off-by: antcybersec <anant1234466@gmail.com>
Copy link
Contributor

@MoralCode MoralCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approve. Havent personally tested but the code looks fine at least

@MoralCode MoralCode added testing Related to Augur's testing suite and removed pending changes PRs that have review comments that have been added but havent been addressed in a while labels Feb 17, 2026
@iGufrankhan
Copy link
Contributor

@MoralCode

I checked out the PR branch locally and tested ingestion on psf/requests.
Contributor resolution completed successfully, contributors without public email were inserted with canonical_email set to
NULL, and I did not observe any NameError or crashes during ingestion.

image

all runnning good

now let me verify before the test log if error happening or not

@MoralCode MoralCode requested a review from shlokgilda February 18, 2026 19:56
Copy link
Collaborator

@shlokgilda shlokgilda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid fix. Left a minor inline comment, nothing blocking. Looks good to merge once that's considered.

#"data_source": session.data_source
"gh_url": contributor.get('url'),
"gh_html_url": contributor.get('html_url'),
"gh_node_id": contributor.get('node_id'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code had a comment here noting this field is used for duplicate checking. Might be worth keeping that context since it's not obvious from the field name alone.

Suggested change
"gh_node_id": contributor.get('node_id'),
"gh_node_id": contributor.get('node_id'), # Used for duplicate checking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is documenting a usecase in another part of augur's code, rather than something from github and is minor enough that im inclined to just let it go. if its being used, im sure a ctrl-f for it in the augur code will help locate the code using it.

@MoralCode MoralCode merged commit 4025bba into chaoss:main Feb 18, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Related to Augur's testing suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contributor user information from profile - emails as well as other available info

5 participants

Comments