Add code and review guidelines to CONTRIBUTING.md and AGENTS.md#964
Add code and review guidelines to CONTRIBUTING.md and AGENTS.md#964
Conversation
Document the team's coding standards and review conventions derived from reviewing PR history. Key additions: responsibility isolation principles, polymorphism over isinstance chains, config validation, testing practices, naming conventions, and review comment labeling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
IDK if this is detail we want to include, but I was gathering this information for my own benefit and found it pretty helpful. We can discuss whether to merge or not. |
CONTRIBUTING.md
Outdated
| - Pass composed objects rather than their parts (e.g. `stepper.loss_scaling` | ||
| instead of two separate arguments). |
There was a problem hiding this comment.
Prefer to keep this example abstract in case the code changes, so this doesn't go out-of-date, or maybe remove the example altogether.
| - **Test helpers over copy-paste.** Create helper functions to build common | ||
| test fixtures. If the same setup appears 3+ times, extract a helper. |
There was a problem hiding this comment.
Also worth noting we want to avoid using pytest fixtures, use helpers instead.
There was a problem hiding this comment.
Agreed that fixtures can quickly become unwieldy, but can also be very useful e.g. for sharing scope across tests.
There was a problem hiding this comment.
Updated (see extension of bullet).
CONTRIBUTING.md
Outdated
| - **Test behavior, not implementation.** If a test re-implements the logic | ||
| it's testing, it isn't actually verifying anything. |
There was a problem hiding this comment.
Also important to have tests that cover important user-story-level details, rather than a lot of tests covering subjective API details, since those make it harder to change those details later and should be used in moderation. Not to say we can't have both, though.
There was a problem hiding this comment.
Yeah, I like tests for lower level operations with tricky logic, but once those are set they kind of feel like dead code compared to the user-story-level details of using the higher-level objects to perform ops.
CONTRIBUTING.md
Outdated
| communication; use "localize" when each rank computes its own local view. | ||
| - Config class names: append `Config` to the name of the thing being built | ||
| (e.g. `TrainStepperConfig` builds `TrainStepper`). | ||
| - Prefer descriptive names over abbreviations (`noise_distribution` not |
There was a problem hiding this comment.
Worth noting descriptions should for the most part only include information that's available in the context of its scope, for example we would avoid calling a function normalize_loss(loss: Tensor) if its internals have nothing to do with loss and could normalize anything - instead it would be normalize(x: Tensor).
CONTRIBUTING.md
Outdated
| about what level of abstraction that change could be handled in. Choose the | ||
| option that splits the concern across the fewest levels of abstraction. When a | ||
| decision changes in the future, as little code as possible should need to be | ||
| touched. |
There was a problem hiding this comment.
Worth noting this does not mean you should be changing as few lines as possible when putting in the feature. You might need to modify APIs at several levels to isolate the feature into one level of abstraction.
AGENTS.md
Outdated
|
|
||
| - `if/raise` instead of `assert` in production code. | ||
| - Context managers for cleanup (timers, distributed contexts). | ||
| - Pass composed objects, not their parts. |
There was a problem hiding this comment.
Maybe add the specification "if multiple attributes would be use within the function". Something to highlight that I shouldn't pass an object just to grab a single value from it.
AGENTS.md
Outdated
| ### Config design | ||
|
|
||
| - Validate in `__post_init__`, not at runtime. | ||
| - No redundant options; remove unused fields. |
There was a problem hiding this comment.
Do we want to have any guidance on backwards compatiblity for configuration parameters of training/eval/inference? It slightly conflicts if we remove top-level configuration fields instead of deprecation warnings. Not that this is a frequent occurence.
CONTRIBUTING.md
Outdated
| - **Test behavior, not implementation.** If a test re-implements the logic | ||
| it's testing, it isn't actually verifying anything. |
There was a problem hiding this comment.
Yeah, I like tests for lower level operations with tricky logic, but once those are set they kind of feel like dead code compared to the user-story-level details of using the higher-level objects to perform ops.
| issue first to discuss it with us**. This helps ensure the feature aligns with | ||
| the project's direction and prevents wasted effort. | ||
|
|
||
| ## Code Guidelines |
There was a problem hiding this comment.
Like @brianhenn mentioned, this feels overly verbose for CONTRIBUTING.md since the wall of text may dissuade humans from reading it! I like the sort of FAQ type high-level stuff under the first abstraction section. I'd vote for keeping that and the testing section since those would be most broadly relevant.
There was a problem hiding this comment.
I think I agree. As a first pass, I'll keep those, but move everything else (including the part about refactoring using facades) to AGENTS.md and add a reference to it from here.
It's hard because I feel like this is text all humans should read >.< but I do agree with you.
| - Backwards compatibility for checkpoint loading is critical; use deprecation | ||
| warnings for config removal. | ||
|
|
||
| ### Testing |
There was a problem hiding this comment.
I think we should indicate a preference for fast-running and parsimonious tests.
There was a problem hiding this comment.
I'm removing this for now in light of @frodre 's article, because I've so far found Claude is very good at writing fast-running tests already.
jpdunc23
left a comment
There was a problem hiding this comment.
I'm curious to know if the changes to AGENTS.md were motivated by specific behavior you observed working with Claude and whether you've see much difference with the new context.
As a general rule, I think we should try to keep AGENTS.md as lightweight as possible to avoid filling up context with things that may not be relevant or may even distract from the task at hand. This includes avoiding duplicated info between AGENTS.md and other docs like README.md and CONTRIBUTING.md which agents are likely to read, and also motivates keeping those docs succinct while maintaining human readability.
Much of the context of how we want code written is embedded in the code base itself, so IMO it isn't necessary to include too many specifics here. However, if agents seem to repeatedly make the same mistake (even in new sessions), then that's probably a good indicator of something specific that would be helpful to add to AGENTS.md.
| - **Test helpers over copy-paste.** Create helper functions to build common | ||
| test fixtures. If the same setup appears 3+ times, extract a helper. |
There was a problem hiding this comment.
Agreed that fixtures can quickly become unwieldy, but can also be very useful e.g. for sharing scope across tests.
| issue first to discuss it with us**. This helps ensure the feature aligns with | ||
| the project's direction and prevents wasted effort. | ||
|
|
||
| ## Code Guidelines |
There was a problem hiding this comment.
For much of the general advice below, rather than commit to developing and maintaining our own set of detailed coding guidelines and design principles, I think it would be better to simply include links to external style guides and guidelines that match our expectations for the quality of contributions from sources that we know aren't going anywhere.
This way the contributing guidelines focus on things that are specific to fme.
There was a problem hiding this comment.
Do you have examples of external style guides to include? These came specifically from looking at our code reviews, rather than an external reference, so I don't have one (except for conventional comments).
I do feel like while some of these are general guidelines, they are ones that are specifically relevant to our code. It's hard to use a lot of CS-focused style guidelines because the nature of our code (our being the end user and developer) means a lot of the rules can be bent or broken. I find it hard in using them to pick out the "right" ones.
For example, isolating responsibilities to one abstraction level is specifically relevant to us because the freedom to make changes to our code in the future is high-value (because our requirements constantly change).
AGENTS.md
Outdated
| ### Review comment conventions | ||
|
|
||
| One guide our team finds helpful is [Conventional Comments](https://conventionalcomments.org/). | ||
| Label every comment with a severity prefix: | ||
| - **Issue**: must fix before merge. | ||
| - **Suggestion (optional)**: non-blocking improvement. | ||
| - **Question**: seeking clarification. | ||
| - **nit**: minor style; does not need re-review. |
There was a problem hiding this comment.
This duplicates similar content added to CONTRIBUTING.md
I've also seen references to a research suggesting that machine generated context files or overly verbose/convoluted context files end up hurting agent performance rather than helping it. I've seen suggestion that targeted instructions / tool specifications usually lead to better outcomes, but also you actually have to evaluate whether it's helping. Iike @jpdunc23 said, it would be useful to know what of the updates were borne out of agent mistakes |
- Remove concrete stepper.loss_scaling example, keep guidance abstract - Add note about preferring helpers over pytest fixtures - Add guidance on user-story-level tests vs API-level tests - Add scope-context naming guidance (names reflect present scope) - Clarify that isolating responsibilities != minimizing lines changed - Clarify "pass composed objects" applies when multiple attributes used - Add preference for fast-running, parsimonious tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CONTRIBUTING.md was too verbose per reviewer feedback. Condensed it to keep only the Design and Testing sections (most broadly relevant), linked to Google Python Style Guide for general advice, and pointed readers to AGENTS.md for detailed naming, config, and code org guidance. Moved the following into AGENTS.md: - Naming scope-context example (normalize vs normalize_loss) - Pytest fixtures guidance (prefer helpers) - User-story-level testing preference - Code consolidation and cleanup guidelines Removed review comment conventions from AGENTS.md (duplicated CONTRIBUTING.md content). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The design guidelines are already in CONTRIBUTING.md. Replace with a pointer so agents read them from the single source of truth. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Ready for another iteration if anyone has feedback. |
|
I agree with @brianhenn that there is a lot of nice content here that would be good to preserve somewhere. One possibility is to add a new page to the docs which we can link to from |
|
I'm going to do another pass of the AGENTS.md based on the research @frodre linked. Basically, I'll treat added constraints as further constraints the agents will need to follow, which might make their job harder. I'll focus on adding constraints that actually are things we want to specify (e.g. we want to avoid large numbers of tests of low-level APIs), as well as lines that specify tooling that is available if the agent so chooses to use it (like our regression tools in Another option is to hold off entirely on the updates to AGENTS.md and just merge the CONTRIBUTING.md updates for now, continuing to work on a PR for the AGENTS changes. One thing I do want to add but will hold off on right now is to tell agents to handle larger tasks by 1) planning out the full change, 2) splitting the plan into a series of unit-tested commits of individual changes/features that are easier to review, and 3) implementing those commits one by one. I have been doing this manually in each new session I make, and it's been extremely helpful in having Claude do larger PRs. Without this instruction, Claude has a tendency to tackle the entire change in one mega-commit, which I find I can't review well. Part of me wonders if some things like this should go in CLAUDE.md instead of AGENTS.md, I don't know how Cursor makes use of that file. |
|
|
||
| To use the PR review rules, configure the GitHub MCP server in Cursor's "Tools & MCP" settings. You will need a read-only personal access token with the following permissions: Pull requests, Issues, Contents, Metadata. | ||
| To use the PR review rules, configure the GitHub MCP server in Cursor's "Tools & MCP" settings. | ||
| You will need a read-only personal access token with the following permissions: Pull requests, Issues, Contents, Metadata. |
There was a problem hiding this comment.
Updated the file to follow "one sentence per line" formatting, which it already mostly followed.
Document the team's coding standards and review conventions derived from reviewing PR history.
Key additions: responsibility isolation principles, polymorphism over isinstance chains, config validation, testing practices, naming conventions, and review comment labeling.