Conversation
Documentation build overview
Show files changed (4 files in total): 📝 4 modified | ➕ 0 added | ➖ 0 deleted
|
Co-authored-by: Gregory P. Smith <[email protected]> Co-authored-by: Donghee Na <[email protected]> Co-authored-by: devdanzin <[email protected]>
d2de5dc to
726ec3c
Compare
Co-authored-by: Jacob Coffee <[email protected]>
…delines. Add the Guidelines to the contributing table.
savannahostrowski
left a comment
There was a problem hiding this comment.
Thank you for doing this, @Mariatta!
My comments are mainly about extending the guidance to cover issues as well. While AI tooling can be great at surfacing real bugs and security issues, I think it's still important that those filing issues understand the problem themselves so we can keep discussions focused and productive.
| Considerations for success | ||
| ========================== | ||
|
|
||
| Authors must review the work done by AI tooling in detail to ensure it actually makes sense before proposing it as a PR. |
There was a problem hiding this comment.
| Authors must review the work done by AI tooling in detail to ensure it actually makes sense before proposing it as a PR. | |
| Authors must review the work done by AI tooling in detail to ensure it actually makes sense before proposing it as a PR or filing it as an issue. |
|
|
||
| Authors must review the work done by AI tooling in detail to ensure it actually makes sense before proposing it as a PR. | ||
|
|
||
| We expect PR authors to be able to explain their proposed changes in their own words. |
There was a problem hiding this comment.
| We expect PR authors to be able to explain their proposed changes in their own words. | |
| We expect PR authors and those filing issues to be able to explain their proposed changes in their own words. |
| Disclosure of the use of AI tools in the PR description is appreciated, while not required. Be prepared to explain how | ||
| the tool was used and what changes it made. |
There was a problem hiding this comment.
| Disclosure of the use of AI tools in the PR description is appreciated, while not required. Be prepared to explain how | |
| the tool was used and what changes it made. | |
| Disclosure of the use of AI tools in the PR description is appreciated, while not required. Be prepared to explain how the tool was used and what changes it made. |
Looks like some funky line breaking?
There was a problem hiding this comment.
I had it to break after 120 characters.
But now that I read the devguide's Rst markup doc, seems like we're supposed to break at 80 characters.
https://devguide.python.org/documentation/markup/#use-of-whitespace
| the responsibility of the contributor. We value good code, concise accurate documentation, and avoiding unneeded code | ||
| churn. Discretion, good judgment, and critical thinking are the foundation of all good contributions, regardless of the | ||
| tools used in their creation. | ||
| Generative AI tools are evolving rapidly, and their work can be helpful. As with using any tool, the resulting |
There was a problem hiding this comment.
It wasn't done before in this file for some reason, but could we please wrap lines?
There was a problem hiding this comment.
I was going to say the opposite :)
The rewrap make it hard to review what has changed. Please can we keep a minimal diff for now, and only rewrap just before merge?
There was a problem hiding this comment.
Just before merge sounds good to me :-)
| Sometimes AI assisted tools make failing unit tests pass by altering or bypassing the tests rather than addressing the | ||
| underlying problem in the code. Such changes do not represent a real fix and are not acceptable. |
There was a problem hiding this comment.
I'd like to see this worded in more general terms rather than using such a specific example (older models did this a lot more than 2026's). What this is really getting at is that we want people to be cautious about reward hacking rather than addressing the actual underlying problem in a backwards compatible manner.
maybe something along the lines of:
"Some models have had a tendency of reward hacking by making incorrect changes to fix their limited context view of the problem at hand rather than focusing on what is correct. Including altering or bypassing existing tests. Such changes do not represent a real fix and are not acceptable."
There was a problem hiding this comment.
I think this can be generalized beyond AI tools to humans as well.
There was a problem hiding this comment.
"Some AI tools may provide responses to a user's prompt that diverge from recommended practices since the AI tool may not have been trained on the full context of the problem and recommended practices. Sometimes, due to limited context, the tool will alter or bypass existing tests. Such changes do not offer a real fix and are not acceptable."
There was a problem hiding this comment.
I'd avoid using the word "trained" as that has a specific meaning in the AI field that isn't really the reason. Focusing on "context" is good as that's the important and widely known term used in AI.
Just "... the AI tool may not have the full context of the problem and recommended practices. You need to provide it that.".
I've never really liked the "Sometimes, due to limited context, the tool will alter or bypass existing tests." example as it is dated for anyone using the latest models (not everyone is... an entirely different access problem that thus makes general purpose vague docs like this hard). But it felt like we should keep some form of an example undesirable behavior from an insufficiently guided model in here in order to make the more important "Such changes do not offer a real fix and are not acceptable." be tied to a concrete example. So absent clearly better ideas, and knowing some users will be using lesser models, it still fits.
| - Consider whether the change is necessary | ||
| - Make minimal, focused changes | ||
| - Follow existing coding style and patterns | ||
| - Write tests that exercise the change |
There was a problem hiding this comment.
Should we add another bullet point along the lines of:
" - Keep backwards compatibility with prior releases in mind. Existing tests may be ensuring specific API behaviors are maintained."
perhaps a follow paragraph after this list:
"Pay close attention to your AI's testing behavior. Have conversations with your AI model about the appropriateness of changes given these principles before you propose them."
There was a problem hiding this comment.
I would rather that we not personify the tools @gpshead. Perhaps:
"Pay close attention to an AI tool's recommendations for testing changes. Provide input about Python's testing principles before requests to the AI tool's model. Always review the AI tool's output before opening a pull request or issue."
|
I would like text added to emphasize the dangers of AI assistants including work derived from training data, potentially violating the originals' copyrights and/or liecnsing terms. Core devs don't need that pointed out, but we have contributors of many backgrounds and experience levels. They're responsible for ensuring they have the legal right to grant the PSF permission to re-license their contributions, but explicit is better than implicit. Let's not assume "everyone knows" - everyone doesn't. |
Absolutely not. Such words are reactionary made up non-specific dangers with nothing concrete to back them up. Thus they have no place in the Python devguide or policies because they are not actionable. Contributor guidelines, the CLA, and license terms have already long covered this from a policy point of view. |
|
There are many examples from researchers of AI assistants duplicating training data verbatim, without attribution. blatantly violating copyright. How much more specific could it be? Newer users in particular are easily bamboozled by this, unware of the issues, and seduced by the supremely confident tone AI assistants adopt. I'm concerned about them and the project. The CLA doesn't even explicitly ask contributors to attest they have a legal right to license their contributions - that's all hiding behind the single word of legalese "valid". We haven't "long covered" this, because the intensified dangers of AI-produced code are a new development. A few years back, a new contributor opened a PR with code copied verbatim from glibc. How did we catch it? Dead easy: a comment in the code plainly said what followed was copied from glibc. They simply didn't know any better at the time. BTW, they want on to become a core dev. And they knew they were copying. How much more likely is someone to unwittingly contribute work that was copied by their AI assistant? How would they know? How would we? I'm not claiming we can "fix this". We can't. But we can - and in IMO should - alert contributors that the risks of contributing derivative works are surely intensified by the use of AI assistants. Not to dissuade them, but to help inform their decisions. Not a change in policy, but pro-active education. You may as well argue that all cautions about AI-produced code are redundant. For example, why encourage people to "Keep backwards compatibility with prior releases in mind"? That's always been policy too, |
|
BTW, Copilot assures me that provenance issues are the greatest danger projects face from use of AI tools. Being silent about that seems quite ill-advised. But it also tells me that few cases of AI-enabled copyright/licensing violations get any publicity. Organizations want to keep them quiet, and contributors who unwittingly submit tainted code are hardly likely to publicize it either. The chardet case is wildly atypical in every respect. |
|
Tim, I realize my response came off harsh. Sorry! I do care about this, I just want to keep this doc focused on actionable guidance for CPython contributions rather than general AI education, which it'll always be behind on. The reason I proposed a backwards-compat reminder but am pushing back on this one: backwards compatibility is something the core team actively evaluates on most every PR. Contributors often get it wrong, and it's a concrete thing they can guide their model to keep in mind. It's a problem we actually see, so nudging AI-using contributors to be proactive about it could have a clear payoff. A provenance warning doesn't have the same shape. We don't have a pattern of AI-laundered copyrighted code showing up in CPython PRs, and even if a contributor reads the warning and takes it seriously, what are they supposed to do? There's no reasonable verification step we can ask of them, and none we can perform either. A caution with no corresponding action just creates unease, and I don't think that earns space in these guidelines. The licensing obligation itself is real and already lives in the CLA. I'm also not well placed to debate licensing specifics in a public thread, so I'll leave that side of it alone. If the concern is that newer contributors don't understand what they're agreeing to in the CLA, that's worth raising with the PSF as a CLA question rather than something we patch with an AI-specific note here. My backwards-compat suggestion doesn't have to go in either, FWIW; that's Mariatta's and the docs reviewers' call. But I'd be sad to see a provenance warning land, as I think it'd detract from what's otherwise shaping up to be a refreshingly practical AI guidelines doc. For this PR I suggest we proceed with the other reviews and table the provenance discussion. It isn't something this PR can resolve. |
|
Ya, I don't have easy answers here. It's just weird to me that our "AI policy" would take pains to point out potential problems in the context of AI that actually apply to all contributions, regardless of source. Whether it's the need for good tests, minimal disruption, or backward compatibility. Yet omit the one area (provenance) in which AI assistants are known to intensify risk, and is the area in which multiple lawsuits are currently active. Not yet iovolving the PSF, but the outrageous chardet case is adjacent to the Python ecosystem. Just a matter of time. In the absence of will to address these issues directly, as best we can (yes, contributors are responsible, but no, we have no concrete suggestions for how they can be sure they have the legal right to license their contributions - which has always been true, but "explicit is better then implicit"), I think it better to say less rather than more. Point to what we expect of all contributions, adding no more here than that those criteria also apply to work of AI origin, in which cases the risks may be especially high. While it's likely too on-target to fit with the devguide's style 😉, I like what Copilot suggested to me: "think of an AI assistent as a junior colleague with a photographic memory but no common sense" 😄 |
IMO, the devguide should, in general, provide commentary/implications/warnings/rules of thumb for more succinctly/formally/technically worded documents. This is the place. |
willingc
left a comment
There was a problem hiding this comment.
Thank you @Mariatta for these updates. I have one strong ask for any documentation about generative AI, LLMs or other AI tools. I want to avoid personifying the tool or equating their ability and judgement to be the same as a human. Like all of our development tools, these tools, while capable in some situations, are still tools that require guidance and verification by humans.
| Sometimes AI assisted tools make failing unit tests pass by altering or bypassing the tests rather than addressing the | ||
| underlying problem in the code. Such changes do not represent a real fix and are not acceptable. |
There was a problem hiding this comment.
I think this can be generalized beyond AI tools to humans as well.
| the responsibility of the contributor. We value good code, concise accurate documentation, and avoiding unneeded code | ||
| churn. Discretion, good judgment, and critical thinking are the foundation of all good contributions, regardless of the | ||
| tools used in their creation. | ||
| Generative AI tools are evolving rapidly, and their work can be helpful. As with using any tool, the resulting |
There was a problem hiding this comment.
| Generative AI tools are evolving rapidly, and their work can be helpful. As with using any tool, the resulting | |
| Generative AI tools can produce output quickly. As with using any tool, the resulting |
There was a problem hiding this comment.
I don't think we should personify the tools any longer. I also want to be more objective about the results.
There was a problem hiding this comment.
Perhaps just simplify to:
"AI tools can produce results quickly. As with using any ..."
FWIW, "their work" was not intended as personification, it a extremely common phrase for how we refer to any machines or really any objects serving any purpose in English. good bad or indifferent.
We should remove the word "Generative" to simplify this further. "produce output" reads as rather a diss on AI models and agentic harnesses that belittles their (non-personified) capabilities that would make for a somewhat dated feeling policy. "results" is more what people are looking for and encompasses all sorts of actions taken.
I don't like saying "produce output" for the same reason that "generative" doesn't quite have the right ring to it. Internally things done by agentic AI are is technically "output" in the output tokens turning into tool calls that iteratively converge on the goals we asked for sense. But AI users see the end result rather than how it happened inside.
There was a problem hiding this comment.
"produce output quickly" is pretty different from the original intent of this sentence, which to me is talking about the rate of change of tools themselves
The speed of generation is not really relevant in my mind, but the acknowledgement that our guidelines may not contain everything that is most helpful to the reader in whatever latest state of the world they find themself in is
| tools used in their creation. | ||
| Generative AI tools are evolving rapidly, and their work can be helpful. As with using any tool, the resulting | ||
| contribution is the responsibility of the contributor. We value good code, concise accurate documentation, | ||
| and avoiding unneeded code churn. Discretion, good judgment, and critical thinking are the foundation of all good |
There was a problem hiding this comment.
| and avoiding unneeded code churn. Discretion, good judgment, and critical thinking are the foundation of all good | |
| and well scoped PRs without unneeded code churn. Discretion, good judgment, and critical thinking are the foundation of all good |
| - Consider whether the change is necessary | ||
| - Make minimal, focused changes | ||
| - Follow existing coding style and patterns | ||
| - Write tests that exercise the change |
There was a problem hiding this comment.
I would rather that we not personify the tools @gpshead. Perhaps:
"Pay close attention to an AI tool's recommendations for testing changes. Provide input about Python's testing principles before requests to the AI tool's model. Always review the AI tool's output before opening a pull request or issue."
| Sometimes AI assisted tools make failing unit tests pass by altering or bypassing the tests rather than addressing the | ||
| underlying problem in the code. Such changes do not represent a real fix and are not acceptable. |
There was a problem hiding this comment.
"Some AI tools may provide responses to a user's prompt that diverge from recommended practices since the AI tool may not have been trained on the full context of the problem and recommended practices. Sometimes, due to limited context, the tool will alter or bypass existing tests. Such changes do not offer a real fix and are not acceptable."
|
@tim-one @gpshead You make excellent points. There is much gray area and uncharted waters around privacy, security, and copyright. I would like to strongly word in this section that the LLMs/AI tools are software development tools not humans. An individual human is responsible for PR and issue submission as well as the quality and security of any code submitted. While AI tools may be used to assist the individual, the individual is still responsible for their submissions. |
No description provided.