Skip to content

Formulate a framework for responsible LLM usage when coding, plus practical recommendations#21

Open
spwoodcock wants to merge 23 commits intomainfrom
docs/llm-usage-guide
Open

Formulate a framework for responsible LLM usage when coding, plus practical recommendations#21
spwoodcock wants to merge 23 commits intomainfrom
docs/llm-usage-guide

Conversation

@spwoodcock
Copy link
Member

@spwoodcock spwoodcock commented Feb 17, 2026

Fixes #18

The problem

  • At HOT, as with most other open-source communities, we need to provide some guidance around the usage of LLMs and agentic coding.
  • This discussion is open to anyone from the public to contribute to, if they have points to make that we may have missed (or have suggestions for better remediation strategies 🙏)

This PR

  • After internal deliberation, external consultation, and summarisation of the resulting themes, this PR hopes to publish an acceptable set of guidelines to navigate the challenge.
  • Suggested reading order:
    1. using-llms-responsibily.md: a description of the ongoing problem and it's various framings.
    2. ai-assisted-coding-guide.md: practical guide for how to approach agentic coding.
    3. managing-ai-contributions.md: practical guide for maintainers accepting AI-assisted code contributions.

Please review, comment, and contradict whatever is written here as needed.
This is supposed to be a collaborative learning exercise to work on these difficult challenges together.

Disclaimer: Yes the documents were initially synthesised from linked references using Claude Opus 4.6 - text summarisation is where LLMs shine after all... the content has then been reviewed and edited by me, to give us a starting point. I added in a few additional perspectives from our ongoing calls with partner organisations too.

Notes

Based on documents:
https://docs.google.com/document/d/1F9C1aaE2CW9JmEmJlOCkuc9Lr_M-YybXXVyHlqDe_GY
https://docs.google.com/document/d/1M85SirgyyQrS33r4l4ta6JDWJg9OgO0BIVKBSoogVZE
https://docs.google.com/document/d/1uMT9EMd50NUCwRj5CTJg2ShQRWbxcI-U3g5oFFS6oMA

@spwoodcock
Copy link
Member Author

cc @AbdelrahmanKatkat @Claurt07

Copy link
Member

@dakotabenjamin dakotabenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough and well-researched. Many problems are identified, but the mitigations look weak in some places, and may fail to address the problem. I've noted a couple in my in-line comments.

We could also add some processes around the following to support some of the guidelines:

  • measuring the impact of AI usage on maintainers (time spent vs. contributions?)
  • for maintainers, checklist or clear factors for when to reject contributions

@spwoodcock
Copy link
Member Author

Thanks for taking the time to review @dakotabenjamin!

I really appreciate your input as someone that cares about this a lot & you make some very valid points - I'll try to address them all 😄


Page content to come.

We plan to research and recommend the best open models to use, as alternatives to proprietary services.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be also generic in above document, Try to prefer open source based models than proprietary ones ! Like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to provide some guidance though! It's easy to provide guidelines, but then someone gets stuck on what actual tools / models to use. We can probably provide some decent options after a little research

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spwoodcock , is my understanding this would be a recommendation only? I think (just as our internal team is experimenting, other contributors might as well)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for sure, simply a recommendation. Users are free to use whatever models they like.

If one day there is an org / model that we really disagree with their methods, we could possibly make a 'banned models' list

Copy link
Member Author

@spwoodcock spwoodcock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few fixes / updates 👍

@kshitijrajsharma
Copy link
Member

kshitijrajsharma commented Feb 17, 2026

@spwoodcock Do you think we should also mention about code license , if AI is being used to write code from some other repo or somewhere else try to see or match compatible license ? I know license with AI agents are kinda shady topic now , its very difficult to know but I feel like we make author aware at-least !

And also about code documentation:

May be we can add ( something like this ) :

Try to document why those changes rather than what every-line does which is self explanatory ( AI tends to do that )

@spwoodcock
Copy link
Member Author

@spwoodcock Do you think we should also mention about code license , if AI is being used to write code from some other repo or somewhere else try to see or match compatible license ? I know license with AI agents are kinda shady topic now , its very difficult to know but I feel like we make author aware at-least !

And also about code documentation:

May be we can add ( something like this ) :

Try to document why those changes rather than what every-line does which is self explanatory ( AI tends to do that )

Thanks for all the comments @kshitijrajsharma! I addressed your points and updated with suggestions 😃

For the point about licensing, there is a comment about this in using-llms-responsibility.md which says:

"The LLVM Project's AI policy states it clearly: using AI tools to regenerate copyrighted material does not remove the copyright, and contributors remain responsible for ensuring nothing infringing enters their work [4]. The risk includes inadvertently incorporating copyrighted code or text into publicly released outputs."

Do you think that is enough to cover it, or we should be more explicit somewhere, perhaps in ai-assisted-coding-guide.md?

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 18, 2026

Thanks @mjvanderveen! (your input really helped to craft this)

Things remaining to do:

  • Guidance on methods available for AI-assisted coding, e.g. paired programming vs agentic mode.
  • Rewording of importance of using AI. Its not mandatory, we just need to have a framework in place

Now the question of the hour: how do we identify LLM generated code?

I made a start on this here, but would love some input from anyone on the heuristics they have encountered to help do this, and potentially any tools they have have tested to automate it 🙏

Copy link
Member

@dakotabenjamin dakotabenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see more input from members of the team before giving explicit approval, but to me this is a great addition to the documentation. Once merged I'll also add it to HOT AI policy as reference.

@LeenDhondt
Copy link
Contributor

LeenDhondt commented Feb 23, 2026

I think you did great research Sam and provided a clear framework, also incorporated most feedback both from internal users as from our discussions with partners.

I do suggest we first run it by our full tech team meeting before merging.

@spwoodcock
Copy link
Member Author

Adding an idea I read here that I really like:

Good use for AI / LLMs:

  • Splitting PRs into smaller chunks: Sometimes PRs (particularly those produced by LLMs) can be far too large to properly review. AI can suggest a logical division of code into separate PRs / commits, allowing for easier human review.

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 24, 2026

Remaining things to update:

  • Guidance on how to measure the impact of AI usage on maintainers (time spent vs. contributions?)
  • Checklist for maintainers or clear factors for when to reject contributions
  • Guidance on which AI tools to use (related to the 'open models' page that was added').
  • Better remediation strategies for environmental impacts
  • Assess whether the legal position for usage of LLMs is untenable, and shuts down this whole exercise (well at least part of it - we still need guidance for how to handle external AI-assisted contributions anyway)

Once complete, we will discuss in our next team meeting, then merge once happy with it👍

@smathermather
Copy link

"The LLVM Project's AI policy states it clearly: using AI tools to regenerate copyrighted material does not remove the copyright, and contributors remain responsible for ensuring nothing infringing enters their work [4]. The risk includes inadvertently incorporating copyrighted code or text into publicly released outputs."

Do you think that is enough to cover it, or we should be more explicit somewhere, perhaps in ai-assisted-coding-guide.md?

As I understand it, there are two axes of copyright risk with AI contributions. One is the regeneration of copyrighted material and the other is the copyright-ability of the AI portions of the contribution. I link to the US legal side of this concern, though as I understand it, the EU side is similar.

In short, disclosure is critical for the author asserting the copyright on contributions that include AI written bits.

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 24, 2026

As I understand it, there are two axes of copyright risk with AI contributions. One is the regeneration of copyrighted material and the other is the copyright-ability of the AI portions of the contribution. I link to the US legal side of this concern, though as I understand it, the EU side is similar.

In short, disclosure is critical for the author asserting the copyright on contributions that include AI written bits.

Thats a really valuable document, thanks for sharing @smathermather ❤️

Also one of the toughest points to assess, where there are no real remediation strategies possible.

Either (1) get swept along with the crowd of hiding from it (2) view it as acceptable risk and a ethical negative, weighted up against the ethical positives of the work we do (3) refuse to engage due to the concerns.

Tough call that we need to discuss more and work out a way forward for!

Open to any input or suggestions from people!


Also, its been noted that the remediation strategies on the environmental front are a bit weak, which I agree.

Orgs could possibly do some back of the envelope calcs for how much usage there might be from our own team, approximate the kWh usage, then donate this amount to effective charities and orgs in this space?

This is obviously not an acceptable strategy for the whole world to engage in, but let's be real: most orgs don't care aren't going to saturate the funding and effectiveness of these charities.

I would promote https://www.effectiveenvironmentalism.org/climate-charities

Again, far from perfect, but it would go some way to acknowledging the problem and attempting to solve it in a roundabout way.

Note

I'm commenting entirely on my own behalf, and don't represent the views of HOT. I haven't sought approval to see if remediative donations is an option.


One excellent, well-tested PR is worth more than ten AI-generated patches that each require maintainer effort to evaluate. Quality over quantity. Always.

### Prefer Existing Libraries
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is mentioned above too. It could possibly be removed, or perhaps its worth reiterating an important point

- [ ] Error handling does not leak sensitive information
- [ ] No unnecessary permissions or access scopes
- [ ] SQL queries are parameterised (no string concatenation)
- [ ] File paths are sanitised against traversal attacks
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth clarifying / linking somewhere for how to best do this


This file is read by AI coding agents (Copilot, Claude Code, Cursor, etc.) when they work on your codebase. It tells the AI what your standards are, what's off-limits, and how to behave. Think of it as onboarding instructions - but for machines.

**What to include:**
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention:

  • MADR, link to section below
  • Tech decisions or paths already explored and discounted - do not attempt these approaches

AI tools must not be used to fix issues labelled `good first issue`.
These exist for human learning.

For full policy details, see: https://docs.hotosm.org/ai-assisted-coding
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to a relative link

- **Code quality**: SonarQube Cloud is free for open source projects to use, assisting code quality and security compliance.
- **Dependency checking**: OWASP [DependencyCheck](https://github.com/dependency-check/DependencyCheck) or [OSV Scanner](https://github.com/google/osv-scanner) can be used to ensure dependencies are updated to avoid latest security vulnerabilities. It's also recommended to use [Renovate bot](https://github.com/renovatebot/renovate) to regularly update dependencies.
- **Secrets scanning**: [GitLeaks](https://github.com/gitleaks/gitleaks) can be integrated as a pre-commit hooks or CI action to prevent accidental commit of org secrets.
- **Licensing and copyright**: [ScanCode Toolkit](https://github.com/aboutcode-org/scancode-toolkit) can be used to scan for copyright breaches in your code and non-compliance with license requirements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test this one out & see how it performs!

**Key points for reviewers:**

- If a PR is marked AI-assisted, ask "why this approach?" - the answer tells you if the contributor understands the code.
- Watch for: verbose AI-style PR descriptions, generic variable names, unnecessary complexity, dependencies that seem unrelated.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply link to the section above instead of listing out the same signs of AI contribution


## Introduction

AI coding tools have moved from novelty to daily workflow in under two years. Andrej Karpathy coined the term "vibe coding" in early 2025 - describing developers who prompt AI, accept all suggestions, and barely read the output. By early 2026, he had already moved on, calling the practice outdated and advocating instead for "agentic engineering": careful, supervised AI-assisted development with full human oversight [1]. While early-2025 AI models were shown in some cases to have a net negative impact on developer productivity [21], models have improved significantly by early 2026, alongside growing efforts within open-source communities to establish appropriate governance and usage policies.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be too soon to have an authoritative source or paper on this, but once there is, it should be added!

Evidence is primarily anecdotal for now, speaking with devs in different orgs, observing the need for communities to catch up to the pace of model development & implement policies.

Sure there is some hype as well, but where there is smoke, there is generally fire too (even if its just smouldering embers for now...).

Watch this space for some actual hard stats


### 1.4 Labour and Exploitation

The refinement of AI models often relies on low-paid human labour for data labelling and content moderation, frequently in low- and middle-income economies. The training data itself was often collected without consent from its creators. Using these tools means participating in a supply chain with unresolved ethical questions about consent, compensation, and intellectual property [20].
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be appreciated if someone has the time to research this one a little deeper. Its hard to not be implicated here, so we need to ensure the risks aren't too great.

Despite that, its not the highest concern on the list for me personally. There are so many industries and practices globally that have a terrible human rights record. I would argue that long hours curating training data is low on the list of moral injustices out there (we need to put this in perspective of potential good that is derived from the tools that we work on). But again, this hunch needs to be proven by hard data before I can be substantiated fully.

**Mitigation approaches:**

- Produce open-source software that partners can adopt freely.
- Advocate for and invest in open-source models that can run locally.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the open models guidance page to be completed.

But we should also provide guidance on how to set up and use these tools in any easy way.

If there is an obvious usability gap (ideally identified through discussion with less tech literate community members - those dabbling with code solutions, who didn't work as software devs previously), we should definitely try to fill them!

I considered a wrapper of sorts to simply run Ollama. But honestly Ollama is pretty simple as it is, as attested to by @emi420. As mentioned, we should seek to identify pain points, and help in the best way we can


AI tools are demonstrably helpful when assisting someone who already understands the codebase and the broader technical landscape, but they are far less reliable as a substitute for that understanding.

**Guidance on appropriate AI use:**
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this a perhaps defer to the section in doc 2 that has more detail

@smathermather
Copy link

Some of this starts to get addressed above, but as I sent this via side channel and Sam suggested in issue is fine:

This is an opus. I appreciate both the thoroughness regarding the problem space, specific challenges, and possible remediation(s) but also the state of the art for responses across projects that have addressed LLM contributions explicitly in their covenants.

Overall, the biggest challenge I see is the remediation question: specifically the challenge of copyright (legal challenge); for labor violations that underpin or are related to those copyright challenges (ethical challenge); for jurisdictional challenges associated with concepts of fair use (possible legal challenges outside US legal frameworks); the existence of untainted, truly open models with known corpus (legal, ethical, and digital sovereignty challenge); and the lack of any clear accounting / signal for decision making on the above with regard to environmental impacts.

These documents serve as a great framing and direction for use of transformer models that are built with consent, documentation of corpus, known licensing and labor practices, as well as resource use. IMO, any substantive ethical use of LLMs in dev work requires a list and possibly the development of such models, and constraint to allowed models with known provenance.

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 25, 2026

Getting there bit by bit!

As promised, I did some 'back-of-the-envelope' calcs to determine a reasonable emissions offset donation for our LLM usage. It uses many assumptions and fudge factors, but overall suggests that energy usage at best could be reasonably low, and at worst (as with large models, regular heavy refactoring) is still manageable for now, although definitely not sustainable into the future:

https://github.com/hotosm/docs/blob/docs/llm-usage-guide/docs/ai-guide/using-lmms-responsibly.md#appendix-a-methodology-for-estimating-llm-energy--co-emissions-and-donation-proxy

The overall summary is that I think HOT should donate ~$300 to effective climate policy and advocacy charities (again, this is my opinion and I have not run it by anyone else yet...).


The last and most difficult thing to address is the legal / ethical concerns raised by @smathermather about copyright infringement and lack of truly 'open training data' models.

We need to decide if:

  1. The potential infringement is an acceptable risk, provided we take a firm stance on "You must disclose AI usage".
  2. We can't accept the legal risk or ethical implications, and decide to ban LLM usage outright.

I'll leave this question in the open for a bit, probably until next week, to allow anyone to comment - would really love some additional feedback here, as I'm a single fallible human being that has many blind spots and is certainly prone to errors in judgement 😄


Comments from @LeenDhondt below:

@spwoodcock , option 2 is for me not an option. We need to learn how to navigate this new disruption in our space, not avoid it
@spwoodcock, on donating offset emissions, this is something we need to bring up at Org level (as same accounts for other areas of work at HOT)

@spwoodcock
Copy link
Member Author

Thanks @LeenDhondt - agree with need to navigate & not avoid 👍

To put the energy issue in context, I added this:
#21 (comment)

The energy usage for our team is a drop in the ocean compared to travel emissions. I still think donations would be great, considering the distributed team and need to meet, but not as initially thought for LLM usage.

@spwoodcock
Copy link
Member Author

Also, to comment on the copyright issue.

To summarise, US courts + Copyright Office take the position that purely AI-generated outputs are not copyrightable, unless they have an element of human authorship (arrangement, substantive edits, creative transformation, etc).

For software we care about
(1) whether training on copyrighted code is lawful, and
(2) whether model outputs reproduce copyrighted code.

In Cory Doctorow’s blog post (from FSF), he argues that model training is not infringement. Training involves copying for analysis and extracting statistical patterns, and copyright law has historically permitted analysis of copyrighted works. If we expand copyright to prohibit training, it would mostly benefit large rights-holders more than individual creators. Cory is mostly discussing artistic work here, but for software it still applies when copyright code was used for training.

The first question remains legally unsettled, so this framework can't really comment. For the second point, we can meaningful address it through the suggested mitigation methods: keep a human in the loop, require contributor disclosure, small PRs, and try our best to detect AI generated content. All AI generated content should be treated as untrusted and not committed blindly.


I would recommend reading the linked blog - it has some nice points. Also articulated in this video shared by Shoaib (for those that prefer):

  • AI bubble: what is fueling it, how it will burst, and the reality of what we are left with: some useful tools for summarisation and coding assistance (akin to a coding plugin, rather than a replacement for software engineers).
  • The reverse centaur model that replaces human work while leaving them responsible for oversight and liability, should be avoided at all costs. If a manager suggested something stupid to a dev, they get told as much. If a manager asks an AI tool, the crappy idea will just get implemented without thought about long term goals, adjacent software, etc - without human judgement and review, we get fragile systems and accumulated tech debt.

@emi420
Copy link
Collaborator

emi420 commented Mar 2, 2026

Sharing what I’m experimenting on, in case you want to also experiment yourself.

Taking account the recent events of mega-giant AI companies discussing if their products will be used for autonomous killing systems and mass surveillance or not, but also in line with reducing dependency from paid and closed-source software, I’ve started to do more experiments with open and local models and tools.

Currently I’m testing OpenCode (which is similar to Claude Code) connected to a local Ollama instance serving the new qwen3.5 model.

I’m also testing a new strategy, maybe some of you are already doing this. Instead of “chatting” with the model directly, I do the following:

  • Create a document with a precise description of the issue I want to solve and the idea of the possible solution (ex: 001-frontend-config.md)
  • Write a prompt requesting the writing of a new document with a plan of actions to solve the issue (ex: 001-frontend-config-PLAN.md)
  • Read the generated plan, make/request adjustments if necessary or ask/write an alternative plan (ex: 001-frontend-config-PLAN-B.md)
  • Write a prompt requesting the execution of the plan (ie: code changes)
  • Review and test the code

The results looks very promising with Qwen3.5. In theory this model offers better performance than Sonnet 4.5 (from Anthropic) but it’s open (open weights) and it runs locally.

Also, following this methodology, while it takes some time and in some cases it’s easier to just write the code, it provides more control and better quality. I’ll share about this in a doc when I have the time.

Note: running qwen3.5:27b on my system (chip: Apple M2 Max, memory: 32gb) feels quite slow but it works, qwen3:8b works really fast. I have to test new versions of qwen3.5 that  are available in Ollama starting today.

@smathermather
Copy link

In Cory Doctorow’s blog post (from FSF), he argues that model training is not infringement. Training involves copying for analysis and extracting statistical patterns, and copyright law has historically permitted analysis of copyrighted works. If we expand copyright to prohibit training, it would mostly benefit large rights-holders more than individual creators. Cory is mostly discussing artistic work here, but for software it still applies when copyright code was used for training.

I actually know very little how this analysis applies outside the US, but it is important to note that this analysis applies specifically to copyright in the US. Doctorow is partially right: under US law, training is (likely) not infringement. But fair use doesn't apply outside the US. I would be interested if anyone has an inkling of how e.g. EU infringement questions are likely to play out.

  • The reverse centaur model that replaces human work while leaving them responsible for oversight and liability, should be avoided at all costs. If a manager suggested something stupid to a dev, they get told as much. If a manager asks an AI tool, the crappy idea will just get implemented without thought about long term goals, adjacent software, etc - without human judgement and review, we get fragile systems and accumulated tech debt.

Yes I almost highlighted the reverse centaur in reference to this portion of the docs, though I think this accountability is important in the context of contributors, as with a FOSS project it's the only safe default.

But for an org / corporate body, reverse centaur / responsibility laundering of LLMs is a concern, especially if / when LLM use is required or expected.

I’ve started to do more experiments with open and local models and tools.

Very interesting. I'm looking forward to the open models list being populated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add guidelines on the use of AI tools for coding

7 participants