Skip to content

Conversation

@mihow
Copy link
Collaborator

@mihow mihow commented Jan 31, 2026

Summary

  • Adds TaxaListQuerySet.get_or_create_for_project() method to scope TaxaList lookups by project
  • Updates all callers (pipeline.py, import_taxa, update_taxa) to use the new method
  • Handles existing duplicates gracefully by returning the oldest matching list

Problem

The TaxaList.name field has no unique constraint, but code was using get_or_create(name=...) which fails with MultipleObjectsReturned when duplicate names exist in the database.

Solution

The new get_or_create_for_project(name, project=None) method:

  • For project=None (global lists): finds lists with no project associations
  • For project=X: finds lists associated with that specific project
  • Returns the oldest matching list if duplicates exist (prevents crashes)

Follow-up needed

  • Data migration to merge existing duplicates on the live database (will be added after reviewing the duplicates)
  • Add --project parameter support to management commands

Test plan

  • Verify tests pass
  • Test locally that import_taxa/update_taxa commands work correctly
  • Deploy to staging and verify no more MultipleObjectsReturned errors

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor
    • Enhanced taxa list management system to support both global and project-specific scoping, improving the flexibility of taxonomic data organization across the application.

✏️ Tip: You can customize this high-level summary in your review settings.

…ate names

The TaxaList model allows multiple lists with the same name, but several
places in the codebase use get_or_create(name=...) which fails with
MultipleObjectsReturned when duplicates exist.

This adds a new get_or_create_for_project() method that:
- Scopes lookups to a specific project (or global lists if project=None)
- Handles existing duplicates gracefully by returning the oldest one
- Updates all callers (pipeline.py, import_taxa, update_taxa) to use it

Also adds TODO comments to management commands about adding --project
parameter support in the future.

Co-Authored-By: Claude <[email protected]>
@netlify
Copy link

netlify bot commented Jan 31, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit a0b1f60
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/697d5f5611d4f30008ba30db

@netlify
Copy link

netlify bot commented Jan 31, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit a0b1f60
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/697d5f5625f3af0009deb2b0

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 31, 2026

📝 Walkthrough

Walkthrough

The PR refactors taxa list creation across the codebase by introducing a new get_or_create_for_project method on the TaxaList manager that supports both project-scoped and globally-scoped taxa lists, replacing direct get_or_create calls in management commands and the ML pipeline.

Changes

Cohort / File(s) Summary
Core Model Updates
ami/main/models.py
Introduces TaxaListQuerySet with get_or_create_for_project method to handle project-scoped and global taxa list creation/retrieval, with logic to filter by project presence and handle duplicate resolution. Adds TaxaListManager and updates TaxaList model with new manager.
Management Commands
ami/main/management/commands/import_taxa.py, ami/main/management/commands/update_taxa.py
Updates both commands to use the new get_or_create_for_project method with project=None for global taxa list creation, and adds TODO comments for future --project parameter support.
ML Pipeline
ami/ml/models/pipeline.py
Replaces get_or_create call with get_or_create_for_project(project=None) to create global algorithm taxa lists via the new project-aware method.

Sequence Diagram

sequenceDiagram
    participant Cmd as Command
    participant QS as TaxaListQuerySet
    participant DB as Database

    Cmd->>QS: get_or_create_for_project(name, project=None)
    alt project is None (Global List)
        QS->>DB: Filter lists with no associated projects
        DB-->>QS: Matching global list (if exists)
        alt List exists
            QS-->>Cmd: Return (existing_list, False)
        else List not found
            QS->>DB: Create new global TaxaList
            DB-->>QS: New TaxaList instance
            QS-->>Cmd: Return (new_list, True)
        end
    else project provided (Project-scoped List)
        QS->>DB: Filter lists for specific project
        DB-->>QS: Matching project-scoped list (if exists)
        alt List exists
            QS-->>Cmd: Return (existing_list, False)
        else List not found
            QS->>DB: Create new TaxaList and associate with project
            DB-->>QS: New TaxaList instance
            QS-->>Cmd: Return (new_list, True)
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A QuerySet so clever, with scopes both near and far,
Global lists and project lists, aligned like morning star!
Each command now calls forth with project=None in hand,
The Manager hops forward to help taxa lists expand! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically addresses the main problem: duplicate TaxaList names causing MultipleObjectsReturned errors, which matches the core issue being fixed.
Description check ✅ Passed The PR description covers the summary, problem statement, solution details, and test plan, but is missing some required template sections like List of Changes, Related Issues, Deployment Notes, and Checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/duplicate-taxalist-names

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants