Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
f763652
feat: Add GraphQL API support for batched operations
arnavk23 Jan 18, 2026
0b6b73e
docs: Update documentation for GraphQL implementation
arnavk23 Jan 18, 2026
d9fa61a
test: Add comprehensive tests for GraphQL client
arnavk23 Jan 18, 2026
21188a2
Merge branch 'master' into feature/graphql-api
arnavk23 Jan 18, 2026
9a35e8f
fix: Update poetry.lock for GraphQL dependencies
arnavk23 Jan 18, 2026
89c77c9
fix: Disable GraphQL client in test environments
arnavk23 Jan 18, 2026
7bffdad
fix: Use consistent annotated tag handling in GraphQL
arnavk23 Jan 18, 2026
180bde4
Update tagbot/action/graphql.py
arnavk23 Jan 18, 2026
c6c56c5
Fix annotated tag handling in GraphQL client
arnavk23 Jan 18, 2026
cdfd30a
Apply suggestions from code review
arnavk23 Jan 18, 2026
328aa33
poetry
arnavk23 Jan 18, 2026
a120c1b
black
arnavk23 Jan 18, 2026
6eec807
flake8
arnavk23 Jan 18, 2026
a5966c2
Merge branch 'master' into pr/488
IanButterworth Feb 7, 2026
5ebd5cc
Update poetry.lock
IanButterworth Feb 7, 2026
3079a59
claude review
arnavk23 Feb 11, 2026
1e9f393
Apply suggestions from code review
arnavk23 Feb 11, 2026
899c6e5
additions
arnavk23 Feb 11, 2026
4c05fc9
flake8
arnavk23 Feb 11, 2026
d60c618
Apply suggestions from code review
arnavk23 Feb 13, 2026
ba0ccfe
copilot changes
arnavk23 Feb 13, 2026
e7125b7
format
arnavk23 Feb 13, 2026
1b33005
Merge branch 'master' into feature/graphql-api
arnavk23 Feb 18, 2026
3ed9148
Merge branch 'master' into feature/graphql-api
arnavk23 Feb 18, 2026
82340cb
Update project version and dependencies in pyproject.toml
arnavk23 Apr 18, 2026
8b88c18
Update poetry.lock
arnavk23 Apr 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions DEVGUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ tagbot/
│ ├── changelog.py # Release notes generation (Jinja2)
│ ├── git.py # Git command wrapper
│ ├── gitlab.py # GitLab API wrapper (optional)
│ ├── graphql.py # GraphQL client for batched API operations
│ └── repo.py # Core logic: version discovery, release creation
├── local/
│ └── __main__.py # CLI entrypoint
Expand Down Expand Up @@ -99,6 +100,11 @@ tagbot/
- Extracts custom notes from registry PR (`<!-- BEGIN RELEASE NOTES -->`)
- Renders Jinja2 template

**`GraphQLClient` (graphql.py)** - Batched API operations:
- `query()` - Low-level GraphQL query helper
- `fetch_tags_and_releases()` - Single query for tags + releases
- Provides 2x+ performance improvement over sequential REST calls

### Special Features

**Subpackages**: For monorepos with `subdir` input:
Expand All @@ -123,10 +129,13 @@ Performance: 600+ versions in ~4 seconds via aggressive caching.

| Cache | Purpose | Built By |
|-------|---------|----------|
| `__existing_tags_cache` | Skip existing tags | Single API call to `get_git_matching_refs("tags/")` |
| `__existing_tags_cache` | Skip existing tags | GraphQL query or `get_git_matching_refs("tags/")` |
| `__releases_cache` | Cached releases | Fetched alongside tags via GraphQL |
| `__tree_to_commit_cache` | Tree SHA → commit | Single `git log --all --format=%H %T` |
| `__registry_prs_cache` | Fallback commit lookup | Fetch up to 300 merged PRs |
| `__commit_datetimes` | "Latest" determination | Lazily built |
| `__commit_datetimes` | "Latest" determination | Single `git log --all --format=%H %aI` |

**GraphQL Optimization**: When available, `_build_tags_cache()` uses a single GraphQL query to fetch both tags and releases simultaneously, reducing API calls by 50% compared to separate REST calls.

**Pattern for new caches**:
```python
Expand Down
15 changes: 9 additions & 6 deletions IMPROVEMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,19 @@ The `Changelog._issues_and_pulls()` method now uses the GitHub search API to fil
---

### 1.4 Use GraphQL API for Batched Operations
**Status**: Not implemented
**Status**: ✅ Implemented
**Impact**: High
**Effort**: High

Many operations make multiple REST API calls that could be consolidated using GitHub's GraphQL API. A single GraphQL query could fetch:
The current GraphQL integration consolidates some operations that previously required multiple REST API calls. The primary GraphQL query currently fetches:
- All tags
- All releases
- Multiple commits' metadata
- Issues/PRs in a date range
**Implementation**: Created `graphql.py` module with a `GraphQLClient` class used for GraphQL-based batch operations, including:
- `fetch_tags_and_releases()` - Single query to get tags + releases (replaces 2 separate REST calls)

Additional helpers may be added over time as the GraphQL integration is expanded.

The implementation uses GraphQL as the primary method with graceful fallback to REST API on errors where applicable.
**Example**: Fetching tags and releases in one query:
```graphql
query {
Expand All @@ -72,7 +75,7 @@ query {
}
```

**Tradeoff**: Would require adding `gql` dependency and significant refactoring.
**Benefit**: Reduces API calls and improves performance. For repositories with many tags/releases, this can cut API calls by 50% or more.

---

Expand Down Expand Up @@ -279,7 +282,7 @@ Current Dockerfile uses `python:3.12-slim`. Could reduce further with:
| 1.1 | Git log primary lookup | High | Low | ✅ Done |
| 1.2 | Changelog API optimization | High | Medium | ✅ Done |
| 1.3 | Batch commit datetime lookups | Medium-High | Low | ✅ Done |
| 1.4 | GraphQL API | High | High | Not started |
| 1.4 | GraphQL API | High | High | ✅ Done |
| 2.1 | Split repo.py | Medium | Medium | Not started |
| 2.2 | Use tomllib | Low | Low | Not started |
| 2.3 | Structured logging | Medium | Medium | Not started |
Expand Down
663 changes: 351 additions & 312 deletions poetry.lock

Large diffs are not rendered by default.

75 changes: 43 additions & 32 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,49 +1,60 @@
[tool.poetry]
[project]
name = "tagbot"
version = "1.24.4"
version = "1.25.7"
description = "Creates tags, releases, and changelogs for your Julia packages when they're registered"
authors = ["Chris de Graaf <[email protected]>"]
authors = [{name = "Chris de Graaf", email = "[email protected]"}]
license = "MIT"
requires-python = ">=3.12"
dynamic = ["dependencies"]

[project.optional-dependencies]
gitlab = ["python-gitlab>=8.2.0,<9"]
ssh = ["pexpect>=4.8.0,<5"]
gpg = ["python-gnupg>=0.5.6,<1"]
reporting = ["docker>=7.1.0,<8", "requests>=2.28.0,<3"]
web = ["Flask==3.1.3", "werkzeug==3.1.7", "pylev>=1.3.0,<2"]
local = ["click>=8,<9", "pyyaml>=6,<7"]
all = ["pexpect>=4.8.0,<5", "python-gnupg>=0.5.6,<1", "docker>=7.1.0,<8", "requests>=2.28.0,<3", "Flask==3.1.3", "werkzeug==3.1.7", "pylev>=1.3.0,<2", "click>=8,<9", "pyyaml>=6,<7", "python-gitlab>=8.2.0,<9"]

[tool.poetry]
requires-plugins = {poetry-plugin-export = ">=1.8"}

[tool.poetry.dependencies]
python = "^3.12"
Flask = "3.1.2"
Jinja2 = "^3"
PyGithub = "^2.7.0"
click = "^8"
docker = "^7.1.0"
pexpect = "^4.8.0"
pylev = "^1.3.0"
python-gnupg = "^0.5.6"
pyyaml = "^6"
PyGithub = "^2.9.0"
semver = "^3.0.4"
toml = "^0.10.0"
MarkupSafe = "3.0.3"
itsdangerous = "2.2.0"
werkzeug = "3.1.5"
types-requests = "^2.32.4"
types-toml = "^0.10.8"
types-PyYAML = "6.0.12.20250915"
setuptools = "^81.0.0"
wheel = "^0.46.3"
python-gitlab = { version = "^8.0.0", optional = true }

[tool.poetry.extras]
gitlab = ["python-gitlab"]

[tool.poetry.requires-plugins]
poetry-plugin-export = ">=1.8"
# Optional: only needed when using SSH key passwords
pexpect = { version = "^4.8.0", optional = true }
# Optional: only needed when using GPG signing
python-gnupg = { version = "^0.5.6", optional = true }
# Optional: only needed for error reporting to julia-tagbot.com
docker = { version = "^7.1.0", optional = true }
requests = { version = "^2.33.1", optional = true }
# Optional: only needed for the web service
Flask = { version = "3.1.3", optional = true }
werkzeug = { version = "3.1.7", optional = true }
pylev = { version = "^1.3.0", optional = true }
# Optional: only needed for the local CLI
click = { version = "^8", optional = true }
pyyaml = { version = "^6", optional = true }
# Optional: only needed for GitLab repos
python-gitlab = { version = "^8.2.0", optional = true }

[tool.poetry.group.dev.dependencies]
black = "^26.1"
boto3 = "^1.42.44"
black = "^26.3"
boto3 = "^1.42.83"
flake8 = "^7"
mypy = "^1.19"
pytest-cov = "^7.0.0"
mypy = "^1.20"
pytest-cov = "^7.1.0"
types-requests = "^2.33.0"
types-toml = "^0.10.8"
types-PyYAML = "6.0.12.20250915"

[tool.black]
line-length = 88

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
183 changes: 183 additions & 0 deletions tagbot/action/graphql.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
"""GraphQL query utilities for GitHub API batching.

This module provides optimized GraphQL queries to replace multiple REST API calls
with single batched requests.
"""

from typing import Any, Dict, List, Optional, Tuple
from github import Github, GithubException

from .. import logger


class GraphQLTruncationError(Exception):
"""Raised when GraphQL query results are truncated due to pagination limits."""

pass


class GraphQLClient:
"""Client for executing GraphQL queries against GitHub API."""

def __init__(self, github_client: Github) -> None:
"""Initialize GraphQL client with GitHub connection.

Args:
github_client: Authenticated PyGithub client instance.
"""
self._github = github_client
# Access the requester attribute (it's private but we need it)
self._requester = github_client._Github__requester # type: ignore

def query(self, query_str: str, variables: Optional[Dict[str, Any]] = None) -> Any:
"""Execute a GraphQL query.

Args:
query_str: GraphQL query string.
variables: Optional variables dict for the query.

Returns:
Query result data.

Raises:
GithubException: If query fails.
"""
payload: Dict[str, Any] = {"query": query_str}
if variables:
payload["variables"] = variables

_headers, data = self._requester.requestJsonAndCheck(
"POST", "/graphql", input=payload
)

if "errors" in data:
error_messages = [e.get("message", str(e)) for e in data["errors"]]
raise GithubException(
400, {"message": f"GraphQL errors: {'; '.join(error_messages)}"}, {}
)

return data.get("data", {})

def fetch_tags_and_releases(
self, owner: str, name: str, max_items: int = 100
) -> Tuple[Dict[str, str], List[Dict[str, Any]]]:
"""Fetch all tags and releases in a single query.

This replaces separate calls to get_git_matching_refs("tags/")
and get_releases().

Args:
owner: Repository owner.
name: Repository name.
max_items: Maximum number of items to fetch per type (default 100).

Returns:
Tuple of (tags_dict, releases_list) where:
- tags_dict maps tag names to commit SHAs
- releases_list contains release metadata dicts
"""
query = """
query($owner: String!, $name: String!, $maxItems: Int!) {
repository(owner: $owner, name: $name) {
refs(
refPrefix: "refs/tags/",
first: $maxItems,
orderBy: {field: TAG_COMMIT_DATE, direction: DESC}
) {
pageInfo {
hasNextPage
endCursor
}
nodes {
name
target {
oid
... on Commit {
oid
}
... on Tag {
target {
oid
}
}
}
}
}
releases(first: $maxItems, orderBy: {field: CREATED_AT, direction: DESC}) {
pageInfo {
hasNextPage
endCursor
}
nodes {
tagName
createdAt
tagCommit {
oid
}
isDraft
isPrerelease
}
}
}
}
"""

variables = {"owner": owner, "name": name, "maxItems": max_items}
logger.debug(f"Fetching tags and releases via GraphQL for {owner}/{name}")

result = self.query(query, variables)
repo_data = result.get("repository", {})

# Process tags
tags_dict: Dict[str, str] = {}
refs_data = repo_data.get("refs", {})
for node in refs_data.get("nodes", []):
if not node:
# Skip None or falsy entries that may appear in GraphQL connections
continue
tag_name = node.get("name")
if not tag_name:
# Skip nodes without a tag name to avoid KeyError and invalid data
continue
target = node.get("target") or {}

# Handle both direct commits and annotated tags
# Annotated tags have a nested target structure, lightweight tags don't
nested_target = target.get("target")
if nested_target:
# Annotated tag - resolve to underlying commit SHA
# GraphQL returns nested target: target.target.oid is the commit
commit_sha = nested_target.get("oid")
if commit_sha:
tags_dict[tag_name] = commit_sha
else:
# Lightweight tag - direct commit reference
commit_sha = target.get("oid")
if commit_sha:
tags_dict[tag_name] = commit_sha
Comment thread
arnavk23 marked this conversation as resolved.
Comment thread
arnavk23 marked this conversation as resolved.

# Process releases
releases_list: List[Dict[str, Any]] = []
releases_data = repo_data.get("releases", {})
for node in releases_data.get("nodes", []):
if node: # Skip None entries
releases_list.append(node)

# Check for pagination - raise exception if data is truncated
if refs_data.get("pageInfo", {}).get("hasNextPage"):
raise GraphQLTruncationError(
f"Repository has more than {max_items} tags, "
"GraphQL cannot fetch all data. Falling back to REST API."
)

if releases_data.get("pageInfo", {}).get("hasNextPage"):
raise GraphQLTruncationError(
f"Repository has more than {max_items} releases, "
"GraphQL cannot fetch all data. Falling back to REST API."
)
Comment thread
arnavk23 marked this conversation as resolved.

logger.debug(
f"GraphQL fetched {len(tags_dict)} tags and {len(releases_list)} releases"
)

return tags_dict, releases_list
Loading