Skip to content

Add design fingerprinting engine with semantic embeddings#4

Merged
nahiyankhan merged 2 commits intomainfrom
design-fingerprint
Apr 6, 2026
Merged

Add design fingerprinting engine with semantic embeddings#4
nahiyankhan merged 2 commits intomainfrom
design-fingerprint

Conversation

@nahiyankhan
Copy link
Copy Markdown
Collaborator

Summary

  • Introduces ghost profile and ghost compare commands that fingerprint a project's design language (palette, spacing, typography, surfaces, architecture) without requiring a registry
  • Adds an extractor plugin system (Tailwind MVP + CSS fallback) for framework-agnostic material gathering
  • Integrates optional LLM interpretation (Anthropic/OpenAI) for rich fingerprint generation and optional semantic embeddings (OpenAI/Voyage) for similarity clustering
  • Includes deterministic 64-dim embedding vector fallback and deterministic registry fingerprinting (no LLM needed for shadcn registries)
  • Existing ghost scan is fully preserved — designSystems is now optional in config

What changed

  • Fingerprint engine (fingerprint/): profile, compare, describe, embed, and registry-based fingerprinting
  • Extractors (extractors/): pluggable Tailwind and CSS extractors for gathering design materials
  • LLM layer (llm/): Anthropic and OpenAI providers with structured prompts for fingerprint generation and summarization
  • CLI (ghost-cli): new profile and compare subcommands
  • Types/config: expanded GhostConfig and new fingerprint types

Test plan

  • Run ghost profile on a Tailwind project and verify fingerprint output
  • Run ghost profile on a plain CSS project (fallback extractor)
  • Run ghost compare between two fingerprints and verify distance scores
  • Run ghost scan to confirm existing behavior is unaffected
  • Test with and without LLM API keys to verify deterministic fallback works

🤖 Generated with Claude Code

nahiyankhan and others added 2 commits April 5, 2026 23:02
…terpretation

Evolves Ghost from registry-only audit to a design language fingerprinting system.
Projects can now be profiled without a registry, fingerprints compared for distance,
and design languages clustered across repos via embedding vectors.

New capabilities:
- `ghost profile` generates a DesignFingerprint (palette, spacing, typography, surfaces, architecture)
- `ghost compare` computes per-dimension distance between two fingerprints
- Extractor plugin system (Tailwind MVP + CSS fallback) for framework-agnostic material gathering
- LLM interpretation layer (Anthropic/OpenAI as optional peer deps) for rich fingerprint generation
- Deterministic 64-dim embedding vector for similarity/clustering
- Deterministic registry fingerprinting (no LLM needed for shadcn registries)

Existing `ghost scan` is fully preserved — designSystems is now optional in config.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace the hand-engineered 64-dim feature vector with optional
model-generated embeddings. The fingerprint is rendered as natural
language and sent to a text embedding model, producing vectors that
capture semantic similarity across different CSS methodologies.

New optional config field `embedding` with provider/model/apiKey.
Falls back to env vars (OPENAI_API_KEY, VOYAGE_API_KEY) and to the
deterministic vector when no embedding config is present.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@github-advanced-security
Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

@nahiyankhan nahiyankhan merged commit 9adcfaf into main Apr 6, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants