docs: new blog entry on a graphql embedding model#2439
Conversation
|
@ThoreKoritzius is attempting to deploy a commit to the The GraphQL Foundation Team on Vercel. A member of the Team first needs to authorize it. |
martinbonnin
left a comment
There was a problem hiding this comment.
One minor comment but I love it! Thanks for working on this!
|
|
||
| ## Why general-purpose embedders struggle with schemas | ||
|
|
||
| Schemas reuse field names everywhere. Dozens of types carry a `description`. Many carry an `author`, a `state`, a `createdAt`, a `priceCents`. Knowing the field name is rarely enough. You have to know *whose* field it is. |
There was a problem hiding this comment.
As someone who eats and sleeps GraphQL, description directly sent me to the GraphQL descriptions, which is not the case. Could you use another example maybe? Or make it really clear that we're talking about the field name and not the field description in the GraphQL document?
There was a problem hiding this comment.
Yes, maybe you can provide a snippet of the schema incl. a few types and its fields for the example in this blog "What's the nightly rate for this room?"
|
|
||
| ## A small, focused fine-tune | ||
|
|
||
| A natural experiment is to fine-tune a general embedder on this specific task: mapping a natural-language question to the `Type.field` coordinate that answers it, with an emphasis on disambiguating between same-named fields on different owner types. The artifact discussed in this post is [`Qwen3-Embedding-0.6B-GraphQL`](https://huggingface.co/xthor/Qwen3-Embedding-0.6B-GraphQL), an open-source ([Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)) fine-tune of [`Qwen3-Embedding-0.6B`](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B). It's an early prototype shared here as a reference point for schema-aware retrieval, and the methodology generalizes to any base embedder. |
There was a problem hiding this comment.
Very cool to provide concrete data and example 🤩
Closes #
Description