Skip to content

Paralellize and cache the creation of FileDescriptorProto#694

Open
ckoutsidi-b wants to merge 4 commits intomainfrom
christina-fdp-fds
Open

Paralellize and cache the creation of FileDescriptorProto#694
ckoutsidi-b wants to merge 4 commits intomainfrom
christina-fdp-fds

Conversation

@ckoutsidi-b
Copy link
Copy Markdown

This commit adds two new types of queries, FDP and FDS.
FDP will turn the given IR files into FileDescriptorProto and FDS takes the results of a resolved Link query and feeds the resulting IRs into FDS, creating a FileDescriptorSet.

This is meant to parallelize and cache the conversion of the IRs into FileDescriptorProto.

This commit adds two new types of queries. FDP will turn the given IR files
into FileDescriptorProto and FDS takes the results of a resolved Link query
and feeds it into FDS.

This is meant to parallelize and cache the conversion of the IRs into
FileDescriptorProto.
@ckoutsidi-b ckoutsidi-b requested a review from mcy April 2, 2026 13:07
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 2, 2026

CLA assistant check
All committers have signed the CLA.

@ckoutsidi-b ckoutsidi-b requested a review from doriable April 2, 2026 15:58
type FDP struct {
*ir.File // Must be comparable.
*ir.Session
Options *[]fdp.DescriptorOption // Must be comparable
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's a subtle problem here, that I'm not immediately sure how to address.

Comparisons for *[]T are on an address basis, which means that creating the same queries.FDP with the same array of options in different places will result in unequal queries even if the slices are the same. It is very easy for people to make the mistake of passing a &[]T{...} to this field, breaking caching.

I think the cleanest solution for this is something like the following. First, we create a new struct like the following in fdp:

type Options struct {
  includeSourceCodeInfo bool
  generateExtraOptionLocations bool
}

Then, we embed Options into generator, and use that as the location for those option values.

The additional missing piece is a way to do excludes, because funcs are not comparable... one way this could be done is to add this to fdp:

type Excluder struct {
  Exclude(*ir.File) bool
}

Unlike funcs, interfaces are comparable if their concrete value is comparable. This results in a more tortured API.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the options thing, unfortunately...

For excludes, we don't necessarily need to address that in either FDP or FDS queries, right? In the former case, it's the fdp of a single *ir.File and in the latter case, it's the fds of a source.Workspace, with the expectation that any exclusions are done on the configuration of the source.Workspace?

Copy link
Copy Markdown
Author

@ckoutsidi-b ckoutsidi-b Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now IsDescriptorProto is being called on ir.File in order to decide whether a file is excluded. In order to avoid having to pass Excluder to FDS, I guess I could do a second call to Link on the caller of FDS before the call to FDS itself, and use that to filter entries from source.Workspace, but it feels kinda convoluted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, sorry, I misread before I commented, I was thinking of something else... ^^" Yeah, actually, in that case, we would need an Excluder interface with comparable impls... :<

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted another update. My understanding is that two instances of the new type IRExcluder (you can find it in descriptor.go) are always equal, while if I made another type (eg: type A struct{}), its instances would never be equal to any instances of IRExcluder, even if both are treated under the Excluder interface. I hope my understanding is correct.

// The File field must not be edited between different FDP queries!
type FDP struct {
*ir.File // Must be comparable.
*ir.Session
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need *ir.Session here, I think...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I had a different implementation previously but the current one doesn't use it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense if the implementation is:

type FDP struct {
  path string
  *ir.Session
  options //options stuff
 // etc. ...
}

but then FDS can't call it using the Link query, I guess ^^" It's kind of a tough API to pin down...

Previously the FDS and FDP queries used a pointer to a slice for the options.
This was comparable on the value of the pointer, rather than the values within the slice.
In addition, DescriptorOption itself is not comparable as it is a function type.

We solve this issue by replacing said slice with a new option struct that is comparable on its values, and with an interface for the decidability of exclution of IR files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants