Skip to content

Emit config diagnostics#2289

Open
sgavriil01 wants to merge 4 commits intofacebook:mainfrom
sgavriil01:emit-config-diagnostics
Open

Emit config diagnostics#2289
sgavriil01 wants to merge 4 commits intofacebook:mainfrom
sgavriil01:emit-config-diagnostics

Conversation

@sgavriil01
Copy link
Contributor

Summary

This PR adds LSP diagnostics for config file (pyrefly.toml/pyproject.toml) parsing errors. When a config file has invalid syntax, diagnostics now appear in VS Code's Problems panel and clear automatically when the file is fixed.

Changes:

  • Modified ConfigError to include file path information
  • Fixed bug in workspace.rs where config errors were being discarded
  • Added publish_config_diagnostics() to emit LSP diagnostics for config files
  • Added caching to ensure diagnostics persist across publish cycles

Fixes #2078

Test Plan

  • Added integration test test_config_file_diagnostics() that creates invalid TOML and verifies proper handling
  • Manually tested by creating syntax errors in pyrefly.toml - diagnostics appear in Problems panel with correct line/column positions
  • Verified diagnostics clear when config is fixed
  • Ran ./test.py --no-test --no-conformance - formatting and linting pass

@meta-cla meta-cla bot added the cla signed label Feb 2, 2026
Emit diagnostics for config file parse errors (facebook#2078)

Config errors (pyrefly.toml/pyproject.toml) now appear as LSP diagnostics
in the Problems panel and clear when fixed.
@sgavriil01 sgavriil01 force-pushed the emit-config-diagnostics branch from c487ef4 to 5393991 Compare February 2, 2026 14:55
@github-actions

This comment has been minimized.

@connernilsen connernilsen self-assigned this Feb 4, 2026
@connernilsen connernilsen self-requested a review February 4, 2026 19:50
@connernilsen connernilsen added configuration language-server Issues specific to our IDE integration rather than type checking and removed needs-triage labels Feb 6, 2026
@connernilsen
Copy link
Contributor

Hey @sgavriil01, thanks for making this! It seems like something we definitely need. I'll review early next week

Copy link
Contributor

@connernilsen connernilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for doing this! It's looking great so far and I love the motivation. I left a few comments, mostly small things, but I do think there needs to be a different approach for how we figure out which files to update for, which I left in the else branch when we're getting the diagnostics to publish.

Let me know if you have any questions or disagree, I'm totally down to discuss things or give more thoughts if you want :)

Comment on lines +44 to +57
pub fn error(msg: anyhow::Error) -> Self {
Self {
severity: Severity::Error,
msg,
msg: format!("{:#}", msg),
file_path: None,
}
}

pub fn warn(msg: anyhow::Error) -> Self {
Self {
severity: Severity::Warn,
msg,
msg: format!("{:#}", msg),
file_path: None,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on instead making this take an optional error? Then when creating a config error it will be more apparent that it can be tied to a config at a path? I could go either way on this/am open to discussion here though :)

/// Extract line and column numbers from TOML error messages
/// Returns (line, column) with 1-based indexing, defaults to (1, 1)
fn extract_line_col_from_toml_error(error_msg: &str) -> (usize, usize) {
let line = error_msg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to parse a position out of an error message, can we instead keep this information as part of the error we construct? TOML deserialization should have spans that will tell you this

// If we got fresh errors, deduplicate and update the cache
if !config_errors.is_empty() {
// Deduplicate by (file_path, message) since the same error can be added multiple times
let mut seen = std::collections::HashSet::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use a SmallSet here instead? They're more efficient, especially when working with small cardinalities or if you aren't removing things from the set

});
*self.cached_config_errors.lock() = config_errors.clone();
} else {
// If get_config_errors() returned empty (consumed by mem::take),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those errors might be empty because all errors have been fixed. If that's the case, will we not end up clearing diagnostics here?

What we might need to do instead to make sure this is working is to keep a map in the config loader of all config paths to errors that have been loaded since the last time they were taken. Then we can use that to determine if a config has been changed since the last time we published errors. If it's not in the map, then we don't update diagnostics from the previous result. If it is, then completely overwrite with new errors.

uri,
Vec::new(),
None,
DiagnosticSource::DidClose,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to use DidClose here unless we're sure it's from a close operation.

Instead, this function (publish_config_diagnostics) is only called from publish_for_handles which takes a DiagnosticSource as an arg. I think we should reuse that.

uri,
diagnostics,
None,
DiagnosticSource::CommittingTransaction,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as above

…efactor ConfigError API

- Replace std::collections::HashSet with SmallSet for deduplication
- Add source: DiagnosticSource parameter to publish_config_diagnostics
- Consolidate ConfigError::error/warn methods to accept Option<PathBuf>
- Update all callsites to use the new unified API
- Add span field to ConfigError for line/column information
- Extract span from toml::de::Error in from_file()
- Update config_error_to_diagnostic to use ConfigError.span()
- Remove extract_line_col_from_toml_error() regex function
- Add error_with_span() constructor for TOML errors
@sgavriil01
Copy link
Contributor Author

Thanks for the review! I've addressed the first 4 points:

  1. Using SmallSet for deduplication
  2. Passing DiagnosticSource as a parameter instead of hardcoding it
  3. Simplified the ConfigError API
  4. Using TOML's built-in error information instead of regex parsing

For point 5 about the caching strategy - could you clarify the intended behavior? Should the ConfigFinder track which config files were recently loaded and only publish diagnostics for those, rather than maintaining a separate cache in the Server?

@github-actions
Copy link

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity for more than 2 weeks.

If you are still working on this this pull request, please add a comment or push new commits to keep it active. Otherwise, please unassign yourself and allow someone else to take over.

Thank you for your contributions!

@github-actions github-actions bot added the stale label Feb 26, 2026
@connernilsen
Copy link
Contributor

Hey @sgavriil01, sorry for the long delay. I've had to shift my focus internally for a bit to focus on a bug that's been causing a lot of problems. I think it should be resolved soon, so I'll come back and check out your changes here hopefully this week :)

@github-actions github-actions bot removed the stale label Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed configuration language-server Issues specific to our IDE integration rather than type checking

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make some sort of UI signal in VS Code if we fail to parse pyproject.toml

2 participants