Update dotstrings parser to handle entries with multi-line values#12
Update dotstrings parser to handle entries with multi-line values#12fla-t wants to merge 4 commits intomicrosoft:masterfrom
Conversation
The dotstrings parser has been updated to handle multi-line entries in the format "key = value;". This allows for more flexibility in writing and parsing dotstrings files.
| comment = re.compile(r"(\'(?:[^\'\\]|\\[\s\S])*\')|//.*|/\*(?:[^*]|\*(?!/))*\*/", re.MULTILINE) | ||
| whitespace = re.compile(r"\s*", re.MULTILINE) | ||
| entry = re.compile(r'"(.*)"\s*=\s*"(.*)";') | ||
| entry = re.compile(r'"([^"]*?)"\s*=\s*"((?:[^";]|"(?!\s*;))*?)";', re.DOTALL) |
There was a problem hiding this comment.
Unfortunately this will break on strings that contain quotes. e.g. NSLocalizedString("Hello \"World\"", "Some Comment") gets turned into "Hello \"World\"" = "Hello \"World\"";.
I think the second change on this line will also stop the string containing ;.
There was a problem hiding this comment.
The first change is there so that the first match group doesn't end up matching the whole string (even matching the "value" part).
But yeah it wouldn't work when there are more than two quotes, in the "key" part of the string (the two quotes around the key itself). This was intentional because I didn't thought there would be any "key" that would have more quotes than two.
If we really want to cater keys that have quotes inside of them, we can replicate the regex that we have for value part, but without the semicolon
entry = re.compile(r'"((?:[^";]|"(?!\s*;))*?)"\s*=\s*"((?:[^";]|"(?!\s*;))*?)";')
There was a problem hiding this comment.
The second change is finding a ";" to match the value part of the string.
There was a problem hiding this comment.
Can you change the test file I have here to contain edge cases?
|
Also there is a line that I hate everytime I take a look at it. comment = comment[::-1].replace("/*", "", 1)[::-1] Or maybe I am missing something? :) |
The dotstrings parser has been updated to handle multi-line entries in the format
This allows for more flexibility in writing and parsing dotstrings files