West Midlands | 26 March SDC | Iswat Bello | Sprint 2 | Improve code with precomputing#203
West Midlands | 26 March SDC | Iswat Bello | Sprint 2 | Improve code with precomputing#203Iswanna wants to merge 3 commits into
Conversation
- Implement alphabetical sorting to bring strings with common prefixes together - Replace the nested O(N^2) loop with a single pass over neighboring strings - Maintain legacy variable names and the 'find_common_prefix' helper function - Add early return for lists with fewer than two strings
- Replace O(N) string scanning with O(1) set lookup - Transition overall algorithm complexity from O(N^2) to O(N) - Maintain legacy loop structure and helper functions for minimal code changes - Add documentation explaining the pre-computing strategy
- Explain the O(N^2) to O(N log N) improvement via pre-sorting in common_prefix - Detail the O(N^2) to O(N) transition using set lookups in count_letters - Summarize the Space-vs-Time trade-offs applied to both algorithms - Note the preservation of legacy variable names and helper functions
This comment has been minimized.
This comment has been minimized.
nedssoft
left a comment
There was a problem hiding this comment.
Iswat — two really nicely-reasoned optimisations here, and again your CHANGES_MADE.md does a lovely job of explaining the why, not just the what. The standout for me is the common-prefix insight: spotting that once the list is sorted, the strings sharing the longest prefix must be neighbours, so a single adjacent-pair pass replaces the whole nested loop. That's a genuinely clever result and it's mathematically sound. 👏 And the set(s) trick in count_letters — turning an O(N) not in s scan into an O(1) lookup while leaving the rest of the logic untouched — is exactly the right instinct.
One optional question inline — about a side-effect, not correctness (all your tests pass and both results are right). You've clearly met the task, so I'm marking this Complete. Really strong work across this sprint. 🎉
| return "" | ||
|
|
||
| # Pre-compute (Sort) - This is the optimization! | ||
| strings.sort() |
There was a problem hiding this comment.
This sorting step is the heart of your optimisation and it's spot-on. One subtle thing worth noticing, though: strings.sort() sorts the list in place — it reorders the very list the caller handed you. The original function only ever read the list, so whoever called it could rely on their list staying in its original order afterwards; with .sort(), it silently comes back rearranged.
For a test that just checks the return value it makes no difference — but imagine a caller who does something else with their list right after calling you. Python gives you a close cousin of .sort() that returns a new sorted list instead of changing the original in place — do you know which one? What would swapping to it cost you here, and what would it protect the caller from?
Your result is correct — this is just about being a considerate houseguest with data you don't own.
Learners, PR Template
Self checklist
Changelist
In this PR, I have implemented performance optimisations for the$O(N^2)$ bottlenecks, allowing the functions to process millions of items in milliseconds.
common_prefixandcount_letterstasks by applying pre-computing strategies. These changes successfully resolveI have provided a detailed technical breakdown of the complexity changes and architectural decisions in the
CHANGES_MADE.mdfile.Key Changes
1. Longest Common Prefix (Task 5)
strings.sort().2. Count Letters (Task 6)
set(s)) to handle character lookups.not in s) with a constant-timeLegacy Preservation
longest,only_upper,string) and re-use original helper functions (find_common_prefix,is_upper_case).Testing Done
test_really_long_list(1,000,000 strings) andtest_long_string(10,000,000 characters) now finish nearly instantly, whereas the original code would hang indefinitely.Learning Reflection
These tasks highlighted the Space-vs-Time trade-off. By using a small amount of extra memory to store a sorted list or a set, we drastically reduced the CPU time required, which is essential for building scalable software.