Word Break - Backtracking / Aggregation and Memoization

https://algo.monster/problems/word_break

small variations in the memo-ized solution are timing out such as when using an int array as a substitute for a boolean array.

Could you please specify the time complexities with and without memoization and explain how we get those? Thanks

Why we need “ok = ok || dfs(…)”? Why can’t we just use “ok = dfs(…)” ?

cant we just break out when we find a ‘true’? why go through the whole backtracking tree?

memoization to cache the branches we have already seen!
Those branches are different because the root is different, so how come we memorize them?
branch a from root b is different from branch a from root ab

Result is different when ok is true and dfs(...) is false.

Is this solution easier? I only save false results, so any time I got something from “memo” I return false.
static boolean bcktr(String s, List words, String path, List memo) {
if (path.equals(s)) return true;
if (s.startsWith(path)) {
for(String word : words) {
String newPath = path + word;
if(memo.contains(newPath)) return false;
if (bcktr(s, words, newPath, memo)) {
return true;
} else memo.add(newPath);
}
}
return false;
}

You can do something like this instead:

public static boolean wordBreak(String s, List<String> words) {
    return dfs(s, words, 0, new Boolean[s.length()]);
}

public static boolean dfs(String s, List<String> words, int idx, Boolean[] dp) {
    if(idx == s.length()) return true;
    else if(dp[idx] != null) return dp[idx];
    
    boolean ok = false;
    for(String word : words) {
     if(s.substring(idx).startsWith(word) && dfs(s, words, idx + word.length(), dp)){
         ok = true;
     }
    }
    return dp[idx] = ok;
}

i had broken memo code and it passed these tests. The ok = False outside of for loop is important
Here is a test to watch out for (it should return True)
abcd
a abc b d

function wordBreak(s, words) {
return backtrack(s, words, [], new Set());
}
|
const backtrack = (s, words, path, memo) => {
const pathCompare = path.join(‘’);

// went beyond length. False path. Add to memo so we don't repeat.
if(pathCompare.length > s.length) {
    memo.add(pathCompare);
    return false;
}

// path is same length, maybe we found
if(pathCompare.length === s.length) {
    if(pathCompare === s) return true;   // found!
    else {
        memo.add(pathCompare); // not found. Add to memo.
        return false;
    }
}

for(const word of words) { // try each word
    path.push(word);

    if(memo.has(path.join(''))) return false;  // if we've seen this path, then stop trying
    
    if(backtrack(s, words, path, memo)) return true;

    path.pop();
}

return false;

}

I wish the solutions came with better explanations. The memoization of a simple index really threw me for a while. Here’s the notes I hacked out for myself after staring at it for an embarrassingly long time. Dunno if this will help anyone.

        // If we are able to match words all the way up to i, but then fail to find
        // any solutions from there with our words, then we will *always* fail to find
        // a match if we arrive at this length again (because all words can be reused
        // as often as you want, and we've already tried everything from this point).
        //
        // If we back up to a previous position, it's possible we will find a match that
        // places us at a new length i, from which one of our words might be able to match
        // to the end.

I also don’t think there is a need to use memoization. In the case of 'aab' and ['a', 'aa', 'b'], we will never have to look at the branch 'aa' -> 'b' because a solution will be found earlier than that in the branch 'a'->'a'->'b'. Consequently, the function can return True right away.

Thanks for sharing this. I was also looking for this same explanation

The diagram is incorrect. The states are the letters left to be matched. Since b is a given word, if you remove b from aab, you get aa, which is a valid state and can in fact be matched by another of the given words, namely aa. Or a and then a.

from typing import List

def word_break(s: str, words: List[str]) → bool:
memo={}
return fun(s,words,memo)
def fun(word,words,memo):
if word in memo:
return memo[word]
if len(word)==0:
return True

for x in words:
    if word.startswith(x):
        if fun(word[len(x):],words,memo):
            memo[word]=True
            return True
    
        
        
memo[word]=False        
return False

We remove from the left. The aa + b case is considered when we remove aa first (the second checkmark in the diagram).

Is memoisation useful in this case only when the string cannot be formed with the words provided? in which case we need to traverse the entire tree to find out

In the case when the string can be formed with the given words, we are always moving forward (either eliminating words at each step, or finding the right word, and moving forward). once we find the right match, we return True and exit, no need to backtrack and check other branches (for which memoisation would have come in handy) ?

The current solution can’t pass leetcode tests.

My Solution, which doesn’t require memorization, passes without timing out. Instead of passing new index, I am simply passing the smaller string. What am I missing here ?

public static boolean dfs(String s, List<String> words){
    if(s.isEmpty()) return true;

    for(String word : words){
        if(s.startsWith(word)){
            String nextString = s.substring(word.length());
            return dfs(nextString, words);
        }
    }

    return false;
}