Skip to content

Latest commit

 

History

History
261 lines (225 loc) · 12.2 KB

076.MinimumWindowSubstring.md

File metadata and controls

261 lines (225 loc) · 12.2 KB

tags: Hash Table, Two Pointers, String

#[LeetCode 76] Minimum Window Substring Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).

For example, S = "ADOBECODEBANC" T = "ABC" Minimum window is "BANC".

Note:
If there is no such window in S that covers all characters in T, return the empty string "".

If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.

Diffculty
Hard

Similar Problems
[LeetCode ] Substring with Concatenation of All Words Hard [LeetCode ] Minimum Size Subarray Sum Medium [LeetCode ] Sliding Window Maximum Hard

Analysis

分析: 由于大小写字母的ASCII码不大于128,因此开辟两个数组存储信息。 require 数组存储T字符串每个字符出现次数。例如: require['A']=5 表示T中A出现5次。 IsInPattern 数组存储S字符串在[begin,end]窗口内每个字符出现次数。

算法核心思想如下: 在保证[begin, end]窗口包含T中所有字符的条件下,延伸end,收缩begin。 进行一次扫描后,记录符合条件的最小窗口(end - begin + 1)表示的字符串。 有个问题:怎样才知道[begin,end]窗口包含了T中所有字符? 我使用count记录剩余“有效”字符数,当count达到0时,即可说明[begin,end]包含了T。 注意:“有效”的意思是指,当前延伸得到的S[end]字符,使得[begin,end]更进一步包含T,而不是重复劳动。 比如说,T="a", [begin, end]已经包含"a",再延伸得到"aa",只是无效操作,并没有使得[begin,end]更接近T, 有效字符数仍为1.

双指针,动态维护一个区间。尾指针不断往后扫,当扫到有一个窗口包含了所有T的字符后, 然后再收缩头指针,直到不能再收缩为止。最后记录所有可能的情况中窗口最小的

// https://leetcode.com/discuss/10337/accepted-o-n-solution // Accepted O(n) solution

// 循环内是通过判断窗口中已经有T中字符的个数来进行 add or delete // 12ms class Solution { public: string minWindow(string S, string T) { int n1 = S.length(), n2 = T.length(); if (n1 == 0 || n2 == 0) return "";

    int count = T.size();
    int require[128] = {0};
    bool IsInPattern[128] = {false};

    for (int i = 0; i < count; ++i){
        require[T[i]]++;
        IsInPattern[T[i]] = true;
    }
    // use i to loop the String, j denotes the start index
    // initially j is 0
    int i = -1, j = 0;
    int minIdx = 0, minLen = INT_MAX;
    while (i < n1){ // && j < n1
        if (count > 0){ // count > 0 means we still need to find
            ++i;
            --require[S[i]]; // 遍历到s[i], 先把require[s[i]]减一

            // 注意: 起初,所有不在pattern里面的require都为0,--之后就变成负了
            // 当在pattern里面的某个字符被找到后,其require也为0
            // 只要S[i] 在pattern里面,就说明找到一个
            // 如果依然require[S[i]] >= 0 的话,就说明--require[S[i]]之前是有需求的
            // 这时候就可以--count了
            if (IsInPattern[S[i]] && require[S[i]] >= 0) // 如果s[i]在字典里,require[s[i]] >= 0表示还有再需要的
                --count;
        }
        else{ // count == 0, 表明全找到了
            int currentLen = i - j + 1;
            if (currentLen < minLen){
                minLen = currentLen;
                minIdx = j;
            }

            require[S[j]]++; // 去掉头部
            if (IsInPattern[S[j]] && require[S[j]] > 0) // shrink
                count++;
            j++;
        }
    }
    if (minLen == INT_MAX) return "";
    return S.substr(minIdx, minLen);
}

};

思路: 可以利用两个指针扫描(两个指针分别为start,i),以S = “e b a d b a c c b”(忽略空格),T = “abc”为例:

初始化 start = i = 0 i 逐渐往后扫描S直到窗口S[start…i]包含所有T的字符,此时i = 6(字符c的位置) 缩减窗口:此时我们注意到窗口并不是最小的,需要调整 start 来缩减窗口。缩减规则为:如果S[start]不在T中 或者 S[start]在T中但是删除后窗口依然可以包含T中的所有字符,那么start = start+1, 直到不满足上述两个缩减规则。缩减后i即为窗口的起始位置,此例中从e开始窗口中要依次删掉e、b、a、d,start最后等于4 ,那么得到一个窗口大小为6-4+1 = 3 start = start+1(此时窗口肯定不会包含所有的T中的字符),跳转到步骤2继续寻找下一个窗口。本例中还以找到一个窗口start = 5,i = 8,比上个窗口大,因此最终的最小窗口是S[4…6] 具体实现时,要用哈希表来映射T中字符以便在O(1)时间内判断一个字符是否在T中,由于是字符缘故,可以用数组简单的来实现;还需要一个哈希表来记录扫描时T中的某个字符在S中出现的次数,也可以用数组实现

// 16ms class Solution { public: string minWindow(string S, string T) { int n1 = S.size(), n2 = T.size(); int srcCnt[256] = {0}; // T中每个字母的个数 int foundCnt[256] = {0}; // 当前找到T中每个字母的个数 for(int i = 0; i < n2; i++) srcCnt[T[i]]++; int hasFound = 0; // 已经找到的字母数目 int winStart = -1, winEnd = n1; // 最终的窗口的左右边界 //两个指针start和i一起扫描 for(int i = 0, start = 0; i < n1; i++) if(srcCnt[S[i]] != 0){ // char in dict foundCnt[S[i]]++; if(foundCnt[S[i]] <= srcCnt[S[i]]) ++hasFound; if(hasFound == n2){ // 找到了一个满足的窗口 // 如果窗口最左端元素不在dict中或者找到的它的数量大于在dict中的数量 // 则直接把其去掉,并且把找到的它的数量减1 while(srcCnt[S[start]] == 0 || foundCnt[S[start]] > srcCnt[S[start]]){ if(srcCnt[S[start]] != 0) foundCnt[S[start]]--; start++; } if(winEnd - winStart > i - start){ winStart = start; winEnd = i; }

                // 这个时候去掉的窗口首元素是在dict中的, 并且找到的数量也等于dict中的数量
                foundCnt[S[start]]--;
                start++;
                hasFound--;
            }
        }
    return winStart != -1 ? S.substr(winStart, winEnd - winStart + 1) : "";
}

};

小小改进:注意到上述步骤3中缩减窗口时start要跳过不在T中的字符,如果S = “eeeeeeeeebadbaccb”(S的前面有大量的字符e不在T中),T = “abc”, 这个跳转会很费时,如果可以在第2步i扫描的过程中保存好T中字符在S中出现的位置,那么我们在缩减窗口时就不需要跳过例子中大量的e, 只需要跳过b、a这些在T中存在但是不影响窗口的字符。这个可以用辅助队列来实现。

// 20 ms class Solution { public: string minWindow(string S, string T) { int lens = S.size(), lent = T.size(); queue Q; int srcCnt[256] = {0}; // T中每个字母的个数 int foundCnt[256] = {0}; // 当前找到T中每个字母的个数 for(int i = 0; i < lent; i++) srcCnt[T[i]]++; int hasFound = 0; // 已经找到的字母数目 int winStart = -1, winEnd = lens; // 窗口的左右边界 for(int i = 0; i < lens; i++) if(srcCnt[S[i]] != 0){ Q.push(i); foundCnt[S[i]]++; if(foundCnt[S[i]] <= srcCnt[S[i]])hasFound++; if(hasFound == lent){ // 找到了一个满足的窗口 int k; do{ //缩减窗口到最小 k = Q.front(); Q.pop(); foundCnt[S[k]]--; } while(srcCnt[S[k]] <= foundCnt[S[k]]); if(winEnd - winStart > i - k){ winStart = k; winEnd = i; } hasFound--; } } return winStart != -1 ? S.substr(winStart, winEnd - winStart +1) : ""; } };

// http://articles.leetcode.com/2010/11/finding-minimum-window-in-s-which.html Best solution: The best solution, is in fact simpler. This best approach is suggested by someone who called stormrage . Notice how complicated the above solution is. It uses a hash table, a queue, and a sorted map. During an interview, the problems tend to be short and the solution usually can be coded in about 50 lines of code. So be sure that you say out loud what you are thinking and keep communication opened with the interviewer. Check if your approach is unnecessary complex, he/she might be able to give you guidance. The last thing you want to do is to get stuck in a corner and keep silent. To help illustrate this approach, I use a different example: S = “acbbaca” and T = “aba“. The idea is mainly based on the help of two pointers (begin and end position of the window) and two tables (needToFind and hasFound) while traversing S. needToFind stores the total count of a character in T and hasFound stores the total count of a character met so far. We also use a count variable to store the total characters in T that’s met so far (not counting characters where hasFound[x] exceeds needToFind[x]). When count equals T‘s length, we know a valid window is found. Each time we advance the end pointer (pointing to an element x), we increment hasFound[x] by one. We also increment count by one if hasFound[x] is less than or equal to needToFind[x]. Why? When the constraint is met (that is, count equals to T‘s size), we immediately advance begin pointer as far right as possible while maintaining the constraint. How do we check if it is maintaining the constraint? Assume that begin points to an element x, we check if hasFound[x] is greater than needToFind[x]. If it is, we can decrement hasFound[x] by one and advancing begin pointer without breaking the constraint. On the other hand, if it is not, we stop immediately as advancing begin pointer breaks the window constraint. Finally, we check if the minimum window length is less than the current minimum. Update the current minimum if a new minimum is found. Essentially, the algorithm finds the first window that satisfies the constraint, then continue maintaining the constraint throughout.

// Returns false if no valid window is found. Else returns // true and updates minWindowBegin and minWindowEnd with the // starting and ending position of the minimum window. bool minWindow(const char* S, const char *T, int &minWindowBegin, int &minWindowEnd) { int sLen = strlen(S); int tLen = strlen(T); int needToFind[256] = {0};

for (int i = 0; i < tLen; i++) needToFind[T[i]]++;

int hasFound[256] = {0}; int minWindowLen = INT_MAX; int count = 0; for (int begin = 0, end = 0; end < sLen; end++) { // skip characters not in T if (needToFind[S[end]] == 0) continue; hasFound[S[end]]++; if (hasFound[S[end]] <= needToFind[S[end]]) count++;

// if window constraint is satisfied
if (count == tLen) {
  // advance begin index as far right as possible,
  // stop when advancing breaks window constraint.
  while (needToFind[S[begin]] == 0 ||
        hasFound[S[begin]] > needToFind[S[begin]]) {
    if (hasFound[S[begin]] > needToFind[S[begin]])
      hasFound[S[begin]]--;
    begin++;
  }

  // update minWindow if a minimum length is met
  int windowLen = end - begin + 1;
  if (windowLen < minWindowLen) {
    minWindowBegin = begin;
    minWindowEnd = end;
    minWindowLen = windowLen;
  } // end if
} // end if

} // end for

return (count == tLen) ? true : false; }