分类
C++ Develop 算法

LeetCode题目:Scramble String,三维动态规划

一开始拿到这个题的时候没什么想法,浆糊了之后立马百度之,才有了思路。
简单的说,就是s1和s2是scramble的话,那么必然存在一个在s1上的长度l1,将s1分成s11和s12两段,同样有s21和s22。
那么要么s11和s21是scramble的并且s12和s22是scramble的;
要么s11和s22是scramble的并且s12和s21是scramble的。
先用递归写了一个算法,但是大集合不过,然后用三维动态规划才搞定。



题目描述Scramble String
Given a string s1, we may represent it as a binary tree by partitioning it to two non-empty substrings recursively.
Below is one possible representation of s1 = “great”:

    great
   /    \
  gr    eat
 / \    /  \
g   r  e   at
           / \
          a   t

To scramble the string, we may choose any non-leaf node and swap its two children.
For example, if we choose the node “gr” and swap its two children, it produces a scrambled string “rgeat”.

    rgeat
   /    \
  rg    eat
 / \    /  \
r   g  e   at
           / \
          a   t

We say that “rgeat” is a scrambled string of “great”.
Similarly, if we continue to swap the children of nodes “eat” and “at”, it produces a scrambled string “rgtae”.

    rgtae
   /    \
  rg    tae
 / \    /  \
r   g  ta  e
       / \
      t   a

We say that “rgtae” is a scrambled string of “great”.
Given two strings s1 and s2 of the same length, determine if s2 is a scrambled string of s1.



代码:三维动态规划,48ms过大集合

    bool isScrambleDP(string s1, string s2) {
        if(s1.size() != s2.size()) return false;
        if(s1.size() == 0) return true;
        if(s1 == s2) return true;
        bool ***iss = NULL;//iss[len][startIndexAtS1][startIndexAtS2]
        iss = new bool**[s1.size()];
        for(int len = 1;len <= s1.size(); ++len) {
            //size at this level
            int levelSize = s1.size() - len + 1;
            int levelIndex = len - 1;
            iss[levelIndex] = new bool*[levelSize];
            for(int indexS1 = 0;indexS1 < levelSize; ++ indexS1) {
                iss[levelIndex][indexS1] = new bool[levelSize];
                for(int is2 = 0; is2 < levelSize; ++is2) {
                    if(len == 1) {
                        iss[levelIndex][indexS1][is2] = (s1[indexS1] == s2[is2]);
                    } else {
                        iss[levelIndex][indexS1][is2] = false;
                        for(int seglen1 = 1; seglen1 < len; ++seglen1) {
                            int seglen2 = len - seglen1;
                            int sli1 = seglen1 - 1;
                            int sli2 = seglen2 - 1;
                            if(iss[sli1][indexS1][is2] && iss[sli2][indexS1 + seglen1][is2 + seglen1]) {
                                iss[levelIndex][indexS1][is2] = true;
                                break;
                            }
                            if(iss[sli1][indexS1][is2 + seglen2] && iss[sli2][indexS1 + seglen1][is2]) {
                                iss[levelIndex][indexS1][is2] = true;
                                break;
                            }
                        }
                    }
                }
            }
        }
        return iss[s1.size() - 1][0][0];
    }



递归代码,小集合12ms过,大集合过不了,因为时间复杂度是O(3n)

class Solution {
public:
    bool isScramble(string s1, string s2) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        if(s1.size() != s2.size()) return false;
        if(s1 == s2) return true;
        for(int isep = 1; isep < s1.size(); ++ isep) {
            //seporate s1 as [0,isep - 1],[isep, s1.size() - 1]
            string seg11 = s1.substr(0,isep);
            string seg12 = s1.substr(isep);
            {//see if forward order is ok
                string seg21 = s2.substr(0,isep);
                string seg22 = s2.substr(isep);
                if(isScramble(seg11,seg21) && isScramble(seg12,seg22)) return true;
            }
            {//see if reverse order is ok
                string seg21 = s2.substr(s2.size() - isep);
                string seg22 = s2.substr(0,s2.size() - isep);
                if(isScramble(seg11,seg21) && isScramble(seg12,seg22)) return true;
            }
        }
        return false;
    }
};



Recursive code rewrite at 2013-2-5, 48ms pass large set

class Solution {
public:
    bool isScramble(string s1, string s2) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        if(s1.size() != s2.size()) return false;
        if(s1 == s2) return true;
        string ts1 = s1,ts2 = s2;
        sort(ts1.begin(), ts1.end());
        sort(ts2.begin(), ts2.end());
        if(ts1 != ts2) return false;
        for(int isep = 1; isep < s1.size(); ++ isep) {
            //seporate s1 as [0,isep - 1],[isep, s1.size() - 1]
            string seg11 = s1.substr(0,isep);
            string seg12 = s1.substr(isep);
            {//see if forward order is ok
                string seg21 = s2.substr(0,isep);
                string seg22 = s2.substr(isep);
                if(isScramble(seg11,seg21) && isScramble(seg12,seg22)) return true;
            }
            {//see if reverse order is ok
                string seg21 = s2.substr(s2.size() - isep);
                string seg22 = s2.substr(0,s2.size() - isep);
                if(isScramble(seg11,seg21) && isScramble(seg12,seg22)) return true;
            }
        }
        return false;
    }
};

“LeetCode题目:Scramble String,三维动态规划”上的23条回复

[…] 这道题定义了一种爬行字符串,就是说假如把一个字符串当做一个二叉树的根,然后它的非空子字符串是它的子节点,然后交换某个子字符串的两个子节点,重新爬行回去形成一个新的字符串,这个新字符串和原来的字符串互为爬行字符串。这道题可以用递归Recursion或是动态规划Dynamic Programming来做,我们先来看递归的解法,参见网友uniEagle的博客,简单的说,就是s1和s2是scramble的话,那么必然存在一个在s1上的长度l1,将s1分成s11和s12两段,同样有s21和s22.那么要么s11和s21是scramble的并且s12和s22是scramble的;要么s11和s22是scramble的并且s12和s21是scramble的。就拿题目中的例子 rgeat 和 great 来说,rgeat 可分成 rg 和 eat 两段, great 可分成 gr 和 eat 两段,rg 和 gr 是scrambled的, eat 和 eat 当然是scrambled。根据这点,我们可以写出代码如下: […]

[…] 这道题定义了一种爬行字符串,就是说假如把一个字符串当做一个二叉树的根,然后它的非空子字符串是它的子节点,然后交换某个子字符串的两个子节点,重新爬行回去形成一个新的字符串,这个新字符串和原来的字符串互为爬行字符串。这道题可以用递归Recursion或是动态规划Dynamic Programming来做,我们先来看递归的解法,参见网友uniEagle的博客,简单的说,就是s1和s2是scramble的话,那么必然存在一个在s1上的长度l1,将s1分成s11和s12两段,同样有s21和s22.那么要么s11和s21是scramble的并且s12和s22是scramble的;要么s11和s22是scramble的并且s12和s21是scramble的。就拿题目中的例子 rgeat 和 great 来说,rgeat 可分成 rg 和 eat 两段, great 可分成 gr 和 eat 两段,rg 和 gr 是scrambled的, eat 和 eat 当然是scrambled。根据这点,我们可以写出代码如下: […]

看了你的不少文章,感觉收获良多!只是有点小问题想请教:按照我的理解,那个递归算法在最差情况下应该是O(3^n),而非O(n^2)。理由是:假设函数运行时间为f(n),那么由于在每次函数调用中都要考虑1~n之间的所有长度,并且正反都要检查,所以有
f(n) = 2[f(1) + f(n-1)] +2[f(2) + f(n-2)] … + 2[f(n/2) + f(n/2+1)]. 易推得f(n+1) = 3(fn), 故f(n) = O(3^n)。当然这是最差情况下的时间复杂度。那么你提到的O(n^2),是否是通过其他数学方法得到的更tight的上限?欢迎探讨!

抱歉,我刚才又想了一下。我之前的推算过程可能不太对。因为循环是从1到s1.size()-1, 总共进行了n – 1次。每次循环有4次函数调用:第一次循环的时间是2[f(1) + f(n-1)], 第二次循环是2[f(2) + f(n-2)] , … 第n/2次循环是2[f(n/2) + f(n/2)] (我之前的推理到此就结束了,因为忘了后面还有n/2次循环)… 第n-1次循环是2[f(n-1) + f(1)] (和第一次循环的时间是一样的)。所以应该是 f(n) = 4[f(1) + f(2) +f(3) … + f(n-1)] (总共4(n-1)项),即O(5^n). 虽然似乎没有完全满足这种情况的例子,因为在if语句里如果第一个条件不满足后一个条件也不会被执行了;如果第一条if语句成功返回true第二条if语句也不会执行。但如果用我能想到的最耗时的例子:s1 = “a…a…a”(2N+1个a), s2 = “a…b…a” (N个a,1个b,N个a)。大概只能有1/3~1/2之间的函数调用不会被执行,总时间还是5^n级别。所以,我认为这段代码的最差时间复杂度其实应该是O(5^n)…… 当然这属于细节问题了,我觉得咱的重点还是在强调”未优化的递归算法是指数时间复杂度,所以过不了大数据集”。

f(n-1) = 4[f(1) + … + f(n-2)] ——— (1)
=> f(n) = 4[f(1) + f(2) +f(3) … + f(n-1)] ——- (2)

(1) – (2) => f(n) – f(n-1) = 4f(n-1)
=> f(n) = 5f(n-1)
=> f(n) = [5 ^ n] * f(0)
=> O(f(n)) = O(5^n) //因为f(0)是常量

谷歌最近的一个面试题:
一点头绪都没有,请大家帮帮忙。

Scrambled Texts

Imagine that I have a script that applies a permutation序列 to the characters of a string. Of course,
the permutation is different if the strings are different length, but for strings of the same length,
it always applies the same permutation, chosen at random from among all n! possible permutations.
For example, let’s say that given the 15-character input ABCDEFGHIJKLMNO, it outputs CMLNFAOHDIJKBGE.
Then for any other 15-character string, it would apply the same permutation, so that MYLANGUAGEMODEL
would be scrambled to LDOEGMLAAGEMYUN.

The following are the results of submitting ten different 128-character strings to this script.
Since they are the same length, they have all had the same permutation applied to them.
The original strings in each case are ordinary English text with all spaces, numbers, punctuation标点符号,
and other non-alphabetic characters removed. The strings are not especially interesting.
They were copied or transcribed改编 from obscure sources, and have all been changed slightly from the original
(without intentionally introducing spelling or grammatical errors). Your goal is to recover the original strings
in any way you see fit.

1. IATSNEDBOONANYNOOOTSPROOLEDCCUODYOUFBOAHORRFWWKTHJEHDEENHUTISYNOCTEAFAOFLCSAONRLOIEUCHNATNULAAYDSUDMEIMATADUIMSYRHVOADETMMJTPIIN
2. EOHYEYRHBOESOCTWUSSEVMHYLIDTHNOLEUSLEDAIRLOLEAEWAETEELKNBSEHTSNWNBNNETNMEELEMNHHRTTRIYIARMENCELBGSCIETEOEIEALESRAICETSLSHAELHIEL
3. FSTEDTHGDESTPSSORRINFHNCCEAEEEEUEIARGREATEAHTTIEAEOSITNDHNCEINLUODTNHSXERASTIHITTSESRINTTAWEATAVDOAOOSILIEEHHCPYTNICMTTIITUEVTHO
4. MCDAIERIBROOEDGEUDUIARRLEEENSPEEEIERENCFOETRDVRATSETRSCINLNMNEOENATOOLYBNURLAOOTSEAMEEAAEASECCEKRMCNNECRMSFPELSRRIEXTDHCSRNUDPRA
5. AFTTEEGDOCISGEAHHRKNIEVTNNASEHOHBTTSNOEHAKTOADTHAETSHDIMEICMULOERHEEINUEOAINOTCSETNDSIOEBMVAKGWYYADADNOWTTUNTNMEEBNDAYADRSANSHRN
6. EIERTDMHNMONAENIADLUDLGSALNISIHGLCTBARRNLEEHAREAPONEMTNWBSWEOLWYNHITOGOYEOEETLETBTDLEAHAOWOSTTEIOONRLOSETRINHAHHHWILSMPRYTIEWTSE
7. HLABRSEOIAERRVDTNYAGAROWTILWNOGNAIINURGBROTHGETNDHIWODHINFIGONORRTFOOEHAAGWRESLRYSFENAREIEHLEOENUNRESONRTTWRAANISHTCANITPWMWWTVD
8. STLIKNRTETDEEEIEHFTEPBCTHEAIALIWNTOROTRSIHHUIEAININNRRSEOTNNNETOODNHETSGPLTEXUIHILTIDAUSFHNTOTREITEGMHANSAHGGRRYUIEBMADFRCSEDDTB
9. ILHHHRTDFPBEGOREEIXLWKTWCEDEIABTSGOOANIUFRDHAOFDSNCNHDTWVNTLERLNERASRAAUBVSSEEUPCSTENYDMELOWFTEEBAUEOISEAJOIDIOEEMNPOLGAANIOSTPT
10.ALTANSGHIHRIESSUIONEINKIEGEMYMPTRELTOUBMEFLRHUNBUISOABTDWSSODEEAFDIAIULUDEANSEGYYANPOPREOWOCAOEARVARNHRSLELORWRYTDYTNFFKTYWUTAAO

You should respond to this email within the next few days with the following:

1, Your best guess as to the original strings (if you have one)
2, A description of how you approached the problem, and
3, A description of how much time you spent on this problem, including dead ends, preparatory work, and so on.

It is our expectation that you will spend not more than about 8 hours on this problem. We’d normally like
to see a response within two days of when you read this. If you have scheduling constraints that would make it
inconvenient to start working on this right away, just let us know whether to expect a response and when.
Please note that this is a difficult problem, and it’s possible to impress us without solving it completely,
so don’t be afraid to submit your best work at the end of the time period.

这确实很难想象。
看起来需要一个英文单词的字典dict,做一个搜索:

    foreach(permutation) do
        thisPermutaionIsRight = true;
        foreach(givenString) do
            originString = reverse permutation for givenString
            if (haveUnrecognizedWord in originString according to Dict)
               //try next permutation
               thisPermutaionIsRight = false;
               break;
            end
        end
        if thisPermutaionIsRight do
            the permutation found.
            break;
        end
    end

就像这样,这感觉是在解密呢。

十分感谢,借鉴了你的代码。另外,稍微修改一下细节,performance稍微有点提高,参见这里
建议
1. 查是否有相同的字符时候可以用bucket sort,这一步复杂度从O(n log n)降到O(n).
2. 做了一个helper,递归的时候传引用,避免每次复制字符串。

十分感谢你的代码。

不是三维吧?是四维吧
还有递归也不仅是n^2吧
递归应该还可以加个memorize的hash, 应该比sort要好, 避免重复判断

三维是可以做的,我的代码中声明为:
bool ***iss = NULL;//iss[len][startIndexAtS1][startIndexAtS2]
是三维,当然可以用四维来做:
·s1上的起点,s1上的终点,s2上的起点,s2上的终点,这四维。

你是对的,递归的算法,从代码上来看是超过O(n^2)的。

这个题也是可以用递归过的。楼主在枚举切分位置isep后,不要直接就递归调用,先对s11,s21排序,若排序后两者相等,才有必要进一步递归。同样,s11和s22也应该先排序比较看是不是可行。我用递归做的是40ms过大数据集。

发表评论

电子邮件地址不会被公开。 必填项已用*标注