Group Anagrams - Getting Started / Warm Up

“Each group of anagrams should be in alphabetical order. The output should be in alphabetical order by the first elements of each group of anagrams.”

For both solutions in Java and Python, the driver code that isn’t part of the solution diff is doing the heavy lifting of sorting the inner lists followed by the outer list of lists. For the sake of completion, shouldn’t that be part of the code required to solve this problem?

Can actually do this in O(n) time if the language you do it in has a hashable Dict or Set (like Python’s frozenset) and a strong hash function. Iterate each string. For each string iterate each letter. Tally the occurrence of each letter to generate a “Thumbprint” for the string. Use the “Thumbprint” as the key in a dict for the outer loop. The value corresponding to the “Thumbprint” is the list of Anagrams that they want you to return. At the end of the loop, just dump the values in the dict.

Reply to myself, think the actual time is O(n * m) since the max string length can vary separately than the length of the list of strings. This is still less than O(n * m log m)

d = {“”.join(sorted(strs[_])):[] for _ in range(len(strs))}
for index, value in enumerate(strs):
d[“”.join(sorted(value))].append(value)
return list(d.values())

How to calculate time complexity in this kind of questions?

wish there was solutions in multiple languages

C++ doesn’t compile unless the const is removed from line 35.
I was going insane assuming it was something I did, but could not see what. Cleared my code and ran the base code, compile error. Very frustrating.

Hi, thanks for the suggestion. we are adding solutions in other language :slight_smile:

For Python solution: I see the driver code does the sorting (which I think should have been my task). But returning the dictionary values, can we guarantee that the order of insertion of these key-value pair will be maintained? Theoretically, dictionary is an unordered collection, insertion order is not maintained, no?

You can do O(n) if you bucket sort.
Letters are a limited range. ASCII code of lowerCase letters goes from 97 - 122. So you can create an array of 25 buckets. Then iterate each word, grab first letter, take note of index and add to list of buckets.
You can do the same for each individual word

here is my Javascript code:

const groupAnagrams = (strs) => {
    // sort main array by first letter
    const sortedWords = sortByFirstLetter(strs);
    
    // build anagram key array
    const anagramKeys = getAnagramKeys(sortedWords);
    
    // build anagram map
    const anagramMap = getAnagramMap(anagramKeys);
        
    // populate anagram map (group words by anagram) and get number of groups
    const groups = populateAnagramMap(sortedWords, anagramKeys, anagramMap);
    
    // get results array
    const results = getResults(anagramMap, groups);
    
    return results;
}

const getResults = (anagramMap, groups) => {
    const results = new Array(groups + 1).fill('');
    
    anagramMap.forEach((groupObject) => {
        const index = groupObject.get('group');
        const words = groupObject.get('words');
        results[index] = [...words];
    });
    return results;
};

const populateAnagramMap = (sortedWords, anagramKeys, anagramMap) => {
    let groups = -1;
    for(let i = 0; i < sortedWords.length; i++) {
        const word = sortedWords[i];
        const anagramKey = anagramKeys[i];
        const groupObject = anagramMap.get(anagramKey);
        groupObject.get('words').push(word);
        if(groupObject.get('group') === - 1) {
            groups += 1;
            groupObject.set('group', groups);
        }
    }
    
    return groups;
};

const getAnagramMap = (anagramKeys) => {
    const anagramMap = new Map();
    for(const key of anagramKeys) {
        if(anagramMap.has(key)) continue;
        const groupObject = new Map();
        groupObject.set('group', -1);
        groupObject.set('words', []);

        anagramMap.set(key, groupObject);
    }
    
    return anagramMap;
};

const getAnagramKeys = (words) => {
    const result = [];
    for(const word of words) {
        const anagramKey = getAnagramKey(word);
        result.push(anagramKey);
    }
    
    return result;
};

const getAnagramKey = (word) => { // tea ['t', 'e', 'a'] --> ['a', 'e' ,'t' ] join
    const arrWord = word.split('');
    const sortedWord = sortByFirstLetter(arrWord);
    return sortedWord.join('');
};

const sortByFirstLetter = (words) => {
    const buckets = new Array('z'.charCodeAt(0) - 'a'.charCodeAt(0))
        .fill('')
        .map(bucket => []);
    const offset = 'a'.charCodeAt(0);
    
    words.forEach((word, index) => {
        const firstLetterCode = word.charCodeAt(0);
        buckets[firstLetterCode - offset].push(index);
    });
    
    const result = [];
    for(const bucket of buckets) {
        if(bucket.length > 0) {
            for(const index of bucket) {
                result.push(words[index]);
            }
        }
    }
    
    return result;
};

The C# boilerplate doesn’t compile. And it looks like the C# version is less than 7.0, because it doesn’t appear to support Tuples.

It’d be great to have these problems linked to LeetCode (test cases). For example, my code runs and passes all the test cases here but the same code (correctly) gets Time Limit Exception in Leetcode. Until that happened, I simply thought my code was just fine and it was just another way of solving the problem. Which was not. Even the solution code here didn’t trigger any doubts for my code.