Substring with Concatenation of All Words - Java Solution

1. Introduction

This blog post focuses on solving the 'Substring with Concatenation of All Words' problem, a complex string manipulation challenge. It combines elements of string searching, hashing, and combinatorics. The task is to find the starting indices of all substrings in a given string 's' that are a concatenation of all possible permutations of an array of words.

Problem

Given a string 's' and an array of strings 'words', where all strings in 'words' are of the same length, we need to find all starting indices of substrings in 's' that are a concatenation of any permutation of 'words'. The result should include the starting indices of all such concatenated substrings in any order.

2. Solution Steps

1. Calculate the length of each word and the total length of all words combined.

2. Use a hashmap to store the frequency of each word in the 'words' array.

3. Iterate through 's' with a window of the combined length of all words.

4. Inside each window, split the substring into words of equal length and use a hashmap to store their frequencies.

5. Compare the two hashmaps; if they match, add the starting index of the window to the result list.

6. Return the list of starting indices.

3. Code Program

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class SubstringWithConcatenation {
    public static void main(String[] args) {
        String s = "barfoothefoobarman";
        String[] words = {"foo", "bar"};
        System.out.println(findSubstring(s, words)); // Test the function
    }

    // Function to find starting indices of all concatenated substrings
    public static List<Integer> findSubstring(String s, String[] words) {
        List<Integer> result = new ArrayList<>();
        Map<String, Integer> wordCount = new HashMap<>();
        int wordLen = words[0].length();
        int allWordsLen = wordLen * words.length;

        // Count the frequency of each word in the 'words' array
        for (String word : words) {
            wordCount.put(word, wordCount.getOrDefault(word, 0) + 1);
        }

        // Iterate through 's' with a window of the combined length of all words
        for (int i = 0; i <= s.length() - allWordsLen; i++) {
            Map<String, Integer> seen = new HashMap<>();
            int j = 0;
            // Check each word in the current window
            while (j < words.length) {
                String word = s.substring(i + j * wordLen, i + (j + 1) * wordLen);
                seen.put(word, seen.getOrDefault(word, 0) + 1);

                // Break if the word is not found or its frequency is too high
                if (!wordCount.containsKey(word) || seen.get(word) > wordCount.get(word)) break;

                j++;
            }
            // If all words are found, add the start index to the result
            if (j == words.length) result.add(i);
        }

        return result;
    }
}

Output:

[0, 9]

Explanation:

1. findSubstring: This function identifies starting indices of concatenated substrings in 's' that match any permutation of 'words'.

2. It uses a hashmap to count the frequency of each word in 'words'.

3. The function then iterates through 's' with a window equal to the combined length of all words.

4. Within each window, it checks if the substring can be divided into words found in 'words', keeping track of their frequencies.

5. If the frequencies in the current window match those in 'words', the starting index of the window is added to the result.

6. The function returns a list of starting indices where the concatenated substrings are found.


Comments