Longest Substring with At Most Two Distinct Characters - C Solution

1. Introduction

This blog post delves into a fascinating string processing challenge in C programming: finding the longest substring that contains at most two distinct characters. This problem is a great exercise in understanding window-sliding techniques and hash table implementations, which are essential for efficient string manipulation and pattern recognition algorithms.

Problem

The task is to write a C program that finds the length of the longest substring containing at most two distinct characters from a given string 's'. The challenge involves efficiently scanning the string while keeping track of character frequencies and substring boundaries.

2. Solution Steps

1. Use a sliding window approach to scan through the string.

2. Keep track of the frequency of each character within the window using a hash table.

3. Expand the window while there are no more than two distinct characters.

4. Shrink the window from the left when a third distinct character is added.

5. Update the maximum length whenever a valid window is found.

3. Code Program

#include <stdio.h>
#include <string.h>

#define MAX_CHARS 256 // Maximum number of ASCII characters

// Function to find the length of the longest substring with at most two distinct characters
int lengthOfLongestSubstringTwoDistinct(char *s) {
    int n = strlen(s);
    if (n < 3) return n; // If string is too short, return its length

    int freq[MAX_CHARS] = {0}; // Frequency table for each character
    int start = 0, end = 0; // Start and end of the sliding window
    int maxLen = 2; // Maximum length of substring
    int distinctCount = 0; // Count of distinct characters in the current window

    while (end < n) {
        if (freq[s[end]] == 0) distinctCount++; // New distinct character
        freq[s[end]]++; // Increment frequency of current character
        end++; // Expand the window

        while (distinctCount > 2) { // More than two distinct characters
            freq[s[start]]--; // Decrement frequency of character at start
            if (freq[s[start]] == 0) distinctCount--; // One less distinct character
            start++; // Shrink the window
        }

        maxLen = (end - start > maxLen) ? end - start : maxLen; // Update maximum length
    }

    return maxLen; // Return the maximum length found
}

int main() {
    char s[] = "eceba"; // Example string
    int result = lengthOfLongestSubstringTwoDistinct(s); // Call the function
    printf("Output: %d\n", result); // Print the result

    return 0;
}

Output:

3

Explanation:

The program uses a sliding window approach to find the longest substring with at most two distinct characters in the string "eceba". It maintains a frequency table to track the number of distinct characters in the current window. The window is expanded until a third distinct character is encountered. At that point, the window is shrunk from the left until there are only two distinct characters left. The length of each valid window is compared with the maximum length found so far, and the maximum is updated accordingly. In this example, the longest such substring is "ece", with a length of 3, hence the output is 3.


Comments