Skip to content

Leetcode 271 - Encode and Decode Strings

Difficulty: medium

Table of contents

Open Table of contents

Problem

Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.

Example 1:

Input: dummy_input = ["Hello","World"]
Output: ["Hello","World"]

Example 2:

Input: dummy_input = [""]
Output: [""]

Constraints:

Follow up: Could you write a generalized algorithm to work on any possible set of characters?

Explanation/Approach

The challenge is to encode and decode strings in a way that we can uniquely identify each string after transmission. To handle any possible set of characters, including delimiters, we can use a length-prefixed encoding.

Length-Prefixed Encoding:

This approach handles any set of characters, including those that could be confused with delimiters in other encoding schemes.

Solution: Length-Prefixed Encoding

class Codec:
    def encode(self, strs):
        """Encodes a list of strings to a single string."""
        encoded = []
        for s in strs:
            encoded.append(str(len(s)) + "#" + s)
        return ''.join(encoded)

    def decode(self, s):
        """Decodes a single string to a list of strings."""
        strs = []
        i = 0
        while i < len(s):
            j = i
            while s[j] != '#':
                j += 1
            length = int(s[i:j])
            strs.append(s[j + 1: j + 1 + length])
            i = j + 1 + length
        return strs

Time and Memory Complexity

The time complexity for both encoding and decoding is O(n), where n is the total number of characters in all strings. The space complexity is also O(n) as we need to store the encoded string or the list of strings.

Explanation

This solution uses length-prefixing to encode each string, ensuring that even strings containing the delimiter (#) can be correctly encoded and decoded. During decoding, the algorithm locates each #, determines the length of the next string, and extracts it accordingly. This method is robust for any set of ASCII characters.

Visualization Process:

Encoding:
Input: ["Hello", "World"]
Encoded: "5#Hello5#World"

Decoding:
Encoded: "5#Hello5#World"
Output: ["Hello", "World"]

Each string is prefixed with its length and a # delimiter. During decoding, these prefixes are used to accurately extract each original string, ensuring that the decoded list of strings matches the original list.