Bytes vs. Runes

Shiraz Khan's Avatar

Shiraz Khan

LinkedIn

Senior Software Engineer

26 January 2025 / 5 min read

Bytes and Runes can be confusing as concepts for someone starting out with Go and coming from another programming language where you don’t have such aliases.

You already encountered them when learning about Numbers and Strings in Go, but let’s have a deeper look and see the differences:

Byte

Definition

Declaration

var b byte = 65 // 65 is the ASCII code for 'A'

Underlying Representation

Usage

Key Points

Rune

Definition

Declaration

var r rune = 'A' // 'A' is a Unicode code point with value 65
var r2 rune = 'δΈ–' // 'δΈ–' is a Unicode code point with value 19990

Underlying Representation

Usage

Key Points

Differences Between Byte and Rune

FeatureByte (byte)Rune (rune)
Type AliasAlias for uint8Alias for int32
Size8 bits (1 byte)32 bits (4 bytes)
Range0 to 2550 to 0x10FFFF (Unicode range)
PurposeRepresents ASCII characters or raw dataRepresents Unicode code points
Character HandlingLimited to ASCIISupports all Unicode characters
Memory UsageMore memory-efficient for ASCIIRequires more memory for Unicode
String IterationTreats strings as a sequence of bytesTreats strings as a sequence of runes
Common Use CasesBinary data, ASCII stringsMulti-language text, emojis, symbols

Practical implications

String Representation

In Go, a string is essentially a read-only slice of bytes ([]byte). However, when you iterate over a string, Go automatically converts it to a sequence of rune values to handle multi-byte Unicode characters correctly.

Conversion Between Byte and Rune

You can convert between byte and rune, but be cautious:

Example: Byte vs Rune in String Iteration

str := "Hello, δΈ–η•Œ"

// Iterating as bytes (may break multi-byte characters)
for i := 0; i < len(str); i++ {
    fmt.Printf("%c ", str[i]) // May print garbage for multi-byte characters
}

// Iterating as runes (correctly handles Unicode)
for _, r := range str {
    fmt.Printf("%c ", r) // Prints each character correctly
}

You’d need multiple bytes to represent what can be represented with just one rune.

Here’s why:

Thus, you’d need multiple bytes to represent what can be represented with just one rune because a single rune can encapsulate any Unicode character, even if that character requires multiple bytes in its UTF-8 representation.

Example:

str := "😊"
fmt.Println(len(str))         // Output: 4 (bytes)
fmt.Println(len([]rune(str))) // Output: 1 (rune)

This demonstrates that the emoji β€™πŸ˜Šβ€™ is represented by 4 bytes but only 1 rune.

When to use byte & rune vs. uint8 & int32

The choice between using byte and rune versus uint8 and int32 in Go depends on the semantic meaning you want to convey in your code. While byte and rune are aliases for uint8 and int32, respectively, they are used in different contexts to make your code more readable and expressive.

ContextUse byte or runeUse uint8 or int32
Text/Character HandlingUse byte for ASCII, rune for UnicodeNot appropriate
Binary DataUse byte for raw binary dataNot appropriate
Numeric DataNot appropriateUse uint8 or int32 for pure numbers
Semantic ClarityUse byte/rune for text/binary contextsUse uint8/int32 for numeric contexts