Strings, bytes, runes and characters in Go
Last Updated :
05 Feb, 2025
In Go, strings are sequences of bytes, not characters. Understanding bytes, runes, and encoding is crucial for handling text correctly. This article explores their differences and key concepts that every developer should know.
1. String
A string in Go is essentially a read-only slice of bytes. This means that strings are backed by a byte slice and are immutable, which means once a string is created, its content cannot be changed directly. While the content of a string can be manipulated (for example, by creating a new string), the string object itself is fixed in terms of size and memory.
Here’s an example of a string:
package main
import "fmt"
func main() {
var str = "Hello, World!"
fmt.Println(str) // Output: Hello, World!
}
String Literals vs. Byte Slices
In Go, a string literal (enclosed in double quotes) is automatically UTF-8 encoded, while a byte slice is just a collection of arbitrary bytes, which could represent text in any encoding scheme, not necessarily UTF-8.
Example:
// String literal
str := "Hello, World!" // This is a UTF-8 encoded string
// Byte slice
bytes := []byte{72, 101, 108, 108, 111, 44, 32, 87, 111, 114, 108, 100, 33} // Same content, raw bytes
Understanding UTF-8 and String Encoding
In Go, strings are UTF-8 encoded by default, meaning each character can be one or more bytes long, depending on the Unicode character’s code point.
UTF-8 Encoding
UTF-8 is a variable-length character encoding that uses one to four bytes for each character. Characters from the ASCII set (U+0000 to U+007F) use a single byte, while characters from other scripts like Chinese or emojis can require multiple bytes.
For example, the character A
(U+0041) is represented as the single byte 0x41
in UTF-8. However, a character like ⌘
(U+2318) takes three bytes (e2 8c 98
).
package main
import "fmt"
func main() {
str := "⌘" // Unicode character U+2318 (Place of Interest)
fmt.Println(len(str)) // Output: 3 because '⌘' is 3 bytes in UTF-8
}
Why Indexing a String in Go Doesn’t Return a Character
Go strings are slices of bytes, meaning that when you index a string, you get the individual byte values, not the characters. This can be confusing because, in many programming languages, strings are treated as sequences of characters. In Go, however, a character could span more than one byte, as seen in UTF-8 encoded strings.
package main
import "fmt"
func main() {
str := "⌘"
fmt.Printf("Character at position 0: %c\n", str[0]) // Output: Character at position 0: � (corrupted)
fmt.Printf("Character at position 0 (byte value): %d\n", str[0]) // Output: 226
}
Here, we see that str[0]
returns the first byte (226
), but that byte alone doesn't represent the character ⌘
, which is a three-byte sequence.
2. Runes
Go introduces the rune type to represent Unicode code points. A rune is an alias for the int32
type, and it is used to represent a single character, regardless of how many bytes it takes in UTF-8 encoding.
Rune and Code Point
In the context of Unicode, a code point is a unique identifier for each character. A rune in Go is a 32-bit integer that represents a Unicode code point. For instance, the ⌘
symbol has a Unicode code point of U+2318
, which is represented as a rune in Go.
Example:
Go
package main
import "fmt"
func main() {
// Declare a rune (character constant)
var r rune = '⌘'
// Print the rune value and its Unicode code point
fmt.Printf("Rune value: %c\n", r) // Output: Rune value: ⌘
fmt.Printf("Unicode code point: U+%04X\n", r) // Output: Unicode code point: U+2318
}
For-Range Loop with Runes
Go has built-in support for iterating over strings using the for range
loop, which handles multi-byte characters like runes properly by iterating over each individual character (rune) in the string.
Go
package main
import "fmt"
func main() {
str := "日本語" // Japanese characters
// Using for-range to loop over runes
for i, runeValue := range str {
fmt.Printf("Rune %c at byte position %d\n", runeValue, i)
}
}
Output:
Rune 日 at byte position 0
Rune 本 at byte position 3
Rune 語 at byte position 6
In this example, for range
iterates over the string, decoding the UTF-8 bytes into the correct Unicode code points (runemarks
) at each index.
Bytes, Runes, and Characters in Go
What’s the Difference Between Bytes, Runes, and Characters?
- Bytes: A byte represents 8 bits of data. In the context of strings, each byte corresponds to one ASCII character or part of a multi-byte character (like UTF-8).
- Runes: A rune is an alias for
int32
and represents a single Unicode code point. It's used in Go to handle characters that may span more than one byte in UTF-8. - Characters: While we often think of characters as being individual letters or symbols, the concept is fuzzy in computing because characters can be composed of one or more code points (like accented characters).
In Go:
- A string holds arbitrary bytes, and indexing into it retrieves bytes, not individual characters.
- A rune holds a Unicode code point, which represents a single character.
Practical Example: Converting Between Runes, Bytes, and Strings
Here’s a practical example where we convert between runes, bytes, and strings in Go:
Go
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
// Original string
str := "Hello, 世界" // "Hello, World" in English and Chinese characters
// Convert string to byte slice
bytes := []byte(str)
fmt.Printf("Byte slice: %x\n", bytes)
// Iterate over the string using a for-range loop
fmt.Println("Iterating over string (runes):")
for i, runeValue := range str {
fmt.Printf("Rune: %c, at byte position %d\n", runeValue, i)
}
// Convert string to rune slice
runes := []rune(str)
fmt.Printf("Rune slice: %v\n", runes)
// Convert rune back to string
backToString := string(runes)
fmt.Printf("Converted back to string: %s\n", backToString)
// Find the length of the string and the number of runes
fmt.Printf("String length (in bytes): %d\n", len(str))
fmt.Printf("Number of runes: %d\n", utf8.RuneCountInString(str))
}
OutputByte slice: 48656c6c6f2c20e4b896e7958c
Iterating over string (runes):
Rune: H, at byte position 0
Rune: e, at byte position 1
Rune: l, at byte position 2
Rune: l, at byte position 3
Rune: o, at byte p...
In Go, understanding strings, bytes, and runes is crucial for handling text, especially in multilingual applications. Strings store arbitrary bytes, while runes represent Unicode characters. Bytes work well for ASCII, but runes are essential for UTF-8 and international text. Using Go’s built-in types and libraries, you can efficiently convert and manipulate text while ensuring accuracy across different languages.
Similar Reads
Go Tutorial
Go or you say Golang is a procedural and statically typed programming language having the syntax similar to C programming language. It was developed in 2007 by Robert Griesemer, Rob Pike, and Ken Thompson at Google but launched in 2009 as an open-source programming language and mainly used in Google
2 min read
Go Programming Language (Introduction)
Go is a procedural programming language. It was developed in 2007 by Robert Griesemer, Rob Pike, and Ken Thompson at Google but launched in 2009 as an open-source programming language. Programs are assembled by using packages, for efficient management of dependencies. This language also supports env
11 min read
time.Sleep() Function in Golang With Examples
In Go language, time packages supplies functionality for determining as well as viewing time. The Sleep() function in Go language is used to stop the latest go-routine for at least the stated duration d. And a negative or zero duration of sleep will cause this method to return instantly. Moreover, t
3 min read
Learn Free Programming Languages
In this rapidly growing world, programming languages are also rapidly expanding, and it is very hard to determine the exact number of programming languages. Programming languages are an essential part of software development because they create a communication bridge between humans and computers. No
9 min read
Golang Tutorial - Learn Go Programming Language
This Golang tutorial provides you with all the insights into Go Language programming, Here we provide the basics, from how to install Golang to advanced concepts of Go programming with stable examples. So, if you are a professional and a beginner, this free Golang tutorial is the best place for your
10 min read
strings.Contains Function in Golang with Examples
strings.Contains Function in Golang is used to check the given letters present in the given string or not. If the letter is present in the given string, then it will return true, otherwise, return false. Syntax:Â func Contains(str, substr string) bool Here, str is the original string and substr is t
2 min read
Interfaces in Golang
In Go, an interface is a type that lists methods without providing their code. You canât create an instance of an interface directly, but you can make a variable of the interface type to store any value that has the needed methods.Exampletype Shape interface { Area() float64 Perimeter() float64}In t
3 min read
fmt.Sprintf() Function in Golang With Examples
In Go language, fmt package implements formatted I/O with functions analogous to C's printf() and scanf() function. The fmt.Sprintf() function in Go language formats according to a format specifier and returns the resulting string. Moreover, this function is defined under the fmt package. Here, you
2 min read
Top 10 Golang Project Ideas with Source Code in 2025
Golang, or Go, a programming language was created by Google. It's widely used for building different kinds of applications like websites and cloud services. The fastest way to master this language is by building projects related to it. This article introduces 10 beginner-friendly to medium-difficult
8 min read
Top 10 Golang Frameworks in 2025
Golang (or Go) is an open-source compiled programming language that is used to build simple, systematic, and secure software. It was designed by Google in the year 2007 and has been readily adopted by developers all over the world due to its features like memory safety, structural typing, garbage co
9 min read