Work with Strings Effectively
Problem
Strings in Go are immutable UTF-8 byte sequences, creating performance problems with naive concatenation and complexity with Unicode handling. String iteration over bytes instead of characters leads to bugs with multi-byte characters. Understanding runes, bytes, and efficient string building is essential.
This guide shows effective string manipulation in Go.
Strings and Runes
Understanding String Internals
// ✅ Strings are UTF-8 encoded bytes
s := "Hello, 世界"
// Length in bytes (not characters!)
fmt.Println(len(s)) // 13 bytes
// ✅ Rune = single Unicode code point (int32)
for i, r := range s {
fmt.Printf("%d: %c (U+%04X)\n", i, r, r)
}
// Output:
// 0: H (U+0048)
// 1: e (U+0065)
// ...
// 7: 世 (U+4E16) <- 3 bytes, index jumps by 3
// 10: 界 (U+754C)
// ❌ Indexing gives bytes, not characters
fmt.Printf("%c\n", s[0]) // H (works for ASCII)
fmt.Printf("%c\n", s[7]) // � (garbage - middle of multi-byte char)
// ✅ Convert to []rune for character indexing
runes := []rune(s)
fmt.Println(len(runes)) // 9 characters
fmt.Printf("%c\n", runes[7]) // 世 (correct)Why it matters: Understanding bytes vs runes prevents bugs with Unicode text. Iterate with range to get runes, use []rune conversion for character-level operations.
String Iteration
// ✅ Iterate over runes (characters)
func CountVowels(s string) int {
count := 0
for _, r := range s {
if isVowel(r) {
count++
}
}
return count
}
func isVowel(r rune) bool {
vowels := "aeiouAEIOU"
return strings.ContainsRune(vowels, r)
}
// ❌ Iterate over bytes (wrong for Unicode)
func CountVowelsBad(s string) int {
count := 0
for i := 0; i < len(s); i++ {
// s[i] is a byte, not a rune
if isVowelByte(s[i]) { // Fails for non-ASCII vowels
count++
}
}
return count
}Common String Operations
Searching and Testing
import "strings"
s := "Hello, World!"
// ✅ Check prefix/suffix
hasHello := strings.HasPrefix(s, "Hello") // true
hasExclaim := strings.HasSuffix(s, "!") // true
// ✅ Contains
contains := strings.Contains(s, "World") // true
// ✅ Count occurrences
count := strings.Count("banana", "an") // 2
// ✅ Index of substring
index := strings.Index(s, "World") // 7
lastIndex := strings.LastIndex(s, "o") // 8
// ✅ Case-insensitive contains
containsLower := strings.Contains(
strings.ToLower(s),
strings.ToLower("WORLD"),
) // true
// ✅ Or use EqualFold for comparison
equal := strings.EqualFold("Hello", "HELLO") // trueModifying Strings
s := " Hello, World! "
// ✅ Trim whitespace
trimmed := strings.TrimSpace(s) // "Hello, World!"
// ✅ Trim specific characters
trimmed = strings.Trim("!!!Hello!!!", "!") // "Hello"
// ✅ Replace
replaced := strings.Replace(s, "World", "Go", 1) // " Hello, Go! "
replaceAll := strings.ReplaceAll(s, "l", "L") // " HeLLo, WorLd! "
// ✅ Case conversion
upper := strings.ToUpper(s) // " HELLO, WORLD! "
lower := strings.ToLower(s) // " hello, world! "
title := strings.Title("hello world") // "Hello World"
// ✅ Repeat
repeated := strings.Repeat("Go", 3) // "GoGoGo"Splitting and Joining
// ✅ Split by delimiter
csv := "Alice,Bob,Charlie"
names := strings.Split(csv, ",") // []string{"Alice", "Bob", "Charlie"}
// ✅ Split with limit
parts := strings.SplitN("a:b:c:d", ":", 2) // []string{"a", "b:c:d"}
// ✅ Split on whitespace
words := strings.Fields(" hello world ") // []string{"hello", "world"}
// ✅ Join strings
joined := strings.Join(names, " & ") // "Alice & Bob & Charlie"
// ✅ Split lines
lines := strings.Split("line1\nline2\nline3", "\n")Efficient String Building
strings.Builder
// ❌ Inefficient concatenation (creates new string each time)
func BuildStringBad(items []string) string {
result := ""
for _, item := range items {
result += item + "," // Creates new string every iteration
}
return result
}
// ✅ Use strings.Builder for efficient building
func BuildString(items []string) string {
var builder strings.Builder
for i, item := range items {
builder.WriteString(item)
if i < len(items)-1 {
builder.WriteString(",")
}
}
return builder.String()
}
// ✅ Pre-allocate capacity if known
func BuildStringWithCapacity(items []string) string {
// Estimate: avg 10 chars per item + commas
capacity := len(items)*10 + len(items) - 1
var builder strings.Builder
builder.Grow(capacity) // Pre-allocate
for i, item := range items {
builder.WriteString(item)
if i < len(items)-1 {
builder.WriteByte(',')
}
}
return builder.String()
}Performance comparison:
// Benchmark results (1000 items):
// Concatenation: 5000000 ns/op 500000 allocs
// strings.Builder: 50000 ns/op 1 allocsBuilder Methods
var builder strings.Builder
// ✅ Write string
builder.WriteString("Hello")
// ✅ Write single byte
builder.WriteByte(' ')
// ✅ Write single rune
builder.WriteRune('世')
// ✅ Get result
result := builder.String()
// ✅ Get length
length := builder.Len()
// ✅ Reset for reuse
builder.Reset()String Formatting
fmt.Sprintf
// ✅ Format string
name := "Alice"
age := 30
formatted := fmt.Sprintf("Name: %s, Age: %d", name, age)
// "Name: Alice, Age: 30"
// ✅ Padding and alignment
fmt.Sprintf("%-10s | %5d", "Alice", 30) // "Alice | 30"
fmt.Sprintf("%10s | %05d", "Bob", 5) // " Bob | 00005"
// ✅ Float precision
fmt.Sprintf("%.2f", 123.456) // "123.46"
fmt.Sprintf("%10.2f", 123.456) // " 123.46"
// ✅ Hexadecimal
fmt.Sprintf("%x", 255) // "ff"
fmt.Sprintf("%X", 255) // "FF"
fmt.Sprintf("%#x", 255) // "0xff"
// ✅ Binary
fmt.Sprintf("%b", 5) // "101"
fmt.Sprintf("%08b", 5) // "00000101"
// ✅ Struct formatting
type User struct {
Name string
Age int
}
user := User{"Alice", 30}
fmt.Sprintf("%+v", user) // "{Name:Alice Age:30}"
fmt.Sprintf("%#v", user) // "main.User{Name:\"Alice\", Age:30}"String Templates
import "text/template"
// ✅ Simple template
func RenderTemplate(name string, age int) (string, error) {
tmpl := template.Must(template.New("user").Parse(
"Hello {{.Name}}, you are {{.Age}} years old.",
))
var buf bytes.Buffer
if err := tmpl.Execute(&buf, struct {
Name string
Age int
}{name, age}); err != nil {
return "", err
}
return buf.String(), nil
}
// ✅ Template with conditionals
const tmplText = `
{{if .Premium}}
Premium User: {{.Name}}
{{else}}
Regular User: {{.Name}}
{{end}}
`
func RenderUserType(name string, premium bool) (string, error) {
tmpl := template.Must(template.New("user").Parse(tmplText))
var buf bytes.Buffer
err := tmpl.Execute(&buf, struct {
Name string
Premium bool
}{name, premium})
return buf.String(), err
}Regular Expressions
Pattern Matching
import "regexp"
// ✅ Compile once, reuse many times
var emailRegex = regexp.MustCompile(`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`)
func IsValidEmail(email string) bool {
return emailRegex.MatchString(email)
}
// ✅ Find matches
func ExtractEmails(text string) []string {
return emailRegex.FindAllString(text, -1)
}
// ✅ Replace with regex
func MaskPhoneNumbers(text string) string {
phoneRegex := regexp.MustCompile(`\d{3}-\d{3}-\d{4}`)
return phoneRegex.ReplaceAllString(text, "XXX-XXX-XXXX")
}
// ✅ Capture groups
func ParseDate(dateStr string) (year, month, day string, ok bool) {
dateRegex := regexp.MustCompile(`^(\d{4})-(\d{2})-(\d{2})$`)
matches := dateRegex.FindStringSubmatch(dateStr)
if len(matches) != 4 {
return "", "", "", false
}
return matches[1], matches[2], matches[3], true
}
// Usage
year, month, day, ok := ParseDate("2025-12-17")
if ok {
fmt.Printf("Year: %s, Month: %s, Day: %s\n", year, month, day)
}Common Patterns
// ✅ Email validation
emailRegex := regexp.MustCompile(`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`)
// ✅ Phone number (US format)
phoneRegex := regexp.MustCompile(`^\d{3}-\d{3}-\d{4}$`)
// ✅ URL validation
urlRegex := regexp.MustCompile(`^https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`)
// ✅ Extract numbers
numberRegex := regexp.MustCompile(`\d+`)
numbers := numberRegex.FindAllString("Order 123 contains 45 items", -1)
// []string{"123", "45"}
// ✅ Match word boundaries
wordRegex := regexp.MustCompile(`\btest\b`)
wordRegex.MatchString("testing") // false
wordRegex.MatchString("test") // true
wordRegex.MatchString("test123") // trueUnicode and Encoding
UTF-8 Handling
import "unicode/utf8"
s := "Hello, 世界"
// ✅ Count runes correctly
runeCount := utf8.RuneCountInString(s) // 9
// ✅ Validate UTF-8
valid := utf8.ValidString(s) // true
invalid := utf8.ValidString("\xFF\xFE") // false
// ✅ Iterate with rune index
for i, w := 0, 0; i < len(s); i += w {
r, width := utf8.DecodeRuneInString(s[i:])
fmt.Printf("%c ", r)
w = width
}Case Folding
import "golang.org/x/text/cases"
import "golang.org/x/text/language"
// ✅ Locale-aware case conversion
caser := cases.Title(language.English)
titled := caser.String("hello world") // "Hello World"
// Turkish has different i/İ rules
turkishCaser := cases.Upper(language.Turkish)
upper := turkishCaser.String("istanbul") // "İSTANBUL"Performance Tips
String Interning
// ❌ Each string literal creates new string
func ProcessRecords(records []Record) {
for _, r := range records {
if r.Status == "active" { // New string each iteration
process(r)
}
}
}
// ✅ Reuse string constants
const StatusActive = "active"
func ProcessRecords(records []Record) {
for _, r := range records {
if r.Status == StatusActive { // Same string instance
process(r)
}
}
}Avoiding Unnecessary Conversions
// ❌ Unnecessary conversion
func CountLines(data []byte) int {
text := string(data) // Allocates new string
return strings.Count(text, "\n")
}
// ✅ Work with bytes directly
func CountLines(data []byte) int {
return bytes.Count(data, []byte("\n"))
}
// ❌ Convert back and forth
func ProcessBytes(data []byte) []byte {
s := string(data)
s = strings.ToUpper(s)
return []byte(s)
}
// ✅ Use bytes package
func ProcessBytes(data []byte) []byte {
return bytes.ToUpper(data)
}Summary
String handling in Go requires understanding the distinction between bytes and runes. Strings store UTF-8 encoded bytes, len() returns byte count not character count. Use range loops to iterate over runes, convert to []rune for character indexing.
strings.Builder provides efficient string construction. Each string concatenation creates a new string due to immutability. Builder maintains a mutable buffer, building the final string once. Pre-allocate capacity with Grow() when size is known.
Common string operations through strings package include searching (Contains, HasPrefix, HasSuffix), splitting (Split, Fields), joining (Join), and modification (Replace, TrimSpace, ToUpper). All operations return new strings, original strings never change.
fmt.Sprintf formats strings with type-safe placeholders. Use padding and alignment for tables, precision for floats, and %+v or %#v for debugging structs. Template package handles complex string generation with conditionals and loops.
Regular expressions with regexp package enable pattern matching and extraction. Compile patterns once with regexp.MustCompile, reuse for multiple operations. Use FindStringSubmatch for capture groups, ReplaceAllString for substitutions.
Unicode handling requires utf8 package for correct rune counting and validation. Use RuneCountInString for character length, ValidString for UTF-8 validation. Locale-aware case operations need golang.org/x/text for languages with special casing rules.
Performance optimization avoids unnecessary conversions between strings and bytes. bytes package provides byte-oriented versions of strings functions. Use string constants to benefit from interning, reducing memory allocation.
String immutability makes them safe for concurrent use but expensive to modify repeatedly. Choose strings.Builder for construction, string operations for transformation, regular expressions for pattern matching, and bytes package when working with []byte data.