Work with Strings Effectively
Problem
Strings are immutable in Java, meaning every modification creates a new object. Naive string concatenation in loops creates thousands of temporary objects, causing performance problems. Poor understanding of string operations leads to inefficient code and bugs with character encoding.
This guide shows effective string manipulation patterns in Java.
String Immutability
Understanding Immutability
// ✅ Strings are immutable - cannot be changed
String original = "Hello";
String modified = original.toUpperCase(); // Creates NEW string
System.out.println(original); // Still "Hello"
System.out.println(modified); // "HELLO"
// ❌ Common misconception
String text = "Hello";
text.toUpperCase(); // Creates new string, but doesn't assign it
System.out.println(text); // Still "Hello" - unchanged!
// ✅ Assign result to use modified string
String text = "Hello";
text = text.toUpperCase(); // Assign result
System.out.println(text); // "HELLO"Why immutability matters: Immutable strings are thread-safe, can be safely shared, and enable string interning for memory optimization. Understanding immutability prevents bugs from expecting strings to change in-place.
StringBuilder vs StringBuffer
When to Use StringBuilder
// ❌ String concatenation in loop - creates N temporary objects
public String buildCsv(List<String> values) {
String result = "";
for (String value : values) {
result += value + ","; // Creates new string each iteration!
}
return result;
}
// For 1000 items, creates 1000 temporary strings
// ✅ StringBuilder - mutable, efficient for concatenation
public String buildCsv(List<String> values) {
StringBuilder sb = new StringBuilder();
for (String value : values) {
sb.append(value).append(","); // Modifies same object
}
return sb.toString();
}
// Creates only one StringBuilder and one final String
// ✅ Common StringBuilder patterns
StringBuilder sb = new StringBuilder();
// Append various types
sb.append("User: ");
sb.append(userId); // int
sb.append(", Balance: ");
sb.append(balance); // BigDecimal
sb.append(", Active: ");
sb.append(isActive); // boolean
// Method chaining
sb.append("Name: ")
.append(name)
.append(", Age: ")
.append(age);
// Insert at position
sb.insert(0, "Prefix: ");
// Replace range
sb.replace(0, 7, "New Prefix");
// Delete range
sb.delete(sb.length() - 1, sb.length()); // Remove last character
String result = sb.toString();StringBuilder vs StringBuffer
// ✅ StringBuilder - NOT thread-safe, faster
StringBuilder sb = new StringBuilder();
sb.append("text"); // Use in single-threaded code
// ✅ StringBuffer - thread-safe, slower
StringBuffer buffer = new StringBuffer();
buffer.append("text"); // Use only when thread-safety needed
// Usually: Use StringBuilder (99% of cases)
// StringBuffer rarely needed - synchronization overhead highCommon String Operations
Searching and Testing
String text = "Hello, World!";
// Contains substring
boolean hasWorld = text.contains("World"); // true
boolean hasworld = text.contains("world"); // false (case-sensitive)
// Starts/ends with
boolean startsHello = text.startsWith("Hello"); // true
boolean endsExclaim = text.endsWith("!"); // true
// Index of substring
int worldIndex = text.indexOf("World"); // 7
int notFound = text.indexOf("Java"); // -1
int lastComma = text.lastIndexOf(","); // 5
// Character at position
char firstChar = text.charAt(0); // 'H'
// Empty/blank checks
boolean isEmpty = text.isEmpty(); // false
boolean isBlank = text.isBlank(); // false (Java 11+)
boolean blankCheck = " ".isBlank(); // true
// Equals
boolean equals = text.equals("Hello, World!"); // true
boolean equalsIgnore = text.equalsIgnoreCase("HELLO, WORLD!"); // trueModifying Strings
String text = " Hello, World! ";
// Trim whitespace
String trimmed = text.trim(); // "Hello, World!"
String stripped = text.strip(); // "Hello, World!" (Java 11+, Unicode-aware)
// Replace
String replaced = text.replace("World", "Java"); // " Hello, Java! "
String replaceFirst = text.replaceFirst("l", "L"); // " HeLlo, World! "
String replaceAll = text.replaceAll("l", "L"); // " HeLLo, WorLd! " (regex)
// Case conversion
String upper = text.toUpperCase(); // " HELLO, WORLD! "
String lower = text.toLowerCase(); // " hello, world! "
// Substring
String hello = text.substring(2, 7); // "Hello"
String world = text.substring(9, 14); // "World"
// Repeat (Java 11+)
String repeated = "abc".repeat(3); // "abcabcabc"Splitting and Joining
// Split on delimiter
String csv = "Alice,Bob,Charlie";
String[] names = csv.split(","); // ["Alice", "Bob", "Charlie"]
// Split with regex
String text = "one two three"; // Multiple spaces
String[] words = text.split("\\s+"); // ["one", "two", "three"]
// Split with limit
String data = "a:b:c:d";
String[] parts = data.split(":", 2); // ["a", "b:c:d"] (max 2 parts)
// Join strings (Java 8+)
List<String> items = List.of("Apple", "Banana", "Cherry");
String joined = String.join(", ", items); // "Apple, Banana, Cherry"
// Join with stream
String result = items.stream()
.collect(Collectors.joining(", ", "[", "]")); // "[Apple, Banana, Cherry]"String Formatting
String.format and printf
// ✅ String.format - returns formatted string
String name = "Alice";
int age = 30;
double balance = 1234.56;
String formatted = String.format("Name: %s, Age: %d, Balance: $%.2f",
name, age, balance);
// "Name: Alice, Age: 30, Balance: $1234.56"
// Format specifiers:
// %s - String
// %d - decimal integer
// %f - floating point
// %.2f - float with 2 decimal places
// %n - platform-independent newline
// ✅ printf - prints formatted string
System.out.printf("User %s has balance $%.2f%n", name, balance);
// ✅ Alignment and padding
System.out.printf("%-10s | %5d%n", "Alice", 30); // "Alice | 30"
System.out.printf("%-10s | %5d%n", "Bob", 25); // "Bob | 25"
// %-10s = left-aligned, 10 characters wide
// %5d = right-aligned, 5 characters wide
// ✅ Thousands separator
System.out.printf("Amount: %,d%n", 1000000); // "Amount: 1,000,000"
// ✅ Hex, octal formatting
System.out.printf("Hex: %x, Octal: %o%n", 255, 255); // "Hex: ff, Octal: 377"Text Blocks (Java 15+)
// ❌ Old multi-line strings - hard to read
String json = "{\n" +
" \"name\": \"Alice\",\n" +
" \"age\": 30,\n" +
" \"active\": true\n" +
"}";
// ✅ Text blocks - preserves formatting
String json = """
{
"name": "Alice",
"age": 30,
"active": true
}
""";
// ✅ SQL queries
String query = """
SELECT u.id, u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'COMPLETED'
ORDER BY o.total DESC
""";
// ✅ HTML templates
String html = """
<html>
<body>
<h1>Welcome, %s!</h1>
<p>Your balance is $%.2f</p>
</body>
</html>
""".formatted(name, balance);Regular Expressions
Pattern and Matcher
import java.util.regex.Pattern;
import java.util.regex.Matcher;
// ✅ Email validation
String emailRegex = "^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$";
Pattern emailPattern = Pattern.compile(emailRegex);
boolean isValid = emailPattern.matcher("alice@example.com").matches(); // true
boolean isInvalid = emailPattern.matcher("invalid.email").matches(); // false
// ✅ Extract matches
String text = "Contact us at support@example.com or sales@example.com";
Pattern pattern = Pattern.compile("[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+");
Matcher matcher = pattern.matcher(text);
List<String> emails = new ArrayList<>();
while (matcher.find()) {
emails.add(matcher.group());
}
// emails = ["support@example.com", "sales@example.com"]
// ✅ Replace with regex
String text = "Error: Failed at step 123";
String cleaned = text.replaceAll("\\d+", "X");
// "Error: Failed at step X"
// ✅ Split with regex
String data = "one two three four";
String[] words = data.split("\\s+");
// ["one", "two", "three", "four"]
// ✅ Capturing groups
String date = "2025-12-17";
Pattern datePattern = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher dateMatcher = datePattern.matcher(date);
if (dateMatcher.matches()) {
String year = dateMatcher.group(1); // "2025"
String month = dateMatcher.group(2); // "12"
String day = dateMatcher.group(3); // "17"
}Common Regex Patterns
// Phone number (US format)
String phoneRegex = "^\\d{3}-\\d{3}-\\d{4}$";
// Matches: 123-456-7890
// URL validation
String urlRegex = "^https?://[A-Za-z0-9.-]+\\.[A-Za-z]{2,}";
// Matches: http://example.com, https://api.example.org
// Integer validation
String intRegex = "^-?\\d+$";
// Matches: 123, -456, 0
// Decimal validation
String decimalRegex = "^-?\\d+\\.\\d+$";
// Matches: 123.45, -0.99
// Hex color code
String hexRegex = "^#[0-9A-Fa-f]{6}$";
// Matches: #FF5733, #00ff00
// Extract all numbers
String extractNumbers = "Order 123 contains 45 items for $67.89";
Pattern numberPattern = Pattern.compile("\\d+");
Matcher numberMatcher = numberPattern.matcher(extractNumbers);
while (numberMatcher.find()) {
System.out.println(numberMatcher.group()); // "123", "45", "67", "89"
}Unicode and Encoding
Character Encoding
// ✅ Specify encoding when reading/writing files
String content = Files.readString(path, StandardCharsets.UTF_8);
Files.writeString(path, content, StandardCharsets.UTF_8);
// ✅ Convert string to bytes with encoding
String text = "Hello, 世界";
byte[] utf8Bytes = text.getBytes(StandardCharsets.UTF_8);
byte[] isoBytes = text.getBytes(StandardCharsets.ISO_8859_1);
// ✅ Convert bytes to string with encoding
String decoded = new String(utf8Bytes, StandardCharsets.UTF_8);
// ❌ Platform default encoding - not portable
byte[] platformBytes = text.getBytes(); // BAD: Uses platform default
String platformString = new String(platformBytes); // BAD: Uses platform defaultWorking with Unicode
// ✅ Unicode characters
String emoji = "Hello 👋";
String chinese = "你好";
String arabic = "مرحبا";
// Length counts UTF-16 code units, not characters
System.out.println(emoji.length()); // 8 (not 7!)
// "👋" is 2 code units in UTF-16
// ✅ Code point iteration for correct character handling
String text = "Hello 👋 World";
text.codePoints().forEach(cp -> {
System.out.println(new String(Character.toChars(cp)));
});
// Prints each character correctly, including emoji
// ✅ Count actual characters
int characterCount = text.codePointCount(0, text.length());String Performance Tips
// ✅ String interning for memory savings
String s1 = "hello";
String s2 = "hello";
System.out.println(s1 == s2); // true (same object due to interning)
String s3 = new String("hello");
System.out.println(s1 == s3); // false (different objects)
String s4 = s3.intern();
System.out.println(s1 == s4); // true (interned = same object as s1)
// ✅ When to use intern()
// - Limited set of strings used repeatedly
// - Long-lived application with many duplicate strings
// - Be cautious - intern pool can grow large
// ❌ Don't intern user input or unbounded strings
// String userInput = request.getParameter("data").intern(); // BAD!
// ✅ StringBuilder capacity for large strings
StringBuilder sb = new StringBuilder(1000); // Pre-allocate capacity
// Avoids multiple internal array resizes
// ✅ String.join vs StringBuilder for known collection
List<String> items = Arrays.asList("a", "b", "c");
// Prefer String.join for clarity
String result = String.join(", ", items);
// StringBuilder only if you're doing complex building
StringBuilder sb = new StringBuilder();
for (String item : items) {
if (complexCondition(item)) {
sb.append(item).append(", ");
}
}Summary
String immutability in Java means every string operation creates a new string object. Understanding this prevents bugs from expecting strings to change in-place. Always assign the result of string operations to capture the modified string.
StringBuilder provides mutable string building for efficient concatenation in loops or complex string construction. Each append() modifies the same StringBuilder object instead of creating new strings. Use StringBuilder for building strings, StringBuffer only when thread-safety is required.
Common string operations include contains(), startsWith(), endsWith() for searching, trim() and strip() for whitespace removal, replace() for substitutions, and split()/join() for breaking apart and combining strings. All operations return new strings due to immutability.
String formatting with String.format() and printf() creates formatted output using format specifiers. Text blocks (Java 15+) simplify multi-line strings like JSON, SQL queries, or HTML templates, preserving formatting and eliminating concatenation.
Regular expressions through Pattern and Matcher enable sophisticated text matching and extraction. Compile patterns once and reuse for performance. Use capturing groups to extract specific parts of matched text. Common patterns handle email validation, phone numbers, URLs, and data extraction.
Unicode handling requires understanding code points vs code units. String.length() counts UTF-16 code units, not characters. Use codePoints() for correct character iteration when handling emoji or non-BMP characters. Always specify character encoding explicitly when converting between strings and bytes.
Performance optimization includes StringBuilder capacity pre-allocation for large strings, string interning for memory savings with frequently repeated strings, and choosing String.join() over StringBuilder for simple collection joining. Avoid interning user input or unbounded strings as the intern pool can grow without limit.