Performance
Need to optimize slow Elixir code? This guide teaches systematic profiling and optimization techniques using benchmarking, profiling tools, and BEAM VM insights to identify and eliminate bottlenecks in production applications.
Prerequisites
- Intermediate Elixir knowledge
- Understanding of BEAM VM basics
- Completed Intermediate Tutorial
- Basic understanding of processes and concurrency
Problem
Performance issues in Elixir applications can stem from multiple sources: inefficient algorithms, process bottlenecks, memory pressure, or improper use of concurrency primitives. Without systematic profiling, optimization efforts waste time on non-critical code paths.
Challenges:
- Identifying actual bottlenecks vs. perceived slow code
- Measuring performance impact of optimizations
- Balancing readability with performance
- Understanding BEAM VM performance characteristics
- Profiling in production without impacting users
Solution Overview
Use a systematic approach: measure first, optimize second, verify third. Elixir provides excellent profiling tools built on Erlang’s battle-tested infrastructure.
Key Tools:
- Benchee: Micro-benchmarking for comparing implementations
- :fprof: Function-level profiling with call counts and timing
- :eprof: Time-based profiling focusing on function execution time
- Observer: Visual system inspection (processes, memory, schedulers)
- :recon: Production-safe debugging and profiling
- Telemetry: Runtime metrics collection
Detailed Implementation
1. Benchmarking with Benchee
Benchee provides statistical benchmarking with warmup, multiple runs, and comparison reports.
Basic Benchmark
defp deps do
[{:benchee, "~> 1.0", only: :dev}]
end
Benchee.run(%{
"Enum.map" => fn -> Enum.map(1..1000, &(&1 * 2)) end,
"for comprehension" => fn -> for x <- 1..1000, do: x * 2 end,
"list comprehension" => fn -> Enum.to_list(for x <- 1..1000, do: x * 2) end
})Output:
Name ips average deviation median 99th %
Enum.map 11.53 K 86.71 μs ±18.45% 82.80 μs 141.20 μs
for comprehension 11.26 K 88.81 μs ±17.23% 85.40 μs 143.60 μs
list comprehension 11.21 K 89.21 μs ±16.89% 86.10 μs 144.20 μs
Comparison:
Enum.map 11.53 K
for comprehension 11.26 K - 1.02x slower
list comprehension 11.21 K - 1.03x slowerAdvanced Benchmark with Inputs
Benchee.run(
%{
"String.split" => fn input -> String.split(input, ",") end,
"Regex.split" => fn input -> Regex.split(~r/,/, input) end,
"Binary pattern match" => fn input -> split_csv(input) end
},
inputs: %{
"Small (10 items)" => String.duplicate("a,", 10),
"Medium (100 items)" => String.duplicate("a,", 100),
"Large (1000 items)" => String.duplicate("a,", 1000)
},
formatters: [
Benchee.Formatters.Console,
{Benchee.Formatters.HTML, file: "benchmark_results.html"}
],
warmup: 2,
time: 5
)
defp split_csv(input) do
split_csv(input, [], [])
end
defp split_csv("," <> rest, current, acc) do
split_csv(rest, [], [Enum.reverse(current) |> IO.iodata_to_binary() | acc])
end
defp split_csv(<<char, rest::binary>>, current, acc) do
split_csv(rest, [char | current], acc)
end
defp split_csv("", current, acc) do
Enum.reverse([Enum.reverse(current) |> IO.iodata_to_binary() | acc])
end2. Profiling with :fprof
:fprof provides detailed function-level profiling including call counts and time spent.
Basic Profiling
:fprof.apply(&MyModule.slow_function/0, [])
:fprof.profile()
:fprof.analyse()Output Analysis:
%% Analysis results:
{ analysis_options,
[{callers, true},
{sort, acc},
{totals, false},
{details, true}]}.
% CNT ACC OWN
[{ totals, 849, undefined, 100.123}].
% CNT ACC OWN
[{ "<0.80.0>", 849, undefined, 100.123}].
{[{undefined, 0, 0.000, 0.000}],
{ {fprof,call,1}, 1, 0.012, 0.001},
[{{MyModule,slow_function,0}, 1, 100.111, 0.001},
{suspend, 1, 0.000, 0.000}]}.
{[{{fprof,call,1}, 1, 100.111, 0.001}],
{ {MyModule,slow_function,0}, 1, 100.111, 0.001},
[{{MyModule,process_data,1}, 100, 95.234, 45.123},
{{MyModule,transform,1}, 100, 4.876, 3.456}]}.Interpretation:
- CNT: Call count
- ACC: Accumulated time (including called functions)
- OWN: Time spent in function itself
Profiling Specific Code Block
:fprof.start()
result = Enum.map(1..10000, fn x ->
expensive_computation(x)
end)
:fprof.stop()
:fprof.analyse(dest: "fprof_results.txt")3. Time Profiling with :eprof
:eprof focuses on execution time per function, simpler than :fprof.
:eprof.start()
:eprof.start_profiling([self()])
MyModule.slow_operation()
:eprof.stop_profiling()
:eprof.analyze()
:eprof.stop()Output:
FUNCTION CALLS % TIME [uS / CALLS]
MyModule.transform/1 1000 45.23 123456 [ 123]
MyModule.validate/1 1000 30.12 82345 [ 82]
Enum.map/2 1 15.67 42890 [42890]4. Visual Profiling with Observer
Observer provides real-time system visualization.
:observer.start()Key Tabs:
- System: Overall BEAM VM stats (processes, memory, schedulers)
- Load Charts: CPU, memory, process count over time
- Applications: Supervision tree visualization
- Processes: Process table with memory, reductions, message queue
- Table Viewer: ETS/DETS table inspection
- Trace Overview: Message tracing between processes
Identifying Process Bottlenecks
In Processes tab, sort by:
- Message Queue Len: Processes with backed-up messages
- Reductions: CPU-intensive processes
- Memory: Memory-hungry processes
- Current Function: What each process is doing
5. Production Profiling with :recon
:recon provides production-safe profiling without impacting performance.
:recon.proc_count(:memory, 10)
:recon.proc_count(:reductions, 10)
:recon.proc_count(:message_queue_len, 10)
:recon.info(<0.123.0>)Memory Analysis
:recon_alloc.memory(:allocated)
:recon_alloc.memory(:used)
:recon.ets_memory()
:recon.bin_leak(10)6. Telemetry for Runtime Metrics
Collect metrics during normal operation.
defmodule MyApp.Telemetry do
def setup do
:telemetry.attach_many(
"my-app-metrics",
[
[:my_app, :request, :start],
[:my_app, :request, :stop],
[:my_app, :db, :query]
],
&handle_event/4,
nil
)
end
defp handle_event([:my_app, :request, :stop], measurements, metadata, _config) do
duration = measurements.duration
path = metadata.path
# Log slow requests
if duration > 1_000_000_000 do # > 1 second
require Logger
Logger.warning("Slow request", duration_ms: div(duration, 1_000_000), path: path)
end
# Export to metrics system
:telemetry_metrics.emit([:http, :request, :duration], duration, %{path: path})
end
endHow It Works
BEAM VM Performance Characteristics
Understanding BEAM performance model is crucial:
- Process Scheduling: Reductions budget (2000 per scheduler slice)
- Message Passing: Copy semantics for small messages, reference passing for large binaries
- Garbage Collection: Per-process GC, not stop-the-world
- ETS: Shared memory tables, lock-free reads
- Binary Memory: Reference counted, shared across processes
Performance Measurement Accuracy
Benchee Methodology:
- Warmup: Prepares JIT, caches before measuring
- Multiple Runs: Statistical validity (mean, median, deviation)
- Garbage Collection: Runs GC between scenarios
- Scheduler Fairness: Runs each scenario multiple times
Profiling Overhead:
- :fprof: High overhead (10-100x slowdown), detailed results
- :eprof: Medium overhead (2-10x slowdown), simpler output
- :recon: Low overhead, production-safe
Variations
1. Continuous Benchmarking
defmodule MyBenchmark do
def run do
Benchee.run(
%{
"current" => fn -> current_implementation() end,
"optimized" => fn -> optimized_implementation() end
},
save: [path: "benchmark_results.benchee", tag: "v1.2.0"],
load: "benchmark_results.benchee"
)
end
end
MyBenchmark.run()2. Profiling Specific Processes
pid = Process.whereis(MyApp.Worker)
:eprof.start_profiling([pid])
:eprof.stop_profiling()
:eprof.analyze()3. Custom Telemetry Metrics
defmodule MyApp.Metrics do
use Telemetry.Metrics
def metrics do
[
summary("my_app.request.duration",
unit: {:native, :millisecond},
tags: [:path, :method]
),
counter("my_app.request.count",
tags: [:status]
),
distribution("my_app.db.query.duration",
unit: {:native, :millisecond},
reporter_options: [buckets: [10, 50, 100, 500, 1000]]
)
]
end
endPitfalls and Best Practices
Common Mistakes
1. Premature Optimization
Bad:
def process(items) do
# Complex, unreadable optimization
:ets.foldl(fn {k, v}, acc ->
process_item(k, v, acc)
end, [], :my_table)
endGood:
def process(items) do
Enum.map(items, &process_item/1)
end2. Benchmarking in Development Environment
Bad:
mix run benchmark.exsGood:
MIX_ENV=prod mix run benchmark.exs3. Ignoring Warmup
Bad:
Benchee.run(%{
"function" => fn -> expensive_operation() end
}, warmup: 0) # First runs will be slowerGood:
Benchee.run(%{
"function" => fn -> expensive_operation() end
}, warmup: 2) # Give JIT time to optimize4. Profiling with Too Much Data
:fprof generates massive output for long-running operations.
Good:
:fprof.apply(&MyModule.process_batch/1, [Enum.take(large_dataset, 100)])Optimization Strategies
1. Use Streams for Large Datasets
File.read!("large.csv")
|> String.split("\n")
|> Enum.map(&parse_line/1)
|> Enum.filter(&valid?/1)
File.stream!("large.csv")
|> Stream.map(&parse_line/1)
|> Stream.filter(&valid?/1)
|> Enum.to_list()2. Leverage ETS for Shared State
def get_config(key) do
GenServer.call(ConfigServer, {:get, key})
end
def get_config(key) do
case :ets.lookup(:config, key) do
[{^key, value}] -> value
[] -> nil
end
end3. Batch Database Operations
Enum.each(user_ids, fn id ->
Repo.get(User, id) |> update_user()
end)
Repo.all(from u in User, where: u.id in ^user_ids, preload: :posts)
|> Enum.each(&update_user/1)Related Resources
- Advanced Tutorial - BEAM VM internals
- Best Practices - Performance patterns
- Cookbook - Optimization recipes
- Caching Guide - ETS and caching strategies