Performance

Need to optimize slow Elixir code? This guide teaches systematic profiling and optimization techniques using benchmarking, profiling tools, and BEAM VM insights to identify and eliminate bottlenecks in production applications.

Prerequisites

Intermediate Elixir knowledge
Understanding of BEAM VM basics
Completed Intermediate Tutorial
Basic understanding of processes and concurrency

Problem

Performance issues in Elixir applications can stem from multiple sources: inefficient algorithms, process bottlenecks, memory pressure, or improper use of concurrency primitives. Without systematic profiling, optimization efforts waste time on non-critical code paths.

Challenges:

Identifying actual bottlenecks vs. perceived slow code
Measuring performance impact of optimizations
Balancing readability with performance
Understanding BEAM VM performance characteristics
Profiling in production without impacting users

Solution Overview

Use a systematic approach: measure first, optimize second, verify third. Elixir provides excellent profiling tools built on Erlang’s battle-tested infrastructure.

Key Tools:

Benchee: Micro-benchmarking for comparing implementations
:fprof: Function-level profiling with call counts and timing
:eprof: Time-based profiling focusing on function execution time
Observer: Visual system inspection (processes, memory, schedulers)
:recon: Production-safe debugging and profiling
Telemetry: Runtime metrics collection

Detailed Implementation

1. Benchmarking with Benchee

Benchee provides statistical benchmarking with warmup, multiple runs, and comparison reports.

Basic Benchmark

defp deps do
  [{:benchee, "~> 1.0", only: :dev}]
end

Benchee.run(%{
  "Enum.map" => fn -> Enum.map(1..1000, &(&1 * 2)) end,
  "for comprehension" => fn -> for x <- 1..1000, do: x * 2 end,
  "list comprehension" => fn -> Enum.to_list(for x <- 1..1000, do: x * 2) end
})

Output:

Name                         ips        average  deviation         median         99th %
Enum.map                  11.53 K       86.71 μs    ±18.45%       82.80 μs      141.20 μs
for comprehension         11.26 K       88.81 μs    ±17.23%       85.40 μs      143.60 μs
list comprehension        11.21 K       89.21 μs    ±16.89%       86.10 μs      144.20 μs

Comparison:
Enum.map                  11.53 K
for comprehension         11.26 K - 1.02x slower
list comprehension        11.21 K - 1.03x slower

Advanced Benchmark with Inputs

Benchee.run(
  %{
    "String.split" => fn input -> String.split(input, ",") end,
    "Regex.split" => fn input -> Regex.split(~r/,/, input) end,
    "Binary pattern match" => fn input -> split_csv(input) end
  },
  inputs: %{
    "Small (10 items)" => String.duplicate("a,", 10),
    "Medium (100 items)" => String.duplicate("a,", 100),
    "Large (1000 items)" => String.duplicate("a,", 1000)
  },
  formatters: [
    Benchee.Formatters.Console,
    {Benchee.Formatters.HTML, file: "benchmark_results.html"}
  ],
  warmup: 2,
  time: 5
)

defp split_csv(input) do
  split_csv(input, [], [])
end

defp split_csv("," <> rest, current, acc) do
  split_csv(rest, [], [Enum.reverse(current) |> IO.iodata_to_binary() | acc])
end

defp split_csv(<<char, rest::binary>>, current, acc) do
  split_csv(rest, [char | current], acc)
end

defp split_csv("", current, acc) do
  Enum.reverse([Enum.reverse(current) |> IO.iodata_to_binary() | acc])
end

2. Profiling with :fprof

:fprof provides detailed function-level profiling including call counts and time spent.

Basic Profiling

:fprof.apply(&MyModule.slow_function/0, [])
:fprof.profile()
:fprof.analyse()

Output Analysis:

%% Analysis results:
{  analysis_options,
 [{callers, true},
  {sort, acc},
  {totals, false},
  {details, true}]}.

%                                               CNT       ACC       OWN
[{ totals,                                      849,  undefined,  100.123}].

%                                               CNT       ACC       OWN
[{ "<0.80.0>",                                  849,  undefined,  100.123}].

{[{undefined,                                     0,    0.000,    0.000}],
 { {fprof,call,1},                                1,    0.012,    0.001},
 [{{MyModule,slow_function,0},                    1,  100.111,    0.001},
  {suspend,                                       1,    0.000,    0.000}]}.

{[{{fprof,call,1},                                1,  100.111,    0.001}],
 { {MyModule,slow_function,0},                    1,  100.111,    0.001},
 [{{MyModule,process_data,1},                   100,   95.234,   45.123},
  {{MyModule,transform,1},                      100,    4.876,    3.456}]}.

Interpretation:

CNT: Call count
ACC: Accumulated time (including called functions)
OWN: Time spent in function itself

Profiling Specific Code Block

:fprof.start()

result = Enum.map(1..10000, fn x ->
  expensive_computation(x)
end)

:fprof.stop()
:fprof.analyse(dest: "fprof_results.txt")

3. Time Profiling with :eprof

:eprof focuses on execution time per function, simpler than :fprof.

:eprof.start()
:eprof.start_profiling([self()])

MyModule.slow_operation()

:eprof.stop_profiling()
:eprof.analyze()
:eprof.stop()

Output:

FUNCTION                      CALLS      %     TIME  [uS / CALLS]
MyModule.transform/1           1000   45.23   123456  [  123]
MyModule.validate/1            1000   30.12    82345  [   82]
Enum.map/2                        1   15.67    42890  [42890]

4. Visual Profiling with Observer

Observer provides real-time system visualization.

:observer.start()

Key Tabs:

System: Overall BEAM VM stats (processes, memory, schedulers)
Load Charts: CPU, memory, process count over time
Applications: Supervision tree visualization
Processes: Process table with memory, reductions, message queue
Table Viewer: ETS/DETS table inspection
Trace Overview: Message tracing between processes

Identifying Process Bottlenecks

In Processes tab, sort by:

Message Queue Len: Processes with backed-up messages
Reductions: CPU-intensive processes
Memory: Memory-hungry processes
Current Function: What each process is doing

5. Production Profiling with :recon

:recon provides production-safe profiling without impacting performance.

:recon.proc_count(:memory, 10)

:recon.proc_count(:reductions, 10)

:recon.proc_count(:message_queue_len, 10)

:recon.info(<0.123.0>)

Memory Analysis

:recon_alloc.memory(:allocated)
:recon_alloc.memory(:used)

:recon.ets_memory()

:recon.bin_leak(10)

6. Telemetry for Runtime Metrics

Collect metrics during normal operation.

defmodule MyApp.Telemetry do
  def setup do
    :telemetry.attach_many(
      "my-app-metrics",
      [
        [:my_app, :request, :start],
        [:my_app, :request, :stop],
        [:my_app, :db, :query]
      ],
      &handle_event/4,
      nil
    )
  end

  defp handle_event([:my_app, :request, :stop], measurements, metadata, _config) do
    duration = measurements.duration
    path = metadata.path

    # Log slow requests
    if duration > 1_000_000_000 do  # > 1 second
      require Logger
      Logger.warning("Slow request", duration_ms: div(duration, 1_000_000), path: path)
    end

    # Export to metrics system
    :telemetry_metrics.emit([:http, :request, :duration], duration, %{path: path})
  end
end

How It Works

BEAM VM Performance Characteristics

Understanding BEAM performance model is crucial:

Process Scheduling: Reductions budget (2000 per scheduler slice)
Message Passing: Copy semantics for small messages, reference passing for large binaries
Garbage Collection: Per-process GC, not stop-the-world
ETS: Shared memory tables, lock-free reads
Binary Memory: Reference counted, shared across processes

Performance Measurement Accuracy

Benchee Methodology:

Warmup: Prepares JIT, caches before measuring
Multiple Runs: Statistical validity (mean, median, deviation)
Garbage Collection: Runs GC between scenarios
Scheduler Fairness: Runs each scenario multiple times

Profiling Overhead:

:fprof: High overhead (10-100x slowdown), detailed results
:eprof: Medium overhead (2-10x slowdown), simpler output
:recon: Low overhead, production-safe

Variations

1. Continuous Benchmarking

defmodule MyBenchmark do
  def run do
    Benchee.run(
      %{
        "current" => fn -> current_implementation() end,
        "optimized" => fn -> optimized_implementation() end
      },
      save: [path: "benchmark_results.benchee", tag: "v1.2.0"],
      load: "benchmark_results.benchee"
    )
  end
end

MyBenchmark.run()

2. Profiling Specific Processes

pid = Process.whereis(MyApp.Worker)

:eprof.start_profiling([pid])
:eprof.stop_profiling()
:eprof.analyze()

3. Custom Telemetry Metrics

defmodule MyApp.Metrics do
  use Telemetry.Metrics

  def metrics do
    [
      summary("my_app.request.duration",
        unit: {:native, :millisecond},
        tags: [:path, :method]
      ),
      counter("my_app.request.count",
        tags: [:status]
      ),
      distribution("my_app.db.query.duration",
        unit: {:native, :millisecond},
        reporter_options: [buckets: [10, 50, 100, 500, 1000]]
      )
    ]
  end
end

Pitfalls and Best Practices

Common Mistakes

1. Premature Optimization

Bad:

def process(items) do
  # Complex, unreadable optimization
  :ets.foldl(fn {k, v}, acc ->
    process_item(k, v, acc)
  end, [], :my_table)
end

Good:

def process(items) do
  Enum.map(items, &process_item/1)
end

2. Benchmarking in Development Environment

Bad:

mix run benchmark.exs

Good:

MIX_ENV=prod mix run benchmark.exs

3. Ignoring Warmup

Bad:

Benchee.run(%{
  "function" => fn -> expensive_operation() end
}, warmup: 0)  # First runs will be slower

Good:

Benchee.run(%{
  "function" => fn -> expensive_operation() end
}, warmup: 2)  # Give JIT time to optimize

4. Profiling with Too Much Data

:fprof generates massive output for long-running operations.

Good:

:fprof.apply(&MyModule.process_batch/1, [Enum.take(large_dataset, 100)])

Optimization Strategies

1. Use Streams for Large Datasets

File.read!("large.csv")
|> String.split("\n")
|> Enum.map(&parse_line/1)
|> Enum.filter(&valid?/1)

File.stream!("large.csv")
|> Stream.map(&parse_line/1)
|> Stream.filter(&valid?/1)
|> Enum.to_list()

2. Leverage ETS for Shared State

def get_config(key) do
  GenServer.call(ConfigServer, {:get, key})
end

def get_config(key) do
  case :ets.lookup(:config, key) do
    [{^key, value}] -> value
    [] -> nil
  end
end

3. Batch Database Operations

Enum.each(user_ids, fn id ->
  Repo.get(User, id) |> update_user()
end)

Repo.all(from u in User, where: u.id in ^user_ids, preload: :posts)
|> Enum.each(&update_user/1)

Related Resources

Advanced Tutorial - BEAM VM internals
Best Practices - Performance patterns
Cookbook - Optimization recipes
Caching Guide - ETS and caching strategies

Last updated December 21, 2025

Configuration Macros