Skip to content

Benchmarks and numerical validation

This page records a local benchmark run comparing explicit D-vine operations in VineCopulas.jl against rvinecopulib. The goal is not to claim broad speed superiority. The goal is to make the current performance profile reproducible and to separate numerical correctness from implementation speed.

The local logs used for this page report Julia 1.12.0 on aarch64-apple-darwin14. Exact hardware was not recorded by the benchmark scripts, so the numbers should be read as a reproducible local reference, not as portable machine-independent constants.

Reproducing the benchmark

From the package root, instantiate the benchmark environment:

bash
julia --project=benchmarks -e 'using Pkg; Pkg.develop(path=pwd()); Pkg.instantiate()'

Install the R dependencies once:

r
install.packages(c("rvinecopulib", "bench"))

Run the ordinary benchmark battery:

bash
bash benchmarks/run_main.sh

The ordinary route excludes Student-t copulas so that the main table reflects the families whose current implementations are suitable for routine performance comparisons. Student-t is studied separately:

bash
bash benchmarks/tcopula_study/run_t_study.sh

The benchmark scripts generate raw outputs under benchmarks/results/, benchmarks/reference/, and benchmarks/logs/. These directories are ignored by git. The summarized tables shipped with the documentation are generated from those outputs.

Main D-vine setup

The ordinary benchmark uses D-vines with n = 10000 evaluation points and three scenarios:

ScenarioMeaning
p=5, trunc=4full five-dimensional D-vine
p=10, trunc=2ten-dimensional truncated D-vine
p=20, trunc=2twenty-dimensional truncated D-vine

For this run, Julia used SAMPLES=5, R used ITERATIONS=5, CDF comparisons used CDF_POINTS=10 and CDF_N=5000.

Log-density performance

The table reports median vectorized log-density time. The speed ratio is rvinecopulib median / Julia median; values above 1 mean Julia was faster in this local run.

FamilyptruncJulia medianrvinecopulib medianJulia speed ratio
clayton5422.1 ms18.8 ms0.85×
clayton10225.4 ms27.8 ms1.09×
clayton20255.7 ms61.9 ms1.11×
frank5420.9 ms15.6 ms0.74×
frank10224.0 ms21.8 ms0.91×
frank20252.5 ms48.8 ms0.93×
gaussian5414.1 ms22.2 ms1.58×
gaussian10216.9 ms32.6 ms1.93×
gaussian20238.1 ms72.5 ms1.90×
gumbel5430.6 ms30.3 ms0.99×
gumbel10238.6 ms45.3 ms1.17×
gumbel20284.6 ms101.4 ms1.20×

Summary:

FamilyLog-density speed ratio rangeInterpretation
gaussian1.58×–1.93×Julia is consistently faster in these scenarios.
clayton0.85×–1.11×near parity at p=5; Julia is faster for p=10 and p=20.
gumbel0.99×–1.20×near parity at p=5; Julia is faster for p=10 and p=20.
frank0.74×–0.93×rvinecopulib is still faster for log-density.

These results are consistent with the implementation strategy: the Gaussian path has a direct closed-form bivariate density and conditional primitives; Clayton and Gumbel are competitive in larger truncated D-vines; Frank is correct but still has room for log-density optimization.

Student-t study

Student-t copulas are validated numerically but remain outside the ordinary benchmark route. The current implementation uses direct bivariate t-copula formulas but still depends heavily on scalar Student-t quantile and CDF evaluations. In this local run, that cost dominates runtime and allocation counts.

ptruncJulia medianrvinecopulib medianrvinecopulib faster byJulia memoryJulia allocations
2138.8 ms2.2 ms17.6×17.7 MiB119,257
541192.6 ms55.9 ms21.3×519.3 MiB3,560,275
2023059.0 ms182.0 ms16.8×1355.4 MiB9,307,335

The important point is not that TCopula is mathematically wrong; it is not. The log-density values agree closely with rvinecopulib. The issue is performance of the scalar Student-t numerical primitives in the current implementation.

Numerical validation

The following table reports log-density agreement against rvinecopulib reference values.

Familyptruncmax abs.mean abs.max rel.
clayton541.22e-138.37e-151.83e-11
clayton1028.17e-148.63e-153.18e-12
clayton2025.12e-131.62e-141.22e-12
frank543.94e-126.26e-155.45e-11
frank1023.87e-111.15e-144.94e-10
frank2022.12e-103.98e-143.98e-10
gaussian543.89e-124.65e-151.53e-10
gaussian1022.29e-126.24e-153.93e-10
gaussian2021.41e-111.22e-142.68e-11
gumbel544.64e-106.55e-141.14e-10
gumbel1022.74e-104.29e-142.27e-11
gumbel2022.43e-105.41e-141.95e-11
t215.17e-131.79e-158.76e-12
t541.30e-102.58e-143.09e-10
t2023.39e-114.20e-148.02e-09

Across the tested families, deterministic log-density agreement is close to floating-point precision for the implemented formulas. The larger direct Rosenblatt differences against rvinecopulib are convention/order dependent; they should not be read as density errors. Internal consistency is the relevant transform check:

CheckWorst max abs. over uploaded benchmark resultsWorst mean abs.
inverse_rosenblatt(rosenblatt(U)) ≈ U9.06e-112.03e-15
rosenblatt(inverse_rosenblatt(Z)) ≈ Z1.29e-102.87e-15

The numerical CDF is approximate for general vines. In this run, the largest reported CDF QMC absolute difference was 8.60e-03. CDF comparisons should therefore be interpreted with Monte Carlo / quasi-Monte Carlo tolerance, not as exact identities.

Interpretation

  • GaussianCopula is currently the strongest performance case: Julia is faster than rvinecopulib for vectorized log-density in all three tested scenarios.

  • ClaytonCopula and GumbelCopula are competitive for log-density, especially in the p=10 and p=20 truncated scenarios.

  • FrankCopula is validated but remains somewhat slower in log-density.

  • TCopula is correct and validated, but performance-limited by Student-t CDF/quantile evaluations; it is documented as a separate optimization target.