tools/perf
Neil Horman 58b7b94ae5 Convert all perf tools to not use ossl_time_divide
It looses information as its doing integer division

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/tools/pull/184)
2024-05-02 15:51:43 +02:00
..
perflib Add a performance test for PEM_read_bio_PrivateKey() 2023-07-14 10:29:17 +01:00
Makefile add basic rw lock performance test 2024-04-24 19:17:46 +02:00
README Fix up stylistic errors 2024-04-24 19:18:43 +02:00
handshake.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
newrawkey.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
pemread.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
providerdoall.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
randbytes.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
rsasign.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
rwlocks.c Fix up stylistic errors 2024-04-24 19:18:43 +02:00
sslnew.c Convert all perf tools to not use ossl_time_divide 2024-05-02 15:51:43 +02:00
x509storeissuer.c make x509storeissuer do timekeeping like the other tests 2024-05-02 15:51:43 +02:00

README

Performance testing tools
=========================

This directory holds tools for carrying out performance tests on OpenSSL.

The various performance test applications are held within this directory, and
various helper files are held in perflib.

The performance test applications are intended to be linked against a supported
OpenSSL version, e.g. 3.1, 3.0, 1.1.1 - which is the version of OpenSSL that
is to be tested. Typically we would expect the apps to be built multiple times
(once for each target OpenSSL version to be tested).

To build the tests we assume that the target OpenSSL has already been built.
Two environment variables are required:

TARGET_OSSL_INCLUDE_PATH: Points to a directory where the OpenSSL include files
are held (e.g. typically "include" under the build directory).

TARGET_OSSL_LIBRARY_PATH: Points to a directory where libcrypto.so and libssl.so
are contained.

To build:

export TARGET_OSSL_INCLUDE_PATH=/path/to/openssl/include
export TARGET_OSSL_LIBRARY_PATH=/path/to/openssl
make

The performance testing apps must be run ensuring that libcrypto.so and
libssl.so are on the library path.

For example:

LD_LIBRARY_PATH=/path/to/openssl ./randbytes 10

Each performance testing app will take different parameters. They are described
individually below. All performance testing apps take the "--terse" option
which has the affect of just printing bare performance numbers without any
labels.

randbytes
---------

The randbytes test does 10000 calls of the RAND_bytes() function divided
evenly among multiple threads. The number of threads to use is provided as
an argument and the test reports the average time take to execute a block of
1000 RAND_bytes() calls.

handshake
---------

Performs a combined in-memory client and server handshake. In total 100000
handshakes are performed divided evenly among each thread. It takes 2 arguments:

certsdir - A directory where 2 files exist (servercert.pem and serverkey.pem) for
the server certificate and key. The test/certs directory of the main OpenSSL
source repository contains such files for all supported branches.

threadcount - The number of threads to perform handshakes on in the test

The output is two values: the average time taken for a single handshake in us,
and the average number of simultaneous handshakes per second performed over the
course of the test.

sslnew
------

The sslnew test repeatedly constructs a new SSL object and associates it with a
newly constructed read BIO and a newly constructed write BIO, and finally frees
them again. It does 100000 repetitions divided evenly among each thread.
The number of threads to use is provided as an argument and the test
reports the average time taken to execute a block of 1000 construction/free
calls.

newrawkey
---------

The newrawkey test repeatedly calls the EVP_PKEY_new_raw_public_key_ex()
function. It does 100000 repetitions divided evenly among each thread. The
number of threads to use is provided as an argument and the test reports the
average time take to execute a block of 1000 EVP_PKEY_new_raw_public_key_ex()
calls.

Note that this test does not support OpenSSL 1.1.1.

rsasign
-------

The rsasign test repeatedly calls the EVP_PKEY_sign_init()/EVP_PKEY_sign()
functions, using a 512 bit RSA key. It does 100000 repetitions divided evenly
among each thread. The number of threads to use is provided as an argument and
the test reports the average time take to execute a block of 1000
EVP_PKEY_sign_init()/EVP_PKEY_sign() calls.

x509storeissuer
---------------

Runs the function call X509_STORE_CTX_get1_issuer() repeatedly in a loop (which
is used in certificate chain building as part of a verify operation). The test
assumes that the default certificates directly exists but is empty. For a
default configuration this is "/usr/local/ssl/certs". The test takes the number
of threads to use as an argument and the test reports the average time take to
execute a block of 1000 X509_STORE_CTX_get1_issuer() calls.

providerdoall
-------------

The providerdoall test repeatedly calls the OSSL_PROVIDER_do_all() function.
It does 100000 repetitions divided evenly among each thread. The number of
threads to use is provided as an argument and the test reports the average time
take to execute a block of 1000 OSSL_PROVIDER_do_all() calls.

pemread
-------------

The pemread test repeatedly calls the PEM_read_bio_PrivateKey() function on
a memory BIO with a private RSA key. It does 100000 repetitions divided evenly
among each thread. The number of threads to use is provided as an argument and
the test reports the average time take to execute a block of 1000
PEM_read_bio_PrivateKey() calls.

rwlocks
-------------
the rwlocks test creates the command line specified number of threads, splitting
them evenly between read and write functions (though this is adjustable via the
LOCK_WRITERS environment variable).  Threads then iteratively acquire a shared
rwlock to read or update some shared data.  The number of read and write
lock/unlock pairs are reported as a performance measurement