Commit Graph

589 Commits

Author SHA1 Message Date
Theo Buehler 25391acc14 Unable to run asm code on OpenBSD (amd64)
In order to get asm code running on OpenBSD we must place
all constants into .rodata sections.

davidben@ also pointed out we need to adjust `x86_64-xlate.pl` perlasm
script to adjust read-olny sections for various flavors (OSes). Those
changes were cherry-picked from boringssl.

closes #23312

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23997)
2024-04-17 09:38:06 +02:00
Tom Cosgrove 88c74fe05b aarch64: fix BTI in bsaes assembly code
Change-Id: I63f0fb2af5eb9cea515dec96485325f8efd50511

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/23982)
2024-04-10 09:20:12 +02:00
Richard Levitte b646179229 Copyright year updates
Reviewed-by: Neil Horman <nhorman@openssl.org>
Release: yes
(cherry picked from commit 0ce7d1f355)

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24034)
2024-04-09 13:43:26 +02:00
Richard Levitte ed0f79c7ae Fix a few incorrect paths in some build.info files
The following files referred to ../liblegacy.a when they should have
referred to ../../liblegacy.a.  This cause the creation of a mysterious
directory 'crypto/providers', and because of an increased strictness
with regards to where directories are created, configuration failure
on some platforms.

Fixes #23436

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/23452)

(cherry picked from commit 667b45454a)
2024-02-02 14:12:49 +01:00
Tomas Mraz 493ad484e9 Disable build of HWAES on PPC Macs
Fixes #22818

Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22860)
2024-01-11 11:08:31 +01:00
fangming.fang 1d1ca79fe3 Preserve callee-saved registers in aarch64 AES-CTR code
The AES-CTR assembly code uses v8-v15 registers, they are
callee-saved registers, they must be preserved before the
use and restored after the use.

Change-Id: If9192d1f0f3cea7295f4b0d72ace88e6e8067493

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23233)
2024-01-10 09:52:15 +01:00
Max Bachmann 84356a02fe remove duplicated typedef for u64
This typedef is already created in aes_local.h as `typedef uint64_t u64;`.

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22969)
2023-12-12 20:01:34 +01:00
fisher.yu cc82b09cbd Optimize AES-CTR for ARM Neoverse V1 and V2.
Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
    V2, to fully utilize their AES pipeline resources.

    Improvement on ARM Neoverse V1.

    Package Size(Bytes)	16	32	64	128	256	1024
    Improvement(%)	3.93	-0.45	11.30	4.31	12.48	37.66
    Package Size(Bytes)	1500	8192	16384	61440	65536
    Improvement(%)	37.16	38.90	39.89	40.55	40.41

Change-Id: Ifb8fad9af22476259b9ba75132bc3d8010a7fdbd

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22733)
2023-11-29 18:10:31 +01:00
Phoebe Chen 751a22194e riscv: Provide vector crypto implementation of AES-ECB mode.
This patch provides stream and multi-block implementations for
AES-128-ECB, AES-192-ECB, and AES-256-ECB to accelerate AES-ECB.
Also, refactor functions to share the same variable
declaration in aes-riscv64-zvkned.pl.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Jerry Shih 3e56c0efe7 riscv: Provide vector crypto implementation of AES-128/256-XTS mode.
To accelerate the performance of the AES-XTS mode, in this patch, we
have the specialized multi-block implementation for AES-128-XTS and
AES-256-XTS.

Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Phoebe Chen 18ed3a58b0 riscv: Provide vector crypto implementation of AES-CTR mode.
Support zvbb-zvkned based rvv AES-128/192/256-CTR encryption.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Phoebe Chen 5e16a6276b riscv: Provide vector crypto implementation of AES-CBC mode.
To accelerate the performance of the AES-128/192/256-CBC block cipher
encryption, we used the vaesz, vaesem and vaesef instructions, which
implement a single round of AES encryption.

Similarly, to optimize the performance of AES-128/192/256-CBC block
cipher decryption, we have utilized the vaesz, vaesdm, and vaesdf
instructions, which facilitate a single round of AES decryption.

Furthermore, we optimize the key and initialization vector (IV) step by
keeping the rounding key in vector registers.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Phoebe Chen d26d01e5ec riscv: Further optimization for single block aes-zvkned decryption.
Interleave key loading and aes decrypt computing for single block aes.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Phoebe Chen 42f1122848 riscv: Further optimization for single block aes-zvkned encryption.
Interleave key loading and aes encrypt computing for single block aes.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Ard Biesheuvel 94474e02fa riscv: Implement AES-192
Even though the RISC-V vector instructions only support AES-128 and
AES-256 for key generation, the round instructions themselves can
easily be used to implement AES-192 too - we just need to fallback to
the generic key generation routines in this case.

Note that the vector instructions use the encryption key schedule (but
in reverse order) so we need to generate the encryption key schedule
even when doing decryption using the vector instructions.

Signed-off-by: Ard Biesheuvel <ardb@google.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:49 +01:00
Christoph Müllner f6631e38f9 riscv: AES: Provide a Zvkned-based implementation
The upcoming RISC-V vector crypto extensions provide
the Zvkned extension, that provides a AES-specific instructions.
This patch provides an implementation that utilizes this
extension if available.

Tested on QEMU and no regressions observed.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:49 +01:00
Danny Tsen 3d3a7ecd1a Improve performance for 6x unrolling with vpermxor instruction
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21812)
2023-10-02 14:00:23 +02:00
Matt Caswell da1c088f59 Copyright year updates
Reviewed-by: Richard Levitte <levitte@openssl.org>
Release: yes
2023-09-07 09:59:15 +01:00
zhuchen 780ce3849f Fixed incorrect usage of vshuf.b instruction
In the definition of the latest revised LoongArch64 vector instruction manual,
it is clearly pointed out that the undefined upper three bits of each byte in
the control register of the vshuf.b instruction should not be used, otherwise
uncertain results may be obtained. Therefore, it is necessary to correct the
use of the vshuf.b instruction in the existing vpaes-loongarch64.pl code to
avoid erroneous calculation results in future LoongArch64 processors.

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21530)
2023-08-01 19:42:58 +02:00
Dimitri Papadopoulos a024ab984e Fix typos found by codespell
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21467)
2023-07-18 18:54:45 +10:00
Heiko Stuebner 3e76b38852 riscv: Clarify dual-licensing wording for GCM and AES
The original text for the Apache + BSD dual licensing for riscv GCM and AES
perlasm was taken from other openSSL users like crypto/crypto/LPdir_unix.c .

Though Eric pointed out that the dual-licensing text could be read in a
way negating the second license [0] and suggested to clarify the text
even more.

So do this here for all of the GCM, AES and shared riscv.pm .

We already had the agreement of all involved developers for the actual
dual licensing in [0] and [1], so this is only a better clarification
for this.

[0] https://github.com/openssl/openssl/pull/20649#issuecomment-1589558790
[1] https://github.com/openssl/openssl/pull/21018

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21357)
2023-07-06 12:53:27 +10:00
Heiko Stuebner 6181a33367 riscv: aes: dual-license under Apache + 2-clause BSD
To allow re-use of the already reviewed openSSL crypto code for RISC-V in
other projects - like the Linux kernel, add a second license (2-clause BSD)
to the 32+64bit aes implementations using the Zkn extension.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/21018)
2023-06-11 01:30:14 -04:00
Liu-ErMeng 4df13d1054 fix aes-xts bug on aarch64 big-endian env.
Signed-off-by: Liu-ErMeng <liuermeng2@huawei.com>

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20797)
2023-04-28 09:19:49 +02:00
Tomas Mraz 72dfe46550 aesv8-armx.pl: Avoid buffer overrread in AES-XTS decryption
Original author: Nevine Ebeid (Amazon)
Fixes: CVE-2023-1255

The buffer overread happens on decrypts of 4 mod 5 sizes.
Unless the memory just after the buffer is unmapped this is harmless.

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/20759)
2023-04-20 17:48:16 +02:00
Pauli c879f8ac56 Fix copyright disclaimer.
The mention of the GPL shouldn't have been there.

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20517)
2023-03-16 10:17:58 +01:00
Christoph Müllner c8a641c39f riscv: aes: Move reusable Perl code into Perl module
Move helper functions and instruction encoding functions
into a riscv.pm Perl module to avoid pointless code duplication.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
2023-03-16 13:12:19 +11:00
Kornel Dulęba 27093ba733 aes/asm/bsaes-armv7.pl: Replace adrl with add
"adrl" is a pseudo-instruction used to calculate an address relative
to PC. It's not recognized by clang resulting in a compilation error.
I've stumbled upon it when trying to integrate the bsaes-armv7 assmebly
logic into FreeBSD kernel, which uses clang as it's default compiler.
Note that this affect the build only if BSAES_ASM_EXTENDED_KEY is
defined, which is not the default option in OpenSSL.

The solution here is to replace it with an add instruction.
This mimics what has already been done in !BSAES_ASM_EXTENDED_KEY logic.
Because of that I've marked this as trivial CLA.

CLA: trivial
Signed-off-by: Kornel Dulęba <mindal@semihalf.com>

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20458)
2023-03-15 08:22:51 +11:00
zhuchen ef917549f5 Add vpaes-loongarch64.pl module.
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19364)
2022-10-12 18:02:12 +11:00
Hongren (Zenithal) Zheng b733ce73a4 add build support for riscv32 aes zkn
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18308)
2022-09-05 10:20:30 +10:00
Hongren (Zenithal) Zheng b1b889d1b3 Add AES implementation in riscv32 zkn asm
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18308)
2022-09-05 10:20:30 +10:00
Tom Cosgrove 1efd8533e1 Fix aarch64 signed bit shift issue found by UBSAN
Also fix conditional branch out of range when using sanitisers.

Fixes #18813

Signed-off-by: Tom Cosgrove <tom.cosgrove@arm.com>

Change-Id: Ic543885091ed3ef2ddcbe21de0a4ac0bca1e2494

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18816)
2022-07-19 12:14:33 +02:00
Bernd Edlinger 65523758e5 Fix reported performance degradation on aarch64
This restores the implementation prior to
commit 2621751 ("aes/asm/aesv8-armx.pl: avoid 32-bit lane assignment in CTR mode")
for 64bit targets only, since it is reportedly 2-17% slower,
and the silicon errata only affects 32bit targets.
Only for 32bit targets the new algorithm is used.

Fixes #18445

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18581)
2022-07-08 11:07:12 +02:00
Alex Chernyakhovsky 6ebf6d5159 Fix AES OCB encrypt/decrypt for x86 AES-NI
aesni_ocb_encrypt and aesni_ocb_decrypt operate by having a fast-path
that performs operations on 6 16-byte blocks concurrently (the
"grandloop") and then proceeds to handle the "short" tail (which can
be anywhere from 0 to 5 blocks) that remain.

As part of initialization, the assembly initializes $len to the true
length, less 96 bytes and converts it to a pointer so that the $inp
can be compared to it. Each iteration of "grandloop" checks to see if
there's a full 96-byte chunk to process, and if so, continues. Once
this has been exhausted, it falls through to "short", which handles
the remaining zero to five blocks.

Unfortunately, the jump at the end of "grandloop" had a fencepost
error, doing a `jb` ("jump below") rather than `jbe` (jump below or
equal). This should be `jbe`, as $inp is pointing to the *end* of the
chunk currently being handled. If $inp == $len, that means that
there's a whole 96-byte chunk waiting to be handled. If $inp > $len,
then there's 5 or fewer 16-byte blocks left to be handled, and the
fall-through is intended.

The net effect of `jb` instead of `jbe` is that the last 16-byte block
of the last 96-byte chunk was completely omitted. The contents of
`out` in this position were never written to. Additionally, since
those bytes were never processed, the authentication tag generated is
also incorrect.

The same fencepost error, and identical logic, exists in both
aesni_ocb_encrypt and aesni_ocb_decrypt.

This addresses CVE-2022-2097.

Co-authored-by: Alejandro Sedeño <asedeno@google.com>
Co-authored-by: David Benjamin <davidben@google.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
2022-07-05 10:10:24 +02:00
Hongren (Zenithal) Zheng 9912c38ed6 add build support for riscv64 aes zkn
Signed-off-by: Hongren (Zenithal) Zheng <i@zenithal.me>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18197)
2022-06-10 11:45:41 +02:00
Hongren (Zenithal) Zheng 608cadfbdb Add AES implementation in riscv64 zkn asm
Signed-off-by: Hongren (Zenithal) Zheng <i@zenithal.me>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18197)
2022-06-10 11:45:41 +02:00
Sebastian Andrzej Siewior 9968c77539 Rename x86-32 assembly files from .s to .S.
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
2022-05-24 13:16:06 +10:00
Henry Brausen b3504b600c Add AES implementation in generic riscv64 asm
This implementation is based on the four-table approach, along the same
lines as the non-constant-time implementation in aes_core.c The
implementation is in perlasm.

Utility functions are defined to automatically stack/unstack registers
as needed for prologues and epilogues. See riscv-elf-psabi-doc at
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/ for ABI details.

Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Henry Brausen <henry.brausen@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17640)
2022-05-19 16:32:49 +10:00
Matt Caswell fecb3aae22 Update copyright year
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Release: yes
2022-05-03 13:34:51 +01:00
Tom Cosgrove 5adddcd962 Fix gcc 6.3 builds of aarch64 BSAES
gcc6.3 doesn't seem to support the register aliases fp and lr for x29 and x30,
so use the x names.

Fixes #18114

Change-Id: I077edda42af4c7cdb7b24f28ac82d1603f550108

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18127)
2022-04-21 10:37:00 +02:00
Sebastian Pop 9c140a3366 disable 5x interleave on buffers shorter than 512 bytes: 3% speedup on Graviton2
d6e4287c97 introduced 5x interleaving as an
optimization for ThunderX2, and that leads to some performance degradation on
when encoding short buffers.  We found this performance degradation by measuring
the performance of nginx on Ubuntu 20.04 that comes with OpenSSL 1.1.1f and
Ubuntu 22.04 with OpenSSL 3.0.1.

This patch limits the 5x interleave to buffers larger than 512 bytes.
On Graviton2 we see the following performance with this patch:

$ openssl speed -evp aes-128-gcm -bytes 128

AES-128-GCM   64 bytes     79 bytes     80 bytes     128 bytes    256 bytes    511 bytes    512 bytes    1024 bytes
master        1062564.71k  775113.11k   1069959.33k  1411716.28k  1653114.86k  1585981.16k  1973683.03k  2203214.08k
master+patch  1062729.28k  771915.11k   1103883.42k  1458665.43k  1708701.20k  1647060.84k  1975571.80k  2204038.42k
diff          0%           0%           3%           3%           3%           4%           0%           0%
revert d6e428 1055290.03k  773448.92k   1117411.97k  1441478.57k  1695698.52k  1634598.04k  1981851.65k  2196680.36k

CLA: trivial

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17984)
2022-03-31 16:28:42 +11:00
Ben Avison 2bd5cde57e Remove further uses of __ARMEL__ in AArch64 assembly
The sweep of the source tree in #17373 missed the BSAES assembly due its
PR #14592 having been temporarily backed out at the time.

This constitutes a partial fix for #17958 - covers cases except when
configured with -DOPENSSL_AES_CONST_TIME.

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17988)
2022-03-30 17:13:56 +02:00
Tom Cosgrove a35c3a9f5b Use Perl to generate bsaes-armv8.S
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14592)
2022-03-09 17:50:03 +01:00
Ben Avison 82551af514 ARM assembly pack: translate bit-sliced AES implementation to AArch64
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14592)
2022-03-09 17:50:03 +01:00
Pauli e180bf641e aes: make the no-asm constant time code path not the default
After OMC and OTC discussions, the 95% performance loss resulting from
the constant time code was deemed excessive for something outside of
our security policy.

The option to use the constant time code exists as it was in OpenSSL 1.1.1.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17600)
2022-01-31 11:39:00 +11:00
David Benjamin 40c24d74de Don't use __ARMEL__/__ARMEB__ in aarch64 assembly
GCC's __ARMEL__ and __ARMEB__ defines denote little- and big-endian arm,
respectively. They are not defined on aarch64, which instead use
__AARCH64EL__ and __AARCH64EB__.

However, OpenSSL's assembly originally used the 32-bit defines on both
platforms and even define __ARMEL__ and __ARMEB__ in arm_arch.h. This is
less portable and can even interfere with other headers, which use
__ARMEL__ to detect little-endian arm.

Over time, the aarch64 assembly has switched to the correct defines,
such as in 32bbb62ea6. This commit
finishes the job: poly1305-armv8.pl needed a fix and the dual-arch
armx.pl files get one more transform to convert from 32-bit to 64-bit.

(There is an even more official endianness detector, __ARM_BIG_ENDIAN in
the Arm C Language Extensions. But I've stuck with the GCC ones here as
that would be a larger change.)

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/17373)
2022-01-09 07:40:44 +01:00
Dimitris Apostolou e304aa87b3 Fix typos
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17392)
2022-01-05 12:37:20 +01:00
x2018 1287dabd0b fix some code with obvious wrong coding style
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16918)
2021-10-28 13:10:46 +10:00
Tomas Mraz d92c696d82 Add missing define to enable AES-NI usage on x86 platform
Fixes #16858

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16866)
2021-10-21 18:23:46 +02:00
Russ Butler 19e277dd19 aarch64: support BTI and pointer authentication in assembly
This change adds optional support for
- Armv8.3-A Pointer Authentication (PAuth) and
- Armv8.5-A Branch Target Identification (BTI)
features to the perl scripts.

Both features can be enabled with additional compiler flags.
Unless any of these are enabled explicitly there is no code change at
all.

The extensions are briefly described below. Please read the appropriate
chapters of the Arm Architecture Reference Manual for the complete
specification.

Scope
-----

This change only affects generated assembly code.

Armv8.3-A Pointer Authentication
--------------------------------

Pointer Authentication extension supports the authentication of the
contents of registers before they are used for indirect branching
or load.

PAuth provides a probabilistic method to detect corruption of register
values. PAuth signing instructions generate a Pointer Authentication
Code (PAC) based on the value of a register, a seed and a key.
The generated PAC is inserted into the original value in the register.
A PAuth authentication instruction recomputes the PAC, and if it matches
the PAC in the register, restores its original value. In case of a
mismatch, an architecturally unmapped address is generated instead.

With PAuth, mitigation against ROP (Return-oriented Programming) attacks
can be implemented. This is achieved by signing the contents of the
link-register (LR) before it is pushed to stack. Once LR is popped,
it is authenticated. This way a stack corruption which overwrites the
LR on the stack is detectable.

The PAuth extension adds several new instructions, some of which are not
recognized by older hardware. To support a single codebase for both pre
Armv8.3-A targets and newer ones, only NOP-space instructions are added
by this patch. These instructions are treated as NOPs on hardware
which does not support Armv8.3-A. Furthermore, this patch only considers
cases where LR is saved to the stack and then restored before branching
to its content. There are cases in the code where LR is pushed to stack
but it is not used later. We do not address these cases as they are not
affected by PAuth.

There are two keys available to sign an instruction address: A and B.
PACIASP and PACIBSP only differ in the used keys: A and B, respectively.
The keys are typically managed by the operating system.

To enable generating code for PAuth compile with
-mbranch-protection=<mode>:

- standard or pac-ret: add PACIASP and AUTIASP, also enables BTI
  (read below)
- pac-ret+b-key: add PACIBSP and AUTIBSP

Armv8.5-A Branch Target Identification
--------------------------------------

Branch Target Identification features some new instructions which
protect the execution of instructions on guarded pages which are not
intended branch targets.

If Armv8.5-A is supported by the hardware, execution of an instruction
changes the value of PSTATE.BTYPE field. If an indirect branch
lands on a guarded page the target instruction must be one of the
BTI <jc> flavors, or in case of a direct call or jump it can be any
other instruction. If the target instruction is not compatible with the
value of PSTATE.BTYPE a Branch Target Exception is generated.

In short, indirect jumps are compatible with BTI <j> and <jc> while
indirect calls are compatible with BTI <c> and <jc>. Please refer to the
specification for the details.

Armv8.3-A PACIASP and PACIBSP are implicit branch target
identification instructions which are equivalent with BTI c or BTI jc
depending on system register configuration.

BTI is used to mitigate JOP (Jump-oriented Programming) attacks by
limiting the set of instructions which can be jumped to.

BTI requires active linker support to mark the pages with BTI-enabled
code as guarded. For ELF64 files BTI compatibility is recorded in the
.note.gnu.property section. For a shared object or static binary it is
required that all linked units support BTI. This means that even a
single assembly file without the required note section turns-off BTI
for the whole binary or shared object.

The new BTI instructions are treated as NOPs on hardware which does
not support Armv8.5-A or on pages which are not guarded.

To insert this new and optional instruction compile with
-mbranch-protection=standard (also enables PAuth) or +bti.

When targeting a guarded page from a non-guarded page, weaker
compatibility restrictions apply to maintain compatibility between
legacy and new code. For detailed rules please refer to the Arm ARM.

Compiler support
----------------

Compiler support requires understanding '-mbranch-protection=<mode>'
and emitting the appropriate feature macros (__ARM_FEATURE_BTI_DEFAULT
and __ARM_FEATURE_PAC_DEFAULT). The current state is the following:

-------------------------------------------------------
| Compiler | -mbranch-protection | Feature macros     |
+----------+---------------------+--------------------+
| clang    | 9.0.0               | 11.0.0             |
+----------+---------------------+--------------------+
| gcc      | 9                   | expected in 10.1+  |
-------------------------------------------------------

Available Platforms
------------------

Arm Fast Model and QEMU support both extensions.

https://developer.arm.com/tools-and-software/simulation-models/fast-models
https://www.qemu.org/

Implementation Notes
--------------------

This change adds BTI landing pads even to assembly functions which are
likely to be directly called only. In these cases, landing pads might
be superfluous depending on what code the linker generates.
Code size and performance impact for these cases would be negligible.

Interaction with C code
-----------------------

Pointer Authentication is a per-frame protection while Branch Target
Identification can be turned on and off only for all code pages of a
whole shared object or static binary. Because of these properties if
C/C++ code is compiled without any of the above features but assembly
files support any of them unconditionally there is no incompatibility
between the two.

Useful Links
------------

To fully understand the details of both PAuth and BTI it is advised to
read the related chapters of the Arm Architecture Reference Manual
(Arm ARM):
https://developer.arm.com/documentation/ddi0487/latest/

Additional materials:

"Providing protection for complex software"
https://developer.arm.com/architectures/learn-the-architecture/providing-protection-for-complex-software

Arm Compiler Reference Guide Version 6.14: -mbranch-protection
https://developer.arm.com/documentation/101754/0614/armclang-Reference/armclang-Command-line-Options/-mbranch-protection?lang=en

Arm C Language Extensions (ACLE)
https://developer.arm.com/docs/101028/latest

Addional Notes
--------------

This patch is a copy of the work done by Tamas Petz in boringssl. It
contains the changes from the following commits:

aarch64: support BTI and pointer authentication in assembly
    Change-Id: I4335f92e2ccc8e209c7d68a0a79f1acdf3aeb791
    URL: https://boringssl-review.googlesource.com/c/boringssl/+/42084
aarch64: Improve conditional compilation
    Change-Id: I14902a64e5f403c2b6a117bc9f5fb1a4f4611ebf
    URL: https://boringssl-review.googlesource.com/c/boringssl/+/43524
aarch64: Fix name of gnu property note section
    Change-Id: I6c432d1c852129e9c273f6469a8b60e3983671ec
    URL: https://boringssl-review.googlesource.com/c/boringssl/+/44024

Change-Id: I2d95ebc5e4aeb5610d3b226f9754ee80cf74a9af

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16674)
2021-10-01 09:35:38 +02:00
Matt Caswell 54b4053130 Update copyright year
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16176)
2021-07-29 15:41:35 +01:00