Skip to content
Snippets Groups Projects
  • Henrik Gramner's avatar
    f7197f68
    x86inc: AVX-512 support · f7197f68
    Henrik Gramner authored
    AVX-512 consists of a plethora of different extensions, but in order to keep
    things a bit more manageable we group together the following extensions
    under a single baseline cpu flag which should cover SKL-X and future CPUs:
     * AVX-512 Foundation (F)
     * AVX-512 Conflict Detection Instructions (CD)
     * AVX-512 Byte and Word Instructions (BW)
     * AVX-512 Doubleword and Quadword Instructions (DQ)
     * AVX-512 Vector Length Extensions (VL)
    
    On x86-64 AVX-512 provides 16 additional vector registers, prefer using
    those over existing ones since it allows us to avoid using `vzeroupper`
    unless more than 16 vector registers are required. They also happen to
    be volatile on Windows which means that we don't need to save and restore
    existing xmm register contents unless more than 22 vector registers are
    required.
    
    Big thanks to Intel for their support.
    f7197f68
    History
    x86inc: AVX-512 support
    Henrik Gramner authored
    AVX-512 consists of a plethora of different extensions, but in order to keep
    things a bit more manageable we group together the following extensions
    under a single baseline cpu flag which should cover SKL-X and future CPUs:
     * AVX-512 Foundation (F)
     * AVX-512 Conflict Detection Instructions (CD)
     * AVX-512 Byte and Word Instructions (BW)
     * AVX-512 Doubleword and Quadword Instructions (DQ)
     * AVX-512 Vector Length Extensions (VL)
    
    On x86-64 AVX-512 provides 16 additional vector registers, prefer using
    those over existing ones since it allows us to avoid using `vzeroupper`
    unless more than 16 vector registers are required. They also happen to
    be volatile on Windows which means that we don't need to save and restore
    existing xmm register contents unless more than 22 vector registers are
    required.
    
    Big thanks to Intel for their support.