Skip to content
Snippets Groups Projects
Commit 6b039003 authored by Christophe Gisquet's avatar Christophe Gisquet Committed by Diego Biurrun
Browse files

x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf


While pshufb allows emulating bswap on XMM registers for SSSE3, more
shuffling is needed for SSE2. Alignment is critical, so specific codepaths
are provided for this case.

For the huffyuv sequence "angels_480-huffyuvcompress.avi":
C (using bswap instruction): ~ 55k cycles
SSE2:                        ~ 40k cycles
SSSE3 using unaligned loads: ~ 35k cycles
SSSE3 using aligned loads:   ~ 30k cycles

Signed-off-by: default avatarDiego Biurrun <diego@biurrun.de>
parent a8462023
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment