Commits · 8fac0d640341f1b26fed8c9af4c31cb9485d5fd5 · libremedia / Tethys / FFmpeg

Oct 02, 2015

x86/audio_convert: fix clobbering of xmm registers · acdd6725

James Almer authored 9 years ago


Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>

acdd6725

Aug 03, 2015

x86: move XOP emulation code back to x86inc · 5750d6c5

James Almer authored 9 years ago


Only two functions that use xop multiply-accumulate instructions where the
first operand is the same as the fourth actually took advantage of the macros.

This further reduces differences with x264's x86inc.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>

5750d6c5

Jul 26, 2015
- swresample/x86: add missing colon to labels · f37a5dcb
  James Almer authored 9 years ago
  
  Silences warnings with Nasm Signed-off-by: James Almer <jamrial@gmail.com>
  f37a5dcb
May 31, 2015

x86: check for AV_CPU_FLAG_AVXSLOW where useful · c16e99e3

James Almer authored 9 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

c16e99e3

Feb 20, 2015
- swresample: add av_cold to init functions · c0e3b461
  Michael Niedermayer authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  c0e3b461
Feb 15, 2015
- x86/swr: make pack_8ch functions work with compilers without aligned stack · f7ed997a
  James Almer authored 10 years ago
  
  Signed-off-by: James Almer <jamrial@gmail.com>
  f7ed997a
Feb 09, 2015
- swresample/x86/rematrix_init: Check av_malloc* return codes, forward errors · b74ecb82
  Michael Niedermayer authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  b74ecb82
- swresample/x86/rematrix_init: Use av_mallocz_array() · 48ffaaaa
  Michael Niedermayer authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  48ffaaaa
Jan 12, 2015

x86/swr: add SSE/AVX unpack_6ch functions · 59ac93f6

James Almer authored 10 years ago


int32/float only

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>

59ac93f6

Jan 11, 2015
- x86/swr: load constants outside the loop in pack_6ch functions · 6abf00d6
  James Almer authored 10 years ago
  
  Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
  6abf00d6
Dec 31, 2014

x86/swr: disable pack_8ch functions on msvc/icl x86_32 · 975ff6a3
James Almer authored 10 years ago
```
Until a proper fix is committed.

Signed-off-by: James Almer <jamrial@gmail.com>
```
975ff6a3
x86/swr: add missing alignment check to pack_6ch functions · 5f14f9e9
James Almer authored 10 years ago
```
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
```
5f14f9e9

x86/swr: add SSE2/AVX pack_8ch functions · 37b35feb

James Almer authored 10 years ago


Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>

37b35feb

Nov 07, 2014

x86/swr: add ff_float_to_int32_a_avx2 · edff061f

James Almer authored 10 years ago


13797 decicycles in ff_float_to_int32_a_sse2, 32768 runs, 0 skips
8603 decicycles in ff_float_to_int32_a_avx2, 32766 runs, 2 skips

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>

edff061f

Nov 06, 2014

x86/swr: replace sse4 instructions in pack_6ch with sse ones · b385c4c6

James Almer authored 10 years ago


There's no benefit from using blendps here except on CPUs with AVX, where
it's faster than shufps according to Intel's documentation.
As such, rename the sse4 functions to sse/sse2 and use shufps instead.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>

b385c4c6

Jul 04, 2014

x86/swr: use lavu helper macros to check CPU extensions · 9937362c

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

9937362c

x86/swr: split audioconvert and rematrix DSP into separate files · 8279a152

James Almer authored 10 years ago


Also rename resample_x86_dsp.c to resample_init.c

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

8279a152

Jul 03, 2014

swr: initialize only the necessary resample dsp functions · 857cd1f3

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

857cd1f3

Jul 02, 2014

swr: rename swresample_dsp init functions to swri_resample_dsp · b5f0eac0

James Almer authored 10 years ago


The swresample_ prefix is not for internal functions

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

b5f0eac0

Jul 01, 2014

x86/swr: add ff_resample_{common, linear}_int16_xop · c45b7f0d

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

c45b7f0d

x86/swr: add ff_resample_{common, linear}_float_fma · 1a69224f

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

1a69224f

x86/swr: convert resample_{common, linear}_double_sse2 to yasm · dd2c9034

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>

312531 -> 311528 dezicycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

dd2c9034

Jun 30, 2014
- swr: convert resample_common/linear_int16_mmx2/sse2 to yasm. · 847bb638
  Ronald S. Bultje authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  847bb638
Jun 28, 2014

swr: rewrite resample_common/linear_float_sse/avx in yasm. · faa1471f

Ronald S. Bultje authored 10 years ago


Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

faa1471f

Jun 15, 2014
- swr: compile mmx2 s16p functions only on x86-32. · 083cd3d1
  Ronald S. Bultje authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  083cd3d1
Jun 14, 2014

swr: add prototypes for resample dsp functions · 7f4dfbd0

James Almer authored 10 years ago


Should fix compilation failures with MSVC and any other compiler
without inline asm support.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

7f4dfbd0

swr: remove obsolete function prototypes. · ada8f9c0
Ronald S. Bultje authored 10 years ago
```
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
ada8f9c0

swr: split out DSP functions. · 7128a35f

Ronald S. Bultje authored 10 years ago

DSP bits of swri_resample go into their own mini-DSP functions; DSP
init goes from a per-call branch in multiple_resample to a proper
DSP init routine; x86 bits go into x86/; swri_resample() moves out of
resample_template.c into resample.c because it's independent of DSP
code or sample type; multiple_resample() is simplified.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

7128a35f

May 16, 2014

swresample: add swri_resample_float_avx · a9bf713d

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

a9bf713d

May 07, 2014
- inline asm: fix arrays as named constraints. · 1898c2f4
  Matt Oliver authored 10 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  1898c2f4
May 06, 2014

swresample/resample: add missing xmm clobbers · 4cdea929

James Almer authored 10 years ago


Might fix fate-swr on ICL

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

4cdea929

Apr 25, 2014

swresample: add swri_resample_double_sse2 · cdac3ab5

James Almer authored 10 years ago


Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

cdac3ab5

Mar 24, 2014

swresample/resample: sse float linear interpolation · 63dbba65

James Almer authored 11 years ago


About two times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

63dbba65

swresample/resample: mmx2/sse2 int16 linear interpolation · fa25c4c4

James Almer authored 11 years ago


About three times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

fa25c4c4

Mar 20, 2014

swresample: add swri_resample_float_sse · 32291ba6

James Almer authored 11 years ago


At least two times faster than the C version.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

32291ba6

Mar 18, 2014

Automatically change MANGLE() into named inline asm operands when direct... · 82367475

Matt Oliver authored 11 years ago

Automatically change MANGLE() into named inline asm operands when direct symbol reference in inline asm are not supported.

This is part of the patch-set for intel C inline asm on windows support

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

82367475

swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2 · 7c8bf09e

James Almer authored 11 years ago


pshuf+paddd is slightly faster than phaddd.
The real gain is in pre-ssse3 processors like AMD K8 and K10, which get
a big boost in performance compared to the mmxext version

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

7c8bf09e

Jan 18, 2014
- swresample: Add arm&x86 clobber tests · 3dd04cbc
  Martin Storsjö authored 11 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  3dd04cbc
Dec 31, 2013

Avoid using empty macro arguments. · cbeaf678

Reimar Döffinger authored 11 years ago


These are not supported by all compilers (gcc 2.95 but also older SPARC
compilers, see gcc bug #33304 for example), and there is no real need for them.
One use of this feature remains in libavdevice/v4l2.c which can't be
replaced quite as easily.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>

cbeaf678

Oct 08, 2013
- x86: Fix compilation with nasm on PPC & OS/2 · ad75d2b5
  Ronald S. Bultje authored 11 years ago
  
  Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
  ad75d2b5