av1_convolve_horiz_rs_sse4.c |
|
9922 |
av1_convolve_scale_sse4.c |
|
21556 |
av1_inv_txfm_avx2.c |
|
89957 |
av1_inv_txfm_avx2.h |
|
2616 |
av1_inv_txfm_ssse3.c |
|
113960 |
av1_inv_txfm_ssse3.h |
|
8965 |
av1_txfm_sse2.h |
|
13143 |
av1_txfm_sse4.c |
|
892 |
av1_txfm_sse4.h |
|
2393 |
cdef_block_avx2.c |
partial A is a 16-bit vector of the form:
[x8 - - x1 | x16 - - x9] and partial B has the form:
[0 y1 - y7 | 0 y9 - y15].
This function computes (x1^2+y1^2)*C1 + (x2^2+y2^2)*C2 + ...
(x7^2+y2^7)*C7 + (x8^2+0^2)*C8 on each 128-bit lane. Here the C1..C8 constants
are in const1 and const2. |
15640 |
cdef_block_sse4.c |
|
1673 |
cdef_block_ssse3.c |
|
2095 |
cfl_avx2.c |
4x4 |
21372 |
cfl_simd.h |
|
14837 |
cfl_sse2.c |
|
3573 |
cfl_ssse3.c |
Adds 4 pixels (in a 2x2 grid) and multiplies them by 2. Resulting in a more
precise version of a box filter 4:2:0 pixel subsampling in Q3.
The CfL prediction buffer is always of size CFL_BUF_SQUARE. However, the
active area is specified using width and height.
Note: We don't need to worry about going over the active area, as long as we
stay inside the CfL prediction buffer.
|
16986 |
convolve_2d_avx2.c |
|
6517 |
convolve_2d_sse2.c |
Horizontal filter |
23807 |
convolve_avx2.c |
rounding code |
39711 |
convolve_sse2.c |
[4] |
19767 |
filterintra_sse4.c |
arbitrary pack arg |
15287 |
highbd_convolve_2d_avx2.c |
Horizontal filter |
8312 |
highbd_convolve_2d_sse4.c |
|
18237 |
highbd_convolve_2d_ssse3.c |
Horizontal filter |
17061 |
highbd_inv_txfm_avx2.c |
|
169251 |
highbd_inv_txfm_sse4.c |
SSE4.1 |
223876 |
highbd_jnt_convolve_avx2.c |
|
36207 |
highbd_jnt_convolve_sse4.c |
Vertical filter |
16618 |
highbd_txfm_utility_sse4.h |
SSE4.1 |
5597 |
highbd_warp_affine_avx2.c |
|
29103 |
highbd_warp_plane_sse4.c |
|
27683 |
highbd_wiener_convolve_avx2.c |
Horizontal filter |
11602 |
highbd_wiener_convolve_ssse3.c |
Horizontal filter |
8618 |
intra_edge_sse4.c |
|
11606 |
jnt_convolve_avx2.c |
|
53957 |
jnt_convolve_sse2.c |
|
15331 |
jnt_convolve_ssse3.c |
Horizontal filter |
9983 |
reconinter_avx2.c |
|
27617 |
reconinter_sse4.c |
SSE4.1 |
6233 |
reconinter_ssse3.c |
|
4755 |
resize_avx2.c |
g0... g15 | i0... i15 |
36682 |
resize_sse2.c |
ah0 ah1 ... ah7 |
15713 |
resize_ssse3.c |
|
39070 |
selfguided_avx2.c |
|
28680 |
selfguided_sse4.c |
|
26475 |
warp_plane_avx2.c |
|
53826 |
warp_plane_sse4.c |
This is a modified version of 'av1_warped_filter' from warped_motion.c:
Each coefficient is stored in 8 bits instead of 16 bits
The coefficients are rearranged in the column order 0, 2, 4, 6, 1, 3, 5, 7
This is done in order to avoid overflow: Since the tap with the largest
coefficient could be any of taps 2, 3, 4 or 5, we can't use the summation
order ((0 + 1) + (4 + 5)) + ((2 + 3) + (6 + 7)) used in the regular
convolve functions.
Instead, we use the summation order
((0 + 2) + (4 + 6)) + ((1 + 3) + (5 + 7)).
The rearrangement of coefficients in this table is so that we can get the
coefficients into the correct order more quickly.
|
41999 |
wiener_convolve_avx2.c |
|
9956 |
wiener_convolve_sse2.c |
Horizontal filter |
8848 |