bitscan.h |
Find first bit set in word. Least significant bit is 1.
Return 0 if no bits set.
|
7502 |
bitset.h |
\file bitset.h
\brief Bitset of arbitrary size definitions.
\author Michal Krol
|
12089 |
blob.c |
Ensure that \blob will be able to fit an additional object of size
\additional. The growing (if any) will occur by doubling the existing
allocation.
|
9149 |
blob.h |
The blob functions implement a simple, low-level API for serializing and
deserializing.
All objects written to a blob will be serialized directly, (without any
additional meta-data to describe the data written). Therefore, it is the
caller's responsibility to ensure that any data can be read later, (either
by knowing exactly what data is expected, or by writing to the blob
sufficient meta-data to describe what has been written).
A blob is efficient in that it dynamically grows by doubling in size, so
allocation costs are logarithmic.
|
12688 |
compiler.h |
\file compiler.h
Compiler-related stuff.
|
2348 |
crc32.c |
@file
CRC32 implementation.
@author Jose Fonseca
|
5471 |
crc32.h |
@file
CRC32 function.
@author Jose Fonseca <jfonseca@vmware.com>
|
1642 |
debug.c |
Reads an environment variable and interprets its value as a boolean.
Recognizes 0/false/no and 1/true/yes. Other values result in the default value.
|
3286 |
debug.h |
extern C |
1747 |
detect_os.h |
Copyright 2008 VMware, Inc. |
2518 |
disk_cache.c |
Number of bits to mask off from a cache key to get an index. |
36724 |
disk_cache.h |
Size of cache keys in bytes. |
9119 |
fast_urem_by_const.h |
Code for fast 32-bit unsigned remainder, based off of "Faster Remainder by
Direct Computation: Applications to Compilers and Software Libraries,"
available at https://arxiv.org/pdf/1902.01961.pdf.
util_fast_urem32(n, d, REMAINDER_MAGIC(d)) returns the same thing as
n % d for any unsigned n and d, however it compiles down to only a few
multiplications, so it should be faster than plain uint32_t modulo if the
same divisor is used many times.
|
2903 |
fnv1a.h |
Quick FNV-1a hash implementation based on:
http://www.isthe.com/chongo/tech/comp/fnv/
FNV-1a is not be the best hash out there -- Jenkins's lookup3 is supposed
to be quite good, and it probably beats FNV. But FNV has the advantage
that it involves almost no code. For an improvement on both, see Paul
Hsieh's http://www.azillionmonkeys.com/qed/hash.html
|
2084 |
format |
|
|
futex.h |
FUTEX_WAIT_BITSET with FUTEX_BITSET_MATCH_ANY is equivalent to
FUTEX_WAIT, except that it treats the timeout as absolute. |
3565 |
half_float.c |
Convert a 4-byte float to a 2-byte half float.
Not all float32 values can be represented exactly as a float16 value. We
round such intermediate float32 values to the nearest float16. When the
float32 lies exactly between to float16 values, we round to the one with
an even mantissa.
This rounding behavior has several benefits:
- It has no sign bias.
- It reproduces the behavior of real hardware: opcode F32TO16 in Intel's
GPU ISA.
- By reproducing the behavior of the GPU (at least on Intel hardware),
compile-time evaluation of constant packHalf2x16 GLSL expressions will
result in the same value as if the expression were executed on the GPU.
|
6407 |
half_float.h |
_mesa_float_to_float16_rtz is no more than a wrapper to the counterpart
softfloat.h call. Still, softfloat.h conversion API is meant to be kept
private. In other words, only use the API published here, instead of
calling directly the softfloat.h one.
|
2606 |
hash_table.c |
Implements an open-addressing, linear-reprobing hash table.
For more information, see:
http://cgit.freedesktop.org/~anholt/hash_table/tree/README
|
24204 |
hash_table.h |
This foreach function is safe against deletion (which just replaces
an entry's data with the deleted marker), but not against insertion
(which may rehash the table, making entry a dangling pointer).
|
6384 |
list.h |
\file
List macros heavily inspired by the Linux kernel
list handling. No list looping yet.
Is not threadsafe, so common operations need to
be protected using an external mutex.
|
8578 |
macros.h |
Compute the size of an array |
10425 |
mesa-sha1.c |
|
1905 |
mesa-sha1.h |
extern C |
1878 |
mesa-sha1_test.c |
|
2246 |
os_memory.h |
OS memory management abstractions
|
1938 |
os_memory_aligned.h |
Memory alignment wrappers.
|
3383 |
os_memory_stdc.h |
OS memory management abstractions for the standard C library.
|
2088 |
os_misc.c |
If the GALLIUM_LOG_FILE environment variable is set to a valid filename,
write all messages to that file.
|
4816 |
os_misc.h |
Miscellaneous OS services.
|
2517 |
os_time.h |
@file
OS independent time-manipulation functions.
@author Jose Fonseca <jfonseca@vmware.com>
|
3666 |
ralloc.c |
Some versions of MinGW are missing _vscprintf's declaration, although they
still provide the symbol in the import library. |
21049 |
ralloc.h |
\def ralloc(ctx, type)
Allocate a new object chained off of the given context.
This is equivalent to:
\code
((type *) ralloc_size(ctx, sizeof(type))
\endcode
|
22211 |
rounding.h |
The C standard library has functions round()/rint()/nearbyint() that round
their arguments according to the rounding mode set in the floating-point
control register. While there are trunc()/ceil()/floor() functions that do
a specific operation without modifying the rounding mode, there is no
roundeven() in any version of C.
Technical Specification 18661 (ISO/IEC TS 18661-1:2014) adds roundeven(),
but it's unfortunately not implemented by glibc.
This implementation differs in that it does not raise the inexact exception.
We use rint() to implement these functions, with the assumption that the
floating-point rounding mode has not been changed from the default Round
to Nearest.
|
4172 |
set.c |
From Knuth -- a good choice for hash/rehash values is p, p-2 where
p and p-2 are both prime. These tables are sized to have an extra 10%
free to avoid exponential performance degradation as the hash table fills
|
16561 |
set.h |
This foreach function is safe against deletion, but not against
insertion (which may rehash the set, making entry a dangling
pointer).
|
4046 |
sha1 |
|
|
simple_mtx.h |
mtx_t - Fast, simple mutex
While modern pthread mutexes are very fast (implemented using futex), they
still incur a call to an external DSO and overhead of the generality and
features of pthread mutexes. Most mutexes in mesa only needs lock/unlock,
and the idea here is that we can inline the atomic operation and make the
fast case just two intructions. Mutexes are subtle and finicky to
implement, so we carefully copy the implementation from Ulrich Dreppers
well-written and well-reviewed paper:
"Futexes Are Tricky"
http://www.akkadia.org/drepper/futex.pdf
We implement "mutex3", which gives us a mutex that has no syscalls on
uncontended lock or unlock. Further, the uncontended case boils down to a
locked cmpxchg and an untaken branch, the uncontended unlock is just a
locked decr and an untaken branch. We use __builtin_expect() to indicate
that contention is unlikely so that gcc will put the contention code out of
the main code flow.
A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables.
|
3921 |
softfloat.c |
\brief Shifts 'a' right by the number of bits given in 'dist', which must be in
the range 1 to 63. If any nonzero bits are shifted off, they are "jammed"
into the least-significant bit of the shifted value by setting the
least-significant bit to 1. This shifted-and-jammed value is returned.
From softfloat_shortShiftRightJam64()
|
44815 |
softfloat.h |
extern C |
2510 |
string_buffer.c |
Too small, double until we can fit the new string |
4311 |
string_buffer.h |
extern "C" |
3016 |
strndup.h |
_WIN32 |
1649 |
strtod.c |
Wrapper around strtod which uses the "C" locale so the decimal
point is always '.'
|
2220 |
strtod.h |
|
1473 |
u_atomic.h |
Many similar implementations exist. See for example libwsbm
or the linux kernel include/atomic.h
No copyright claimed on this file.
|
11009 |
u_debug.c |
CHAR_BIT |
10543 |
u_debug.h |
@file
Cross-platform debugging helpers.
For now it just has assert and printf replacements, but it might be extended
with stack trace reports and more advanced logging in the near future.
@author Jose Fonseca <jfonseca@vmware.com>
|
11179 |
u_dynarray.h |
A zero-initialized version of this is guaranteed to represent an
empty array.
Also, size <= capacity and data != 0 if and only if capacity != 0
capacity will always be the allocation size of data
|
7162 |
u_endian.h |
|
2952 |
u_math.c |
This is defined in pmmintrin.h, but it can only be included when -msse3 is
used, so just define it here to avoid further. |
3525 |
u_math.h |
Math utilities and approximations for common math functions.
Reduced precision is usually acceptable in shaders...
"fast" is used in the names of functions which are low-precision,
or at least lower-precision than the normal C lib functions.
|
17562 |
u_memory.h |
Memory functions
|
2735 |
u_queue.h |
Job queue with execution in a separate thread.
Jobs can be added from any thread. After that, the wait call can be used
to wait for completion of the job.
|
7707 |
u_string.h |
@file
Platform independent functions for string manipulation.
@author Jose Fonseca <jfonseca@vmware.com>
|
3038 |
u_thread.h |
pthread_np.h -> sys/param.h -> machine/param.h
- defines ALIGN which clashes with our ALIGN
|
6673 |
xxhash.h |
Notice extracted from xxHash homepage :
xxHash is an extremely fast Hash algorithm, running at RAM speed limits.
It also successfully passes all tests from the SMHasher suite.
Comparison (single thread, Windows Seven 32 bits, using SMHasher on a Core 2 Duo @3GHz)
Name Speed Q.Score Author
xxHash 5.4 GB/s 10
CrapWow 3.2 GB/s 2 Andrew
MumurHash 3a 2.7 GB/s 10 Austin Appleby
SpookyHash 2.0 GB/s 10 Bob Jenkins
SBox 1.4 GB/s 9 Bret Mulvey
Lookup3 1.2 GB/s 9 Bob Jenkins
SuperFastHash 1.2 GB/s 1 Paul Hsieh
CityHash64 1.05 GB/s 10 Pike & Alakuijala
FNV 0.55 GB/s 5 Fowler, Noll, Vo
CRC32 0.43 GB/s 9
MD5-32 0.33 GB/s 10 Ronald L. Rivest
SHA1-32 0.28 GB/s 10
Q.Score is a measure of quality of the hash function.
It depends on successfully passing SMHasher test set.
10 is a perfect score.
Note : SMHasher's CRC32 implementation is not the fastest one.
Other speed-oriented implementations can be faster,
especially in combination with PCLMUL instruction :
http://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735
A 64-bit version, named XXH64, is available since r35.
It offers much better speed, but for 64-bit applications only.
Name Speed on 64 bits Speed on 32 bits
XXH64 13.8 GB/s 1.9 GB/s
XXH32 6.8 GB/s 6.0 GB/s
|
51451 |