| LICENSE |
|
1091 |
- |
| utf8_range.c |
This is a wrapper for the Google range-sse.cc algorithm which checks whether
a sequence of bytes is a valid UTF-8 sequence and finds the longest valid
prefix of the UTF-8 sequence.
The key difference is that it checks for as much ASCII symbols as possible
and then falls back to the range-sse.cc algorithm. The changes to the
algorithm are cosmetic, mostly to trick the clang compiler to produce optimal
code.
For API see the utf8_validity.h header.
|
6981 |
41 % |
| utf8_range.h |
|
562 |
- |
| utf8_range_neon.inc |
This code is almost the same as SSE implementation, please reference
utf8-range-sse.inc for detailed explanation.
The only difference is the range adjustment step. NEON code is more
straightforward.
|
3924 |
- |
| utf8_range_sse.inc |
This code checks that utf-8 ranges are structurally valid 16 bytes at once
using superscalar instructions.
The mapping between ranges of codepoint and their corresponding utf-8
sequences is below.
|
11516 |
- |
| utf8_validity.h |
|
866 |
50 % |