Speed up verifying UTF-8

Edit
Title Speed up verifying UTF-8
Topic Performance
Created 2021-02-20 21:13:00
Last modified 2021-12-20 14:26:44 (2 years, 2 months ago)
Latest email 2021-12-20 14:24:40 (2 years, 2 months ago)
Status
2022-01: Committed
2021-11: Moved to next CF
2021-09: Moved to next CF
2021-07: Moved to next CF
2021-03: Moved to next CF
Target version
Authors John Naylor (john.naylor)
Reviewers Heikki Linnakangas (heikki), Amit Khandekar (amitdkhan)Become reviewer
Committer John Naylor (john.naylor)
Links
Emails
[POC] verifying UTF-8 using SIMD instructions
First at 2021-02-01 17:32:23 by John Naylor <john.naylor at enterprisedb.com>
Latest at 2021-12-20 14:24:40 by John Naylor <john.naylor at enterprisedb.com>
Latest attachment (v25-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-12-13 15:39:37 from John Naylor <john.naylor at enterprisedb.com>
    Attachment (v25-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-12-13 15:39:37 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v24-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-10-19 21:42:40 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v23-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-08-26 15:35:54 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v22-addendum-32-bit-transitions.txt) at 2021-08-24 16:00:28 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v22-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-08-04 11:22:57 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (mbverifystr-threshold.sql) at 2021-07-30 01:12:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: No)
    Attachment (v21-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-07-28 18:12:11 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v20-0001-Add-a-fast-path-for-validating-UTF-8-text.patch) at 2021-07-26 11:09:00 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v2-0001-XXX-Make-SIMD-code-more-platform-neutral.txt) at 2021-07-22 00:07:26 from Thomas Munro <thomas.munro at gmail.com> (Patch: Yes)
    Attachment (0001-XXX-Make-SIMD-code-more-platform-neutral.txt) at 2021-07-21 15:29:21 from Thomas Munro <thomas.munro at gmail.com> (Patch: Yes)
    Attachment (v19-rewrite-pg_verify_str-for-speed.patch) at 2021-07-20 21:24:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v18-0001-Use-pure-DFA.patch) at 2021-07-19 01:26:47 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v17-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-07-17 00:02:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v16gamma-Rewrite-pg_utf8_verifystr-for-speed.txt) at 2021-07-15 22:00:05 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v16alpha-Rewrite-pg_utf8_verifystr-for-speed.txt) at 2021-07-15 18:12:43 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v15-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-07-12 19:45:39 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v14-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-30 11:18:32 from Heikki Linnakangas <hlinnaka at iki.fi> (Patch: Yes)
    Attachment (v13-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-29 11:20:38 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v12-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-10 12:45:01 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (Chinese62.txt) at 2021-06-07 12:39:40 from John Naylor <john.naylor at enterprisedb.com> (Patch: No)
    Attachment (v11-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-06 19:21:51 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v10-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-02 16:26:41 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v9-0001-Replace-pg_utf8_verifystr-with-two-faster-impleme.patch) at 2021-04-01 14:22:06 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v8-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-03-19 19:24:06 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v7-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-24 21:50:50 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v6-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-24 16:25:49 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v4-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-20 21:10:58 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (addendum-01-8-byte-stride.patch) at 2021-02-19 00:43:04 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v4-SSE4-with-autoconf-support.patch) at 2021-02-17 05:40:32 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v3-SSE4-with-autoconf-support.patch) at 2021-02-16 01:32:52 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v2-add-portability-stub-and-new-fallback.patch) at 2021-02-13 01:31:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (utf-sse42-demo.patch) at 2021-02-09 21:12:22 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (simdjson-utf8-hack.patch) at 2021-02-08 10:17:11 from Heikki Linnakangas <hlinnaka at iki.fi> (Patch: Yes)
    Attachment (v1-0001-Add-an-ASCII-fast-path-to-multibyte-encoding-veri.patch) at 2021-02-07 20:24:16 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v1-verify-utf8-sse-ascii.patch) at 2021-02-01 17:32:23 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
History
When Who What
2021-12-20 14:26:44 John Naylor (john.naylor) Closed in commitfest 2022-01 with status: Committed
2021-12-13 19:18:34 John Naylor (john.naylor) Added john.naylor as committer
2021-12-13 19:18:25 John Naylor (john.naylor) New status: Needs review
2021-12-08 18:12:16 John Naylor (john.naylor) New status: Waiting on Author
2021-12-02 11:32:46 Daniel Gustafsson (d_gustafsson) Closed in commitfest 2021-11 with status: Moved to next CF
2021-10-19 21:43:20 John Naylor (john.naylor) New status: Needs review
2021-10-18 14:57:44 John Naylor (john.naylor) New status: Waiting on Author
2021-10-06 13:16:44 Jaime Casanova (jcasanov) Closed in commitfest 2021-09 with status: Moved to next CF
2021-08-04 11:24:00 John Naylor (john.naylor) New status: Needs review
2021-08-03 02:46:29 Masahiko Sawada (masahikosawada) New status: Waiting on Author
2021-08-03 02:27:17 Masahiko Sawada (masahikosawada) Closed in commitfest 2021-07 with status: Moved to next CF
2021-07-26 11:09:34 John Naylor (john.naylor) New status: Needs review
2021-07-22 03:17:48 John Naylor (john.naylor) New status: Waiting on Author
2021-07-12 19:46:02 John Naylor (john.naylor) New status: Needs review
2021-07-04 20:02:41 John Naylor (john.naylor) New status: Waiting on Author
2021-06-30 12:06:37 Heikki Linnakangas (heikki) Added heikki as reviewer
2021-06-02 16:30:48 John Naylor (john.naylor) Changed name to Speed up verifying UTF-8
2021-04-08 15:44:06 David Steele (dsteele) Closed in commitfest 2021-03 with status: Moved to next CF
2021-03-03 14:07:19 Amit Khandekar (amitdkhan) Added amitdkhan as reviewer
2021-02-21 00:07:31 John Naylor (john.naylor) Changed authors to John Naylor (john.naylor)
2021-02-20 21:13:00 John Naylor (john.naylor) Attached mail thread CAFBsxsEV_SzH+OLyCiyon=iwggSyMh_eF6A3LU2tiWf3Cy2ZQg@mail.gmail.com
2021-02-20 21:13:00 John Naylor (john.naylor) Created patch record
Edit