Speed up verifying UTF-8

Edit
ID 2996
Title Speed up verifying UTF-8
Topic Performance
Created 2021-02-20 21:13:00
Last modified 2021-12-20 14:26:44 (2 years, 11 months ago)
Latest email 2021-12-20 14:24:40 (2 years, 11 months ago)
Status
2022-01: Committed
2021-11: Moved to next CF
2021-09: Moved to next CF
2021-07: Moved to next CF
2021-03: Moved to next CF
Target version
Authors John Naylor (john.naylor)
Reviewers Heikki Linnakangas (heikki), Amit Khandekar (amitdkhan)Become reviewer
Committer John Naylor (john.naylor)
Links CFbot results (CirrusCI) CFbot GitHub
Checkout latest CFbot patchset Go to your local checkout of the PostgreSQL repository and run:
git remote add commitfest https://github.com/postgresql-cfbot/postgresql.git
git fetch commitfest cf/2996
git checkout commitfest/cf/2996
Emails
[POC] verifying UTF-8 using SIMD instructions
First at 2021-02-01 17:32:23 by John Naylor <john.naylor at enterprisedb.com>
Latest at 2021-12-20 14:24:40 by John Naylor <john.naylor at enterprisedb.com>
Latest attachment (v25-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-12-13 15:39:37 from John Naylor <john.naylor at enterprisedb.com>
    Attachment (v25-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-12-13 15:39:37 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v24-0001-Add-fast-path-for-validating-UTF-8-text.patch) at 2021-10-19 21:42:40 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v23-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-08-26 15:35:54 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v22-addendum-32-bit-transitions.txt) at 2021-08-24 16:00:28 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v22-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-08-04 11:22:57 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (mbverifystr-threshold.sql) at 2021-07-30 01:12:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: No)
    Attachment (v21-0001-Add-fast-paths-for-validating-UTF-8-text.patch) at 2021-07-28 18:12:11 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v20-0001-Add-a-fast-path-for-validating-UTF-8-text.patch) at 2021-07-26 11:09:00 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v2-0001-XXX-Make-SIMD-code-more-platform-neutral.txt) at 2021-07-22 00:07:26 from Thomas Munro <thomas.munro at gmail.com> (Patch: Yes)
    Attachment (0001-XXX-Make-SIMD-code-more-platform-neutral.txt) at 2021-07-21 15:29:21 from Thomas Munro <thomas.munro at gmail.com> (Patch: Yes)
    Attachment (v19-rewrite-pg_verify_str-for-speed.patch) at 2021-07-20 21:24:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v18-0001-Use-pure-DFA.patch) at 2021-07-19 01:26:47 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v17-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-07-17 00:02:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v16gamma-Rewrite-pg_utf8_verifystr-for-speed.txt) at 2021-07-15 22:00:05 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v16alpha-Rewrite-pg_utf8_verifystr-for-speed.txt) at 2021-07-15 18:12:43 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v15-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-07-12 19:45:39 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v14-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-30 11:18:32 from Heikki Linnakangas <hlinnaka at iki.fi> (Patch: Yes)
    Attachment (v13-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-29 11:20:38 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v12-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-10 12:45:01 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (Chinese62.txt) at 2021-06-07 12:39:40 from John Naylor <john.naylor at enterprisedb.com> (Patch: No)
    Attachment (v11-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-06 19:21:51 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v10-0001-Rewrite-pg_utf8_verifystr-for-speed.patch) at 2021-06-02 16:26:41 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v9-0001-Replace-pg_utf8_verifystr-with-two-faster-impleme.patch) at 2021-04-01 14:22:06 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v8-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-03-19 19:24:06 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v7-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-24 21:50:50 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v6-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-24 16:25:49 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v4-0001-Add-noError-argument-to-encoding-conversion-funct.patch) at 2021-02-20 21:10:58 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (addendum-01-8-byte-stride.patch) at 2021-02-19 00:43:04 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v4-SSE4-with-autoconf-support.patch) at 2021-02-17 05:40:32 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v3-SSE4-with-autoconf-support.patch) at 2021-02-16 01:32:52 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v2-add-portability-stub-and-new-fallback.patch) at 2021-02-13 01:31:33 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (utf-sse42-demo.patch) at 2021-02-09 21:12:22 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (simdjson-utf8-hack.patch) at 2021-02-08 10:17:11 from Heikki Linnakangas <hlinnaka at iki.fi> (Patch: Yes)
    Attachment (v1-0001-Add-an-ASCII-fast-path-to-multibyte-encoding-veri.patch) at 2021-02-07 20:24:16 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
    Attachment (v1-verify-utf8-sse-ascii.patch) at 2021-02-01 17:32:23 from John Naylor <john.naylor at enterprisedb.com> (Patch: Yes)
History
When Who What
2021-12-20 14:26:44 John Naylor (john.naylor) Closed in commitfest 2022-01 with status: Committed
2021-12-13 19:18:34 John Naylor (john.naylor) Added john.naylor as committer
2021-12-13 19:18:25 John Naylor (john.naylor) New status: Needs review
2021-12-08 18:12:16 John Naylor (john.naylor) New status: Waiting on Author
2021-12-02 11:32:46 Daniel Gustafsson (d_gustafsson) Closed in commitfest 2021-11 with status: Moved to next CF
2021-10-19 21:43:20 John Naylor (john.naylor) New status: Needs review
2021-10-18 14:57:44 John Naylor (john.naylor) New status: Waiting on Author
2021-10-06 13:16:44 Jaime Casanova (jcasanov) Closed in commitfest 2021-09 with status: Moved to next CF
2021-08-04 11:24:00 John Naylor (john.naylor) New status: Needs review
2021-08-03 02:46:29 Masahiko Sawada (masahikosawada) New status: Waiting on Author
2021-08-03 02:27:17 Masahiko Sawada (masahikosawada) Closed in commitfest 2021-07 with status: Moved to next CF
2021-07-26 11:09:34 John Naylor (john.naylor) New status: Needs review
2021-07-22 03:17:48 John Naylor (john.naylor) New status: Waiting on Author
2021-07-12 19:46:02 John Naylor (john.naylor) New status: Needs review
2021-07-04 20:02:41 John Naylor (john.naylor) New status: Waiting on Author
2021-06-30 12:06:37 Heikki Linnakangas (heikki) Added heikki as reviewer
2021-06-02 16:30:48 John Naylor (john.naylor) Changed name to Speed up verifying UTF-8
2021-04-08 15:44:06 David Steele (dsteele) Closed in commitfest 2021-03 with status: Moved to next CF
2021-03-03 14:07:19 Amit Khandekar (amitdkhan) Added amitdkhan as reviewer
2021-02-21 00:07:31 John Naylor (john.naylor) Changed authors to John Naylor (john.naylor)
2021-02-20 21:13:00 John Naylor (john.naylor) Attached mail thread CAFBsxsEV_SzH+OLyCiyon=iwggSyMh_eF6A3LU2tiWf3Cy2ZQg@mail.gmail.com
2021-02-20 21:13:00 John Naylor (john.naylor) Created patch record
Edit