|
|
|
|
|
by nprescott
1178 days ago
|
|
This reminded me of a prior discussion[0] on bulk data generation in SQLite with Rust (vs Python vs PyPy) which previously led me to trying out two different techniques using just SQLite[1]. The approach here is so similar I tried my prior solution on the slowest VPS I have access to (1 vCPU core 2.4GHz, 512Mi, $2.50/month from vultr): sqlite> create table users (id blob primary key not null, created_at text not null, username text not null);
sqlite> create unique index idx_users_on_id on users(id);
sqlite> pragma journal_mode=wal;
sqlite> .load '/tmp/uuid.c.so'
sqlite> .timer on
sqlite> insert into users(id, created_at, username)
select uuid(), strftime('%Y-%m-%dT%H:%M:%fZ'), 'hello'
from generate_series limit 100000;
Run Time: real 1.159 user 0.572631 sys 0.442133
where the UUID extension comes from the SQLite authors[2] and generate_series is compiled into the SQLite CLI. It is possible further pragma-tweaking might eke out further performance but I feel like this representative of the no-optimization scenario I typically find myself in.In the interest of finding where the bulk of the time is spent and on a hunch I tried swapping the UUID for plain auto-incrementing primary keys as well: sqlite> insert into users(created_at, username) select strftime('%Y-%m-%dT%H:%M:%fZ'), 'hello' from generate_series limit 100000;
Run Time: real 0.142 user 0.068090 sys 0.025507
Clearly UUIDs are not free![0]: https://news.ycombinator.com/item?id=27872575 [1]: https://idle.nprescott.com/2021/bulk-data-generation-in-sqli... [2]: https://sqlite.org/src/file/ext/misc/uuid.c |
|