With a more straightforward approach, the tool can be reproduced with just a few queries in ClickHouse.
1. Create a table with styles by authors:
CREATE TABLE hn_styles (name String, vec Array(UInt32)) ENGINE = MergeTree ORDER BY name
2. Calculate and insert style vectors (the insert takes 27 seconds):
INSERT INTO hn_styles WITH 128 AS vec_size,
cityHash64(arrayJoin(tokens(lower(decodeHTMLComponent(extractTextFromHTML(text)))))) % vec_size AS n,
arrayMap((x, i) -> i = n, range(vec_size), range(vec_size)) AS arr
SELECT by, sumForEach(arr) FROM hackernews_history GROUP BY by