Y
Hacker News
new
|
ask
|
show
|
jobs
by
ngrilly
3552 days ago
Did you store the plain text of each PDF in PostgreSQL or just the ts_vector resulting from the plain text?
1 comments
fatbird
3552 days ago
IIRC, I stored the plain text too because the engine can return contextually marked up plaintext after finding it in the ts_vector.
link
ngrilly
3552 days ago
You're right, PostgreSQL needs the plain text to highlight it with ts_headline. It's similar to Elasticsearch keeping the original document in the _source attribute. Thanks!
link