|
|
|
|
|
by LeifCarrotson
276 days ago
|
|
In case "bases with optional newlines" wasn't obvious to anyone else, a specific example (from Wikipedia) is: ;LCBO - Prolactin precursor - Bovine
MDSKGSSQKGSRLLLLLVVSNLLLCQGVVSTPVCPNGPGNCQVSLRDLFDRAVMVSHYIHDLSS
EMFNEFDKRYAQGKGFITMALNSCHTSSLPTPEDKEQAQQTHHEVLMSLILGLLRSWNDPLYHL
VTEVRGMKGAPDAILSRAIEIEEENKRLLEGMEMIFGQVIPGAKETEPYPVWSGLPSLQTKDED
ARYSAFYNLLHCLRRDSSKIDTYLKLLNCRIIYNNNC*
where "SS...EM", HL..VT", or "ED..AR" may be common subsequences, but the plaintext file arbitrarily wraps at column 65 so it renders on a DEC VT100 terminal from the 70s nicely.Or, for an even simpler example: ; plaintext
GATTAC
AGATTA
CAGATT
ACCAGA
TTACAG
ATTACA
becomes, on disk, something like ; plaintext\r\nGATTAC\r\nAGATTA\r\nCAGATT\r\nACCAGA\r\nTTACAG\r\nATTACA\r\n
which is hard to compress, while ; plaintext\r\nGATTACAGATTACAGATTACCAGATTACAGATTACA
is just "; plaintext\r\n" + "GATTACA" * 7
and then, if you want, you can reflow the text when it's time to render to the screen. |
|