|
|
|
|
|
by korbin
724 days ago
|
|
SHA256 is designed as such that the maximum amount of data that can be contained within a single block is 440 bits (55 bytes.) If you carefully organize the nonce at the end and use all 55 bytes, you can pre-hash the first ~20/64 rounds of state and the first several rounds of W generation and just base further iterations off of that static value (this is known as a "midstate optimization.") > If you limit your variable portion to a base16 alphabet like A-P The more nonce bits you decide to use, the less you can statically pre-hash. In FPGA, I am using 64 deep, 8-bit-wide memories to do the alphabet expansion. I am guessing in CUDA you could something similar with `LOP3.LUT`. |
|