* encoding detection from <meta> tags doesn't normalize encodings - Python doesn't use the same names as HTML;
* I'm still not sure encoding detection is correct, as it is unclear what are priorities in the current implementation. It should be 1) Content-Type header; 2) BOM marks; 3) encoding in meta tags (or xml declared encoding if you support it); 4) content-based guessing - chardet, etc., or just a default value. I.e. encoding in meta should have less priority than Content-Type header, but more priority than chardet, and if I understand it properly, response.text is decoded both using Content-Type header and chardet.
* lxml's fromstring handles XML (XHTML) encoding declarations, and it may fail in case of unicode data (http://lxml.de/parsing.html#python-unicode-strings), so passing response.text to fromstring is not good. At the same time, relying on lxml to detect encoding is not enough, as http headers should have a higher priority. In parsel we're re-encoding text to utf8, and forcing utf8 parser for lxml to solve it: https://github.com/scrapy/parsel/blob/f6103c8808170546ecf046....
* absolute_links doesn't look correct for base urls which contain path. It also may have issues with urls like tel:1122333, or mailto:.
For encoding detection we're using https://github.com/scrapy/w3lib/blob/c1a030582ec30423c40215f... in Scrapy. It works well overall; its weakness is that it doesn't require a HTML tree, and doesn't parse it, extracting meta information only from first 4Kb using a regex (4Kb limit is not good). Other than that, it does all the right things AFAIK.
With libraries like these, it's all about getting the API right first, them optimizing for perfection second. :)