|
|
|
|
|
by zzzcpan
2539 days ago
|
|
Do you even need to match Google's robots.txt parsing behavior? With less than 1000 lines you can be pretty sure they are not doing it right and are breaking plenty of people's assumptions about it. Either way you have to test it on real world data. |
|
That's why I'm saying there's no point trying to re-implement this. If you were going to re-implement this, there's probably already a library that will work well enough for you. The value here is solely in being exactly what Google uses; anything that is a "re-implementation" of this code but isn't exactly what Google uses is missing the point.
If they formalize it into a spec, others may then implement the spec, but they can and should do that by implementing the spec, not porting this code.