| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by verst 2450 days ago

I get your perspective and skepticism, I really do. I have no incentive to defend Twitter. I cannot say whether this was done deliberately or not, but it absolutely could have been a mistake by a single engineer at Twitter.

The JIRA will just have been something vague like "add support for phone number matching to tailored audience matching pipeline" likely created by a manager on the ads infra team. Context will have already been assumed. Given that these are simple data pipelines there likely will not have been a design document specifically calling out the fields to match against for this task.

At Twitter it was also possible to deploy these Hadoop jobs without checking in code. They would require to be run as the main ads system service accounts, but most ads engineers should have had the ability to deploy such a job.

As I mentioned earlier, the fragility of this part of the ads infrastructure I observed in 2015 makes me believe that a mistake is entirely possible here.

Example: Hadoop job writes some output file to HDFS, a different job reads files from a particular location on HDFS and processes them. If no files exist there must not have been anything to process right? But it could have also been the case the first Hadoop job failed which nobody noticed subsequently.

Anyways, it could have been an engineer by mistake, an engineer trying to get promoted and increasing revenue numbers, or an action at the direction of management. Don't rule out the first option though...