Hacker News new | ask | show | jobs
by friday99 2578 days ago
This is such a weird proposal that I think indicates an unawareness of the advertising industry. The reason the internet tracking industry exists is entirely because the first party advertisers don't want to deal with attribution or anything like that. They are only interested with selling their product. They contract out to third party companies specifically to not have to deal with those attribution issues. The reason those third parties have to resort to tracking is because advertising fraud is rampant on the internet and the first party advertisers want some assurance that they aren't just throwing money away. In the traditional advertising world, they can see the ad on TV or printed somewhere and know they aren't getting ripped off. There is absolutely no assurance on the internet that anyone is seeing your ad. So this proposal is saying that the advertiser should be in the business of assuring that their ads are being seen and delivering value, but it still doesn't solve the problem that the advertiser wants to be sure how many of their ads are being seen. Advertisers are paying for the number of times and ad is shown not the number of times it is clicked (this isn't the year 2000), so this proposal gives the advertisers more work to do without actually solving their real problem. I don't foresee any significant adoption of this proposal from advertisers.

Now, if someone could come up with a privacy-preserving solution to advertisers quality assurance problem of buying advertisements on the internet, that would be big business.

8 comments

I don't know enough about the advertising industry to say that this comment is wrong, but it triggers some warning bells for me that make me suspect it might be wrong.

In the podcasting industry, referral codes are used specifically so that advertisers can figure out how many conversions are coming from the podcast -- it's not enough to tell them, "X people heard the thing." They want to know whether or not it's worth advertising based on concrete user acquisition numbers.

AT&T recently put out an interview about their advertising strategy as a content company. The quote from that piece[0]:

> "Regardless of how you see a directed car ad, say, AT&T can then use geolocation data from your phone to see if you went to a dealership and possibly use data from the automaker to see if you signed up for a test-drive—and then tell the automaker, “Here’s the specific ROI on that advertising,” says Lesser. AT&T claims marketers are paying four times the usual rate for that kind of advertising."

And on a more fundamental level, it doesn't make sense to me why targeted ads would even be such a profitable industry in the first place if advertisers didn't care about increasing ROI per ad -- and that means improving click-through rates and conversion rates. I don't see how I wouldn't want attribution stats per-campaign if I was trying to improve my ad targeting, or doing AB tests on different marketing styles.

It is entirely possible that I don't understand the advertising industry very well, but from everything I've seen, advertisers do seem to care about attribution, a lot. Maybe click-through isn't the best way to measure that, but I don't see strong evidence that brand recognition or exposure is a more valuable target for advertisers to be pursuing than direct revenue impact.

[0]: http://fortune.com/longform/att-media-company/

In the podcasting industry, referral codes are used specifically so that advertisers can figure out how many conversions are coming from the podcast

The old-school method of this was to have unique response numbers. It's why you see phone numbers like "800-555-1212 extension 37." When you call the number and ask for extension 37, the person on the horn says, "Yes, this is extension 37..." and begins their pitch while noting that affiliate 37 gets the conversion credit.

It's similar to TV commercials you see that tell you to "Go to example.com/TV37."

Some lead generators are willing to splash out extra money on specific phone numbers without extensions because they believe they convert better. It's why toll free phone numbers expanded so rapidly from just 800 and 888 to 877, 866, 855, and 844.

They do care about the whole attribution process since this is very relevant to incrementality (which is equivalent to 'they aren't just throwing money away'). But this doesn't necessarily mean that they want to deal with all the details themselves. This is my guess for the parent reply's intention.
" Advertisers are paying for the number of times and ad is shown not the number of times it is clicked (this isn't the year 2000), so this proposal gives the advertisers more work to do without actually solving their real problem. "

I might be mis-reading this but isn't Adwords model specifically disruptive because it charged per click and not per view. It's why when you click on a search ad you see ?gclid=... appended to the end of a URL so that the ad you clicked gets credit... I am confused or missing the reason for your statement about impressions being more important than a click?

Thanks!

I didn't know about "gclid=...", and "fbid=..." in URLS. After reading your post, I found an extension to remove them:

https://github.com/jparise/chrome-utm-stripper

With easily installable packages available for Chromium and Firefox:

https://chrome.google.com/webstore/detail/tracking-token-str...

https://addons.mozilla.org/en-US/firefox/addon/utm-tracking-...

There is also a feedback loop on adwords and other CPC spends. Run ads that get few clicks and get penalized in cost or impressions -- in severe cases to the point of being off-lined. What business that runs CPC placements wants to show an ad with a lower click through rate.
Also, maybe I'm missing something but I don't see how this addresses fraud and any advertising solution that doesn't address fraud is basically DOA. How do they ensure that clicks and conversions are legitimate? It's already a hard problem, despite the large amounts of data that legitimate browsing activities leak.
Do you know how/why it is such a hard problem?

I assume you have:

* Browser data (UAS, screen size, network speed, localStorage)

* IP data (therefore some proxy of geographic data)

* Third party data (eg. Google Analytics demographic data)

* Odd click patterns (eg. from the same IP, bursts within a short window)

* Finally you can see who is benefiting from the clicks (eg. certain publishers) and suspend their account

I feel like all this data would generate a substantial click "footprint" that you could run through an ML model. At worst, these third-party advertising companies can suspend whoever is benefiting from the clicks if they gather enough suspicious evidence.

There are multiple types of fraud. One is bots that give fake impressions, but another is fraudulent publishers that give improper ad placement (e.g. overlapping ads or invisible ads). In the second type, the user is legitimate, so you can't entirely rely on something that identifies illegitimate users. I think this is one reason why ads aren't always sandboxed in iframes since you need a way to detect if the ad is actually visible in the root frame.

Behavior tracking is difficult since it's hard to say that a legitimate user will never do something. E.g. large ISP NATs thwart IP tracking by giving many customers the same IP. Safari blocks 3rd party cookies.

Google has a somewhat well known bot countermeasure called botguard that does a decent job proving that you are probably running an entire browser, but that only marginally increases the cost of fraud to running a browser instance per-bot. Increasing per-impression cost for fraudsters can put them out of business, but increasing per-impression cost to detect fraudsters can put advertisers out of business.

Also, ad-targeting is often a realtime problem. You have to decide what ad, if any, to show within milliseconds. Do you never show ads to unrecognized users? How much turnaround time will you need before you can precompute a profile and start showing ads to a legitimate user? How much turnaround time do you need for detecting and blocking fraud?

Unfortunately, specific countermeasures aren't often publicly published since one of the greatest costs of ad fraud is figuring out and then circumventing countermeasures. E.g. you might have a hard time reverse engineering something faster than it's being engineered by 20 people at Google.

"Also, ad-targeting is often a realtime problem."

Surely Google are caching a queue of adds for each user and similarly for "random unknown user"? Why would this have to be real time?

Programmatic advertising is 100% a real-time, per request bidding process. There is no queue of ads. Virtually all banner advertising on the web now is done this way.
Just because it's Google's code on the publisher page, doesn't mean it's Google's customer's ad that shows up on the page. It's entirely possible a third party is willing to pay more than any of Google's own customers, so it's auctioned off to Google's customers, and Google's partners (who auction it among their own customers).

Also advertisers often want to do dynamic stuff too. Or may be willing to pay more for the same user in different contexts. Or utterly unwilling to have their ad on sites with UGC. And you don't know where the user will show up next.

I won't go into details but you seem to be assuming specific, relatively unsophisticated methods. Also, not everything you mention is available or useful and it's not close to enough to for more sophisticated frauds.[1][2] Keep in mind that most ads are paid on a per-impression basis - the main reason to simulate clicks is because at some point people will notice if a specific site consumes a bunch of impressions but doesn't contribute any clicks. Ad-tech companies tend to be competent in ML since it's necessary for optimization, but fraud remains a hard problem.

[1] https://www.buzzfeednews.com/article/craigsilverman/porn-run...

[2] https://clearcode.cc/blog/rtb-online-advertising-fraud/

As soon as you have the ML model you have the method to train a fraud bot. Just keep iterating on it until it fools the model.
"In the traditional advertising world, they can see the ad on TV or printed somewhere and know they aren't getting ripped off. There is absolutely no assurance on the internet that anyone is seeing your ad."

If this is true, it sounds like the internet really isn't well-suited for advertising.1

How have certain companies become so enriched by selling something that has such a high risk of not delivering value? If what you say is true, it stands to reason that many clients are getting ripped off.

1 I still find it interesting that the web/mobile ad industry almost always relies on web browsers/apps to make ads workable. These programs must process what is returned from a request for content, auto-load resources from third party hosts, and often interpret and execute Javascript code to make additional resource requests. A user can successfully request the content from a web page with a single domain name, DNS lookup and HTTP request, the basic functionality of the internet and web, without using a web browser, but that alone does not suffice to deliver ads.

The reason those third parties have to resort to tracking is because advertising fraud is rampant on the internet and the first party advertisers want some assurance that they aren't just throwing money away.

This does not sound correct to me. (I was in the digital ad industry.) The interest in conversion counting is not just because you can't trust views/clicks. It's because conversions are the most useful event to track. You might be paying for views, but your conversions are your best proxy for ROI on one ad campaign vs. another.

I agree with your prior point that the first party who is selling ad space doesn't want to think about it. That part is entirely true.

> Advertisers are paying for the number of times and ad is shown not the number of times it is clicked (this isn't the year 2000),

aren't facebook and google ads cost per click rather than cost per impression?

There are many types of models that you can have on all these platforms. It's definitely not JUST based on impressions.
FB is CPM (Generally) Google is both depending on the ad product.
FB is not CPM, they charge for "reach" which is based on users, not impressions. It's kind of unique.
I'm sorry but this is false. They do charge based on CPM which is why they can continue charging you for serving higher and higher frequencies to the same audience. If they charged based on reach then once you went over frequency 1 they would stop charging correct?
>They do charge based on CPM which is why they can continue charging you for serving higher and higher frequencies to the same audience.

They charge based on reach and will serve your ad to users as many times as they want until they illicit a reaction. There is some proprietary algo at work to determine how your spend gets distributed.

How do we know this? If my paid post gets a good response, it gets free impressions. If my paid post gets poor response, it gets few impressions and I get a warning about the post.

Facebook is incentivizing good content, and it's not charging you per impression. Impressions definitely come into play, but what is more important is users, frequency, and engagement. Unlike other platforms.

I'm sorry but again no. Your rationale is due to organic sharing of the ad rather than some sort of weird reach charging. Again if they were charging by reach and I was hitting frequency 7 or something surely they couldn't keep charging because those people have already been reached correct?

I'm well aware of the Facebook's algorithm that prioritizes ads with a higher relevancy score (or rather the 3 categories of quality that they recently replaced relevancy score with). They punish bad ads by artificially raising CPMs and artificially lowering them for quality ads.

So once again, yes they are charging you by impression. They just also make a distinction between paid impressions and earned impressions. Eg someone shares your ad and their friends see it = earned impressions.

Here's a link where Facebook says they're explicitly charging you based on CPM or CPC: https://www.facebook.com/business/a/ad-bidding

Relevant quote: "Depending on the type of bid you choose, you only pay for clicks or impressions when you run ads. Your ads will be deployed evenly over time, and you'll never be charged over your budget."

i can tell you i have insight into 9 figures worth of pay-per-click advertising on Google, Bing and FB yearly. pay per click is very much alive and well on these platforms.
If I were you, I would be embarassed for posting a comment with such an authoritative voice on a topic you clearly know so little about. I've worked with more than 20 ad teams for a wide variety of companies, from seed stage to public ones worth billions. Everyone cares incredibly much about attribution and properly tracking ad spend to sales.
> The reason the internet tracking industry exists is entirely because the first party advertisers don't want to deal with attribution or anything like that.

Attribution exists because there has been in the past few years a major pressure from advertisers to relate actual sales with digital media investments.

If you go some years back a lot of major brands cut on their digital media investment because they weren't seeing a return when compared to other medium - like TV.

>They are only interested with selling their product. They contract out to third party companies specifically to not have to deal with those attribution issues.

They are interested in sales, branding and customer service. Some brands work hand-in-hand with media agencies - agencies who do the buying, manage and optimize all the media budget. These agencies have the technical know-how for campaign setup, automation, tracking, etc. It's not because they don't want to deal with it, it's because they would have to make a huge investment in human resources and tech. Some brands did/do move all of this in-house, or outsource it to agencies with exclusive, or with high FTAs.

> The reason those third parties have to resort to tracking is because advertising fraud is rampant on the internet and the first party advertisers want some assurance that they aren't just throwing money away.

Tracking is used to measure the performance of the campaigns - delivery itself, plus the results on the advertiser side. Ad fraud is indeed rampant, that's why there has been a lot of development to mitigate this issue (like viewability).

> In the traditional advertising world, they can see the ad on TV or printed somewhere and know they aren't getting ripped off. There is absolutely no assurance on the internet that anyone is seeing your ad. So this proposal is saying that the advertiser should be in the business of assuring that their ads are being seen and delivering value, but it still doesn't solve the problem that the advertiser wants to be sure how many of their ads are being seen.

There's no way to know if an ad was seen on any media. TV campaigns are measured with a sample of the population with a device that tracks what's being viewed - but you don't know if a person is using their smartphone when your ad is running. The problem with the internet is that there's no filtering between what's being done by a machine and by a human when it comes to the campaign delivery - and that's directly attached to the buying model for the advertiser. They wouldn't care about wastage if they didn't have to pay for it.

> Advertisers are paying for the number of times and ad is shown not the number of times it is clicked (this isn't the year 2000), so this proposal gives the advertisers more work to do without actually solving their real problem. I don't foresee any significant adoption of this proposal from advertisers.

Advertisers pay for a lot of things, for reach, for views, for clicks, for seconds of audio, for grids of outdoors, the list goes on, so you can't just say they aren't paying for clicks - hell Amazon is the new big boy in advertising and you pay per click.

The golden goose of attribution is to connect from the point of contact in advertising, all the way to the post-sale (the so called customer loyalty programs). If you can tell someone who saw an ad on Youtube, and when he went to a shop out of the shelf (where you have a 15% share of shelf), bought your product, and came to the social media saying how damn great that cookie is.

Any step closer to that is a win for advertisers.