Hacker News new | ask | show | jobs
by conradlee 4539 days ago
I'm one of the Synference co-founders, and we've thought a lot about how to make it easy to extract relevant information from users who land on your site for the first time.

There are many important pieces of information you can grab from a user who's never been on your site based on basic information available to all web servers, such as the user's IP address and user agent: (1) the time of day/day of week they're visiting -- relevant because, e.g., night users or weekend users might be a different demographic (2) information related to the particular page they're visiting, such as article category on wikipedia (3) operating system type and age (4) device type -- smartphone, tablet, pc (5) geographic location.

You can use location to infer a range of demographic features, such as income, education, etc. Due to its importance, location is an attribute that Synference adaptively coarsens/zooms in on. For example, if you have a lot of users from a certain city, then each user within that city will have a location whose granularity goes down to the neighborhood. But if you have relatively few users from Europe, then each user's location might only be at the level of the country.

The Synference API makes it easy to extract this information. All you have to do is send the API the user's IP address and user agent, and the API parses those two pieces of information into most of the features I've just listed. We also add a timestamp (corrected for the user's timezone using the IP address's geographic location).

We don't need to split users into buckets---machine learning classifiers build models that automatically take care of deciding how a user's attributes can be used to predict the target metric.