Sunday, November 20, 2016

Cold Start Analysis

Increase the duration of moving window.

Develop hierarchy of keyword groups and calculate PTQS

Infer PTQS from partner attributes

The hierarchy structure can deal with the cold-start and smoothing to some extent.

Isotonic regression in scikit-learn

http://tullo.ch/articles/speeding-up-isotonic-regression/

Isotonic regression is a useful non-parametric regression technique for fitting an increasing function to a given dataset.

A classic use is in improving the calibration of a probabilistic classifier. Say we have a set of 0/1 data-points (e.g. ad clicks), and we train a probabilistic classifier on this dataset.

Unfortunately, we find that our classifier is poorly calibrated - for cases where it predicts about 50% probability of a click, there is actually a 20% probability of a click, and so on.

With a trained isotonic regression model, our final output is the composition of the classifiers prediction with the isotonic regression function.

For an example of this usage, see the Google Ad Click Prediction - A View from the Trenches paper from KDD 2013, which covers this technique in section 7. The AdPredictor ICML paper paper also uses this technique for calibrating a Naive Bayes predictor.

We'll now detail how we made the scikit-learn implementation of isotonic regression more than ~5,000x faster, while reducing the number of lines of code in the implementation.

The nature of a conversion event can vary widely across advertisers. Conversion events can be defined by: submission of ac completed form, a purchase event, subscribing a service, etc. Each of these has different intrinsic conversions rates.

A partner generates traffic from several websites, which may vary widely in traffic quality. Source tag may be a more natural granularity, however, source tag are susceptible to manipulation.

Classified and structured match

Product match



Domain match

No comments:

Post a Comment

Blog Archive