Algorithms and Technical BackgroundΒΆ
Parse.ly recommends articles to users based on several factors: statistical text profiles of the user’s past reading behavior, user taste clustering, and optional user preferences. These will be discussed in turn:
Text profiles: Every action a user performs on an article (such as reading, sharing or ignoring it) is used to train a document classifier based on Naive Bayesian inference. When new articles enter the system, P3 can boost their importance or filter them away – on a per-user basis – by leveraging the user’s existing “resonance profile”. The net effect is that the user sees more personally relevant articles and less that have been known not to elicit a good response.
User taste clustering: Users across the entire Parse.ly Platform are continually clustered into taste profiles, allowing us to perform Amazon/Netflix-style recommendations with their actions. In this way, P3 not only leverages your individual history, but also the collective intelligence of users with similar tastes across all Parse.ly partner sites.
Optional user preferences: This is a set of weighted interests, which are similar to topics/keywords. The interests are normalized and matched against articles to provide the user with an empowering experience, similar to “taste preferences” in systems like Netflix. We recognize that only a small subset of users can be bothered to customize their experience and provide input to our system, so user preferences are entirely optional.
Finally, Parse.ly collects the data that powers our recommender using sophisticated real-time feed processing infrastructure. We plug a publisher’s feed into a PubSubHubub hub, and apply a number of content cleaning and massaging steps in a distributed architecture hosted on our cluster of nodes at Rackspace Cloud. This ensures a clean, high-fidelity reproduction of your content in P3, and ensures content recommendations are available in our widget and API within seconds of publication.