A type of Algorithm that is used in recommendation systems. It uses probability to prioritize actions. Still in the process of reading the material and trying to understand how exactly it works. However, to quote the paper;

The idea behind posterior sampling algorithm is to force
optimism through probabilistic action. Specifically at each
time step,
t, we will make a recommendation j(t) based on
the probability that it is the best possible recommendation,
P j(t) = j(t). However, this probability is inaccessible, so
instead the algorithm samples a model for the unknown feature
vectors based on the probability that they are the true feature
vectors (given the viewing history), and finds the optimal
recommendation should this be the true model. It can be
shown that this sampling technique is equivalent to sampling
a recommendation based on the probability it is optimal, and a
more detailed description of the algorithm and its motivations
can be seen in. Thus the algorithm proceeds to keep track
of the distribution of model parameters at each time step, and
updates them accordingly