Measures of Similarity
Date:
This post is inspired by the recent preprint that I released called ‘Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content’.
In this work, we develop a method for estimating human’s subjective similarity using a cognitive model. Similarity between stimuli has been a major theme in my research, and it is a major challenge to cognitive modeling in general. In this post I will detail the importance of metrics of similarity that accurately reflect how humans perceive the similarity of stimuli they are presented with. I will focus on Instance Based Learning models, and in a later post discuss how similarity is relevant for Reinforcement Learning models and how we apply Generative AI representations onto determining similarity.
In Instance-Based Learning models, similarity between two instances in memory must be calculated to determine the activation of an instance in memory using the equation:
\[A_i(t) = \ln \Bigg( \sum_{t' \in \mathcal{T}_i(t)} (t - t')^{-d}\Bigg) + \mu \sum_{j \in \mathcal{F}} \omega_j (S_{ij} - 1) + \sigma \xi\]The important term in this equation is $S_{ij}$ which represents the similarity between the instance $i$ held in memory, and the current instance being evaluated by the model. The $j$ term represents the feature of the instance, also called attributes.
A concrete example of this is to consider a binary choice task with two attributes, shape and color. We begin by considering an agent that has experience selecting a blue-circle and observing a reward of 1, followed by an experience of selecting a red-square and observing a reward of 0. Then we are considering how this agent will evaluate the expected utility associated with a blue-square based on this experience. We can represent this agent’s memory as a list of tuples:
\[\mathcal{M} = \Big[ \big( \{ \text{shape:circle, color:blue} \} , 1\big), \big( \{ \text{shape:square, color:red} \} , 0\big) \Big]\]Here the set of features $\mathcal{F}=$[shape,color], so when we are calculating the
If we were to try and program this using the python pyIBL package using the following code:
from pyibl import Agent
agent = Agent(name="Contextual Bandit", attributes=["shape","color"])
agent.populate({"shape":"circle","color":"blue"}, 1)
agent.populate({"shape":"square","color":"red"}, 0)
choice, details = agent.choose([{"shape":"square","color":"blue"}], details=True)
we get an error stating that
RuntimeError: No experience available for choice {'shape': 'square', 'color': 'blue'}
This is because the pyIBL agent does not have a defined similarity function that can relate it to its past experience. Without this, it is required to rely either on a default utility that will be output for the utility of any stimuli that the agent does not have experience for, or it will need to have an experience of an option before estimating its utility. To solve this issue through the application of similarity functions, we can alter the above code as follows:
from pyibl import Agent
def identity_similarity(x,y):
if(x==y):
return 1
else:
return 0
agent = Agent(name="Contextual Bandit", attributes=["shape","color"], mismatch_penalty=1)
agent.similarity(["shape", "color"], identity_similarity)
agent.populate({"shape":"circle","color":"blue"}, 1)
agent.populate({"shape":"square","color":"red"}, 0)
choice, details = agent.choose([{"shape":"square","color":"blue"}], details=True)
print(details[0]['blended_value'])
Now that we have defined a similarity metric for our attributes, we can estimate the value associated with the novel stimuli without an error. We additionally need to define the mismatch_penalty parameter for this agent, which corresponds to the value $\mu$ in the above equation. If we observe the details of this agent as it is determining the value of the blue-square, we can see that it gives a blended value of 0.406. This value will change due to the random noise $\sigma \xi$ in the activation equation. To make this blended value consistent we can set the temperature and noise parameters of our ibl agent as follows
agent = Agent(name="Contextual Bandit", attributes=["shape","color"], mismatch_penalty=1, temperature=1, noise=0)
This results in the agent predicting a blended value of 0.5 exactly every time the python script is run. This makes intuitive sense given how we defined the similarity metric using the identity function. The model is estimating the value associated with a blue-square based on the past experience of a blue-circle that had a value of 1 and a red-square that had a value of 0. Both of these previous instances have a similarity of 0.5 in relation to the current instance we are evaluating. That is because $S_{1\text{color}} = 1$, $S_{1\text{shape}} = 0$, $S_{2\text{color}} = 0$, $S_{2\text{shape}} = 1$. The first instance in memory matches with respect to color but not shape, and the second instance in memory matches in terms of shape but not color. Thus, the activation function will be equal for these two instances in memory, and additionally the probability of retrieval $P_i$ meaning when we calculate the expected utility of the novel option using:
\[V_k(t) = \sum_{i \in \mathcal{M}_k} P_i(t)u_i\]It will be equal to $V_k(t) = [0.5 * 1] + [1 * 0.5] = 0.5$. Note that decay is not relevant here because we prepopulated these instances, but we could have also represented them as standard experience and set decay to 0 to get the static expected utility of 0.5.
It is important to note that the way this value was calculated was majorly impacted by the similarity metric we used, which weighed each of the features equally and resulted in the estimated utility of 0.5. However, what if we consider an agent that is only paying attention to color and totally ignores shape? One way to represent such an agent is through similarity functions as follows:
def always_zero(x,y):
return 0
agent = Agent(name="Contextual Bandit", attributes=["shape","color"], mismatch_penalty=1, decay=0, temperature=1, noise=0)
agent.similarity(["color"], identity_similarity)
agent.similarity(["shape"], always_zero)
agent.populate({"shape":"circle","color":"blue"}, 1)
agent.populate({"shape":"square","color":"red"}, 0)
choice, details = agent.choose([{"shape":"square","color":"blue"}], details=True)
print(details[0]['blended_value'])
This results in a blended value of 1, since the similarity between the current blue-square option and the instances in memory is 1 for the blue-circle, and 0 for the red-square. Through this example we can see how similarity metrics are crucial, particularly when human decision makers may be using unintuitive metrics of similarity that may differ from what we as cognitive modelers naively assume to be their personal subjective metric of similarity.
This idea of measuring similarity gets more complicated as we think of increasingly complex stimuli, and environments where many different attributes may be interacting with another to determine an individual participant’s metric of similarity. In the next post on this topic I will further discuss the background of similarity functions as they are used in tabular reinforcement learning, deep reinforcement learning, and how we extend this notion of similarity into the representations of stimuli formed by generative AI models.