Fair problem class

A ‘fair’ problem class, in logical decision theory, is a problem class in which agents’ payoffs depend only on their choices and policies, not directly on their algorithm. For example, suppose my problem is “If an agent chooses X, I will give them $10, and if they choose Y, I will give them $0.” This problem is ‘fair’ because we made no mention of why the agent chose X or Y; only the agent’s actual choice matters. On the other hand, suppose I say, “I will give an agent $10 if it chooses X over Y because it is an alphabetizing agent and X is higher in alphabetical order than Y; but if the agent chooses X for any other reason, like wanting money, I will give the agent $0.” This problem is not ‘fair’ because it rewards having an algorithm apart from any output or other abstract behavior of that algorithm.