The page is currently under construction

Minimax Regret

The aim of this webpage is to present various concepts related to the notion of regret e.g., minmax regret, Hannan regret, and to provide interested readers with an extensive bibliography on regret.

MINIMAX REGRET

Minimax regret is a normative concept for decision making under uncertainty that has axiomatic foundations.

IMPLEMENTATION
There is no prior, instead it is a worst case analysis similar to the maximin criterion. After establishing the possible states and outcomes for each state and each action one transforms all outcomes into von Neumann Morgenstern utilities. Notice that a rational decision maker or Bayesian would proceed by assessing a probability distribution over the unknown states and then choosing the action that maximizes expected utlity. Under minimax regret the decision maker considers for each action in each state the difference between the highest utility in that state and the utility of the given action in that state. This difference is called regret, one thus derives a matrix of regrets. The choice is then made by finding the action that minimizes among all actions the maximum regret of the action across all states. Typically one allows for mixed actions in order to enable the decision maker to hedge "risk" across states.

EXAMPLE
To illustrate, assume that there are two action "bring" (your umbrella) and "not bring" (your umbrella) and two states "rain" and "shine" with outcomes as given in the following matrix and preference over outcomes given by wet < heavy < dry.

outcomes
rain
shine
bring
heavy
heavy
not bring
wet
dry

Tranferring this into von Neumann Morgenstern utilities means to assign utility 1 to the best outcome "dry" and 0 to the worst outcome "wet" and c with 0<c<1 to "heavy" such that the decsion maker is indifferent between "heavy" for sure and "dry" with probability c and "wet" with probability 1-c. [Aside: In this example it just so happens that if the dm would be a Bayesian and would believe that "shine" occurs with probabilty c then she would be indifferent between "bring" and "not bring".] This leads us to the matrix of utilities:


utilities
rain
shine
bring
c
c
not bring
0
1


Next we derive for each given action and each given state the regret:

regret
rain
shine
bring
0
1-c
not bring
c
0

Finally we select for minimax regret. Assume that the umbrella is sufficiently heavy so that c<0.5. Then "not bring" attains minimax regret among the set of pure actions and c*"bring"+(1-c)*"not bring" attains minimax regret among all pure and mixed actions. So if for instance c=0.4 then regret is bounded above in the minimax regret solution by 0.4 in the case of pure actions and by 0.24 in the case of mixed actions.


COMMENTS
Regret appears in the name of the criterion as it is as if the decision maker is worried about lossed opportunities, about finding out ex-post what the state was and hence "regretting" not having chosen the best action in that state. However such feedback is not necessary for an implementation of the criterion as everything is ex-ante. It is as if the decision maker anticipates ex-ante that she will learn the state ex-post and ex-ante chooses an action to minimize expost emotional distress due to regret. However, despite this story, the foundations of minimax regret are not behavioral but purely axiomatic.

FOUNDATIONS
Independence of irrelevant alternatives is relaxed. Instead one only requires that choice is not altered of some action is added that does not change the best outcome in any of the states. So this axiom tries to assess two decision problems as being similar if a decision maker who knows the state would not care which one she is in. As a consequence, the action selected will depend on the set of actions available. One says that this criterion is menu dependent.


RELATIVE MINIMAX

This criterion is very closely related to competitive ratio, it is also a normative concept for decision making under uncertatinty with axiomatic foundations. It is also a menu dependent criterion. The difference is that now outcomes in each state are measured in relative terms to the best outcome and not in absolute terms. The advantage is that one obtains a scale free notion as everything is measured in efficiency losses.

IMPLEMENTATION

Specify the worst outcome where this worst outcome must not be achievable by one of the actions, it must merely exist as it will be used as a common point of reference. Choice is made as follows. Proceed first as above, assigning von Neumann Morgenstern utilities, thereby assigning 1 to the best outcome in the matrix and utility 0 to the worst outcome. Next consider each state separately and divide all entries in that state by the highest utility in that state. Finally select the action that maximizes the minimum utility, with or without mixing.

EXAMPLE
Consider again the above example. Assume that the common worst outcome to both states is "stay home". Then we obtain the following matrix


utilities
rain
shine
bring
g
g
not bring
d
1

with 0<d<g<1 where actually g=c+(1-c)*d as the dm is indifferent between "bring" and "not bring" when shine occurs with probability c.

Next we divide all entries statewise by the highest element in that state, obtaining

relative utilities
rain
shine
bring
1
g
not bring
d/g
1

Finally, we select "bring" among the pure actions if g>d/g, for instance this is true if g=1/2 and d=1/6 (so c=2/5) in which case the value of relative minimax is equal to 1/2. In other words, at least 50% of the total utility achievable will be realized in each state. For these parameters we find 4/7*"bring"+3/7*"not bring" attaining relative minimax among the mixed actions. Here at least 5/7 (or 71%) efficiency can be guaranteed.

Note that the dm is indifferent beween "wet" and d/g*"heavy"+(1-d/g)*"stay home". So an alternative way to construct the final matrix is to set up the von Neumann Morgenstern utilities independently for each state, setting the best outcome in that state equal to 1 and the worst overall outcome equal to 0.


COMPETITIVE RATIO

TO BE ADDED


HANNAN REGRET

Hannan regret has been devoloped to evaluate learning over time. It differs from the above approaches in various ways. It is not based on an axiomatization. In contrast to the above concepts, it is defined only in terms of performance in each sequence of states and not in terms of expected performance across states. Looking back one compares own expected performance to how other rules would have performed given what happened. One does not compare to all possible alternative rules but only to a finite set of so-called experts. Typically one only considers how constant behavior, always choosing the same action, would have performed.

IMPLEMENTATION
Typically one assume that foregone payoffs are observable. This means that one observes ex-post what each action would have yielded. For instance, it could be that one observes the state of the world after making a choice. Moreover, it is best to assume that payoffs are von Neumann Morgenstern utilities as one would choose the action yielding the highest expected payoff if one would know the state of the world before making the choice. For any sequence of states of nature and for each action one then considers the difference between the payoffs that would have been attained when choosing that action in each round and the payoffs attained by the rule chosen by the dm. The maximal value of this difference over the set of all sequences (apart from a subset with measure zero) and over all actions is called Hannan regret. Notice that there is a resemblance to minimax regret in the sense that differences and not ratios are considered,

EXAMPLE
Consider the example above with c=1/2, here we have already tranformed outcomes into utilities.

utilities
rain
shine
bring
1/2
1/2
not bring
0
1


If the dm faces this decision only once then minimax regret coincides with minimizing maximum Hannan regret, maximal regret is equal to 1/4.

Now assume that the dm faces this decision twice where there is no restriction on how states change over time. Assume that total performance is determined by adding utilities over time. Assume that the decision maker first chooses "bring" with probability 1/2 and then chooses whatever was best in the first state with probability 3/4. Assume "rain" occurs twice. Then "bring" yields a total utility of 1 and "not bring" a total utility of 1. So both constant actions are equally good. Our rule on the other hand yields total expected utility of 1/2*1/2+3/4*1/2=5/8. Hence Hannan regret is equal to 1-5/8=3/8.

Alternatively one can work directly with the regret matix as defined when investigating minimax regret.


regret
rain
shine
bring
0
1/2
not bring
1/2
0

Now assume that first "rain" then "shine" occurs. Total regret of our rule is equal to 1/2*1/2+3/4*1/2=5/8 while total regret of any constant choice is equal to 1/2. Thus, Hannan regret is equal to 5/8-1/2=1/8. We need not analyze the other two case due to symmetry.

Thus we obtain that maximal Hannan regret is equal to 3/8.

Consider briefly minimax regret in this repeated choice. The value of minimax regret is 1/4+1/4=1/2. Nature can realize the states independently in which case there is no possibility of learning and hence it is as if the dm is facing two separate decisions. In contrast, under Hannan regret the experts (those making constant choices) cannot ensure no regret when "rain" "shine" occurs. Thus the regret of the dm does not weigh so heavily and taking advantage of this by focussing on learning when the same state occurs twice we find that the dm can achieve a maximal Hannan regret of 3/8 that is lower that the value 1/2 obtained for the case where all four experts are allowed.


PROBABILITISTIC REGRET

In this homepage we also include references to probabilistic regret as defined by Fishburn, Bell, Loomes and Sugden. Here the dm has a prior over the states of nature, however chooses to include an additive term that refers to regret of lost opportunities. ETC.



This homepage has been built after a workshop organized by Karl Schlag at the EUI March 2 and 3, 2007 on Minimax Regret and Related Concepts. The participants were Bernd Droge, Dean Foster, Paolo Giodani, Mark LeQuement, Ludovic Renou, Karl Schlag, Gilles Stoltz, Joerg Stoye, Daniele Terlizzese, Bjoern Uhl and Sanne Zwart.

Bibliography

Note: Most of these papers are available through ScienceDirect, Businness Premier or JSTOR. We didn't attempt to link papers to their pdf files.

Hannan Regret

Seminal papers on Hannan regret:

Survey of minimization of Hannan-regret in machine learning: From the game-theoretic viewpoint, Hannan regret minimizing strategies are sometimes termed universal consistent strategies:

Early references on the exponentially weighted average forecaster:

The key tool in the upper bounds: Hoeffding's inequality:

Lower bound on Hannan regret in full monitoring:

Partial monitoring: upper and lower bounds on the Hannan regret Other potential functions and links to approachability: Hannan regret with side information (i.e., with Stoye's notions of covariates):

Survey on internal (Hannan) regret:


Non-sorted items Minimax Regret – unsorted items Regret as part of utility Voting Bargaining Repeated Decsion Making  Other Applications
Statistical Decision Theory

Econometrics Foundations-Axiomatizations-Minmax Regret Foundations-Alternative Approaches
Purchasing Evidence Estimation (OR) Publications on Minimax Regret in Statistics Minimax Regret in Sampling Inspection
Minimax Regret in Micro and Macro Economics

List provided by Paolo Giodani