interpretability & causality

notes on causal & interpretable ML approaches

1. LIME (local interpretable model agnostic explanations)

  • Explains the local (zoomed-in) approximation for the overall complex model
  • Fits a linear model on this zoomed-in portion


  • \(G\) can be collection of interpretable models \(\{\) Linear regression, Decision trees, Log Reg…\(\}\)
  • \(\pi_{x}\) represents the locality area around \(x\)
  • minimize \(\mathcal{L}(f, g, \pi_{x})\) which measures how bad the simple model \(g\) approximates \(f\) within the area of \(\pi_{x}\)
  • minimize \(\Omega(g)\) which is measure of complexity of the interpretable model \(g\) such as (# of parameters in linear regression, depth of decision trees, etc.)

Implementation:
Step 1. Calculate the first loss term \(\mathcal{L}(f, g, \pi_{x})\)


  • Within the local area of query point, generate a bunch of random new datapoints
  • Classify those datapoints according to the complex model \(f\)
  • Use these new data & labels predicted by \(f\) to train a simple model \(g\)
  • Loss is then defined as the total deviation from complex model \(f\) to simple model \(g\) for each random generated datapoint
  • Additionally, we should weight the errors based on distance from the query point

Step 2. Calculate the second loss term \(\Omega(g)\)

  • Ensures the model is as simple as possible
  • So you can use something like lasso or ridge regression in a linear model

Summary: LIME allows you to see which input features are most relevant for an output prediction, even if the model is black box



2. SHAP (Shapley Additive Explanations)

  • Shapley values tells us the weighted average of a features contribution to the output
  • Find marginal contributions of each subset of features (to account for interactions between features)
  • So we need to iterate over all possible combinations of features and then average (total of \(2^{n}\) subsets)


Example: 2 player game


Player 1 and Player 2 work together and generate $10,000
Player 1 alone can generate 7,500
Player 2 alone can generate 5,000

  • Shapley(Player 1) = (total - p2 solo contribution) + p1 solo contribution / 2
  • Shapley(Player 2) = (total - p1 solo contribution) + p2 solo contribution / 2

Example: 3 player game


  • Shapley(Player 1) = weighted average of all possible subsets that isolate player 1
  • Weights –> expected marginal contributions

How do we get the weights?

  • Weights = number of ways in which Player 1 can join the different coalitions


General Shapley Value Formula



3. Counterfactual Explanations

  • Person X has a 90% of stroke. If they decrease BMI to 25, then decrease prediction to 30% stroke.
  • Counterfactual: The smallest change in input features that changes prediction to another output

  • Find the minimum change from the original input \(x\) to the counterfactual \(x'\) st. the output class is changed \(f(x') = c\)

Generating counterfactuals

  1. White box approach: if we have access to the model (model-specific)
  2. Black box approach: if we only have inputs and outputs (model-agnostic)


Output of a counterfactual explainer results in what features need to change by what value in order to output a designated class


4. Causality and Graph Neural Networks



Def: Confounding Factors are variables that influence both independent and dependent variables. The existence of them explains why correlation != causation.

  • Let X be an ind. variable and Y be a dep variable. We say that X and Y are confounded by Z if Z causally influences both X and Y.
  • Ex: Suppose we want to see if carb intake (X) affects cholesterol levels (Y). A confounding variable (Z) could be exercise level, as people who exercise more can eat more carbs and also impact cholesterol.

Goal: Isolate causal effects by reducing confounding factors by blocking paths (do-calculus)

Pearls Causal Hierarchy and Do-Calculus

  1. Association (Seeing Patterns): Provides observational info and correlations
    • Ex: Observing students who study more get higher test scores
  2. Interventions (Doing Something): Approximating what happens if you change something and the effects of actions
    • Ex: Making people study more to observe the effect on scores
  3. Counterfactuals (Imagining What-Ifs): Imagining hypotheticals about what happens after-the-fact
    • Ex: Imagining what would happen if people did not study at all

www.bnlearn.com
www.pywhy.org

Neural Causal Models

  • Goal is to approximate connections between variables by learning associational/interventional/counterfactual distributions


5. Bayesian Networks




A Bayesian Network is a directed acyclical graph (DAG) with nodes and directed links that explains the probabilistic relationships (and influences) between variables:

  • Node = feature (or multiple features)
  • Link = indicates one node directly influences another

Distributions:

  • Each node (which represents features e.g., age, height, bmi) is assigned a distribution
  • Give node X, a bayesian network requires a distribution \(P(X \mid parent(X))\) where parent(X) indicates the parent nodes of X
  • If X has no parents, then the distribution is just P(X) aka the “prior”

How Are Distributions Assigned?

  1. Learned from data
  2. Experts specified
  3. Hybrid


  • Notice how we don’t factor in the rain when calculating probability of slipping. All we care about is that the ground is wet.

Inference: Given a Bayesian Network describing P(X,Y,Z), what is P(Y)

  1. Enumeration (brute force)