We aim to find the discrete probability distribution \( \{p_i\}_{i=1}^n \) that maximizes Shannon entropy:
\[ H(p) = -\sum_{i=1}^n p_i \log p_i \]
Subject to:
The Lagrangian is:
\[ \mathcal{L}(p_1, \dots, p_n, \lambda, \theta) = -\sum p_i \log p_i - \lambda \left(\sum p_i - 1\right) - \theta \left(\sum p_i x_i - M\right) \]
Taking partial derivatives:
This leads to the solution:
\[ p_i = \frac{1}{Z} \exp(-\theta x_i), \quad \text{where} \quad Z = \sum_{j=1}^n \exp(-\theta x_j) \]
import numpy as np
from scipy.optimize import root_scalar
def max_entropy_vs_random(n=10, M_target=5.5):
"""
Compare entropy of a random vs maximum entropy distribution
under an expected value constraint.
"""
x = np.arange(1, n + 1)
# Random distribution
rand_vals = np.random.rand(n)
p_random = rand_vals / np.sum(rand_vals)
entropy_random = -np.sum(p_random * np.log(p_random))
expected_random = np.sum(p_random * x)
print("๐ฒ Random Distribution:")
for i, pi in enumerate(p_random, start=1):
print(f" p{i} = {pi:.6f}")
print(f" Entropy = {entropy_random:.6f}, Mean = {expected_random:.4f}\\n")
# Maximum entropy distribution using Lagrange
def moment_constraint(theta):
Z = np.sum(np.exp(-theta * x))
p = np.exp(-theta * x) / Z
return np.sum(p * x) - M_target
sol = root_scalar(moment_constraint, bracket=[-10, 10], method="brentq")
theta = sol.root
exp_vals = np.exp(-theta * x)
Z = np.sum(exp_vals)
p_opt = exp_vals / Z
entropy_opt = -np.sum(p_opt * np.log(p_opt))
expected_opt = np.sum(p_opt * x)
lambda_val = -1 - np.log(1 / Z)
print("๐ Maximum Entropy Distribution:")
for i, pi in enumerate(p_opt, 1):
print(f" p{i} = {pi:.6f}")
print(f" ฮธ = {theta:.6f}")
print(f" ฮป = {lambda_val:.6f}")
print(f" โp_i = {np.sum(p_opt):.6f}")
print(f" โp_i * x_i = {expected_opt:.6f} (target M = {M_target})")
print(f" Entropy = {entropy_opt:.6f}")
print("\\n๐ Entropy Comparison:")
print(f" Random Entropy = {entropy_random:.6f}")
print(f" Maximum Entropy = {entropy_opt:.6f}")
print(f" Difference = {entropy_opt - entropy_random:.6f}")
The principle of maximum entropy gives the most unbiased distribution possible under known constraints. It:
Maximum entropy is the fairest model when you know only a few things โ and refuse to guess the rest.