Notes on Semiparametric Models

Feb 5, 2025 2 min read Probability, econometrics, causal inference

$$ \newcommand{\indep}{\mathrel{\perp\mkern-10mu\perp}} \newcommand{\P}{\mathbb{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\E}{\mathbb{E}} \newcommand{\Var}{\operatorname{Var}} \newcommand{\Cov}{\operatorname{Cov}} \newcommand{\1}[1]{\mathbf{1}\\{#1\\}} $$

Motivation

Semiparametric models contain both a finite-dimensional parameter of interest ($\theta$) and an infinite-dimensional nuisance parameter ($\eta$).
The goal is to estimate $\theta$ as efficiently as possible while filtering out the impact of $\eta$.

Key Concepts

1. Tangent Space

The tangent space consists of all possible local perturbations of the statistical model.
In parametric models, these directions are given by score functions (derivatives of the log-likelihood). In semiparametric models, the tangent space is typically an infinite-dimensional subspace of $L^2(P)$ (the space of square-integrable functions).

2. Nuisance Tangent Space ($\mathcal{T}_{\eta}$)

This is the subset of the full tangent space that corresponds to variations in the nuisance parameter $\eta$, while holding the parameter of interest $\theta$ fixed.
It represents all the directions in which the nuisance part of the model can change and potentially affect the estimation of $\theta$.

3. Orthogonal Complement of the Nuisance Tangent Space ($\mathcal{T}_{\eta}^\perp$)

Defined as:
$$ \mathcal{T}_{\eta}^\perp = \{ h \in L^2(P) : \langle h, g \rangle = 0 \quad \text{for all } g \in \mathcal{T}_{\eta} \} $$
where the inner product $\langle \cdot, \cdot \rangle$ is typically given by covariance (or Fisher information).
This space contains directions that are “free” of the influence of the nuisance parameter. In other words, any variation in this space does not get “contaminated” by changes in $\eta$.

Why It Matters?

Efficient Estimation

In semiparametric estimation, constructing an estimator with the smallest possible variance (i.e., achieving the efficiency bound) involves ensuring that its influence function lies in $\mathcal{T}_{\eta}^\perp$.
The influence function describes how an estimator responds to small changes in the data distribution.
By projecting any candidate influence function onto $\mathcal{T}_{\eta}^\perp$, one removes the component due to the nuisance parameter, yielding the efficient influence function.

Practical Implication

This separation allows us to focus on the parameter of interest while systematically “filtering out” nuisance effects, leading to more precise (optimal) estimators.

Summary

Nuisance Tangent Space ($\mathcal{T}_{\eta}$): Captures all the directions of change due to the nuisance parameter $\eta$.
Orthogonal Complement ($\mathcal{T}_{\eta}^\perp$): Contains directions free from nuisance effects, representing pure variations in $\theta$.
Efficient Influence Function: By projecting onto $\mathcal{T}_{\eta}^\perp$, one obtains an influence function that is optimal, meaning that the corresponding estimator achieves the semiparametric efficiency bound.

double machine learning semiparametric

Chen Xing

Founder & Data Scientist

Enjoy Life & Enjoy Work!