In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of spaces.
The numbers and above are said to be Hölder conjugates of each other. The special case <math>p=q=2</math> gives a form of the Cauchy–Schwarz inequality. Hölder's inequality holds even if <math>\|fg\|_1</math> is infinite, the right-hand side also being infinite in that case. Conversely, if is in <math>L^p(\mu)</math> and is in <math>L^q(\mu)</math>, then the pointwise product <math>fg</math> is in <math>L^1(\mu)</math>.
Hölder's inequality is used to prove the Minkowski inequality, which is the triangle inequality in the space <math>L^p(\mu)</math>, and also to establish that <math>L^q(\mu)</math> is the dual space of <math>L^p(\mu)</math> for <math>p\in[1,\infty)</math>.
Hölder's inequality (in a slightly different form) was first found by . Inspired by Rogers' work, gave another proof as part of a work developing the concept of convex and concave functions and introducing Jensen's inequality, which was in turn named for work of Johan Jensen building on Hölder's work.
Remarks
Conventions
The brief statement of Hölder's inequality uses some conventions.
- In the definition of Hölder conjugates, means zero.
- If , then and stand for the (possibly infinite) expressions
::<math>\begin{align}
&\left(\int_S |f|^p\,\mathrm{d}\mu\right)^{\frac{1}{p \\
&\left(\int_S |g|^q\,\mathrm{d}\mu\right)^{\frac{1}{q
\end{align}</math>
- If , then stands for the essential supremum of , similarly for .
- The notation with is a slight abuse, because in general it is only a norm of if is finite and is considered as equivalence class of -almost everywhere equal functions. If and , then the notation is adequate.
- On the right-hand side of Hölder's inequality, 0 × ∞ as well as ∞ × 0 means 0. Multiplying with ∞ gives ∞.
Estimates for integrable products
As above, let and denote measurable real- or complex-valued functions defined on . If is finite, then the pointwise products of with and its complex conjugate function are -integrable, the estimate
:<math>\biggl|\int_S f\bar g\,\mathrm{d}\mu\biggr|\le\int_S|fg|\,\mathrm{d}\mu =\|fg\|_1</math>
and the similar one for hold, and Hölder's inequality can be applied to the right-hand side. In particular, if and are in the Hilbert space , then Hölder's inequality for implies
:<math>|\langle f,g\rangle| \le \|f\|_2 \|g\|_2,</math>
where the angle brackets refer to the inner product of . This is also called Cauchy–Schwarz inequality, but requires for its statement that and are finite to make sure that the inner product of and is well defined. We may recover the original inequality (for the case ) by using the functions and in place of and .
Generalization for probability measures
If is a probability space, then just need to satisfy , rather than being Hölder conjugates. A combination of Hölder's inequality and Jensen's inequality implies that
:<math>\|fg\|_1 \le \|f\|_p \|g\|_q</math>
for all measurable real- or complex-valued functions and on .
Notable special cases
For the following cases assume that and are in the open interval with .
Counting measure
For the <math>n</math>-dimensional Euclidean space, when the set <math>S</math> is <math>\{1,\dots,n\}</math> with the counting measure, we have
:<math>\sum_{k=1}^n |x_k\,y_k| \le \left( \sum_{k=1}^n |x_k|^p \right)^{\frac{1}{p \left( \sum_{k=1}^n |y_k|^q \right)^{\frac{1}{q
\text{ for all }(x_1,\ldots,x_n),(y_1,\ldots,y_n)\in\mathbb{R}^n\text{ or }\mathbb{C}^n.</math>
Often the following practical form of this is used, for any <math>(r,s)\in\mathbb{R}_+</math>:
:<math>\left(\sum_{k=1}^n |x_k|^r\,|y_k|^s \right)^{r+s}\le \left( \sum_{k=1}^n |x_k|^{r+s} \right)^{r} \left( \sum_{k=1}^n |y_k|^{r+s} \right)^{s}.</math>
For more than two sums, the following generalisation (, ) holds, with real positive exponents <math> \lambda_i </math> and <math> \lambda_a + \lambda_b+ \cdots +\lambda_z =1</math>:
:<math>\sum_{k=1}^n |a_k|^{\lambda_a}\,|b_k|^{\lambda_b} \cdots |z_k|^{\lambda_z} \le \left(\sum_{k=1}^n |a_k|\right)^{\lambda_a} \left(\sum_{k=1}^n |b_k|\right)^{\lambda_b} \cdots \left(\sum_{k=1}^n |z_k|\right)^{\lambda_z} .
</math>
Equality holds iff <math> |a_1|: |a_2|: \cdots : |a_n| =|b_1|: |b_2|: \cdots : |b_n| = \cdots = |z_1|: |z_2|: \cdots : |z_n| </math>.
If <math>S=\N</math> with the counting measure, then we get Hölder's inequality for sequence spaces:
:<math>\sum_{k=1}^{\infty} |x_k\,y_k| \le \left( \sum_{k=1}^{\infty} |x_k|^p \right)^{\frac{1}{p \left( \sum_{k=1}^{\infty} |y_k|^q \right)^{\frac{1}{q
\text{ for all }(x_k)_{k\in\mathbb N}, (y_k)_{k\in\mathbb N}\in\mathbb{R}^{\mathbb N}\text{ or }\mathbb{C}^{\mathbb N}.</math>
Lebesgue measure
If <math>S</math> is a measurable subset of <math>\R^n</math> with the Lebesgue measure, and <math>f</math> and <math>g</math> are measurable real- or complex-valued functions on <math>S</math>, then Hölder's inequality is
:<math>\int_S \bigl| f(x)g(x)\bigr| \,\mathrm{d}x \le\biggl(\int_S |f(x)|^p\,\mathrm{d}x\biggr)^{\frac{1}{p \biggl(\int_S |g(x)|^q\,\mathrm{d}x\biggr)^{\frac{1}{q.</math>
Probability measure
For the probability space <math>(\Omega, \mathcal{F}, \mathbb{P}),</math> let <math>\mathbb{E}</math> denote the expectation operator. For real- or complex-valued random variables <math>X</math> and <math>Y</math> on <math>\Omega,</math> Hölder's inequality reads
:<math>\mathbb{E}[|XY|] \leqslant \left (\mathbb{E}\bigl[ |X|^p\bigr]\right)^{\frac{1}{p \left(\mathbb{E}\bigl[|Y|^q\bigr]\right)^{\frac{1}{q.</math>
Let <math>1 < r < s < \infty</math> and define <math>p = \tfrac{s}{r}.</math> Then <math>q = \tfrac{p}{p-1}</math> is the Hölder conjugate of <math>p.</math> Applying Hölder's inequality to the random variables <math>|X|^r</math> and <math>1_{\Omega}</math> we obtain
:<math>\mathbb{E}\bigl[|X|^r\bigr]\leqslant \left(\mathbb{E}\bigl[|X|^s\bigr]\right)^{\frac{r}{s.</math>
In particular, if the <sup>th</sup> absolute moment is finite, then the <sup> th</sup> absolute moment is finite, too. (This also follows from Jensen's inequality.)
Product measure
For two σ-finite measure spaces and define the product measure space by
:<math>S=S_1\times S_2,\quad \Sigma=\Sigma_1\otimes\Sigma_2,\quad \mu=\mu_1\otimes\mu_2,</math>
where is the Cartesian product of and , the arises as product σ-algebra of and , and denotes the product measure of and . Then Tonelli's theorem allows us to rewrite Hölder's inequality using iterated integrals: If and are real- or complex-valued functions on the Cartesian product , then
:<math>\int_{S_1}\int_{S_2}|f(x,y)\,g(x,y)|\,\mu_2(\mathrm{d}y)\,\mu_1(\mathrm{d}x) \le\left(\int_{S_1}\int_{S_2}|f(x,y)|^p\,\mu_2(\mathrm{d}y)\,\mu_1(\mathrm{d}x)\right)^{\frac{1}{p\left(\int_{S_1}\int_{S_2}|g(x,y)|^q\,\mu_2(\mathrm{d}y)\,\mu_1(\mathrm{d}x)\right)^{\frac{1}{q.</math>
This can be generalized to more than two measure spaces.
Vector-valued functions
Let denote a measure space and suppose that and are -measurable functions on , taking values in the -dimensional real- or complex Euclidean space. By taking the product with the counting measure on , we can rewrite the above product measure version of Hölder's inequality in the form
:<math> \int_S \sum_{k=1}^n|f_k(x)\,g_k(x)|\,\mu(\mathrm{d}x) \le \left(\int_S\sum_{k=1}^n|f_k(x)|^p\,\mu(\mathrm{d}x)\right)^{\frac{1}{p\left(\int_S\sum_{k=1}^n|g_k(x)|^q\,\mu(\mathrm{d}x)\right)^{\frac{1}{q.</math>
If the two integrals on the right-hand side are finite, then equality holds if and only if there exist real numbers , not both of them zero, such that
:<math>\alpha \left (|f_1(x)|^p,\ldots,|f_n(x)|^p \right )= \beta \left (|g_1(x)|^q,\ldots,|g_n(x)|^q \right ),</math>
for -almost all in .
This finite-dimensional version generalizes to functions and taking values in a normed space which could be for example a sequence space or an inner product space.
Proof of Hölder's inequality
There are several proofs of Hölder's inequality; the main idea in the following is Young's inequality for products.
Alternative proof using Jensen's inequality:
</math>
where is any probability distribution and any -measurable function. Let be any measure, and the distribution whose density w.r.t. is proportional to <math>g^q</math>, i.e.
:<math>\mathrm{d}\nu = \frac{g^q}{\int g^q\,\mathrm{d}\mu}\mathrm{d}\mu</math>
Hence we have, using <math>\frac{1}{p}+\frac{1}{q}=1</math>, hence <math>p(1-q)+q=0</math>, and letting <math>h=fg^{1-q}</math>,
:<math> \begin{align}\int fg\,\mathrm{d}\mu = & \left (\int g^q\,\mathrm{d}\mu \right )\int \underbrace{fg^{1-q_h\underbrace{\frac{g^q}{\int g^q\,\mathrm{d}\mu}\mathrm{d}\mu}_{\mathrm{d}\nu}\\
\leq & \left (\int g^q\mathrm{d}\mu \right ) \left (\int \underbrace{f^pg^{p(1-q)_{h^p}\underbrace{\frac{g^q}{\int g^q\,\mathrm{d}\mu}\,\mathrm{d}\mu}_{\mathrm{d}\nu} \right )^{\frac{1}{p\\
= & \left (\int g^q\,\mathrm{d}\mu \right ) \left (\int \frac{f^p}{\int g^q\,\mathrm{d}\mu}\,\mathrm{d}\mu \right )^{\frac{1}{p .
\end{align}</math>
Finally, we get
:<math>\int fg\,\mathrm{d}\mu \leq \left(\int f^p\,\mathrm{d}\mu \right )^{\frac{1}{p \left(\int g^q\,\mathrm{d}\mu \right )^{\frac{1}{q</math>
This assumes that are real and non-negative, but the extension to complex functions is straightforward (use the modulus of ).
It also assumes that <math>\|f\|_p,\|g\|_q</math> are neither null nor infinity, and that <math>p,q > 1</math>: all these assumptions can also be lifted as in the proof above.
We could also bypass use of both Young's and Jensen's inequalities. The proof below also explains why and where the Hölder exponent comes in naturally.
</math>
where <math> \nu(X)=1 </math> and <math>h</math> is <math>\nu</math>-measurable (real or complex) function on <math> X </math>. To prove this, we must bound <math>|h| </math> by <math> |h|^p </math>. There is no constant <math> C </math> that will make <math> |h(x)| ~\leq~ C|h(x)|^p </math> for all <math> x > 0 </math>. Hence, we seek an inequality of the form
:<math> |h(x)| ~\leq~ a'|h(x)|^p + b', \quad \text{for all} \quad x>0 </math>
for suitable choices of <math> a' </math> and <math> b' </math>.
We wish to obtain <math> A:=\|h\|_p </math> on the right-hand side after integrating this inequality. By trial and error, we see that the inequality we wish should have the form
:<math> |h(x)| ~\leq~ aA^{1-p}|h(x)|^p + bA, \quad \text{for all} \quad x>0, </math>
where <math> a, b </math> are non-negative and <math> a+b=1 </math>. Indeed, the integral of the right-hand side is precisely <math> A </math>. So, it remains to prove that such an inequality does hold with the right choice of <math>a,b.</math>
The inequality we seek would follow from:
:<math> \tfrac{y}{A} ~\leq~ a(\tfrac{y}{A})^p + b, \quad \text{for all} \quad y>0, </math>
which, in turn, is equivalent to
:<math> (*) \quad z ~\leq~ az^p + b, \quad \text{for all} \quad z>0. </math>
It turns out there is one and only one choice of <math> a, b </math>, subject to <math> a+b=1 </math>, that makes this true: <math> a=\tfrac{1}{p}</math> and, necessarily, <math> b=1-\tfrac{1}{p}</math>. (This is where Hölder conjugate exponent is born!) This completes the proof of the inequality at the first paragraph of this proof. Proof of Hölder's inequality follows from this as in the previous proof. Alternatively, we can deduce Young's inequality and then resort to the first proof given above. Young's inequality follows from the inequality (*) above by choosing <math> z=\tfrac{a}{b^{q-1</math> and multiplying both sides by <math> b^{q} </math>.
Extremal equality
Statement
Assume that and let denote the Hölder conjugate. Then for every ,
:<math>\|f\|_p = \max \left \{ \left| \int_S f g \, \mathrm{d}\mu \right | : g\in L^q(\mu), \|g\|_q \le 1 \right\},</math>
where max indicates that there actually is a maximizing the right-hand side. When and if each set in the with contains a subset with (which is true in particular when is ), then
:<math>\|f\|_\infty = \sup \left\{ \left| \int_S f g \,\mathrm{d}\mu \right| : g\in L^1(\mu), \|g\|_1 \le 1 \right \}.</math>
Proof of the extremal equality:
Remarks and examples
- The equality for <math>p = \infty</math> fails whenever there exists a set <math>A</math> of infinite measure in the <math>\sigma</math>-field <math>\Sigma</math> with that has no subset <math>B \in \Sigma</math> that satisfies: <math>0 < \mu(B) < \infty.</math> (the simplest example is the <math>\sigma</math>-field <math>\Sigma</math> containing just the empty set and <math>S,</math> and the measure <math>\mu</math> with <math>\mu(S) = \infty.</math>) Then the indicator function <math>1_A</math> satisfies <math>\|1_A\|_{\infty} = 1,</math> but every <math>g \in L^1 (\mu)</math> has to be <math>\mu</math>-almost everywhere constant on <math>A,</math> because it is <math>\Sigma</math>-measurable, and this constant has to be zero, because <math>g</math> is <math>\mu</math>-integrable. Therefore, the above supremum for the indicator function <math>1_A</math> is zero and the extremal equality fails.
- For <math>p = \infty,</math> the supremum is in general not attained. As an example, let <math>S = \mathbb{N}, \Sigma = \mathcal{P}(\mathbb{N})</math> and <math>\mu</math> the counting measure. Define:
::<math>\begin{cases} f: \mathbb{N} \to \mathbb{R} \\ f(n) = \frac{n-1}{n} \end{cases}</math>
:Then <math>\|f\|_{\infty} = 1.</math> For <math>g \in L^1 (\mu, \mathbb{N})</math> with <math>0 < \|g\|_1 \leqslant 1,</math> let <math>m</math> denote the smallest natural number with <math>g(m) \neq 0.</math> Then
::<math>\left |\int_S fg\,\mathrm{d}\mu\right| \leqslant \frac{m-1}{m}|g(m)|+\sum_{n=m+1}^\infty|g(n)| = \|g\|_1-\frac{|g(m)|}m<1.</math>
Applications
- The extremal equality is one of the ways for proving the triangle inequality for all and in , see Minkowski inequality.
- Hölder's inequality implies that every defines a bounded (or continuous) linear functional on by the formula
::<math>\kappa_f(g) = \int_S f g \, \mathrm{d}\mu,\qquad g\in L^q(\mu).</math>
:The extremal equality (when true) shows that the norm of this functional as element of the continuous dual space coincides with the norm of in (see also the article).
Generalization with more than two functions
Statement
Assume that and such that
:<math>\sum_{k=1}^n \frac1{p_k} = \frac1r</math>
where 1/∞ is interpreted as 0 in this equation, and r=∞ implies are all equal to ∞. Then, for all measurable real or complex-valued functions defined on ,
:<math>\left\|\prod_{k=1}^n f_k\right\|_r \le \prod_{k=1}^n \left\|f_k\right\|_{p_k}</math>
where we interpret any product with a factor of ∞ as ∞ if all factors are positive, but the product is 0 if any factor is 0.
In particular, if <math>f_k \in L^{p_k}(\mu)</math> for all <math>k \in \{ 1, \ldots, n \}</math> then <math>\prod_{k=1}^n f_k \in L^r(\mu).</math>
Note: For <math>r \in (0, 1),</math> contrary to the notation, is in general not a norm because it doesn't satisfy the triangle inequality.
Proof of the generalization:
\left\|f_n\right\|_{\infty}.
\end{align}</math>
Case 2: If <math>p_n < \infty</math> then necessarily <math>r < \infty</math> as well, and then
:<math>p := \frac{p_n}{p_n-r}, \qquad q := \frac{p_n}r</math>
are Hölder conjugates in . Application of Hölder's inequality gives
:<math>\left \||f_1 \cdots f_{n-1}|^r\,|f_n|^r\right \|_1 \le \left \||f_1 \cdots f_{n-1}|^r\right\|_p\,\left \||f_n|^r\right \|_q.</math>
Raising to the power <math>1/r</math> and rewriting,
:<math>\|f_1 \cdots f_n\|_r \le \|f_1 \cdots f_{n-1}\|_{pr} \|f_n\|_{qr}.</math>
Since <math>q r = p_n</math> and
:<math>\sum_{k=1}^{n-1} \frac1{p_k} = \frac1r-\frac1{p_n} = \frac{p_n-r}{rp_n} = \frac1{pr},</math>
the claimed inequality now follows by using the induction hypothesis.
Interpolation
Let and let denote weights with . Define <math>p</math> as the weighted harmonic mean, that is,
:<math> \frac1p = \sum_{k=1}^n \frac{\theta_k}{p_k}.</math>
Given measurable real- or complex-valued functions <math>f_k</math> on , then the above generalization of Hölder's inequality gives
:<math>\left\| |f_1|^{\theta_1}\cdots |f_n|^{\theta_n}\right\|_p \le \left\||f_1|^{\theta_1}\right\|_{\frac{p_1}{\theta_1\cdots \left\| |f_n|^{\theta_n}\right\|_{\frac{p_n}{\theta_n = \|f_1\|_{p_1}^{\theta_1}\cdots \|f_n\|_{p_n}^{\theta_n}.</math>
In particular, taking <math>f_1 = \cdots = f_n=:f</math> gives
:<math>\|f\|_p \leqslant \prod_{k=1}^n \|f\|_{p_k}^{\theta_k}.</math>
Specifying further and , in the case <math>n = 2,</math> we obtain the interpolation result
An application of Hölder gives
\cdot |f_1|^{\frac{p_1 \theta}{p\right\|_p^p \le \|f_0\|_{p_0}^{p_0(1-\theta)} \|f_1\|_{p_1}^{p_1\theta}</math>
and in particular
<math display="block">\|f\|_p^p \leqslant \|f\|_{p_0}^{p_0(1-\theta)} \cdot \|f\|_{p_1}^{p_1\theta}.</math>
Both Littlewood and Lyapunov imply that if <math>f \in L^{p_0}\cap L^{p_1}</math> then <math>f \in L^p</math> for all <math>p_0 < p < p_1.</math>
Reverse Hölder inequalities
Two functions
Assume that and that the measure space satisfies . Then for all measurable real- or complex-valued functions and on such that for all ,
:<math>\|fg\|_1\geqslant \|f\|_{\frac{1}{p\,\|g\|_{\frac{-1}{p-1.</math>
If
:<math>\|fg\|_1 < \infty \quad \text{and} \quad \|g\|_{\frac{-1}{p-1 > 0, </math>
then the reverse Hölder inequality is an equality if and only if
:<math>\exists \alpha \geqslant 0 \quad |f| = \alpha|g|^{\frac{-p}{p-1 \qquad \mu\text{-almost everywhere}.</math>
Note: The expressions:
<math> \|f\|_{\frac{1}{p</math> and <math>\|g\|_{\frac{-1}{p-1,</math>
are not norms, they are just compact notations for
:<math>\left (\int_S|f|^{\frac{1}{p\,\mathrm{d}\mu\right)^{p} \quad \text{and} \quad \left (\int_S|g|^{\frac{-1}{p-1\,\mathrm{d}\mu\right)^{-(p-1)}.</math>
\right \|_1 &= \left \||fg|^{\frac{1}{p\,|g|^{-\frac{1}{p\right \|_1\\
&\leqslant \left \| |fg|^{\frac{1}{p \right \|_p \left \| |g|^{-\frac{1}{p\right \|_q \\
&=\|fg\|_1^{\frac{1}{p\left \||g|^{\frac{-1}{p-1\right \|_1^{\frac{p-1}{p
\end{align}</math>
Raising to the power gives us:
:<math>\left \||f|^{\frac{1}{p\right \|_1^p \leqslant \|fg\|_1 \left \||g|^{\frac{-1}{p-1\right \|_1^{p-1}.</math>
Therefore:
:<math>\left \||f|^{\frac{1}{p\right \|_1^p \left \||g|^{\frac{-1}{p-1\right \|_1^{-(p-1)} \leqslant \|fg\|_1 .</math>
Now we just need to recall our notation.
Since is not almost everywhere equal to the zero function, we can have equality if and only if there exists a constant such that almost everywhere. Solving for the absolute value of gives the claim.
Multiple functions
The Reverse Hölder inequality (above) can be generalized to the case of multiple functions if all but one conjugate is negative.
That is,
: Let <math>p_1,..., p_{m-1} < 0</math> and <math>p_m \in \mathbb{R}</math> be such that <math>\sum_{k=1}^{m} \frac{1}{p_k} = 1</math> (hence <math>0 < p_m < 1</math>). Let <math>f_k</math> be measurable functions for <math>k = 1,...,m</math>. Then
:<math>\left\|\prod_{k=1}^m f_k\right\|_1 \ge \prod_{k=1}^m \left\|f_k\right\|_{p_k}.</math>
This follows from the symmetric form of the Hölder inequality (see below).
Symmetric forms of Hölder inequality
It was observed by Aczél and Beckenbach that Hölder's inequality can be put in a more symmetric form, at the price of introducing an extra vector (or function):
Let <math>f = (f(1),\dots,f(m)) , g = (g(1),\dots, g(m)), h = (h(1),\dots,h(m))</math> be vectors with positive entries and such that <math>f(i) g(i) h(i) = 1</math> for all <math>i</math>. If <math>p,q,r</math> are nonzero real numbers such that <math>\frac{1}{p}+\frac{1}{q}+\frac{1}{r}=0</math>, then:
- <math>\|f\|_p \|g\|_q \|h\|_r \ge 1</math> if all but one of <math>p,q,r</math> are positive;
- <math>\|f\|_p \|g\|_q \|h\|_r \le 1</math> if all but one of <math>p,q,r</math> are negative.
The standard Hölder inequality follows immediately from this symmetric form (and in fact is easily seen to be equivalent to it). The symmetric statement also implies the reverse Hölder inequality (see above).
The result can be extended to multiple vectors:
Let <math>f_1, \dots, f_n</math> be <math>n</math> vectors in <math>\mathbb{R}^m</math> with positive entries and such that <math>f_1(i) \dots f_n(i) = 1</math> for all <math>i</math>. If <math>p_1,\dots,p_n</math> are nonzero real numbers such that <math>\frac{1}{p_1}+\dots+\frac{1}{p_n}=0</math>, then:
- <math>\|f_1\|_{p_1} \dots \|f_n\|_{p_n} \ge 1</math> if all but one of the numbers <math>p_i</math> are positive;
- <math>\|f_1\|_{p_1} \dots \|f_n\|_{p_n} \le 1</math> if all but one of the numbers <math>p_i</math> are negative.
As in the standard Hölder inequalities, there are corresponding statements for infinite sums and integrals.
Conditional Hölder inequality
Let be a probability space, a , and Hölder conjugates, meaning that . Then for all real- or complex-valued random variables and on ,
:<math>\mathbb{E}\bigl[|XY|\big|\,\mathcal{G}\bigr] \le \bigl(\mathbb{E}\bigl[|X|^p\big|\,\mathcal{G}\bigr]\bigr)^{\frac{1}{p \,\bigl(\mathbb{E}\bigl[|Y|^q\big|\,\mathcal{G}\bigr]\bigr)^{\frac{1}{q
\qquad\mathbb{P}\text{-almost surely.}</math>
Remarks:
- If a non-negative random variable has infinite expected value, then its conditional expectation is defined by
::<math>\mathbb{E}[Z|\mathcal{G}] = \sup_{n\in\mathbb{N\,\mathbb{E}[\min\{Z,n\}|\mathcal{G}]\quad\text{a.s.}</math>
- On the right-hand side of the conditional Hölder inequality, 0 times ∞ as well as ∞ times 0 means 0. Multiplying with ∞ gives ∞.
Proof of the conditional Hölder inequality:
,\qquad V=\bigl(\mathbb{E}\bigl[|Y|^q\big|\,\mathcal{G}\bigr]\bigr)^{\frac{1}{q</math>
and note that they are measurable with respect to the . Since
:<math>\mathbb{E}\bigl[|X|^p1_{\{U=0\\bigr] = \mathbb{E}\bigl[1_{\{U=0\\underbrace{\mathbb{E}\bigl[|X|^p\big|\,\mathcal{G}\bigr]}_{=\,U^p}\bigr]=0,</math>
it follows that a.s. on the set . Similarly, a.s. on the set , hence
:<math>\mathbb{E}\bigl[|XY|\big|\,\mathcal{G}\bigr]=0\qquad\text{a.s. on }\{U=0\}\cup\{V=0\}</math>
and the conditional Hölder inequality holds on this set. On the set
:<math>\{U=\infty, V>0\}\cup\{U>0, V=\infty\}</math>
the right-hand side is infinite and the conditional Hölder inequality holds, too. Dividing by the right-hand side, it therefore remains to show that
:<math>\frac{\mathbb{E}\bigl[|XY|\big|\,\mathcal{G}\bigr]}{UV}\le1
\qquad\text{a.s. on the set }H:=\{0<U<\infty,\,0<V<\infty\}.</math>
This is done by verifying that the inequality holds after integration over an arbitrary
:<math>G\in\mathcal{G},\quad G\subset H.</math>
Using the measurability of with respect to the , the rules for conditional expectations, Hölder's inequality and , we see that
:<math>\begin{align}
\mathbb{E}\biggl[\frac{\mathbb{E}\bigl[|XY|\big|\,\mathcal{G}\bigr]}{UV}1_G\biggr]
&=\mathbb{E}\biggl[\mathbb{E}\biggl[\frac{|XY|}{UV}1_G\bigg|\,\mathcal{G}\biggr]\biggr]\\
&=\mathbb{E}\biggl[\frac{|X|}{U}1_G\cdot\frac{|Y|}{V}1_G\biggr]\\
&\le\biggl(\mathbb{E}\biggl[\frac{|X|^p}{U^p}1_G\biggr]\biggr)^{\frac{1}{p
\biggl(\mathbb{E}\biggl[\frac{|Y|^q}{V^q}1_G\biggr]\biggr)^{\frac{1}{q\\
&=\biggl(\mathbb{E}\biggl[\underbrace{\frac{\mathbb{E}\bigl[|X|^p\big|\,\mathcal{G}\bigr]}{U^p_{=\,1\text{ a.s. on }G}1_G\biggr]\biggr)^{\frac{1}{p
\biggl(\mathbb{E}\biggl[\underbrace{\frac{\mathbb{E}\bigl[|Y|^q\big|\,\mathcal{G}\bigr]}{V^p_{=\,1\text{ a.s. on }G}1_G\biggr]\biggr)^{\frac{1}{q\\
&=\mathbb{E}\bigl[1_G\bigr].
\end{align}</math>
Hölder's inequality for increasing seminorms
Let be a set and let <math>F(S, \mathbb{C})</math> be the space of all complex-valued functions on . Let be an increasing seminorm on <math>F(S, \mathbb{C}),</math> meaning that, for all real-valued functions <math>f, g \in F(S, \mathbb{C})</math> we have the following implication (the seminorm is also allowed to attain the value ∞):
:<math> \forall s \in S \quad f(s) \geqslant g(s) \geqslant 0 \qquad \Rightarrow \qquad N(f) \geqslant N(g).</math>
Then:
:<math>\forall f, g \in F(S, \mathbb{C}) \qquad N(|fg|) \leqslant \bigl(N(|f|^p)\bigr)^{\frac{1}{p \bigl(N(|g|^q)\bigr)^{\frac{1}{q,</math>
where the numbers <math>p</math> and <math>q</math> are Hölder conjugates.
Remark: If is a measure space and <math>N(f)</math> is the upper Lebesgue integral of <math>|f|</math> then the restriction of to all functions gives the usual version of Hölder's inequality.
Distances based on Hölder inequality
Hölder inequality can be used to define statistical dissimilarity measures between probability distributions. Those Hölder divergences are projective: They do not depend on the normalization factor of densities.
See also
- Cauchy–Schwarz inequality
- Minkowski inequality
- Jensen's inequality
- Young's inequality for products
- Clarkson's inequalities
- Brascamp–Lieb inequality
Citations
References
- .
- . Available at Digi Zeitschriften.
- .
- <!-- -->
- .
- .
- <!-- -->
External links
- .
- .
- .
- Archived at Ghostarchive and the Wayback Machine: .
