1 Introduction

The electric conductivity describes the response of the current density \(j_i(t)\) toward a time-dependent electric field \(E_j(t)\). In the Fourier space, the linear optical conductivity \(\sigma _i^j(\omega )\) (ij are the spatial indices) is the proportionality constant connecting \(j_i(\omega )\) to \(E_i(\omega )\):

$$\begin{aligned} j_i(\omega )=\sum _j\sigma _i^j(\omega )E_j(\omega )+O(E^2). \end{aligned}$$
(1)

There has been a long history of studies on the general properties of \(\sigma _i^j(\omega )\). (See Ref. [1] and the references therein.) For example, the optical conductivity obeys the frequency-sum rule (f-sum rule) [1]; that is, the integral

$$\begin{aligned} \int _{-\infty }^{\infty }d\omega \sigma _i^{j}(\omega )~ \end{aligned}$$
(2)

is solely determined by an expectation value in the absence of the electric field. Furthermore, the optical conductivity is known to have the following generic structure:

$$\begin{aligned} \sigma _{i}^{j}(\omega )=\frac{i}{\omega +i\eta }\mathcal {D}_{i}^{j}+\sigma _{i\,(\text {regular})}^{j}(\omega ), \end{aligned}$$
(3)

where \(\mathcal {D}_i^j\) is called the Drude weight that characterizes the singular part of \(\sigma _i^j(\omega )\) around \(\omega =0\) and \(\sigma _{i\,(\text {regular})}^{j}(\omega )\) is the regular part that includes all other terms. The Drude weight is a useful measure distinguishing ideal conductors from insulators and non-ideal conductors [2]. More than a half century ago, Kohn [3] showed that the Drude weight at zero temperature is given by the curvature of the ground state energy \(\mathcal {E}_0(\vec {A})\) as a function of the vector potential \(\vec {A}\),

$$\begin{aligned}&\mathcal {D}_i^j=\frac{1}{V}\frac{\partial ^{2}\mathcal {E}_0(\vec {A})}{\partial A_i\partial A_j}\Big |_{\vec {A}=0}. \end{aligned}$$
(4)

This is nowadays known as the Kohn formula [1]. An extension to a finite temperature was achieved in Ref. [4].

Recently, two of us proposed [5] a generalization of the f-sum rule and the Kohn formula to the N-th order optical conductivity \(\sigma _{i}^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) [defined by Eqs. (10) and (22)] through a heuristic argument utilizing extreme quantum processes in the quench or the adiabatic limit. Historically, however, the f-sum rule and the Kohn formula for the linear optical conductivity were derived using the concrete expression of \(\sigma _i^j(\omega )\) in terms of current-current correlation functions obtained from the linear response theory [6] as we review in Sec. 4 [1, 3, 4]. The main result of this work is to put forward this analysis to higher-order conductivities and provide a proof of the generalized Kohn formula [in Eq. (28) below] via the perturbation theory for the second- and third-order response.

On the way to achieve this goal, we also find it possible to further generalize the f-sum rule to arbitrary non-equilibrium states by focusing on the instantaneous response [see Eq. (17)], which may be seen as another result of this work. This idea was briefly sketched in Ref. [5] without concrete formulation. Our study extends earlier works [7,8,9,10] on the f-sum rule in several respects.

  • In Ref. [7], the f-sum rules were formulated for general nonlinear response functions at higher orders. Since the electric conductivity represents the response of the electric current to the applied electric field, the nonlinear f-sum rules discussed in this paper might appear to be a special case of such a general result. However, the formulation in Ref. [7] by itself does not directly give the f-sum rules for the optical conductivity, because the perturbed Hamiltonian in our problem is written in terms of the vector potential \(\vec {A}(t)\), not the electric field \(\vec {E}(t)\) for which the optical conductivity is defined.

  • In Refs. [7, 8], the electric field was described in terms of the scalar potential \(\phi =-\vec {E}(t)\cdot \sum _j{\hat{\vec {q}}}^j\) (i.e., the length gauge [11]). This cannot represent a uniform electric field under the periodic boundary condition because the position operator becomes ill-defined [12]. Thus the f-sum rules for a uniform electric field in a finite system with the periodic boundary condition cannot be derived in this way.

  • In Refs. [7,8,9] the f-sum rule was derived for, in addition to the equilibrium states, non-equilibrium steady states or non-equilibrium states prepared in a specific protocol although with a large degree of freedoms including the functional form of the time-dependence of the pump field.

  • In Refs. [7,8,9,10], the separation of the Hamiltonian into the kinetic term (which is quadratic in particle field operators) and potential terms is assumed. Furthermore, the Newtonian kinetic energy \({\hat{K}}=\sum _{j}\hat{\vec {p}}_j^2/2m\) was assumed in Refs. [7, 8]. While the f-sum rule for the nonlinear conductivities was discussed in Ref. [7], the assumed form of the kinetic energy leads to vanishing f-sum for nonlinear conductivities at all (second and higher) orders. On the other hand, in Refs. [9, 10], more general (but still quadratic in field operators) kinetic term is considered, but the f-sum rule was derived only for the linear optical conductivity.

In contrast, we derive the compact explicit form of the f-sum rule of optical conductivities at arbitrary orders for most general non-equilibrium states, without assuming any specific form of the Hamiltonian. In particular, we find non-trivial f-sum rules for the second and the higher order non-linear conductivities for general Hamiltonians beyond the Newtonian or quadratic kinetic term. Wherever overlaps, our result is consistent with the earlier works. Our analysis also clarifies that the f-sum rule is not a property specific to the class of non-equilibrium states which were studied in Refs. [7,8,9], but essentially a statement on the instantaneous response of any state (although the standard representation in terms of the frequency integral is applicable only to stationary states). We also describe the electric field by the time-dependent gauge field (i.e., the velocity gauge [11]), which can naturally represent a uniform electric field under any boundary condition including the periodic one. For this purpose, we will keep the discussion applicable to general time-dependent, non-equilibrium states as much as possible. Furthermore, our analysis of the f-sum rules clarifies the similarlty to, and the difference from, the (non-linear) Kohn formulas, which is the other main topic of the present paper.

2 Setup and Definitions

Here we explain the setting of our study and give the definition of the nonlinear optical conductivity and the nonlinear Drude weight.

2.1 Setup

We consider quantum many-body systems defined on the d-dimensional cubic lattice.Footnote 1 The system size V is kept finite with an arbitrary boundary condition. Let \({\hat{H}}_{0}(t)\) be the Hamiltonian of the system, which is allowed to explicitly depend on t. The Hamiltonian may contain arbitrary forms of kinetic terms and interactions, but all terms are required to be short-ranged and U(1) symmetric. If the initial density matrix is \({\hat{\rho }}_0\), the density matrix at a later time t is given by

$$\begin{aligned} {\hat{\rho }}_{0}(t)={\hat{S}}_0(t){\hat{\rho }}_0{\hat{S}}_0(t)^\dagger , \end{aligned}$$
(5)

where \({\hat{S}}_0(t)\) is the time-evolution operator

$$\begin{aligned} {\hat{S}}_0(t)=\mathcal {T}\exp \left( -i \int _0^tdt'{\hat{H}}_0(t')\right) . \end{aligned}$$
(6)

Since we have not put any restriction on the initial density matrix, \({\hat{\rho }}_{0}(t)\) can describe an arbitrary equilibrium and non-equilibrium state.

We perturb this system by applying a uniform electric field \(\vec {E}(t) \equiv d \vec {A}(t)/dt\) via the vector potential \(\vec {A}(t)=(A_x(t),A_y(t),\dots )\). (Note that our sign convention of \(\vec {A}(t)\) is opposite to the standard one. ) The perturbed Hamiltonian is denoted by \({\hat{H}}(t,\vec {A}(t))\). \(\vec {A}(t)\) is set to be \(\vec {0}\) for \(t\le 0\) and is turned on continuously at \(t=0\). The word “perturbation” in this work is exclusively used regarding the external field; the effect of many-body interactions and disorders, if any, can be fully taken into account in \({\hat{\rho }}_0(t)\). We assume that \({\hat{H}}(t,\vec {A})\) is analytic with respect to \(\vec {A}\) so that

$$\begin{aligned}&{\hat{H}}_{i_1\dots i_N}(t)\equiv \frac{\partial ^N{\hat{H}}(t,\vec {A})}{\partial A_{i_1}\dots \partial A_{i_N}}\Big |_{\vec {A}=\vec {0}} \end{aligned}$$
(7)

is well-defined for any integer \(N\ge 1\).

We are interested in the response of the current density toward the applied electric field. The current density operator is defined by

$$\begin{aligned} \hat{\vec {j}}(t,\vec {A})\equiv \frac{1}{V}\frac{\partial {\hat{H}}(t,\vec {A})}{\partial \vec {A}}. \end{aligned}$$
(8)

If \({\hat{\rho }}(t)\) is the perturbed density matrix fully including the effect of \(\vec {A}(t)\), the current expectation value at time t reads

$$\begin{aligned} j_i(t)\equiv \text {tr}\big ({\hat{j}}_i(t,\vec {A}(t)){\hat{\rho }}(t)\big )\quad (i=x,y,\dots ). \end{aligned}$$
(9)

The spontaneous current \(j_i(t)|_{\vec {A}=0}\) vanishes in the Gibbs states and in the ground state [13,14,15], while it may be nonzero in other more general states. The N-th order response of \(j_i(t)\) (\(N\ge 1\)) may be written as

$$\begin{aligned} j_{i}^{(N)}(t)&=\frac{1}{N!}\sum _{i_1,\dots ,i_N}\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\prod _{\ell =1}^NE_{i_\ell }(t_\ell ), \end{aligned}$$
(10)

which gives the definition of the N-th order conductivity. Here, the summation of \(i_\ell \)’s (\(\ell =1,\dots ,N\)) is taken over \(x,y,\dots \). An infinitesimal parameter \(\epsilon >0\) is included to properly treat possible \(\delta \)-functions. As \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) vanishes whenever \(t_\ell >t\) for any \(\ell =1,\dots ,N\) due to the causality, the value of \(\epsilon \) does not affect the integral. The nonlinear conductivity \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) is symmetric with respect to the exchange of any pair of \((i_\ell ,t_\ell )\) and \((i_{\ell '},t_{\ell '})\).

For our discussions below, we find it useful to introduce another set of response functions \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\). They are defined similarly to \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) in Eq. (10) but \(\vec {E}(t)\) there is replaced with \(\vec {A}(t)\):

$$\begin{aligned} j_{i}^{(N)}(t)&=\frac{1}{N!}\sum _{i_1,\dots ,i_N}\int _0^{t+\epsilon }dt_1\dots \int _0^{t+\epsilon } dt_N \phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\prod _{\ell =1}^NA_{i_\ell }(t_\ell ). \end{aligned}$$
(11)

In terms of the response function \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\), the conductivity \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) is given by

$$\begin{aligned} \sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)=\int _{t_1}^{t+\epsilon }dt_1' \dots \int _{t_N}^{t+\epsilon }dt_N' \phi _{i}^{i_1\dots i_N}(t,t_1',\dots , t_N'). \end{aligned}$$
(12)

This can be seen by expressing \(\vec {A}(t)\) as \(\vec {A}(t)=\int _{0}^{t}dt'\vec {E}(t')\) and by performing integration by parts one by one for \(t_1,\dots ,t_N\) in Eq. (11).

$$\begin{aligned}&\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\prod _{\ell =1}^NA_{i_\ell }(t_\ell )\nonumber \\&\quad =\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \left( \frac{d}{dt_1}\int _0^{t_1}dt_1' \phi _{i}^{i_1\dots i_N}(t,t_1',t_2,\dots , t_N)\right) \left( \int _0^{t_1} dt_1'' E_{i_1}(t_1'')\right) \prod _{\ell =2}^NA_{i_\ell }(t_\ell )\nonumber \\&\quad =\int _0^{t+\epsilon } dt_2\dots \int _0^{t+\epsilon } dt_N \int _0^{{t+\epsilon }}dt_1' \phi _{i}^{i_1\dots i_N}(t,t_1',t_2,\dots , t_N)\int _0^{{t+\epsilon }} dt_1'' E_{i_1}(t_1'')\prod _{\ell =2}^NA_{i_\ell }(t_\ell )\nonumber \\&\quad \quad -\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \int _0^{t_1}dt_1' \phi _{i}^{i_1\dots i_N}(t,t_1',t_2,\dots , t_N) E_{i_1}(t_1)\prod _{\ell =2}^NA_{i_\ell }(t_\ell )\nonumber \\&\quad =\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \left( \int _{t_1}^{{t+\epsilon }}dt_1' \phi _{i}^{i_1\dots i_N}(t,t_1',t_2,\dots , t_N)\right) E_{i_1}(t_1)\prod _{\ell =2}^NA_{i_\ell }(t_\ell )\nonumber \\&\quad =\dots \nonumber \\&\quad =\int _0^{t+\epsilon } dt_1\dots \int _0^{t+\epsilon } dt_N \left( \int _{t_1}^{{t+\epsilon }}dt_1'\dots \int _{t_N}^{{t+\epsilon }}dt_N' \phi _{i}^{i_1\dots i_N}(t,t_1',\dots , t_N')\right) \prod _{\ell =1}^NE_{i_\ell }(t_\ell ). \end{aligned}$$
(13)

Comparing the last line with Eq. (10), we obtain Eq. (12).

2.2 Instantaneous response

Let us define the instantaneous conductivity by

$$\begin{aligned} \mathcal {I}_{i}^{i_1\dots i_N}(t)&\equiv \lim _{t'\rightarrow t-0}\sigma _{i}^{i_1\dots i_N}(t,t',\dots , t'). \end{aligned}$$
(14)

In terms of \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\), we have

$$\begin{aligned} \mathcal {I}_{i}^{i_1\dots i_N}(t)=\lim _{t'\rightarrow t-0}\int _{t'}^{t+\epsilon }dt_1' \dots \int _{t'}^{t+\epsilon }dt_N' \phi _{i}^{i_1\dots i_N}(t,t_1',\dots , t_N'), \end{aligned}$$
(15)

which implies that \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) contains a term of the form

$$\begin{aligned} \phi _{i\,\text {(inst)}}^{i_1\dots i_N}(t,t_1,\dots , t_N)\equiv \mathcal {I}_{i}^{i_1\dots i_N}(t)\prod _{\ell =1}^N\delta (t-t_\ell ). \end{aligned}$$
(16)

One of the main results of this work is the following formula that gives \(\mathcal {I}_{i}^{i_1\dots i_N}(t)\) as the expectation value of \({\hat{H}}_{i_1\dots i_N}(t)\) in Eq. (7):

$$\begin{aligned} \mathcal {I}_{i}^{i_1\dots i_N}(t)=\lim _{t'\rightarrow t-0}\sigma _{i}^{i_1\dots i_N}(t,t',\dots , t')=\frac{1}{V}\text {tr}\big ({\hat{H}}_{ii_1\dots i_N}(t){\hat{\rho }}_0(t)\big ). \end{aligned}$$
(17)

See Sect. 4.1 for the derivation. This relation can be interpreted as the generalized f-sum rule for non-equilibrium states as we discuss in Sect. 2.3.

2.3 Fourier Transformation for Stationary States

If the unperturbed Hamiltonian \({\hat{H}}_0\) lacks any time-dependence and if the initial density matrix \({\hat{\rho }}_0\) commutes with \({\hat{H}}_0\), the system becomes time-translation invariant. In such a case, \({\hat{\rho }}_0\) can be written as

$$\begin{aligned} {\hat{\rho }}_0=\sum _n\rho _n|n\rangle \langle n|, \end{aligned}$$
(18)

where \(|n\rangle \) is the N-th eigenstate of \({\hat{H}}_0\) with the energy eigenvalue \(\mathcal {E}_n\). In the stationary state, both \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) and \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)\) depend only on the difference \(t-t_\ell \). We write

$$\begin{aligned} \sigma _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)&=\sigma _{i}^{i_1\dots i_N}(t-t_1,\dots , t-t_N), \end{aligned}$$
(19)
$$\begin{aligned} \phi _{i}^{i_1\dots i_N}(t,t_1,\dots , t_N)&=\phi _{i}^{i_1\dots i_N}(t-t_1,\dots , t-t_N), \end{aligned}$$
(20)

for which Eq. (12) becomes

$$\begin{aligned} \sigma _{i}^{i_1\dots i_N}(t_1,\dots , t_N)=\int _{-\epsilon }^{t_1}dt_1' \dots \int _{-\epsilon }^{t_N}dt_N' \phi _{i}^{i_1\dots i_N}(t_1',\dots , t_N'). \end{aligned}$$
(21)

In the presence of the time-translation symmetry, it is customary to work in the Fourier space. The Fourier transformation is defined by

$$\begin{aligned}&\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)=\int _{-\epsilon }^{\infty }dt_1\dots \int _{-\epsilon }^{\infty }dt_N\sigma _i^{i_1\dots i_N}(t_1,\dots ,t_N)\prod _{\ell =1}^Ne^{(i\omega _\ell -\eta )t_\ell }, \end{aligned}$$
(22)
$$\begin{aligned}&\phi _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)=\int _{-\epsilon }^{\infty }dt_1\dots \int _{-\epsilon }^{\infty }dt_N\phi _i^{i_1\dots i_N}(t_1,\dots ,t_N)\prod _{\ell =1}^Ne^{(i\omega _\ell -\eta )t_\ell }, \end{aligned}$$
(23)

where \(\eta >0\) is an infinitesimal parameter ensuring the convergence of the integrand in the \(t_\ell \rightarrow \infty \) limit. The Fourier component \(\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) is called the N-th order optical conductivity. In the Fourier space, Eq. (21) reduces to

$$\begin{aligned} \sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)=\phi _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\prod _{\ell =1}^N\frac{i}{\omega _\ell +i\eta }. \end{aligned}$$
(24)

By the inverse Fourier transformation, the frequency integral of \(\sigma _{i}^{i_1\dots i_N}(\omega _1,\dots , \omega _N)\) can be related to the instantaneous conductivity in Eq. (17) that is time-independent in stationary states:

$$\begin{aligned}&\int _{-\infty }^{\infty }\frac{d\omega _1}{2\pi }\dots \int _{-\infty }^{\infty }\frac{d\omega _N}{2\pi }\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\nonumber \\&\quad =\frac{1}{2^N}\mathcal {I}_{i}^{i_1\dots i_N}=\frac{1}{2^N}\lim _{t'\rightarrow +0}\sigma _{i}^{i_1\dots i_N}(t',\dots , t')=\frac{1}{2^NV}\sum _n\rho _n\big \langle n\big |{\hat{H}}_{ii_1\dots i_N}\big |n\big \rangle .~ \end{aligned}$$
(25)

The factor \(2^{-N}\) originates from the discontinuity of \(\sigma _{i}^{i_1\dots i_N}(t_1,\dots , t_N)\) around \(t_\ell =0\). This is the generalized f-sum rule for non-linear conductivity proposed recently [5].

2.4 Drude Weight and Kohn Formula

The Drude weight, usually discussed for the linear optical conductivity as in Eq. (3), can be naturally extended to nonlinear responses. The N-th order Drude weight (\(N\ge 1\)) is the coefficient of the term proportional to \(\prod _{\ell =1}^Ni/(\omega _\ell +i\eta )\) in the N-th order optical conductivity [5]:

$$\begin{aligned} \sigma _{i\,\text {(Drude)}}^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)=\mathcal {D}_{i}^{i_1\dots i_N}\prod _{\ell =1}^N\frac{i}{\omega _\ell +i\eta }. \end{aligned}$$
(26)

This is the most singular part of \(\sigma _{i}^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) around \(\omega _1=\dots =\omega _N=0\) proportional to \(\prod _{i=1}^N\delta (\omega _\ell )\). One may think of \(\sigma _{i\,\text {(Drude)}}^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) as the long-time average of \(\sigma _{i}^{i_1\dots i_N}(t_1,\dots ,t_N)\), in sharp contrast to the instantaneous value \(\mathcal {I}_{i}^{i_1\dots i_N}(t)\) defined in Eq. (14).

Comparing with Eq. (24), we find

$$\begin{aligned} \mathcal {D}_{i}^{i_1\dots i_N}= \phi _i^{i_1\dots i_N}(\omega _1=0,\dots ,\omega _N=0)=\int _{-\epsilon }^{\infty }dt_1\dots \int _{-\epsilon }^{\infty }dt_N\phi _i^{i_1\dots i_N}(t_1,\dots ,t_N)e^{-\eta \sum _{\ell =1}^Nt_\ell }. \end{aligned}$$
(27)

The generalized Kohn formula proposed in Ref. [5] reads

$$\begin{aligned}&\mathcal {D}_i^{i_1\dots i_N}=\frac{1}{V}\sum _n \rho _n\frac{\partial ^{N+1}\mathcal {E}_n(\vec {A})}{\partial A_{i}\partial A_{i_1}\dots \partial A_{i_N}}\Big |_{\vec {A}=\vec {0}}. \end{aligned}$$
(28)

Here, \(\mathcal {E}_n(\vec {A})\) is the energy eigenvalue of the (instantaneous) eigenstate \(|n(\vec {A})\rangle \) of \({\hat{H}}(\vec {A})\). The weight \(\rho _n\) appearing in \({\hat{\rho }}_0\) in Eq. (18) is kept independent of \(\vec {A}\). The energy eigenvalue \(\mathcal {E}_n(\vec {A})\) and the eigenstate \(|n(\vec {A})\rangle \) are assumed to be analytic around \(\vec {A}=0\), satisfying \(\mathcal {E}_n(\vec {0})=\mathcal {E}_n\) and \(|n(\vec {0})\rangle =|n\rangle \). The case of \(N=1\) for Gibbs states (i.e., \(\rho _n\propto e^{-\beta \mathcal {E}_n}\)) reduces to the finite-temperature extension of the original Kohn formula discussed in Ref. [4], and Eq. (28) is its generalization to the N-th order optical conductivity (\(N\ge 1\)) of general stationary states. In Sect. 4.2, we give a proof of Eq. (28) via the non-degenerate perturbation theory for \(N=2\) and 3.

As a by-product of this derivation, we obtain the following alternative formulas of the Drude weight up to the third-order response [see Eqs. (63), (64), and (67)]:

$$\begin{aligned} \mathcal {D}_i^{i_1}&=\frac{1}{V}\sum _n\rho _n\Big (\langle n| {\hat{H}}_{ii_1}|n\rangle -\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_1}|n\rangle +\text {c.c.}\Big ), \end{aligned}$$
(29)
$$\begin{aligned} \mathcal {D}_i^{i_1i_2}&=\mathcal {S}_{ii_1i_2}\frac{1}{V}\sum _n\rho _n\Big [\langle n|{\hat{H}}_{ii_1i_2}|n\rangle -3\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_1i_2}|n\rangle +\text {c.c.}\nonumber \\&\quad +6\langle n|{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_2}|n\rangle \Big ], \end{aligned}$$
(30)
$$\begin{aligned} \mathcal {D}_{i}^{i_1i_2i_3}&=\mathcal {S}_{ii_1i_2i_3}\frac{1}{V}\sum _n\rho _n\Big [\langle n|{\hat{H}}_{i i_1 i_2 i_3}|n\rangle -6\langle n|{\hat{H}}_{ii_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_2i_3}|n\rangle \nonumber \\&\quad -4\langle n|{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_1i_2i_3} |n\rangle +\text {c.c.}\Big ]\nonumber \\&\quad +12\mathcal {S}_{ii_1i_2i_3}\frac{1}{V}\sum _n\rho _n\Big [ \langle n|{\hat{H}}_{i_3}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_1i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_i|n\rangle \nonumber \\&\quad +\langle n|{\hat{H}}_{i_1i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle +\text {c.c.}\Big ]\nonumber \\&\quad -24\mathcal {S}_{ii_1i_2i_3}\frac{1}{V}\sum _n\rho _n\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle \nonumber \\&\quad +24\mathcal {S}_{ii_1i_2i_3}\frac{1}{V}\sum _n\rho _n\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{({\hat{H}}_0-\mathcal {E}_{n})^2}{\hat{H}}_{i_1}|n\rangle \langle n|{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle . \end{aligned}$$
(31)

Here, \({\hat{Q}}_n\equiv 1-|n\rangle \langle n|\) is the projector onto the compliment of the space spanned by \(|n\rangle \), \(\text {c.c.}\) represents the complex conjugation of the term right in front, \(\mathcal {S}_{ii_1\dots i_N}\) is the symmetrizing operation among i, \(i_1\), \(\dots \), and \(i_N\), and \(\delta _n{\hat{H}}_i\) is a short-hand notation for \({\hat{H}}_i-\langle n|{\hat{H}}_i|n\rangle \). The expression  (29) for the linear Drude weight has been used in the literature [1, 3, 4], and Eqs. (30) and (31) are its generalizations to the second- and the third order Drude weight. The advantage of these expressions is that the gauge field \(\vec {A}\) is set to be \(\vec {0}\) when diagonalizing the Hamiltonian \({\hat{H}}_0\) to find \(|n\rangle \) and \(\mathcal {E}_n\). As a price to pay, one needs to know all excited states to compute the correlation functions. This is in contrast to the Kohn formula (28) at zero temperature, which uses only the ground state energy as a function of \(\vec {A}\).

3 Kubo Theory

In this section, we derive the concrete expressions of \(\phi _i^{i_1\dots i_N}(t,t_1,\dots ,t_N)\) for \(N=1\), 2 and 3 by following Kubo’s work on the response theory [6]. This formulation naturally leads us to a proof of Eq. (17) on the instantaneous conductivity for a general \(N\ge 1\).

Let us begin by expanding the perturbed Hamiltonian \({\hat{H}}(t,\vec {A}(t))\) as

$$\begin{aligned} {\hat{H}}(t,\vec {A}(t))=\sum _{N=0}^\infty {\hat{H}}_{N}(t), \end{aligned}$$
(32)

where \({\hat{H}}_{N}(t)\) is the correction of all N-th order terms,

$$\begin{aligned}&{\hat{H}}_{N}(t)\equiv \frac{1}{N!}\sum _{i_1,\dots ,i_N}{\hat{H}}_{i_1\dots i_N}(t)\prod _{\ell =1}^NA_{i_\ell }(t). \end{aligned}$$
(33)

The perturbed density matrix \({\hat{\rho }}(t)\) can be accordingly written as

$$\begin{aligned}&{\hat{\rho }}(t)=\sum _{N=0}^\infty {\hat{\rho }}_{N}(t). \end{aligned}$$
(34)

Plugging Eqs. (32) and (34) into the von Neumann equation \(i\partial _t{\hat{\rho }}(t)=[{\hat{H}}(t,\vec {A}(t)),{\hat{\rho }}(t)]\), we find, order by order, that

$$\begin{aligned} \partial _t{\hat{\rho }}_{N}(t)&=\sum _{M=0}^N(-i)[{\hat{H}}_{N-M}(t),{\hat{\rho }}_M(t)]. \end{aligned}$$
(35)

We can solve this equation for \({\hat{\rho }}_N(t)\) by switching to the “interacting picture”

$$\begin{aligned} \hat{{\tilde{O}}}(t)\equiv {\hat{S}}_0^\dagger (t){\hat{O}}(t){\hat{S}}_0(t), \end{aligned}$$
(36)

where \({\hat{S}}_0(t)\) was defined in Eq. (6). Assuming \({\hat{\rho }}_N(0)=0\) for \(N\ge 1\), we find

$$\begin{aligned} \hat{{\tilde{\rho }}}_N(t)&=\sum _{M=0}^{N-1}\int _0^{t} dt'(-i)[\hat{{\tilde{H}}}_{N-M}(t'),\hat{{\tilde{\rho }}}_M(t')]\quad {(N\ge 1)}. \end{aligned}$$
(37)

Since the right-hand side contains only \(\hat{{\tilde{\rho }}}_M(t')\) with \(0\le M\le N-1\), one can find \(\hat{{\tilde{\rho }}}_N(t)\) inductively. For example, we have \(\hat{{\tilde{\rho }}}_0(t)={\hat{\rho }}_0\),

$$\begin{aligned} \hat{{\tilde{\rho }}}_1(t)&=\int _0^{t} dt'(-i)[\hat{{\tilde{H}}}_{1}(t'),{\hat{\rho }}_0], \end{aligned}$$
(38)
$$\begin{aligned} \hat{{\tilde{\rho }}}_2(t)&=\int _0^{t} dt'(-i)[\hat{{\tilde{H}}}_{2}(t'),{\hat{\rho }}_0]+\int _0^{t} dt'\int _0^{t'} dt''(-i)^2[\hat{{\tilde{H}}}_{1}(t'),[\hat{{\tilde{H}}}_{1}(t''),{\hat{\rho }}_0]], \end{aligned}$$
(39)

and

$$\begin{aligned} \hat{{\tilde{\rho }}}_3(t)&=\int _0^{t} dt'(-i)[\hat{{\tilde{H}}}_{3}(t'),{\hat{\rho }}_0]+\int _0^{t} dt'\int _0^{t'} dt''(-i)^2\big ([\hat{{\tilde{H}}}_{2}(t'),[\hat{{\tilde{H}}}_{1}(t''),{\hat{\rho }}_0]]\nonumber \\&\quad +[\hat{{\tilde{H}}}_{1}(t'),[\hat{{\tilde{H}}}_{2}(t''),{\hat{\rho }}_0]\big )\nonumber \\&\quad +\int _0^{t} dt'\int _0^{t'} dt''\int _0^{t''} dt'''(-i)^3[\hat{{\tilde{H}}}_{1}(t'),[\hat{{\tilde{H}}}_{1}(t''),[\hat{{\tilde{H}}}_{1}(t'''),{\hat{\rho }}_0]]]. \end{aligned}$$
(40)

Expressions for higher-order terms can be derived, at least formally, in the same way. However, when expressed in terms of \({\hat{H}}_{M}\) (\(0\le M\le N\)) and \({\hat{\rho }}_0\), the number of terms in \(\hat{{\tilde{\rho }}}_N(t)\) (\(N\ge 1\)) is \(2^{N-1}\), and it is practically not easy to keep track of them altogether. In what follows, we will mainly discuss corrections only up to the third order (\(N=3\)).

The current density operator in Eq. (8) depends on \(\vec {A}(t)\) explicitly, which can also be written as

$$\begin{aligned} {\hat{j}}_i(t,\vec {A}(t))&=\sum _{N=0}^\infty {\hat{j}}_{iN}(t), \end{aligned}$$
(41)
$$\begin{aligned} {\hat{j}}_{iN}(t)&\equiv \frac{1}{V}\sum _{N=0}^\infty \frac{1}{N!}\sum _{i_1,\dots ,i_N}{\hat{H}}_{ii_1\dots i_{N}}(t)\prod _{\ell =1}^{N}A_{i_\ell }(t). \end{aligned}$$
(42)

The N-th order correction to the current expectation value is the sum of \(N+1\) contributions:

$$\begin{aligned} j_{i}^{(N)}(t)=\sum _{M=0}^N\text {tr}\big (\hat{{\tilde{j}}}_{i N-M}(t)\hat{{\tilde{\rho }}}_M(t)\big ), \end{aligned}$$
(43)

which contains \(2^N\) terms in total. Using Eq. (38)–(40), we get

$$\begin{aligned} j_{i}^{(1)}(t)&=\langle \hat{{\tilde{j}}}_{i 1}(t)\rangle _0+\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{1}(t')]\rangle _0, \end{aligned}$$
(44)
$$\begin{aligned} j_{i}^{(2)}(t)&=\langle \hat{{\tilde{j}}}_{i 2}(t)\rangle _0+\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 1}(t),\hat{{\tilde{H}}}_{1}(t')]\rangle _0\nonumber \\&\quad +\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{2}(t')]\rangle _0+\int _0^{t} dt'\int _0^{t'} dt''(-i)^2\langle [[\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{1}(t')],\hat{{\tilde{H}}}_{1}(t'')]\rangle _0, \end{aligned}$$
(45)

and

$$\begin{aligned} j_{i}^{(3)}(t)&=\langle \hat{{\tilde{j}}}_{i 3}(t)\rangle _0+\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 2}(t),\hat{{\tilde{H}}}_{1}(t')]\rangle _0\nonumber \\&\quad +\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 1}(t),\hat{{\tilde{H}}}_{2}(t')]\rangle _0+\int _0^{t} dt'\int _0^{t'} dt''(-i)^2\langle [[\hat{{\tilde{j}}}_{i 1}(t),\hat{{\tilde{H}}}_{1}(t')],\hat{{\tilde{H}}}_{1}(t'')]\rangle _0\nonumber \\&\quad +\int _0^{t} dt'(-i)\langle [\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{3}(t')]\rangle _0+\int _0^{t} dt'\int _0^{t'} dt''(-i)^2\big (\langle [[\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{2}(t')],\hat{{\tilde{H}}}_{1}(t'')]\rangle _0\nonumber \\&\quad +\langle [[\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{1}(t')],\hat{{\tilde{H}}}_{2}(t'')]\rangle _0\big )\nonumber \\&\quad +\int _0^{t} dt'\int _0^{t'} dt''\int _0^{t''} dt'''(-i)^3\langle [[[\hat{{\tilde{j}}}_{i 0}(t),\hat{{\tilde{H}}}_{1}(t')],\hat{{\tilde{H}}}_{1}(t'')],\hat{{\tilde{H}}}_{1}(t''')]\rangle _0. \end{aligned}$$
(46)

Here and hereafter, we write

$$\begin{aligned} \langle {\hat{O}}\rangle _0\equiv \text {tr}\big ({\hat{O}}{\hat{\rho }}_0\big ) \end{aligned}$$
(47)

for any operator \({\hat{O}}\). Plugging Eqs. (33) and (42) into these results, we can read off \(\phi _{i}^{i_1\dots i_N}(t,t_1,\dots ,t_N)\):

$$\begin{aligned} V\phi _{i}^{i_1}(t,t_1)&=\delta (t-t_1)\langle \hat{{\tilde{H}}}_{i i_1}(t)\rangle _0\nonumber \\&\quad +\theta (t-t_1)(-i)\langle [\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1}(t_1)]\rangle _0, \end{aligned}$$
(48)
$$\begin{aligned} V\phi _{i}^{i_1i_2}(t,t_1,t_2)&=\delta (t-t_1)\delta (t-t_2)\langle \hat{{\tilde{H}}}_{ii_1i_2}(t)\rangle _0\nonumber \\&\quad +2\mathcal {S}_{i_1i_2}\delta (t-t_1)\theta (t-t_2)(-i)\langle [\hat{{\tilde{H}}}_{i i_1}(t),\hat{{\tilde{H}}}_{i_2}(t_2)]\rangle _0\nonumber \\&\quad +\delta (t_1-t_2)\theta (t-t_1)(-i)\langle [\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1i_2}(t_1)]\rangle _0\nonumber \\&\quad +2\mathcal {S}_{i_1i_2}\theta (t-t_1)\theta (t_1-t_2)(-i)^2\langle [[\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1}(t_1)],\hat{{\tilde{H}}}_{i_2}(t_2)]\rangle _0, \end{aligned}$$
(49)

and

$$\begin{aligned}&V\phi _{i}^{i_1i_2i_3}(t,t_1,t_2,t_3) \nonumber \\&\quad =\delta (t-t_1)\delta (t-t_2)\delta (t-t_3)\langle \hat{{\tilde{H}}}_{i i_1 i_2 i_3}(t)\rangle _0\nonumber \\&\qquad +3\mathcal {S}_{i_1i_2i_3}\delta (t-t_1)\delta (t-t_2)\theta (t-t_3)(-i)\langle [\hat{{\tilde{H}}}_{ii_1i_2}(t),\hat{{\tilde{H}}}_{i_3}(t_3)]\rangle _0\nonumber \\&\qquad +3\mathcal {S}_{i_1i_2i_3}\delta (t-t_1)\delta (t_2-t_3)\theta (t-t_2)(-i)\langle [\hat{{\tilde{H}}}_{i i_1}(t),\hat{{\tilde{H}}}_{i_2i_3}(t_2)]\rangle _0\nonumber \\&\qquad +6\mathcal {S}_{i_1i_2i_3}\delta (t-t_1)\theta (t-t_2)\theta (t_2-t_3)(-i)^2\langle [[\hat{{\tilde{H}}}_{i i_1}(t),\hat{{\tilde{H}}}_{i_2}(t_2)],\hat{{\tilde{H}}}_{i_3}(t_3)]\rangle _0\nonumber \\&\qquad +\delta (t_1-t_2)\delta (t_2-t_3)\theta (t-t_1)(-i)\langle [\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1i_2i_3}(t_1)]\rangle _0\nonumber \\&\qquad +3\mathcal {S}_{i_1i_2i_3}\delta (t_1-t_2)\theta (t-t_1)\theta (t_1-t_3)(-i)^2\langle [[\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1i_2}(t_1)],\hat{{\tilde{H}}}_{i_3}(t_3)]\rangle _0\nonumber \\&\qquad +3\mathcal {S}_{i_1i_2i_3}\delta (t_2-t_3)\theta (t-t_1)\theta (t_1-t_2)(-i)^2\langle [[\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1}(t_1)],\hat{{\tilde{H}}}_{i_2i_3}(t_2)]\rangle _0\nonumber \\&\qquad +6\mathcal {S}_{i_1i_2i_3}\theta (t-t_1)\theta (t_1-t_2)\theta (t_2-t_3)(-i)^3\langle [[[\hat{{\tilde{H}}}_{i}(t),\hat{{\tilde{H}}}_{i_1}(t_1)],\hat{{\tilde{H}}}_{i_2}(t_2)],\hat{{\tilde{H}}}_{i_3}(t_3)]\rangle _0. \end{aligned}$$
(50)

Here, \(\theta (t)\) is the Heaviside step function and \(\mathcal {S}_{i_1\dots i_N}\) refers to the symmetrizing operation averaging over all N! permutations of \((i_\ell ,t_\ell )\)’s. For example,

$$\begin{aligned} \mathcal {S}_{i_1i_2}f_{i_1i_2}(t_1,t_2)&=\frac{1}{2}\big [f_{i_1i_2}(t_1,t_2)+f_{i_2i_1}(t_2,t_1)\big ], \end{aligned}$$
(51)
$$\begin{aligned} \mathcal {S}_{i_1i_2i_3}f_{i_1i_2i_3}(t_1,t_2,t_3)&=\frac{1}{6}\big [f_{i_1i_2i_3}(t_1,t_2,t_3)+f_{i_2i_3i_1}(t_2,t_3,t_1)+f_{i_3i_1i_2}(t_3,t_1,t_2)\nonumber \\&\quad +f_{i_3i_2i_1}(t_3,t_2,t_1)+f_{i_1i_3i_2}(t_1,t_3,t_2)+f_{i_2i_1i_3}(t_2,t_1,t_3)\big ]. \end{aligned}$$
(52)

One can obtain the expression of the corresponding \(\sigma _{i}^{i_1\dots i_N}(t,t_1,\dots ,t_N)\) by using Eq. (12).

Let us switch back to the consideration for a general \(N\ge 1\). In general, \(j_{i}^{(N)}(t)\) (\(N\ge 0\)) contains \(\langle \hat{{\tilde{j}}}_{i N}(t)\rangle _0=\langle \hat{{\tilde{H}}}_{i i_1 \dots i_N}(t)\rangle _0/V\) that originates from the \(M=0\) contribution in Eq. (43). This term results in the instantaneous response

$$\begin{aligned} \phi _{i\,\text {(inst)}}^{i_1\dots i_N}(t,t_1,\dots ,t_N)=\frac{1}{V}\langle \hat{{\tilde{H}}}_{i i_1 \dots i_N}(t)\rangle _0\prod _{\ell =1}^N\delta (t-t_\ell )=\frac{1}{V}\text {tr}\big ({\hat{H}}_{i i_1 \dots i_N}(t){\hat{\rho }}_0(t)\big )\prod _{\ell =1}^N\delta (t-t_\ell ). \end{aligned}$$
(53)

All other contributions to \(j_{i}^{(N)}(t)\) from \(M\ge 1\) in Eq. (43) are retarded effect involving at least one temporal integration that can be traced back to Eq. (37). Comparing Eq. (53) with Eq. (16), we obtain our result in Eq. (17).

4 Direct Proof of the Generalized Kohn Formula

In this section we perform the Fourier transformation to obtain the explicit form of \(\phi _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) for \(N=1\), 2 and 3. Then we use them to give a perturbative proof of the Kohn formula for the second- and third-order optical conductivity.

4.1 Optical Conductivity and f-Sum Rules

In the presence of the time-translation symmetry, Eqs. (48) and (49) reduce to

$$\begin{aligned} V\phi _i^{i_1}(t_1)&= \delta (t_1)\langle {\hat{H}}_{ii_1}\rangle _0\nonumber \\&\quad +\theta (t_1)(-i)\langle [{\hat{H}}_{i},e^{-i{\hat{H}}_0t_1}{\hat{H}}_{i_1}e^{i{\hat{H}}_0t_1}]\rangle _0 \end{aligned}$$
(54)

and

$$\begin{aligned} V\phi _{i}^{i_1i_2}(t_1,t_2)&=\delta (t_1)\delta (t_2)\langle {\hat{H}}_{ii_1i_2}\rangle _0\nonumber \\&\quad +2\mathcal {S}_{i_1i_2}\delta (t_1)\theta (t_2)(-i)\langle [{\hat{H}}_{ii_1},e^{-i{\hat{H}}_0t_2}{\hat{H}}_{i_2}e^{i{\hat{H}}_0t_2}]\rangle _0\nonumber \\&\quad +\delta (t_2-t_1)\theta (t_1)(-i)\langle [{\hat{H}}_i,e^{-i{\hat{H}}_0t_1}{\hat{H}}_{i_1i_2}e^{i{\hat{H}}_0t_1}]\rangle _0\nonumber \\&\quad +2\mathcal {S}_{i_1i_2}\theta (t_1)\theta (t_2-t_1)(-i)^2\langle [[{\hat{H}}_i,e^{-i{\hat{H}}_0t_1}{\hat{H}}_{i_1}e^{i{\hat{H}}_0t_1}],e^{-i{\hat{H}}_0t_2}{\hat{H}}_{i_2}e^{i{\hat{H}}_0t_2}]\rangle _0. \end{aligned}$$
(55)

We perform the Fourier transformation (23) assuming the form of the density matrix in Eq. (18). We find

$$\begin{aligned} V\phi _i^{i_1}(\omega _1)&=\langle {\hat{H}}_{ii_1}\rangle _0\nonumber \\&\quad +\sum _n\rho _n\langle n|\Big (\delta _n{\hat{H}}_i\frac{{\hat{Q}}_n}{\omega _1-{\hat{H}}_0+\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i_1}-\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{\omega _1+{\hat{H}}_0-\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i}\Big )|n\rangle \end{aligned}$$
(56)

and

$$\begin{aligned}&V\phi _{i}^{i_1i_2}(\omega _1,\omega _2) \nonumber \\&\quad =\langle {\hat{H}}_{ii_1i_2}\rangle _0\nonumber \\&\quad \quad +2\mathcal {S}_{i_1i_2}\sum _n\rho _n\langle n|\Big (\delta _n{\hat{H}}_{ii_1}\frac{{\hat{Q}}_n}{\omega _2-{\hat{H}}_0+\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i_2}-\delta _n{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{\omega _2+{\hat{H}}_0-\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{ii_1}\Big )|n\rangle \nonumber \\&\quad \quad +\sum _n\rho _n\langle n|\Big (\delta _n{\hat{H}}_i\frac{{\hat{Q}}_n}{\omega _1+\omega _2-{\hat{H}}_0+\mathcal {E}_{n}+2i\eta }\delta _n{\hat{H}}_{i_1i_2}\nonumber \\&\quad \quad -\delta _n{\hat{H}}_{i_1i_2}\frac{{\hat{Q}}_n}{\omega _1+\omega _2+{\hat{H}}_0-\mathcal {E}_{n}+2i\eta }\delta _n{\hat{H}}_{i}\Big )|n\rangle \nonumber \\&\quad \quad +2\mathcal {S}_{i_1i_2}\sum _n\rho _n\Big [\langle n|\delta _n{\hat{H}}_i\frac{{\hat{Q}}_n}{\omega _1+\omega _2-{\hat{H}}_0+\mathcal {E}_{n}+2i\eta }\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{\omega _2-{\hat{H}}_0+\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i_2}|n\rangle \nonumber \\&\quad \quad -\langle n|\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{\omega _1+{\hat{H}}_0-\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_i\frac{{\hat{Q}}_n}{\omega _2-{\hat{H}}_0+\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i_2}|n\rangle \nonumber \\&\quad \quad +\langle n|\delta _n{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{\omega _2+{\hat{H}}_0-\mathcal {E}_{n}+i\eta }\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{\omega _1+\omega _2+{\hat{H}}_0-\mathcal {E}_{n}+2i\eta }\delta _n{\hat{H}}_i|n\rangle \Big ]. \end{aligned}$$
(57)

Here,\(\delta _n{\hat{H}}_i\equiv {\hat{H}}_i-\langle n|{\hat{H}}_i|n\rangle \), \({\hat{Q}}_n\equiv 1-|n\rangle \langle n|\) is the projector onto the compliment of the space spanned by \(|n\rangle \), and \(\mathcal {S}_{i_1\dots i_N}\) is the symmetrization among \((i_\ell ,\omega _\ell )\)’s. The corresponding optical conductivity is then given by Eq. (24). The expression of \(\phi _{i}^{i_1i_2i_3}(t_1,t_2,t_3)\) and \(\phi _{i}^{i_1i_2i_3}(\omega _1,\omega _2,\omega _3)\) are too long to be presented here and are included in Appendix A.

More generally, the instantaneous contribution in Eq. (53) gives rise to

$$\begin{aligned}&\sigma _{i\,\text {(inst)}}^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)=\frac{1}{V}\langle {\hat{H}}_{ii_1\dots i_N}\rangle _0\prod _{\ell =1}^N\frac{i}{\omega _\ell +i\eta } \end{aligned}$$
(58)

in the N-th other optical conductivity. All other terms in \(\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) are suppressed by \(|\omega _\ell |^{-2}\) for large \(|\omega _\ell |\) for at least one \(1\le \ell \le N\) as a result of the temporal integration in Eq. (37). This observation immediately implies the f-sum rule (25). To see this, let us perform the frequency integration in Eq. (25) using techniques of complex analysis. Except for the instantaneous term in Eq. (58), all other terms in \(\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) do not contribute to this integral. This is because, if a term is suppressed by \(|\omega _\ell |^{-2}\) for large \(|\omega _\ell |\), one can form a closed integration path by adding a large half circle in the upper half of the complex \(\omega _\ell \)-plane and apply the Cauchy’s integral theorem. All poles of \(\sigma _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) are located in the lower-half of the complex plane and thus the integral vanishes. Therefore, taking into account only the contribution from the instantaneous term (58) using the formula

$$\begin{aligned} \int _{-\infty }^{\infty }\frac{d\omega }{2\pi i}\frac{i}{\omega +i\eta }=\theta (0)=\frac{1}{2}, \end{aligned}$$
(59)

we reproduce the f-sum rule (25).

4.2 Drude Weight

Here we derive the Kohn formula based on the concrete expressions of \(\phi _i^{i_1\dots i_N}(\omega _1,\dots ,\omega _N)\) obtained above. For the brevity of the presentation, let us write

$$\begin{aligned} |n_{i_1\dots i_N}\rangle \equiv \frac{\partial ^N|n(\vec {A})\rangle }{\partial A_{i_1}\dots \partial A_{i_N}}\Big |_{\vec {A}=\vec {0}}. \end{aligned}$$
(60)

The linear Drude weight is given by \(\mathcal {D}_i^{i_1}=\phi _i^{i_1}(\omega _1=0)\). Let us make this into the form of the Kohn formula following the discussions in Refs. [3, 4], and [1]. Using the standard formulas

$$\begin{aligned}&\langle n|n_{i_1}\rangle +\text {c.c.}=0, \end{aligned}$$
(61)
$$\begin{aligned}&{\hat{Q}}_n|n_{i_1}\rangle =-\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_1}|n\rangle \end{aligned}$$
(62)

in the first-order nondegenerate perturbation theory, we find,

$$\begin{aligned} V\mathcal {D}_i^{i_1}&=V\phi _i^{i_1}(\omega _1=0)\nonumber \\&=\sum _n\rho _n\Big (\langle n| {\hat{H}}_{ii_1}|n\rangle -\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_1}|n\rangle +\text {c.c.}\Big )\nonumber \\&= \sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1}|n\rangle +\langle n|{\hat{H}}_i{\hat{Q}}_n|n_{i_1}\rangle +\text {c.c.}\big )\nonumber \\&= \sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1}|n\rangle +\langle n|{\hat{H}}_i|n_{i_1}\rangle +\text {c.c.}\big )\nonumber \\&=\sum _n\rho _n\frac{\partial }{\partial A_{i_1}}\Big \langle n(\vec {A})\Big |\frac{\partial {\hat{H}}(\vec {A})}{\partial A_{i}}\Big |n(\vec {A})\Big \rangle \Big |_{\vec {A}=\vec {0}}\nonumber \\&=\sum _n\rho _n\frac{\partial ^2\mathcal {E}_n(\vec {A})}{\partial A_i\partial A_{i_1}}\Big |_{\vec {A}=\vec {0}}. \end{aligned}$$
(63)

Here, \(\text {c.c.}\) represents the complex conjugation of the term right in front. Thus Eq. (28) for \(N=1\) is verified. Strictly speaking, this proof applies only to the cases where all energy levels \(\mathcal {E}_n\) with \(\rho _n\ne 0\) are non-degenerate. It was argued in Ref. [4] that the first-order degenerate perturbation theory lifts the degeneracy and the same procedure should work.

Let us do the same for the second-order Drude weight, given by \(\mathcal {D}_i^{i_1i_2}=\phi _i^{i_1i_2}(\omega _1=0,\omega _2=0)\). Knowing the answer, we find it easier to go backward:

$$\begin{aligned}&\sum _n\rho _n\frac{\partial ^3\mathcal {E}_n(\vec {A})}{\partial A_i\partial A_{i_1}\partial A_{i_2}}\Big |_{\vec {A}=\vec {0}}\nonumber \\&\quad =\sum _n\rho _n\frac{\partial ^2}{\partial A_{i_1}\partial A_{i_2}}\Big \langle n(\vec {A})\Big |\frac{\partial {\hat{H}}(\vec {A})}{\partial A_{i}}\Big |n(\vec {A})\Big \rangle \Big |_{\vec {A}=\vec {0}}\nonumber \\&\quad =\mathcal {S}_{i_1i_2}\sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1i_2}|n\rangle +2\langle n_{i_1}|{\hat{H}}_{i}|n_{i_2}\rangle \big )+\mathcal {S}_{i_1i_2}\sum _n\rho _n\big (2\langle n|{\hat{H}}_{ii_1}|n_{i_2}\rangle +\langle n|{\hat{H}}_i|n_{i_1i_2}\rangle \big )+\text {c.c.}\nonumber \\&\quad =\mathcal {S}_{i_1i_2}\sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1i_2}|n\rangle +2\langle n_{i_1}|{\hat{Q}}_n\delta _n{\hat{H}}_i{\hat{Q}}_n|n_{i_2}\rangle \big )\nonumber \\&\quad \quad +\mathcal {S}_{i_1i_2}\sum _n\rho _n\big (2\langle n|{\hat{H}}_{ii_1}{\hat{Q}}_n|n_{i_2}\rangle +\langle n|{\hat{H}}_i{\hat{Q}}_n|n_{i_1i_2}\rangle \big )+\text {c.c.}\nonumber \\&\quad \quad +2\mathcal {S}_{i_1i_2}\sum _n\rho _n\langle n_{i_1}|{\hat{Q}}_n{\hat{H}}_i|n\rangle \langle n|n_{i_2}\rangle +\text {c.c.}\nonumber \\&\quad =\mathcal {S}_{ii_1i_2}\sum _n\rho _n\Big [\langle n|{\hat{H}}_{ii_1i_2}|n\rangle -3\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_1i_2}|n\rangle +\text {c.c.}\nonumber \\&\quad \quad +6\langle n|{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_2}|n\rangle \Big ]\nonumber \\&\quad =V\phi _i^{i_1i_2}(\omega _1=0,\omega _2=0)=V\mathcal {D}_i^{i_1i_2}. \end{aligned}$$
(64)

This reproduces Eq. (28) for \(N=2\). In the derivation, we used the standard formulas in Eqs. (61), (62), and

$$\begin{aligned}&\left( \langle n|n_{i_1i_2}\rangle +\langle n_{i_2}|n_{i_1}\rangle \right) +\text {c.c.}=0, \end{aligned}$$
(65)
$$\begin{aligned}&{\hat{Q}}_n|n_{i_1i_2}\rangle =-\mathcal {S}_{i_1i_2}\left[ \frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_1i_2}|n\rangle +2\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}\delta _n{\hat{H}}_{i_1}|n_{i_2}\rangle \right] . \end{aligned}$$
(66)

Finally, for the third-order response, we have

$$\begin{aligned}&\sum _n\rho _n\frac{\partial ^4\mathcal {E}_n(\vec {A})}{\partial A_i\partial A_{i_1}\partial A_{i_2}\partial A_{i_3}}\Big |_{\vec {A}=\vec {0}}\nonumber \\&\quad =\sum _n\rho _n\frac{\partial ^3}{\partial A_{i_1}\partial A_{i_2}\partial A_{i_3}}\Big \langle n(\vec {A})\Big |\frac{\partial {\hat{H}}(\vec {A})}{\partial A_{i}}\Big |n(\vec {A})\Big \rangle \Big |_{\vec {A}=\vec {0}}\nonumber \\&\quad =\mathcal {S}_{i_1i_2i_3}\sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1i_2i_3}|n\rangle +6\langle n_{i_2}|{\hat{H}}_{ii_1}|n_{i_3}\rangle \big )\nonumber \\&\quad \quad +\mathcal {S}_{i_1i_2i_3}\sum _n\rho _n\big (3\langle n|{\hat{H}}_{ii_1i_2}|n_{i_3}\rangle +3\langle n|{\hat{H}}_{ii_1}|n_{i_2i_3}\rangle +3\langle n_{i_1}|{\hat{H}}_{i}|n_{i_2i_3}\rangle +\langle n|{\hat{H}}_{i}|n_{i_1i_2i_3}\rangle \big )+\text {c.c.}\nonumber \\&\quad =\mathcal {S}_{i_1i_2i_3}\sum _n\rho _n\big (\langle n| {\hat{H}}_{ii_1i_2i_3}|n\rangle +6\langle n_{i_2}|{\hat{Q}}_n\delta _n{\hat{H}}_{ii_1}{\hat{Q}}_n|n_{i_3}\rangle \big )\nonumber \\&\quad \quad +\mathcal {S}_{i_1i_2i_3}\sum _n\rho _n\big (3\langle n|{\hat{H}}_{ii_1i_2}{\hat{Q}}_n|n_{i_3}\rangle +3\langle n|{\hat{H}}_{ii_1}{\hat{Q}}_n|n_{i_2i_3}\rangle \nonumber \\&\qquad +3\langle n_{i_1}|{\hat{Q}}_n\delta _n{\hat{H}}_{i}{\hat{Q}}_n|n_{i_2i_3}\rangle +\langle n|{\hat{H}}_{i}{\hat{Q}}_n|n_{i_1i_2i_3}\rangle \big )+\text {c.c.}\nonumber \\&\qquad +\mathcal {S}_{i_1i_2i_3}\sum _n\rho _n\big (6\langle n_{i_2}|n\rangle \langle n|{\hat{H}}_{ii_1}{\hat{Q}}_n|n_{i_3}\rangle +3\langle n_{i_1}|n\rangle \langle n|{\hat{H}}_{i}{\hat{Q}}_n|n_{i_2i_3}\rangle \nonumber \\&\qquad +3\langle n_{i_1}|{\hat{Q}}_n{\hat{H}}_{i}|n\rangle \langle n|n_{i_2i_3}\rangle \big )+\text {c.c.}\nonumber \\&\quad =\mathcal {S}_{ii_1i_2i_3}\sum _n\rho _n\Big [\langle n|{\hat{H}}_{i i_1 i_2 i_3}|n\rangle -6\langle n|{\hat{H}}_{ii_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_2i_3}|n\rangle \nonumber \\&\qquad -4\langle n|{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_1i_2i_3} |n\rangle +\text {c.c.}\Big ]\nonumber \\&\quad \quad +12\mathcal {S}_{ii_1i_2i_3}\sum _n\rho _n\Big [ \langle n|{\hat{H}}_{i_3}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_1i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_i|n\rangle \nonumber \\&\qquad +\langle n|{\hat{H}}_{i_1i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle +\text {c.c.}\Big ]\nonumber \\&\quad \quad -24\mathcal {S}_{ii_1i_2i_3}\sum _n\rho _n\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}\delta _n{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle \nonumber \\&\quad \quad +24\mathcal {S}_{ii_1i_2i_3}\sum _n\rho _n\langle n|{\hat{H}}_i\frac{{\hat{Q}}_n}{({\hat{H}}_0-\mathcal {E}_{n})^2}{\hat{H}}_{i_1}|n\rangle \langle n|{\hat{H}}_{i_2}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_{n}}{\hat{H}}_{i_3}|n\rangle \nonumber \\&\quad =V\phi _i^{i_1i_2i_3}(\omega _1=0,\omega _2=0,\omega _3=0)=V\mathcal {D}_{i}^{i_1i_2i_3}. \end{aligned}$$
(67)

In passing to the last line we set \(\omega _1=\omega _2=\omega _3=0\) in Eq. (A2). In the derivation, we used

$$\begin{aligned} \mathcal {S}_{i_1i_2i_3}\left( \langle n|n_{i_1i_2i_3}\rangle +3\langle n_{i_1}|n_{i_2i_3}\rangle \right) +\text {c.c.}=0 \end{aligned}$$
(68)

and

$$\begin{aligned} {\hat{Q}}_n|n_{i_1i_2i_3}\rangle&=\mathcal {S}_{i_1i_2i_3}\Big [-\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_1i_2i_3}|n\rangle -3\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}\delta _n{\hat{H}}_{i_1i_2}|n_{i_3}\rangle \nonumber \\&\quad -6\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}|n_{i_3}\rangle \langle n|{\hat{H}}_{i_1}\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}{\hat{H}}_{i_2}|n\rangle -3\frac{{\hat{Q}}_n}{{\hat{H}}_0-\mathcal {E}_n}\delta _n{\hat{H}}_{i_1}|n_{i_2i_3}\rangle \Big ], \end{aligned}$$
(69)

in addition to the above first- and second-order relations.

5 Tight-Binding Model

Let us demonstrate our results with a simple example of a tight-binding model. We diagonalize the unperturbed Hamiltonian as

$$\begin{aligned} {\hat{H}}_0&=\sum _{\varvec{k},n}\varepsilon _{\varvec{k}n}{\hat{\gamma }}_{\varvec{k}n}^\dagger {\hat{\gamma }}_{\varvec{k}n}. \end{aligned}$$
(70)

In this basis, \({\hat{H}}_{i_1\dots i_N}\) in Eq. (7) can be written as

$$\begin{aligned} {\hat{H}}_{i_1\dots i_N}&=\sum _{\varvec{k},m,n}{\hat{\gamma }}_{\varvec{k}m}^\dagger h_{i_1\dots i_N}^{\varvec{k}mn}{\hat{\gamma }}_{\varvec{k}n}. \end{aligned}$$
(71)

The response function \(\phi _i^{i_1}(\omega _1)\) and \(\phi _{i}^{i_1i_2}(\omega _1,\omega _2)\) are then given by

$$\begin{aligned} V\phi _i^{i_1}(\omega _1)&=\sum _{\varvec{k},n<0}h_{ii_1}^{\varvec{k}nn}+\sum _{\varvec{k},n<0,m>0}\left( \frac{h_i^{\varvec{k}nm}h_{i_1}^{\varvec{k}mn}}{\omega _1-(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta } -\frac{h_{i_1}^{\varvec{k}nm}h_i^{\varvec{k}mn}}{\omega _1+(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta }\right) \end{aligned}$$
(72)

and

$$\begin{aligned}&V\phi _{i}^{i_1i_2}(\omega _1,\omega _2)\nonumber \\&\quad =\sum _{\varvec{k},n<0}h_{ii_1i_2}^{\varvec{k}nn}+2\mathcal {S}_{i_1i_2}\sum _{\varvec{k},n<0,m>0}\left( \frac{h_{ii_1}^{\varvec{k}nm}h_{i_2}^{\varvec{k}mn}}{\omega _2-(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta } -\frac{h_{i_2}^{\varvec{k}nm}h_{ii_1}^{\varvec{k}mn}}{\omega _2+(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta }\right) \nonumber \\&\quad \quad +\sum _{\varvec{k},n<0,m>0}\left( \frac{h_i^{\varvec{k}nm}h_{i_1i_2}^{\varvec{k}mn}}{\omega _1+\omega _2-(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+2i\eta } -\frac{h_{i_1i_2}^{\varvec{k}nm}h_{i}^{\varvec{k}mn}}{\omega _1+\omega _2+(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+2i\eta }\right) \nonumber \\&\quad \quad +2\mathcal {S}_{i_1i_2}\sum _{\varvec{k},n,n'<0,m,m'>0}\frac{h_i^{\varvec{k}nm}(h_{i_1}^{\varvec{k}mm'}\delta _{n,n'}-h_{i_1}^{\varvec{k}n'n}\delta _{m,m'})h_{i_2}^{\varvec{k}m'n'}}{\big [\omega _1+\omega _2-(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+2i\eta \big ]\big [\omega _2-(\varepsilon _{\varvec{k}m'}-\varepsilon _{\varvec{k}n'})+i\eta \big ]}\nonumber \\&\quad \quad -2\mathcal {S}_{i_1i_2}\sum _{\varvec{k},n,n'<0,m,m'>0}\frac{h_{i_1}^{\varvec{k}nm}(h_i^{\varvec{k}mm'}\delta _{n,n'}-h_i^{\varvec{k}n'n}\delta _{m,m'})h_{i_2}^{\varvec{k}m'n'}}{\big [\omega _1+(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta \big ]\big [\omega _2-(\varepsilon _{\varvec{k}m'}-\varepsilon _{\varvec{k}n'})+i\eta \big ]}\nonumber \\&\quad \quad +2\mathcal {S}_{i_1i_2}\sum _{\varvec{k},n,n'<0,m,m'>0}\frac{h_{i_2}^{\varvec{k}nm}(h_{i_1}^{\varvec{k}mm'}\delta _{n,n'}-h_{i_1}^{\varvec{k}n'n}\delta _{m,m'})h_{i}^{\varvec{k}m'n'}}{\big [\omega _2+(\varepsilon _{\varvec{k}m}-\varepsilon _{\varvec{k}n})+i\eta \big ]\big [\omega _1+\omega _2+(\varepsilon _{\varvec{k}m'}-\varepsilon _{\varvec{k}n'})+2i\eta \big ]}. \end{aligned}$$
(73)
Fig. 1
figure 1

The linear and the second-order optical conductivities in the tight-binding model in Eq. (74). a The real-space illustration of the model. b The band structure \(\varepsilon _{nk_x}\) as a function of \(k_x\). The orange part is occupied in the ground state. c \(\sigma _x^{x}(\omega _1)\) as a function of \(\omega _1\in (-3,3)\). The gray curve is the fit by Eq. (75). d The zoom up of (c) for \(\omega _1\in (-0.2,0.2)\). e \(\sigma _x^{xx}(\omega _1,\omega _2)\) as a function of \(\omega _1, \omega _2\in (-3,3)\). f The zoom up of e

We use the following two-band model [16], illustrated in Fig. 1a, in \(d=1\) at zero temperature for the demonstration.

$$\begin{aligned} H_{k_x}=\begin{pmatrix} -\delta -2t_2\sin (k_x+A_x) &{} t_1(e^{iA_x/2}+e^{-i(k_x+A_x/2)})\\ t_1(e^{-iA_x/2}+e^{i(k_x+A_x/2)}) &{} \delta +2t_2\sin (k_x+A_x) \end{pmatrix}. \end{aligned}$$
(74)

This model breaks both the inversion symmetry and the time-reversal symmetry, resulting in a nonzero f-sum for the second-order conductivity. We set \(t_1=0.5\), \(\delta =0.5\), and \(t_2=0.125\). The crystal momentum \(k_x\) takes values \(2\pi i_x/L_x\) with \(i_x=1,2,\cdots L_x\) and \(L_x=501\). When the chemical potential is set to be \(-1\), the system is metallic and partially fills the lower band as shown in Fig. 1b. The optical conductivities \(\sigma _x^{x}(\omega _1)\) and \(\sigma _x^{xx}(\omega _1,\omega _2)\) are computed based on Eqs. (24), (72), and (73) with \(\eta =0.01\) for the range of \(\omega _1\) and \(\omega _2\) in \(|\omega _\ell |<10\). The frequencies are discretized with \(\Delta \omega =1/200\). The obtained optical conductivities as a function of \(\omega \) are shown in Fig. 1c and e.

Table 1 Numerical results for the tight-binding model in Eq. (74)

The Drude weights \(\mathcal {D}_x^x\) and \(\mathcal {D}_x^{xx}\) are then determined by fitting \(\text {Re}[\sigma _x^{x}(\omega _1)]\) and \(\text {Re}[\sigma _x^{xx}(\omega _1,\omega _2)]\) for small \(\omega _\ell \)’s [Fig. 1d and f] by

$$\begin{aligned} \text {Re}[\sigma _x^{x}(\omega _1)]=\frac{\eta }{\omega ^2+\eta ^2}\mathcal {D}_x^x,\quad \text {Re}[\sigma _x^{xx}(\omega _1,\omega _2)]=\frac{\eta ^2-\omega _1\omega _2}{(\omega _1^2+\eta ^2)(\omega _2^2+\eta ^2)}\mathcal {D}_x^{xx}\quad (|\omega _\ell |\ll 1). \end{aligned}$$
(75)

For this fit, we use frequencies in the range \(|\omega _\ell |<0.2\). The Drude weights \(\mathcal {D}_x^x\) and \(\mathcal {D}_x^{xx}\) obtained this way are compared with \(\partial ^2\mathcal {E}_0(A_x)/\partial A_x^2\) and \(\partial ^3\mathcal {E}_0(A_x)/\partial A_x^3\) computed separately. As summarized in Table 1, we observe good agreement, confirming the Kohn formula.

The frequency integration of \(\sigma _x^{x}(\omega _1)\) and \(\sigma _x^{xx}(\omega _1,\omega _2)\) in Eq. (25) are approximated by the Riemannian summation, i.e.,

$$\begin{aligned} \sum _{-10<\omega _1<10}\frac{\Delta \omega }{2\pi }\,\sigma _x^{x}(\omega _1),\quad \sum _{-10<\omega _1,\omega _2<10}\frac{(\Delta \omega )^2}{(2\pi )^2}\,\sigma _x^{xx}(\omega _1,\omega _2). \end{aligned}$$
(76)

The results are compared with \(\langle \partial ^2{\hat{H}}(A_x)/\partial A_x^2\rangle _0\) and \(\langle \partial ^3{\hat{H}}(A_x)/\partial A_x^3\rangle _0\) computed separately. Again, we find that they agree well, verifying the f-sum rule.

6 Conclusion

In this work, we studied the non-linear conductivity \(\sigma _i^{i_1 \dots i_N}(t_1,\dots ,t_N)\) with respect to the spatially uniform electric field. We provided the detailed discussion on the relation between the instantaneous response and the f-sum rule, hinted in Ref. [5]. We also derived explicit expressions of the nonlinear optical conductivity \(\sigma _i^{i_1 \dots i_N}(\omega _1,\dots ,\omega _N)\) of the orders \(N=2\) and 3 for general quantum many-body systems, in terms of correlation functions. Based on the explicit formulas obtained, we proved the nonlinear generalizations of the Kohn formula to the second- and third-order optical conductivity, confirming the results in Ref. [5] derived in a general framework but without explicit expressions of the non-linear conductivities. The obtained non-linear f-sum rules and the Kohn formulas are valid for any finite size, and thus also could be used in the thermodynamic limit with an appropriate procedure. The exact f-sum rules for the finite size systems would be useful for the benchmarking of numerical calculations.

As a demonstration, we applied our formulation to a simple tight-binding model in one dimension, and numerically verified the non-linear f-sum rule and Drude weight of the second order. While this demonstration was done for the non-interacting electrons, we emphasize that our formulation and results of the present paper is applicable to very general quantum many-body systems with interactions.

Since our discussion did not assume the translation invariance, both the generalized f-sum rules and the Kohn formulas should be valid even in the presence of disorder. For example, in the localized phases, all energy levels are insensitive to the the applied vector potential because the effect of vector potential can be converted to the twist of the boundary condition by a local gauge transformation. (In other words, only extended states can be affected by the vector potential.) This is consistent with our expectation that both the linear and nonlinear Drude weight should vanish in the localized phases. This discussion suggests the presence of residual Drude peak in weakly disordered phases. We leave the detailed investigation of such possibility as a future work.