The so-called First Principle of thermodynamics is just another way of expressing the universal conservation of energy, by stating that mechanical work and heat are equivalent forms of the energy. However, the first definite statement about this eminently physical principle came in fact from medicine. Around 1840, Julius Robert Mayer, then a physician in Java, deduced the energy equivalent of heat by observing differences in venous blood color, which he attributed to different oxygen concentrations, and hence to different amounts of heat produced by the body. His empirical calculations led him to a value of 3.58 J/cal, not too far from the accurate value measured by James Joule just a few years later, 4.16 J/cal, by means of calorimetry experiments. The origins of the idea that “something” should be conserved during transformations implying mechanics and heat processes had been around at least since James Watt’s works, in the late XVIII century, and according to some science historians even earlier. In fact, already Leibniz around 1692 observed that the *vis viva* (that we call kinetic energy) in an inelastic collision is not conserved, and that the apparent loss of this quantity is caused by its distribution among the small elements of impacting bodies; further, in his *Essay on Dynamics* (1695) he wrote a sort of “Delphic” statement that may already resemble a detailed principle of energy conservation: “There is neither more nor less power in an effect than in its cause.” Only problem, no one knew the notion of energy in the year 1695.

For some 40 years, around the transition from the XVIII to the XIX century, Watt’s steam engine was the main industrial facility for converting thermal power into mechanical work. Joule went one step further in 1845, while demonstrating the equivalence of mechanical work and heat produced in his experiments with the “paddle-wheel apparatus”. Apparently, he questioned what was happening to the exchanged quantities, between the time when work was added to water, and the time it was extracted into heat? His logic led him to the idea that an intermediate step should be brought upon by the water, in some other form, and relative to the internal forces between the water particles. At the end of his paper, published in the *Philosophical Magazine* (26 S.3 p. 369), Joule suggests that internal motion of “atoms” is at the origin of temperature, while the internal forces originate pressure, and he even seems to hint at possible energy transfers at fixed frequencies between matter and radiation: “With regard to the detail of the theory, much uncertainty at present exists. […] Most phenomena may be explained […] by the discovery of Faraday that each atomic element is associated with the same absolute quantity of electricity. Let us suppose that these ‘atmospheres of electricity’ revolve with vast velocity round their respective atoms; and that the velocity of rotation determines what we call temperature. […] The centrifugal force of the revolving atmospheres is then the sole cause of expansion, upon the removal of pressure. […] In order to apply it to radiation, we have only to admit that the revolving atmospheres of electricity possess the power of exciting isochronal undulations in the ether.”

It was Rudolf Clausius (a man with a strange beard and stingy eyes) who first wrote down the First Principle in a way similar to what we learn it today in first-year physics. In his 1850 communication to the Academy of Berlin (later collected in the *Poggendorfs Annalen*, and reprinted in *Phil. Mag. ***102** (1851) 1) he recognized the need to introduce a new state variable that would account for the “sensible heat” (kinetic energy) and the heat for “interior work”, but he did not name it. One year later, Kelvin called it “mechanical energy”, and later “intrinsic energy”. In 1865, and not after some hesitation, Clausius began calling his state function “energy”, and established that it can only depend on the temperature and no other quantity, such as density or pressure. In 1882 it was named *internal energy * by Helmholtz. If one considers only adiabatic processes and heat can be ignored, the concept of internal energy would hardly arise or be needed. However, for real systems exchanging both heat and work the presence of internal forces now makes the necessary link.

Once internal forces were a reality also in thermodynamics, Clausius made another key finding. A special property of physical systems governed by purely central forces can be obtained in a very general form, the *virial theorem*. This theorem peculiarly differs in character from any other theorems that are usually derived and discussed in the domain of mathematical physics, as it is of *statistical* nature. In other words, it concerns time averages of mechanical quantities. In mechanics we are all familiar with the identity of the total force vector **F*** _{i}* acting on a particle and its momentum derivative,

**F**

_{i}=d**p**

*Consider the instantaneous quantity*

_{i}/dt.*W*=

**p**

*•*

_{i}**r**

*obtained by summing the scalar product of momentum and position, over all the particles in the system. Now, let us take its time derivative, which reads*

_{i}*dW*/

*dt =*

**p**

*(*

_{i}*d*

**r**

*)+(*

_{i}/dt*d*

**p**

*)*

_{i}/dt**r**

_{i }(sums over all the

*i*particles implied). We then observe that

**p**

*(*

_{i}=m*d*

**r**

*/*

_{i}*dt*)

*,*and

**v**

_{i}=d**r**

*/*

_{i}*dt,*so that the first term is the sum of the

*mv*

_{i}^{2}over all particles, that is twice the kinetic energy 2

*K*of the system; therefore, we have that

*dW*/

*dt*=

**F**

*•*

_{i}**r**

*+ 2*

_{i}*K*. The peculiar observation that turns this identity into a “theorem” is that, for a periodic motion, the integral over a period τ vanishes, [

*W*(τ)

*–W*(

*0*)]/τ =0, so that the time average gives <

*K*> = –1/2 <

**F**

_{i}•**r**

*>. The latter right side is the “virial”, as defined by Clausius. But what if the motion is not periodic? In that case, we must hope that positions and velocities in our system do not diverge to infinity. If that is the case (and very often it is), the integral of*

_{i}*W*can be made to vanish as well, for a sufficiently long τ.

This theorem has some important implications, firstly in the kinetic theory. If we consider an ideal gas enclosed in a volume *V* at temperature *T,* its kinetic energy at equilibrium is given by the equipartition law, *K*=3/2* RT*. On the other hand, in the absence of any collisions and interactions, the only forces on the particles come from the rebounds against the volume walls, where only the force components aligned with the normal **n** to the surface, make up the gas pressure *P.* It is then straightforward to see that the integral of **n***•***r*** _{i}* over the surface, is the same as the integral of

**∇**

*•*

**r**

*over the volume (Gauss’ theorem), and equal to 3*

_{i}*V*. Hence, the virial is equal to 3/2

*PV*, and by equating it to

*K*the ideal gas law

*PV*=

*RT*is recovered. All these considerations are exactly formulated, already in such very modern terms, in Clausius’ 1870 paper (

*Phil. Mag.*

**40**S.4, p. 122). The great value of the virial theorem in the framework of the kinetic theory mostly lies in the fact that when considering

*real*gases, the forces between particles

*do give*a contribution, and the calculation of the

**F**

_{i}*•*

**r**

*provides the deviation from the ideal gas. It was Kamerlingh-Onnes, in 1902, who first obtained the corrections to the ideal gas behavior in the form of a “virial expansion” (*

_{i}*Proc. Kon. Ned. Ac. Wet.*

**4**, 125), by expressing the ratio

*PV*/

*RT*as a series of powers of the density, with the coefficients of the linear, quadratic etc. terms being called the “virial coefficients”.

In our molecular dynamics simulations of atomistic systems, the direct calculation of **F**_{i}*•***r*** _{i}* is routinely done, and gives as a useful byproduct the local stress tensor around a microscopic particle. The calculation can be carried out even for forces that are not purely central-directed, although in that case some subtle problems do arise. This is due to the fact that the total force acting on a particle cannot be

*uniquely*decomposed into individual terms, contributed from each of the other particles. While this may not be important for the total force, which eventually is independent on how the total may be broken down into separate contributions, it affects the definition of the stress tensor. Moreover, forces beyond four-point interactions, such as the tensor force in nuclear physics, or the dihedral forces in proteins, cannot be decomposed into individual contributions without violating linear and angular momentum conservation, since a group of five or more particles is defined by more degrees of freedom (3N–6) than the number of pair distances N(N–1)/2 that can be formed among them.

The virial theorem has another subtle implication, for the rather common case in classical physics when the forces can be derived from a potential energy *U.* In that case, each **F*** _{i}*can be replaced by a (

*dU*/

*d*

**r**

*) oriented along*

_{i}**r**

*. If the potential is a power-law of the particle distance,*

_{i}*U*=

*Ar*, the product

^{n}**F**

_{i}•**r**

*= (*

_{i}*dU*/

*d*

**r**

*)*

_{i}**r**

*= (*

_{i}*Anr*

^{n}^{-1})

*r*=

*nU*, and the virial theorem now reads <

*K*>= 1/2

*nU*, which for the case of, e.g., Coulomb or gravitational forces with

*n*=–1, turns to <

*K*>= – 1/2 <

*U*>

*. That looks like an amazing and bold statement: the kinetic energy of a system is equal to (minus) half of its potential energy. So much from so little… Remember, we are talking about a theorem, so this statement must hold*

*universally*, for the systems to which it applies. Vladimir Fock extended the proof of the virial to the realm of quantum mechanics (

*Zeit. Phys.*

**63**(1930) 855). He considered to replace the classical expression for

*W*=

**p**•

_{n}**r**

*, by the commutator of the system Hamiltonian*

_{n }*H*with the quantity

*Q*=

*P*, that is the sum over all the

_{n}X_{n}*n*particles of the product of the corresponding momentum and position operators. This amounts to 2π

*i*/

*h*[

*H,Q*]=2

*T*–

*X*(

_{n}*dU*/

*dX*); then, noting that the left-side is just the

_{n}*dQ*/

*dt*in the Heisenberg representation, which vanishes for a periodic or a bounded motion, the quantum equivalent of the virial theorem is obtained. An interesting difference arises when considering relativistic systems, for which the kinetic energy is no longer simply proportional to

*v*

^{2}, but to (γ–1) (the Lorentz factor): the result of a simple calculation gives a velocity-dependent ratio <α

*K*>= – 1/2 <

*U*> with the factor α=(γ+1)/2γ varying between 1 for classical systems, and 2 for ultra-relativistic systems.

The Swiss scientist Fritz Zwicky (actually born in Bulgaria from a Swiss father and a Czech mother) was a very peculiar character. He coined the term *supernova* (Baade and Zwicky, *PNAS* **20** (1934) 254) and discovered more than 100 examples of the kind; he introduced the concept of neutron star (*Astrophys. J.* **88** (1938) 522) well before Oppenheimer’s famous paper on the subject; and while discussing the methods to measure masses of distant galaxies (nebulae, according to the old-style nomenclature), he introduced for the first time the virial theorem in the domain of astrophysics (*Helv. Phys. Acta* **6** (1933) 110).By rearranging the theorem *K=–U/2* and using the definition of gravitational potential energy *U=GM*^{2}*/R* for a self-gravitating, isolated body in the universe, the key information about its mass *M* can be obtained. The result is *M=v*^{2}*R/G* where *v *is the mean velocity of stars in the galaxy, combining the rotation speed and velocity dispersion, *G *is Newton’s gravitational constant and *R *is the effective radius (size) of the galaxy. This equation is extremely important, as it relates two observable properties of galaxies (velocity dispersion and effective half-light radius) to a fundamental but unobservable property, the mass of the galaxy. Consequently, the virial theorem forms the root of many galaxy scaling relations. The comparison of mass estimates based on the virial theorem, to estimates based on the luminosities of galaxies is today a main technique used by astronomers to detect the presence of dark matter in galaxies and clusters of galaxies. It may surprise you that it was the very same Zwicky, in the very same 1933 paper on the virial theorem, who firstly introduced the concept of dark matter, or *dunkle Materie*. He noticed that in order to obtain an average Doppler shift corresponding to a velocity of 1000 km/s or more, as observed, the average density of stars in the *Coma* cluster would have to be at least 400 times greater than what is derived on the basis of observations of luminosity. If this were to be verified, he wrote “*das überraschende Resultat ergeben, dass dunkle Materie in sehr viel grösserer Dichte vorhanden ist als leuchtende Materie”*: the surprising result would then follow that *dark matter* is present in very much greater density than luminous matter.

So much you can get from such a (deceptively) simple statement as *K*= – 1/2 *U.*