Calculating the Volume of a Sphere

Calculating the Volume of a Sphere

In this lesson, we'll use the concept of a definite integral to calculate the volume of a sphere. First, we'll find the volume of a hemisphere by taking the infinite sum of infinitesimally skinny cylinders enclosed inside of the hemisphere. Then we'll multiply our answer by two and we'll be done.

Nuclear Fusion Engines

Nuclear Fusion Engines

In this lesson, we'll give a friendly introduction to what nuclear fusion is and how it might be used by space faring civilizations.

Finding the integral of \(kx^m\)

Finding the integral of \(kx^m\)

In this lesson, we'll find the integral of any arbitrary function \(kx^n\) where \(k\) and \(n\) are any finite numbers such that \(n≠-1\).

Capacitance

Capacitance

Capacitance is defined as the amount of charge stored in a capacitor per volt across the capacitor. The value of the capacitance is a measure of how rapidly a capacitor stores electric potential energy as the capacitor is getting charged up.

Calculating the amount of Electric Potential Energy Stored in a Capacitor

Calculating the amount of Electric Potential Energy Stored in a Capacitor

In this lesson, we'll determine the electric potential difference (also called voltage) across any arbitrary capacitor.

Quantum Mechanics: Math Interlude

Complex numbers

electromagnetism thumbnaillllll imageeee.jpg

Any ket vector \(|\psi⟩\) can be multiplied by a number \(z\) (which, in general can be real or complex) to produce a new vector \(|\phi⟩\):

$$z|\psi⟩=|\phi⟩.$$

In general, \(z=x+iy\). Sometimes the number \(z\) will just be a real number with no imaginary part, but in general it can have both a real and imaginary part. The complex conjugate of the number \(z\) is represented by \(z^*\) and is obtained by changing the sign of the imaginary part of \(z\); so \(z^*=x-iy\). Let’s look at some examples of complex numbers and their complex conjugates. Suppose that \(z\) is any real number with no imaginary part: \(z=x+(i·0)=x\). The complex conjugate of any real number is \(z^*=x-(i·0)=x\). In other words taking the complex conjugate \(z^*\) of any real number \(z=x\) just gives the real number back. Suppose, however, that \(z=x+iy\) is any complex number and we take the complex conjugate twice. Let’s see what we get. \(z^*\) is just \(z^*=x-iy\) (a new complex number). If we take the “complex conjugate of the complex conjugate,” we have \((z^*)^*=x+iy=z\). For any complex number \(z\),\(z^*=z\) .

If we multiply any complex number \(z\) by its complex conjugate \(z^*\), we’ll get

$$zz^*=(x=iy)(x-iy)=x^2-ixy+ixy-i^2y^2=x^2+y^2.$$

Figure 1: Any complex number \(z\) can be represented in either Cartesian coordinates as \(x+iy\) or in polar coordinates as \(re^{iθ}\).

Figure 1: Any complex number \(z\) can be represented in either Cartesian coordinates as \(x+iy\) or in polar coordinates as \(re^{iθ}\).

For any complex number \(z\) the product \(z^*z\) is always a real number that is greater than or equal to zero. This product is called the modulus squared and, in a very rough sense, represents the length squared of a vector in a complex space. We can write the modulus squared as \(|z|^2=zz^*\). From Figure 1 we can also see that any complex number can be represented by \(z=x+iy=rcosθ+irsinθ=re^{iθ}\). The complex conjugate of this is \(z^*=x-iy=rcosθ+irsin(-θ)=re^{-iθ}\). The modulus squared of any vector in the complex plain is given by \(|z|^2=zz^*=(re^{iθ})(re^{iθ}=r^2\). If \(|z|^2=r^2=1\) and hence \(|z|=r=1\), then the magnitude of the complex vector is 1 and the vector is called normalized.


Column and row vectors and matrices

Any vector \(|A⟩\) can be expressed as a column vector: \(\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}\). To multiply \(|A⟩\) by any number \(z\) we simply multiply each of the components of the column vector by \(z\) to get \(z|A⟩=z\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}=\begin{bmatrix}zA_1 \\⋮ \\zA_N\end{bmatrix}\). We can add two complex vectors \(|A⟩\) and \(|B⟩\) to get a new complex vector \(|C⟩\). Each of the new components of \(|C⟩\) is obtained by adding the components of \(|A⟩\) and \(|B⟩\) to get \(|A⟩+|B⟩=\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}+\begin{bmatrix}B_1 \\⋮ \\B_N\end{bmatrix}=\begin{bmatrix}A_1+B_1 \\⋮ \\A_N+B_N\end{bmatrix}=|C⟩\). For every ket vector \(|A⟩=\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}\) there is an associated bra vector which is the complex conjugate of \(|A⟩\) and is given by \(⟨A|=\begin{bmatrix}A_1^* &... &A_N^*\end{bmatrix}\). The inner product between any two vectors \(|A⟩\) and \(|B⟩\) is written as \(⟨B|A⟩\). The outer product between any two vectors \(|A⟩\) and \(|B⟩\) is written as \(|A⟩⟨B|\). The rule for taking the inner product between any two such vectors is

$$⟨B|A⟩=\begin{bmatrix}B_1^* &... &B_N^*\end{bmatrix}\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}=B_1^*A_1\text{+ ... +}B_N^*A_N.$$

Whenever you take the inner product of a vector \(|A⟩\) with itself you get

$$⟨A|A⟩=\begin{bmatrix}A_1^* &... &A_N^*\end{bmatrix}\begin{bmatrix}A_1 \\⋮ \\A_N\end{bmatrix}=A_1^*A_1\text{+ ... +}A_N^*A_N.$$

We learned earlier that the product between any number \(z\) (which can be a real number but is in general a complex number) and its complex conjugate \(z^*\) (written as ) is always a real number that is greater than or equal to zero. This means that each of the terms \(A_i^*A_i\) is greater than or equal to zero and, therefore, will always equal a real number greater than or equal to zero.

Suppose we have some arbitrary matrix \(\textbf{M}\) whose elements are given by

$$\textbf{M}=\begin{bmatrix}m_{11} &... &m_{N1} \\⋮ &\ddots &⋮ \\m_{1N} &... &m_{NN}\end{bmatrix}.$$

To find the transpose of this matrix (written as \(\textbf{M}^T\)) we interchange the order of the two lower indices of each element. (Another way of thinking about it is that we “reflect” each element about the diagonal.) When we do this we get

$$\textbf{M}^T=\begin{bmatrix}m_{11} &... &m_{1N} \\⋮ &\ddots &⋮ \\m_{N1} &... &m_{NN}\end{bmatrix}.$$

The Hermitian conjugate of a matrix (represented by \(\textbf{M}^†\)) is obtained by first taking the transpose of the matrix and then taking the complex conjugate of each element to get

$$\textbf{M}^†=\begin{bmatrix}m_{11}^* &... &m_{1N^*} \\⋮ &\ddots &⋮ \\m_{N1}^* &... &m_{NN}^*\end{bmatrix}.$$

We represent observables/measurable as linear Hermitian operators. In our electron spin example, the observable/measurable is given by the linear Hermitian operator \(\hat{σ}_r\).


This article is licensed under a CC BY-NC-SA 4.0 license.


Quantum Dynamics

In many situations, during the time interval in between a particle's initial and final states, the force acting on that particle can vary with time in a very complicated way where the details of the force over that time interval are unknown. The concept of energy enables us to think about systems which have very complicated forces acting on them. A particle could have a force acting on it described by a very general force function \(\vec{F}(\vec{r}.t)\); but by thinking about the energy of that particle we can understand the effects \(\vec{F}(\vec{r}.t)\) has on it. 

In Newtonian mechanics, for any force function \(\vec{F}(\vec{r}.t)\), the force is related to potential energy according to the equations

$$F(x)=\frac{-∂V(x)}{∂x}$$

$$F(y)=\frac{-∂V(y)}{∂y}$$

$$F(z)=\frac{-∂V(z)}{∂z}.\tag{1}$$

We can express Equations (1) in terms of Newton's second law:

$$\frac{-∂V(x)}{∂x}=\frac{∂p_x}{∂t}$$

$$\frac{-∂V(y)}{∂y}=\frac{∂p_y}{∂t}$$

$$\frac{-∂V(z)}{∂z}=\frac{∂p_z}{∂t}.\tag{2}$$

Equations (2) represent the laws of dynamics governing any particle with potential energy \(V(\vec{r})\) due to the force \(\vec{F}(\vec{r})\).

According to De Broglie's equation \(p=\frac{h}{𝜆}\), the bigger the momentum of an object the smaller the wavelength of the wavefunction. Particles which are very massive have a lot of momentum and the wavefunction \(\psi(x,t)\) associated with them has a very small wavelength—meaning it is an extremely localized wavepacket with very little uncertainty involved in measuring the position \(x\) of the particle. If the potential energy \(V(x)\) of the particle does not vary that much with respect to the size of the wavepacket, although it will flatten out over time it won't flatten out that much (especially for a very massive particle). Therefore, for a very massive particle with a very localized wavefunction where \(V(x)\) varies smoothly, \(<X> ≈ x\). Since \(<X> ≈ x\), there will be very little uncertainty in the computation \(\frac{-dV(x)}{dx}\) and \(-<\frac{dv}{dx}>≈ \frac{-dv}{dx}\). For very massive particles with highly localized wavefunction which aren't disturbed (by interacting with their environment via forces), Newton's second law is a good approximation of the dynamics governing their motion. 

When a particle is hit by a photon (which happens during any measurement) or an atom, its potential energy function \(V(x)\) spikes. This causes the wavefunction to become very dispersed; when this happens there is a tremondous amount of uncertainty in measuring the position of the particle. Thus \(\frac{d<X>}{dt}\) becomes big during the interaction and \(<P>\) changes. After the interaction, \(<P>\) will remain roughly constant. Much of quantum dynamics can be understood from how \(\psi(x,t)\) is effected by \(V(x)\).

The Eigenvalues of any Observable \(\hat{L}\) must be Real

An eigenvector of any operator \(\hat{M}\) is defined as simply a special particular vector \(|λ_M⟩\) such that the only effect of \(\hat{M}\) acting on \(|λ_M⟩\) is to span the vector by some constant \(λ_M\) (called the eigenvalue of \(\hat{M}\)). Any eigenvector \(|λ_M⟩\) and eigenvalue \(λ_M\) of \(\hat{M}\) is defined as

$$\hat{M}|λ_M⟩=λ_M|λ_M⟩.$$

Thus the eigenvectors and eigenvalues of any observable \(\hat{L}\) is defined as

$$\hat{L}|L⟩=L|L⟩.\tag{9}$$

electromagnetism thumbnaillllll imageeee.jpg

Let’s take the Hermitian conjugate of both sides of Equation (9) to get

$$⟨L|\hat{L}^†=⟨L|L^*.\tag{10}$$

According to Principle 1, any observable \(\hat{L}\) is Hermitian. Since any Hermitian operator \(\hat{H}\) satisfies the equation \(\hat{H}=\(\hat{H}^†\), it follows that \(\hat{L}=\hat{L}^†\) and we can rewrite Equation (10) as

$$⟨L|\hat{L}=⟨L|L^*.\tag{11}$$

Let’s multiply both sides of Equation (9) by the bra \(⟨L|\) and both sides of equation (11) by the ket \(|L⟩\) to get

$$⟨L|\hat{L}|L⟩=L⟨L|L⟩\tag{12}$$

and

$$⟨L|\hat{L}|L⟩=L^*⟨L|L⟩.\tag{13}$$

Subtracting Equation (12) from Equation (13), we get

$$⟨L|\hat{L}|L⟩-⟨L|\hat{L}|L⟩=0=(L^*-L)⟨L|L⟩.\tag{14}$$

In general, \(⟨L|L⟩≠0\) and is equal to a number \(z\). Therefore it follows that, in general, \(L^*-L=0\) and \(L^*=L\). In order for a number \(z\) to equal its own complex conjugate, it must be real with no imaginary part. Thus, in general, the eigenvalue \(L\) of any observable \(\hat{L}\) is always real. This is a good thing; otherwise Principle 3 wouldn’t make any sense. Recall that principle 3 states that the possible measured values of any quantity (i.e. charge, position, electric field, etc.) are the eigenvalues \(L\) of the corresponding observable \(\hat{L}\). But the measured values obtained from an experiment are always real; in order for Principle 3 to be consistent with this fact, the eigenvalues \(L\) better be real too.

Pauli Matrices

electromagnetism thumbnaillllll imageeee.jpg

The quantity \(\hat{σ}_n\) is called the 3-vector spin operator (or just spin operatory for short). This quantity can be represented as a 2x2 matrix as \(\hat{σ}_z=\begin{bmatrix}(σ_n)_{11} & (σ_n)_{12}\\(σ_n)_{21} & (σ_n)_{22}\end{bmatrix}\). The value of  \(\hat{σ}_n\)(that is to say, the value of each one of its entries) depends on the direction \(\vec{n}\) that \(A\) is oriented along; in other words, it depends on which component of spin \(\hat{σ}_n\) we’re measuring using \(A\). In order to measure a component of spin \(\hat{σ}_m\) in a different direction (say the \(\vec{m}\) direction) the apparatus \(A\) must be rotated; similarly the spin operator must also be “rotated” (mathematically) and, in general, \(\hat{σ}_m≠\hat{σ}_m\) and the values of their entries will be different.

We’ll start out by finding the values of the entries of \(\hat{σ}_z\)—the spin operator associated with the positive z-direction. The states \(|u⟩\) and \(|d⟩\) are eigenvectors of \(\hat{σ}_z\) with eigenvalues \(λ_u=σ_{z,u}=+1\) and \(λ_d=σ_{z,d}=-1\); or, written mathematically,

$$\hat{σ}_z|u⟩=σ_{z,u}|u⟩=|u⟩$$

$$\hat{σ}_z|d⟩=σ_{z,d}|d⟩=-|d⟩.\tag{15}$$

Recall that any ket vector can be represented as a column vector; in particular the eigenstates can be represented as \(|u⟩=\begin{bmatrix}1 \\0\end{bmatrix}\) and \(|d⟩=\begin{bmatrix}0 \\1\end{bmatrix}\). We can rewrite Equations (15) as

$$\begin{bmatrix}(σ_z)_{11} & (σ_z)_{12}\\(σ_z)_{21} & (σ_z)_{22}\end{bmatrix}\begin{bmatrix}1 \\0\end{bmatrix}=\begin{bmatrix}1 \\0\end{bmatrix}$$

$$\begin{bmatrix}(σ_z)_{11} & (σ_z)_{12}\\(σ_z)_{21} & (σ_z)_{22}\end{bmatrix}\begin{bmatrix}0 \\1\end{bmatrix}=-\begin{bmatrix}0 \\1\end{bmatrix}.\tag{16}$$

Using the first equation of Equations (16) we have

$$(σ_z)_{11}+(σ_z)_{12}·0=1⇒(σ_z)_{11}=1$$

and

$$(σ_z)_{21}+(σ_z)_{22}·0=0⇒(σ_z)_{21}=0.$$

Using the second equation from Equations (16) we have

$$(σ_z)_{11}·0+(σ_z)_{12}=1⇒(σ_z)_{12}=0$$

and

$$(σ_z)_{21}·0+(σ_z)_{22}=0⇒(σ_z)_{22}=-1.$$

Therefore,

$$\hat{σ}_z=\begin{bmatrix}1 & 0\\0 & -1\end{bmatrix}.\tag{17}$$

To derive the spin operator \(\hat{σ}_x\) we’ll go through a similar procedure. The eigenvectors of \(\hat{σ}_x\) are \(|r⟩\) and \(|l⟩\) with eigenvalues \(λ_r=σ_{x,r}=+1\) and \(λ_l=σ_{x,l}=-1\):

$$\hat{σ}_x|r⟩=σ_{x,r}|r⟩=|r⟩$$

$$\hat{σ}_x|l⟩=σ_{x,l}|l⟩=-|l⟩.\tag{18}$$

The states \(|r⟩\) and \(|l⟩\) can be written as linear superpositions of \(|u⟩\) and \(|d⟩\) as

$$|r⟩=\frac{1}{\sqrt{2}}|u⟩+\frac{1}{\sqrt{2}}|d⟩$$

$$|l⟩=\frac{1}{\sqrt{2}}|u⟩-\frac{1}{\sqrt{2}}|d⟩.\tag{19}$$

Substituting \(|u⟩=\begin{bmatrix}1 \\0\end{bmatrix}\) and \(|d⟩=\begin{bmatrix}0 \\1\end{bmatrix}\) we get

$$|r⟩=\begin{bmatrix}\frac{1}{\sqrt{2}} \\0\end{bmatrix}+\begin{bmatrix}0 \\\frac{1}{\sqrt{2}}\end{bmatrix}=\begin{bmatrix}\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}}\end{bmatrix}$$

$$|l⟩=\begin{bmatrix}\frac{1}{\sqrt{2}} \\0\end{bmatrix}+\begin{bmatrix}0 \\-\frac{1}{\sqrt{2}}\end{bmatrix}=\begin{bmatrix}\frac{1}{\sqrt{2}} \\-\frac{1}{\sqrt{2}}\end{bmatrix}.$$

We can rewrite Equations (19) in matrix form as

$$\begin{bmatrix}(σ_x)_{11} & (σ_x)_{12}\\(σ_x)_{21} & (σ_x)_{22}\end{bmatrix}\begin{bmatrix}\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}}\end{bmatrix}=\begin{bmatrix}\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}}\end{bmatrix}$$

$$\begin{bmatrix}(σ_x)_{11} & (σ_x)_{12}\\(σ_x)_{21} & (σ_x)_{22}\end{bmatrix}\begin{bmatrix}\frac{1}{\sqrt{2}} \\-\frac{1}{\sqrt{2}}\end{bmatrix}=\begin{bmatrix}\frac{1}{\sqrt{2}} \\-\frac{1}{\sqrt{2}}\end{bmatrix}.\tag{20}$$

By solving the four equation in Equations (20) we can find the values of each of the entries of \(\hat{σ}_x\) (just like we did for \(\hat{σ}_z\)) and obtain

$$\hat{σ}_x=\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}.\tag{21}$$

Lastly, we solve for \(hat{σ}_y\) the same exact way that we solved for \(hat{σ}_x\) to get

$$\hat{σ}_y=\begin{bmatrix}0 & -i\\i & 0\end{bmatrix}.\tag{22}$$

The three matrices associated with \(\hat{σ}_z\), \(\hat{σ}_x\), and \(\hat{σ}_y\) are called the Pauli matrices.

The Einstein Equivalence Principle

The Einstein Equivalence Principle

When Einstein first realized that someone falling in an elevator near Earth's surface would experience all the same effects as another person riding in a rocket ship accelerating at 9.8 meters per second, he described it as "the happiest thought of his life." He realized that all the laws of physics and any physical experiment done in either reference frame would be identical and completely indistinguishable. This is because the effects of gravity in a constant gravitational field are identical to the effects of constant acceleration. This lead Einstein to postulate that gravity and acceleration are equivalent. Analogous to how all the physical consequences of special relativity could be derived from the postulation of the constancy of light speed and the sameness of physical laws in all inertial reference frames, all of the physical consequences of general relativity are derived from the postulate that acceleration and gravity are equivalent and that the laws of physics are the same in all reference frames. The former has been called the Einstein Equivalence Principle. There are various different forms that this statement can take, but in this lesson we shall describe the strong version of the Einstein Equivalence Principle.

Fundamental Principles and Postulates of Quantum Mechanics

electromagnetism thumbnaillllll imageeee.jpg

Principle 1: Whenever you measure any physical quantity \(L\), there is a Hermitian linear operator \(\hat{L}\) (called an observable) associated with that measurement.

Principle 2: Any arbitrary state of a quantum system is represented by a ket vector \(|\psi⟩\).

Principle 3: The possible measurable values of any quantity are the eigenvalues \(λ_L(=L)\) of \(\hat{L}\).

Principle 4: According to the Copenhagen interpretation of quantum mechanics, after measuring \(L\), the possible states a quantum system can end up in are the eigenvectors \(|λ_L⟩(=|L⟩\)) of \(\hat{L}\).

Principle 5: For any two states \(|\psi⟩\) and \(|\phi⟩\), the probability amplitude of the state changing from \(|\psi⟩\) to \(|\phi⟩\) is given by

$$\psi=⟨\phi|\psi⟩.\tag{1}$$

The probability \(P\) of the state changing from \(|\psi⟩\) and \(|\phi⟩\) can be calculated from the probability amplitude using the relationship

$$P=\psi^*\psi=⟨\psi|\phi⟩^*⟨\phi|\psi⟩=|\psi|^2.\tag{2}$$

From a purely mathematical point of view, any ket \(|\psi⟩\) in Hilbert space can be represented as a linear combination of basis vectors:

$$|\psi⟩=\sum_i{\psi_i|i⟩}.\tag{3}$$

The kets \(|1⟩\text{, ... ,}|n⟩\) represent any basis vectors and their coefficients \(\psi_1\text{, ... ,}\psi_n\) are, in general, complex numbers. We shall prove in the following sections that we can always find eigenvectors \(|L_1⟩\text{, ... ,}|L_n⟩\) of any observable \(\hat{L}\) that form a complete set of orthonormal basis vectors; therefore any state vector \(|\psi⟩\) can be represented as

\(|\psi⟩=\sum_i{\psi_i|L_i⟩}.\tag{4}\)

We’ll also prove that the collection of numbers \(\psi_i\) are given by

$$\psi_i=⟨L_i|\psi⟩\tag{5}$$

and represent the probability amplitude of a quantum system changing from the state \(|\psi⟩\) to one of the eigenstates \(|L_i⟩\) after a measurement of \(L\) is performed. The collection of all the probability amplitudes \(\psi_i\) is called the wavefunction. When the wavefunction \(\psi(L,t)\) associated with the state \(|\psi⟩\) becomes a continuous function of \(L\) (that is, the range of possible values of \(L\) becomes infinite and the number of probability amplitudes becomes infinite), we define \(|\psi|^2\) as the probability density. One example where \(\psi\) becomes continuous is for a particle which can have an infinite number of possible \(x\) positions. Then \(\psi\) becomes a continuous function of \(x\) (and, in general, also time \(t\)). Since \(|\psi(x,t)|^2\) is the probability density, the product \(|\psi(x,t)|^2dx\) is the probability of measuring \(L\) at the position \(x\) and at the time \(t\). The probability of measuring anything at the exact location \(x\) is in general zero. A far more useful question to ask is: what is the probability \(P(x+Δx,t)\) of measuring \(L\) within the range of x-values, \(x+Δx\)? This is given by the following equation:
$$P(x+Δx,t)=\int_{x}^{x+Δx} |\psi(x,t)|^2dx.\tag{6}$$
According to the normalization condition, the total probability of measuring \(L\) over all possible values of \(x\) must satisfy
$$P(x,t)=\int_{-∞}^{∞} |\psi(x,t)|^2dx=1.\tag{7}$$
If \(\psi(L,t)\) is continuous, then the inner product \(⟨\phi|\psi⟩\) is defined as
$$\psi(L,t)=⟨\phi|\psi⟩=\int_{-∞}^{∞} \phi^*{\psi}\text{ dL}.\tag{8}$$

Measuring the Spin of an Electron

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbs…

                                                                        Figure 1

electromagnetism thumbnaillllll imageeee.jpg

We can think of the spin of an electron as a 3-vector attached to the electron which behaves like a bar magnet. If an electron whose spin is pointing in a direction at an angle \(θ\) to the vertical is placed in the magnetic field of an apparatus \(A\), its spin will align with the magnetic field of \(A\). In doing so the electron will emit radiation and lose all of its potential energy. As the angle \(θ\) increases, the electron’s potential energy increases. Therefore one might expect that the bigger \(θ\) is, the more energy the electron will radiate away. However, this is not what happens. Experimentally, it has been demonstrated that no matter what direction the spin is initially pointing in, either one of two things will happen when the electron’s spin aligns with the magnetic field: either the electron will emit no radiation or it will emit exactly one photon whose energy \(E_γ\) equals the potential energy \(PE(180°)\) when the spin is pointing down. The electron’s initially prepared spin can be in any direction but, oddly, when you measure the spin it is always only up or down.

The apparatus \(A\) in Figure (1) is used to create a magnetic field. The field lines are straight and start at the north pole and end at the south pole as illustrated in Figure (1). To simplify things we’ll draw \(A\) as a box with an arrow on it where the arrow represents the direction of the magnetic field. If we imagine “turning \(A\) off,” then rotating \(A\) by \(90°\) without effecting the electron, and then turning \(A\) back on we’ll be measuring the x component of spin \(\hat{σ}_x\). If we do the same thing as before and turn \(A\) off, then rotate \(A\) by \(90°\) along the xy-plain, and turn \(A\) back on we’ll be measuring the y-component of spin \(\hat{σ}_y\). If we follow the same procedure as before except rotate \(A\) by an arbitrary angle so that it is pointing in an arbitrary direction, the axis along which \(A\) is pointing will be the component of the measured spin which we’ll represent as \(\hat{σ}_r\).

There are two lights attached to the bottom of \(A\). Suppose that the spin of the electron is initially prepared in any arbitrary direction, if we measure \(\hat{σ}_z\) and no photon is emitted we will have measured \(σ_z=+1\) and the spin is now up. If we measure \(\hat{σ}_z\) and a photon is emitted then we will have measured \(σ_z=-1\) and the spin is now down. (It’ll take some time getting used to this notion but before the component of spin \(\hat{σ}_z\) is measured the z-component of spin could be anything. But when you go to measure \(\hat{σ}_z\) there are only two possible values of the z-component of spin you can measure which are .) This entire discussion applies to whenever we measure the components of spin \(\hat{σ}_x\), \(\hat{σ}_y\), and in general \(\hat{σ}_r\) along any arbitrary axis: before the component of spin with respect to some axis is measured it can be anything; but when we go to measure that component of spin it can only be \(σ_r=±1\).

Imagine repeating over and over again the following experiment:

1)      Initially prepare the electron’s spin in any arbitrary direction.

2)      Turn \(A\)’s magnetic field off and then rotate \(A\) until its “up arrow” is pointing along the z-axis as in fig. #.

3)      Turn \(A\) back on and measure the component of the electron’s spin \(\hat{σ}_z\).

4)      Turn \(A\)’s magnetic field off and then, after that, prepare/reset the electron’s spin to be the same as it was in step 1) before measuring \(\hat{σ}_z\).

5)      Repeat


If you perform this procedure many times where the electron’s spin is initially prepared pointing entirely along the positive x-axis, then in 50% of the experiments the electron will emit no radiation and in the other 50% the electron will emit a photon. If you go through this procedure for \(θ=45°\), in 75% of the experiments the electron will emit no radiation (measured spin is up) and in 25% of the experiments the electron will emit a photon (measured spin is down). If you go through this procedure for \(θ=135°\), 75% of the time you will measure the electron’s spin to be pointing down and 25% of the time to be pointing up. There are, however, two special cases where if the electron’s spin is prepared at \(θ=135°\) the spin will be measured to be up 100% of the time and if the electron’s spin is prepared at \(θ=180°\) the spin will be measured to be down 100% of the time.

Solving the FRW Equation for the Scaling Factor in different scenarios

Solving the FRW Equation for the Scaling Factor in different scenarios

In this lesson we'll solve the FRW equation (one of the EFE's) for the scaling factor \(a(t)\) to determine the expansion rate of the universe in two different idealized scenarios: a universe filled with only radiation and a universe filled with only matter. These two different scenarios are called a radiation dominated universe and a matter dominated universe, respectively.

Bentley's and Olber's Paradoxes

Bentley's and Olber's Paradoxes

Shortly after Isaac Newton published his law of gravitation, the philosopher Richard Bentley and the astronomer Heinrich Olbers pointed out two paradoxes that arise from this law. The first, which is called Bentley's paradox, points out that if the universe is finite in size then, since the force of gravity is always attractive, all of the stars and galaxies in the universe should collapse in on themselves. The second, called Olbers' paradox, states that if the universe is infinite and if the distribution of stars in the universe is uniform then the night sky should be filled with infinitely many stars and the night sky should therefore be blindingly bright. 

Friedman-Robertson-Walker (FRW) Equation

Derivation of Friedman-Robertson-Walker (FRW) Equation

We shall use Newton’s theory of gravity, one of his theorem’s from his Principia, and the conservation of energy to derive the FRW equation which describes how the scaling factor \(a(t)\) changes with time. (Later on we’ll derive the FRW equation by substituting the metric \(g_{μν}\)  into the Einstein Fields Equations (EFEs).)

In Newtonian theory, what determines the motions of the galaxies is gravity. What is the effect on the motion of the mass \(m=m_j\) (Alice’s galaxy say) with some initial outward velocity \(V=H_0D\) due to the expansion of space? Note that \(V\) is a variable depending on \(D\). Very large values of  \(D\) (corresponding to galaxies very far away) have enormous velocities \(V\) whereas for small values of \(D\) (corresponding to nearby galaxies) have very small velocities \(V\). Clearly, \(V\) is different for each galaxy of mass \(m\) due to \(D\) being different. We assume that the motion of \(m\) is only effected by the gravitational field due to \(M\) (in other words, the gravitational force \(F_{\text{M,m}}\) is the only influence acting on \(m\)).

One of Newton’s theorem's states that the gravity due to all the masses surrounding \(m\) can be regarded as the gravity due to a point-mass \(M\) centered in the sphere of radius \(D\) where \(M\) is the sum of all the masses enclosed by that sphere.

The total energy \(E\) for each galaxy of mass \(m\) is constant and is given by \(E=KE+PE=\frac{1}{2}mv^2-\frac{MmG}{D}\). From this point onwards the steps we take might seem tedious, but really all we’ll be doing is a lot of algebra and making a lot of substitutions. First let’s multiply both sides of the energy-conservation equation by \(2/m\) to get

$$V^2-2\frac{MG}{D}=\frac{2E}{m}=KE.$$

Next let’s substitute \(V=dD/dt=(∆r)da/dt\) and \(D=a∆r\):

$$∆r^2\biggl(\frac{da}{dt}\biggr)-2\frac{MG}{a∆r}=\frac{2E}{m}=KE.$$

Next we can multiply both sides by \(1/a^2\) to get

$$∆r^2\biggl(\frac{da}{dt}\frac{1}{a^2}\biggr)-2\frac{MG}{a^3∆r}=\frac{2E}{a^2m}=\frac{KE}{a^2}.$$

We will also use Newton’s theorem to show that only the mass \(M\) inside the sphere of radius \(D\) contributes to the gravitational effect of a galaxy at a distance \(D\) away from the center. The mass is given by \(M=ρ(\frac{4}{3}πD^3)=ρ(\frac{4}{3}πa^3∆r^3\). It follows that

$$∆r^2\biggl(\frac{da}{dt}\frac{1}{a^2}\biggr)-2\frac{[ρ(\frac{4}{3}πa^3∆r^3)]G}{a^3∆r}=∆r^2\biggl(\frac{d^2a}{dt^2}\frac{1}{a^2}-\frac{8}{3}πρG\biggr)=\frac{K}{a^2}.$$

At some particular time \(t=constant\), the scaling factor is constant and does not vary with position (because we assumed the Universe is isotropic and homogeneous over large distances) and \(Δr^2K=K\). I will now write the left- and right-hand side of the equation in a way where you can clearly see both sides are proportional to \(Δr^2\) which must be the case since they are equal:

$$Δr^2\biggl(\frac{d^2a/dt^2}{a^2}-\frac{8}{3}πρG\biggr)=\frac{2}{ma^2}\biggl(\frac{1}{2}mV^2-\frac{MmG}{aΔr}\biggr).$$

Using the fact that \(V=Δrda/dt\) and \(M=ρ(\frac{4}{3}πa^3Δr^3)\), you can see that a \(Δr^2\) term can be factored out of the right-hand side of the equation and thus, by dividing both sides by \(Δr^2\), we get

$$\frac{(da/dt)^2}{a^2}-\frac{8}{3}πGρ=\frac{K}{a^2}.$$

Next, we'll let \(K=-κ\) and substitute it into the equation above; then we'll add \(\frac{8}{3}πGρ\) to both sides to get the FRW equation which is given by

$$H^2(t)=\frac{(da/dt)^2}{a^2}=8πGρ/3.\tag{6}$$


Some comments on the qualitative character of FRW Equation

The FRW equation is a differential equation whose solution is the scaling factor \(a(t)\). The scaling factor \(a(t)\) that we obtain from this equation will be different based on what \(ρ\) and \(κ\) are. The value of \(κ\) is related to the curvature and geometry of the space of the universe. If \(κ=1\), then the space of the Universe is a closed and bounded 3-sphere; if \(κ=-1\), the space is an open, 3-dimensional hyperboloid; if \(κ=0\), then space is flat (non-curved) and open. We know, empirically, from observations of the Cosmic Microwave Background (CMB) that \(κ≈0\) to within an accuracy of about 1%. We will assume that \(κ=0\) which simplifies the FRW equation to

$$H^2(t)=(da/dt)^2/a^2=\frac{8}{3}πGρ.\tag{7}$$

If we live in a Universe with a flat space with no curvature where \(κ=0\), since the left-hand side of the equation will always be positive, the left-hand side of the equation will always be positive. This means that Hubble’s parameter \(H(t)\) will remain constant with time which means that the Universe must always be expanding (it cannot be contracting).

For now we’ll just worry about studying the FRW equation when \(κ=0\) and (for now) we won’t worry about how \(a(t)\) changes based on the value of \(κ\). Given \((da/dt)^2/a^2=8πGρ/3\), \(a(t)\) will be different depending on what \(ρ\) is. For example, from the birth of the Universe until the Universe was about 10,000 years old, the energy content of the Universe was dominated by radiation and we can call the energy density in this primordial Universe \(ρ_r\). From when the Universe was 10,000 years old to a few billion years old, the distribution of energy was mostly in the form of masses (galaxies) at rest in the coordinates \(x^i)\). We can call the energy density of the universe during this time period \(ρ_M\). \(ρ_M\) and \(ρ_r\) are different; as a consequence of this, the right-hand side of the FRW equation will be different for \(ρ_M\) and \(ρ_r\). Therefore, \(a(t)\) will be different for both cases.

In a Universe where there is only radiation, all of the energy is in the photons present. The energy of each photon is given by \(E_γ=hf=hc/λ=k/λ\). The energy \(E_γ\) of each photon depends on the size of the box. If the box expands, for example, as space expands, the wave associated with each photon will expand with it. As its wavelength \(λ\) gets stretched (which is to say redshifted), the energy \(E_γhc/λ\) of each photon will decrease. For photons traveling along the x-axis in the box, let the distance between points A and B (which can be the two points on crests or troughs separated by one wavelength \(λ\) or, more generally, two points with the same phase \(Φ\) separated by \(λ\)) be \(D=a∆x\); then, \(D=a∆x=λ\). As the box grows, \(∆x\) stays the same while the scaling factor increases. Since \(∆x=constant\), we have \(a(constant)=λ\) and thus \(a∝λ\). We can substitute this into \(E_γ\) to obtain \(E_γ=hc/λ=hc/a∆x=k/a\). Thus, \(E_γ∝λ\).

We can choose any point in space to be the origin of a coordinate system \(x^i)\) which “follows” the expansion/contraction of space. Suppose we draw a box whose edges are separated by 1 coordinate unit. As space either expands or contracts, the box will expand or contract with it. \(∆x^1\), \(∆x^2\), and \(∆x^3\) do not change with this expansion/contraction. The actual physical distances between the edges of the box are given \(a∆x^1\), \(a∆x^2\), and \(a∆x^3\) which, in general, will change with time. The volume of the box is given by \(V=(a∆x^1)(a∆x^2)(a∆x^3)\) since \(∆x^1=∆x^2=∆x^3=1\). All of the photons will be traveling at the speed of light where some are going into the box and some are going out of the box, but we will assume that on average the total number of photons in the box is constant. The energy density in the box due to the radiation present is \(ρ_r=\frac{\text{total energy in box}}{\text{volume of box}}=E_{total}/V\). The energy of each photon is \(E_γ\) and if there are \(N\) (which we assume is constant) photons in the box then the total energy in the box is \(E_{total}=NE_γ\) (this is just the sum of the energies of each photon in the box). The energy density in the box is

$$ρ_r=E_{total}/V=NE_γ/a^3=N(k/a)/a^3=k/a^4.\tag{8}$$

where \(ρ_r∝1/a^4\). In a Universe where space is flat (but can be expanding/contracting) with only radiation in it, the FRW equation simplifies to

$$H^2(t)=(da/dt)^2/a^2=8πGρ_r/3=k/a^4.\tag{9}$$


This article is licensed under a CC BY-NC-SA 4.0 license.

References

1. Leonard Susskind. "Matter and radiation dominated universes". theoreticalminimum.com.


Surface of Last Scattering

When we look at the Andromeda galaxy, we are seeing photons that are 2 million years old (that is how long it took the photons, traveling at the speed of light, to reach us since the distance between our galaxy and the Andromeda galaxy is 2 million light-years); since the scaling factor \(a(t)\) was smaller then, the photons were “hotter” and had more energy. If we look back at the light coming from galaxies 10 billion light-years away, we are seeing photons that are 10 billion years old and that are even hotter and more energetic. If we look back about 13.4 billion light-years, we are seeing photons that are 13.4 billion years old and that were buzzing along through space when the Universe was just 300,000 years old; we are seeing photons right at the time when the Universe transitioned from being opaque to being transparent. Since these photons are 3000K, when we look up at the night sky it should be blindingly bright. Indeed, if we ignore the expansion of space due to \(ρ\), this would be the case. However, because these photons are traveling through a space that is expanding, they become redshifted to the microwave range.

Boltzmann’s relationship between the temperature \(T\) and energy \(E\) of radiation is given by \(E=kT⇒E∝T\). We showed earlier that the total energy \(E\) of a collection of photons is \(E=NE_γ=\frac{Nhc}{λ}=\frac{Nhc}{a∆r}=\frac{k}{a}\) and \(E∝\frac{1}{a}\). Using Boltzmann’s relationship, we also see that \(T∝\frac{1}{a}\). As the Universe expands and \(a\) grows with time \(t\) since the Big Bang, the wavelengths of photons become stretched, there energy decreases, and there temperature decreases. But if we imagine running the clocks backwards and watching the Universe as time runs backwards, the Universe is contracting and \(a\) is decreasing as we run \(t\) backwards, the wavelengths of photons decreases, their energy increases, and their temperature increases. Today we can measure that the temperature of radiation is \(~3K\). During some time period of the very young Universe, the temperature was so hot that light and photons couldn’t pass through it (they became scattered). The hydrogen and helium atoms were ionized; this means that the electrons and nuclei (or protons and neutrons were split too, not sure) were split apart and buzzing passed one another at tremendous speeds. Such a state of matter is called a plasma. During this time period, all of the matter in the Universe looked like a giant, glowing ball of plasma (in other words, it looked like a giant Sun). When the temperature of a collection of particles become \(~3,000K\), it becomes opaque (which means that light and photons cannot pass through it). We will call this temperature \(T_{\text{last scattering}}\).

This video was produced by David Butler. For the transcript of this video, visit: http://howfarawayisit.com/documents/.

The temperature \(T_{\text{last scattering}}≈3,000K\) is the “turning point” so to speak; at the time (which we will call \(t_{\text{last scattering}}\)) when atoms and radiation were at this temperature, this is the instant of time just before the Universe became transparent. For any \(t>t_{\text{last scattering}}\), the temperature \(T\) of radiation and atoms was \(T<T_{\text{last scattering}}≈3,000K\). We are interested in solving for the time \(t_{\text{last scattering}}\): that moment of time when the Universe transitioned from being opaque to transparent. To do this, we take advantage of the fact that the Universe was still matter dominated at \(t_{\text{last scattering}}\); assuming that throughout the entire time interval from \(t_{\text{last scattering}}\) all the way until our present time \(t\) the dominant form of energy density in the Universe was \(ρ_M\), the scaling factor is \(a(t)=Ct^{2/3}\) for this entire interval of time. (It is an enormous idealizations and simplification to assume the only contribution to the energy density \(ρ\) throughout the Universe is \(ρ_M\) and that \(ρ≈ρ_M\); later on, we shall come up with a model from which we can derive a \(ρ\) that includes both radiation energy density \(ρ_r\) and mass-energy density \(ρ_M\).) At our present time (which we will call \(t_{today}≈10^{10}\text{ years}=10\text{ billion years}\)), the scaling factor is given by \(a(t_{today})=Ct^{2/3}_{today}\); at the time \(t_{\text{last scattering}}\), the scaling factor was \(a(t_{\text{last scattering}}=Ct^{2/3}_{\text{last scattering}}\). By taking the ratio of the two scaling factors and doing some algebra, we can find \(t_{\text{last scattering}}\):

\begin{equation}
\frac{a(t_{today})}{a(t_{\text{last scattering}})}=\frac{Ct^{2/3}_{today}}{Ct^{2/3}_{\text{last scattering}}}=1000.
\end{equation} Since \(T≈1/a\), if we are considering a time when \(T\) was 1,000 times greater than it is today, the scaling factor was 1,000 times smaller back at that hotter \(T\)

\begin{equation}
\frac{t_{today}}{t_{\text{last scattering}}}=1000^{3/2}⇒t_{\text{last scattering}}≈\frac{10^{10}years}{1000^{3/2}}≈300,000\text{ years}.
\end{equation} For the first 300,000 years of the life of the Universe, the Universe was opaque. Just after a time of roughly 300,000 years since the Big Bang, the temperature \(T\) of the Universe became cool enough for atomic nuclei to bond with electrons and form electrically neutral hydrogen and helium atoms allowing the Universe to become transparent.

The graph of \(ρ(a)\) vs. \(a\) shows that as \(a\) increases (since \(k=0\), \(a\) will keep growing with \(t\) forever) with time and the Universe expands, during the time interval \(t_0≤t≤t_1\) when the scaling factor increased from \(a(t_0)\) to \(a(t_1)\), the dominant form of energy density was \(ρ_r\) since \(ρ_r>ρ_M\) and \(ρ_r>ρ_0\) over this time period. During the time interval \(t_0≤t≤t_1\) when the scaling factor increased from \(a(t_0)\) to \(a(t_1)\), the dominant form of energy density was \(ρ_M\). For times \(t>t_1\) when \(a(t)>a(t_1)\) (which is the time period we live in today), the dominant form of energy density is \(ρ_0\). From the graph for \(a(t)\) vs. \(t\), we see that after a long enough time (when \(t\) becomes very large), \(a(t)=Ce^{\sqrt{\frac{8}{3}πGρ_0}t}\) will dominate and the Universe will keep expanding at an exponential (and thus accelerating) rate.

How \(a(t)\) changes with \(t\) in a Universe dominated only by dark/vacuum energy only becomes significant over vast distance scales. Substituting \(a(t)\) into \(D=a(t)∆r\) and then substituting \(D\) into \(V=H_0D\), we can calculate that the recessional velocities between our galaxy and very nearby galaxies (small ) is very small (it is only hundreds of kilometers per second which, compared to the speed of light, is extremely small). For very enormous values of \(D\), the recessional velocities of distant galaxies is close to the speed of light. For example, distant quasars that are 10 billion light-years away move away from as at a speed of about half the speed of light. If the distance \(D\) a galaxy is away from us is big enough, it will be moving away from us at a speed greater than the speed of light. As the march of time progresses, Hubble's Parameter \(H(t)\) will continue to grow with time since the scaling factor will continue to increase with time due to dark energy; after an unimaginably long period of time (five billion years), \(H(t)\) will have increased so dramatically that even if one substituted distances for the nearest galaxies only a few million light-years away into Equation (3), they would discover that those galaxies are receding from us faster than light speed. Lawrence Krauss once said that, for this reason, we are living in a very special time in the history of the universe, a time when we can still observe the CMBR and arrive at correct conclusions about the nature of the universe. Five billion years from now, all of the galaxies beyond the Milky Way, and also the CMBR, will be receding from us faster than the speed of light and will become undetectable.


This article is licensed under a CC BY-NC-SA 4.0 license.

References

1. Singh, Simon. Big Bang: The Origin of the Universe. New York: Harper Perennial, 2004. Print.

2. Wikipedia contributors. "Cosmic microwave background." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 12 May. 2017. Web. 18 May. 2017.

3. Leonard Susskind. "Vacuum energy". theoreticalminimum.com.

Spectroscopy

Recessional velocities and compositions of stars

For the purposes of this discussion we can regard light as a wave without consideration of its particle character. Since it is a wave it has a wavelength \(λ\) which is defined as the distance between two crests or troughs of the light wave as shown in figure #. (More generally speaking, it is defined as the distance between two points on the wave with the same phase.) If we know what the wavelength \(λ\) of light is, then we can determine a great deal of information about that light such as its color (or, more precisely, which region of the EM spectrum the light is in), energy, temperature, and so on. Each particular color of light has a specific wavelength. For example red light has a wavelength of roughly \(λ=700nm\)whereas violet light has a wavelength of roughly \(λ=400nm\). There are also forms of light whose wavelength corresponds to regions in the EM spectrum which are imperceptible to human vision (but, nonetheless, can still be “seen” by our detectors) such as infrared light, microwaves, etc. Given the wavelength then using Plank’s relationship \(E=hc/λ\), we can determine the energy of the light. Furthermore from the Boltzmann relationship \(E=kT\), we can also relate the energy of this light to the temperature of the matter which emitted it. On average the atoms composing matter of temperature \(T\) will emit photons with an energy \(E=kT\) where \(T=k/λ\). Thus given \(λ\) we can determine the temperature \(T\) of the matter. The value \(λ\) of radiation emitted by the human body with temperature \(T≈98\text{ degrees F}\) corresponds to infrared radiation. If you consider hotter matter such as a piece of metal heated to \(500\text{ degrees C}\), it will emit wavelengths of light which correspond to the red region of the EM spectrum and the object will be glowing red. If you consider still hotter object such as the filament in a light bulb with a temperature of \(3,000\text{ degrees C}\), it will emit white light because \(λ\) will have shifted to the middle of the visible part of the EM spectrum.

When light is emitted by a distant stellar object (i.e. star or galaxy) with an initial wavelength \(λe\), by the time it reaches us its wavelength becomes “stretched” and increases by an amount \(∆λ\). We say that the light was redshifted by an amount \(∆λ\). In practice the radial velocity \(V\) of recession of a distant stellar object a distance \(D\) away from us can be calculated using Doppler’s equation if we know what the redshift (∆λ\) is. During the 1920s the astronomer Edwin Hubble performed this calculation for many different stellar objects and plotted the velocity \(V\) as a function of distance \(D\) on a graph. He then drew a line of regression through the data points (which was a straight line) and concluded that \(V∝D\). Then by calculating the slope he was able to determine Hubble’s constant \(Ho\). (Hubble’s initial calculation of \(Ho\) was off by a factor of 10 but later on this error was corrected.) It is very important to emphasize that these stellar objects are not moving through space; rather it is space itself that is “moving” and expanding. Therefore the redshift of these objects is due not to their motion relative to us but rather to the expansion of space (we will talk about this in detail later on). The redshift \(∆λ\) of such objects is measured using a device called a spectroscope which is an instrument that contains a glass (or some other refractive material) prism. When light passes through the prism it “spreads out”; this makes it easier to distinguish between the different wavelengths of light. (For example when white light (which is composed of all visible wavelengths) passes through a prism a rainbow is produced and the different wavelengths are easier to distinguish.) This refracted light is then shone on a photographic plate (the detector) which records the “brightness” of each wavelength of light. The intensity as a function of wavelength, \(I(λr)\), is obtained from the detector and, roughly speaking, is a measure of the “brightness” or “dimness” of each wavelength \(λr\).

Figure 2

According to the laws of atomic physics (which are derived from quantum mechanics), a particular atom can emit and absorb only certain wavelengths of light. Take for example sodium atoms which compose salt. Sodium atoms emit or absorb only two different wavelengths (which correspond to the “orange region” of the EM spectrum) of photons. If a light source shined light composed of all wavelengths in the EM spectrum through a bunch of sodium atoms all wavelengths of light would pass right through it except for two specific wavelengths (which are in the “yellow region” of the EM spectrum). If this emitted light, after passing through the sodium atoms, is shined through a spectroscope and onto a detector, two distinct bands (which lie in the yellow region) will appear on the detector as shown in Figure 3. Each different kind of atom leaves its own signature and produces different bands on the detector as shown in Figure 3. If light consisting of all wavelengths is shined through some object (that is composed of unknown kinds of atoms) and then through a spectroscope and onto a detector, we can determine what kinds of atoms it is made of by inspecting the dark lines (called spectral lines) on the detector. Throughout the early 20th century astronomers used this technique (which is called spectroscopy) to determine the composition of the Sun, other stars, our Milky Way galaxy, and other galaxies.

A star has an atmosphere made up of certain kinds of atoms each of which absorb only particular wavelengths. Stars generate their light through a process called nuclear fusion at its center. When this light passes through the star’s atmosphere most wavelengths of light will go straight through it but certain wavelengths of light will get absorbed. After the light passes through the star’s atmosphere some of it will passes through an astronomer’s telescope and spectroscope and onto the detector. All wavelengths of light will appear on the detector except for the ones that got absorbed—these will appear as black spectral lines. Suppose that we had two identical stars made of the same atoms: the first is stationary relative to us, but the second is moving away from us due to the expansion of space. The spectral lines will look exactly the same for both stars with one exception: the spectral lines corresponding to the star moving away from us will be slightly redshifted. This measured redshift \(∆λ\) is what astronomers use to calculate the recessional velocity \(V\) of the star. The same exact argument applies to entire galaxies: most of a galaxy’s light is generated in the interiors of stars which then pass through the stars atmospheres (some wavelengths getting absorbed) and some of that light reaches the detectors of humanities most powerful telescopes. By examining the redshift in the spectral lines astronomers like Vesto Slipher and Edwin Hubble were able to determine what the galaxies recessional velocities were.


Finding massive exoplanets

Spectroscopy also plays an important role in the discovery of massive exoplanets which are several times as massive as the Earth. To understand how this works, it is very useful to start out by thinking about the Jupiter-Sun system. As Jupiter orbits around the Sun, it exerts gravitational forces causing the Sun to oscillate about its equilibrium position with an amplitude of hundreds of miles. If we were living on a distant exoplanet "watching" the Sun with out telescopes, what would we see? Well, if we were at the appropriate point in our orbit we would be able to see the Sun either moving towards us or away from us. This relative motion will cause the light emitted from the Sun to be either blueshifted or redshifted. From there, we could just use Doppler's equations to determine the Sun's relative velocity. After determining the relative velocity, we could then work out the mass of Jupiter, its orbital period, and lastly how far Jupiter is away from the Sun.

The entire aforementioned discussion we just had would also apply to an Eath-based observer watching distant stars with their telscopes. When astronomers see a star wobbling, this gives them a pretty good hunch that a massive exoplanet must be orbiting around it. Using the aforementioned techniques, astronomers ccould deduce the mass of the exoplanet, its orbital period, and its distance away from its parent star. Unfortunately, for smaller planets, this strategy of looking at the star's wobble doesn't work quite so well since small exoplanets (ones which are 0.1 to a few times the size of the Earth) exert very small tugs on their parent star making it difficult to notice any wobbles. To find smaller exoplanets, astronomers use a different strategy: namely, they try to spot a "twinkle" as the exoplanet passes by its neighboring star.


This article is licensed under a CC BY-NC-SA 4.0 license.

References

1. Singh, Simon. Big Bang: The Origin of the Universe. New York: Harper Perennial, 2004. Print.

2. Wikipedia contributors. "Spectroscopy." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 12 May. 2017. Web. 18 May. 2017.

A Brief Tour of our Milky Way Galaxy

A Brief Tour of our Milky Way Galaxy

The Milky Way galaxy—our home galaxy—is a grand assemblage of over one-hundred billion stars that spans one-hundred thousand light-years across space. But that isn't all that there is in our galaxy. Enormous clouds of gad and dust float in between the stars; this is called the interstellar medium. Mapping the Milky Way and understanding its size and composition was made possible by advances in astronomy. Techniques such as stellar parallax and the use of a class of stars known as Cepheid variable stars made it possible for astronomers to measure vast distances across space. Despite this it was still very difficult to make some long-range distance measurements across the galaxy because the interstellar medium blocks out so much visible light; the advent of infrared astronomy, however, circumvented this issue and made it possible for 20th century astronomers to determine the size of the Milky Way.

Dark Matter

Dark matter

This video was produced by DrPhysicsA\(^{[1]}\)


Suppose we have a very sparse distribution of particles/point-masses surrounding a very dense region of particles where most of the mass is. A familiar example of this is our solar system where the vast majority of the mass is concentrated in the Sun and as you move far away from the Sun the distribution of mass is very sparse and small. If this is the way the mass is distributed, we can approximate and say that the gravitational force \(F_g\) acting on any one of the particles  in the sparse and mostly empty region is due entirely to the mass \(M\) where \(M\) can be treated as a single particle/point-mass. For example, when studying our solar system, under this approximation, we can think of the Earth as a speck and all of the other planets, moons, comets and asteroids as other specks which have no gravitational effect on the Earth; under this approximation, we can think of all the mass concentrated at the center making up the Sun as the only lump of mass in the solar system which has a gravitational effect on the Earth and we can treat that entire lump of mass \(M_{Sun}\) as a little point-particle. Hence, we can assume that the gravitational force acting on the Earth is \(F_{Sun,Earth}=G\frac{M_{Sun}m_{Earth}}{r^2}\). Another good approximation is that the Earth goes around the Sun in a circular orbit (although, in reality, it is an ellipse). Recall that Newton’s second law describe any force \(F\) acting on an object (modeled as a particle/point-mass) of mass \(m\) as equaling \(ma\) where \(a\) is the acceleration of the object. In our example, the only force \(F\) acting on the Earth is the gravitational force \(F_{Sun,Earth}\) due to \(M_{Sun}\). Thus,  the net force  acting on the Earth is just  and . Thus, \(m_{Earth}a=G\frac{M_{Sun}m_{Earth}}{r^2}\) and \(a=G\frac{M_{Sun}}{r^2}\). The acceleration \(a\) of any object going in a circular path due to a gravitational force \(F_g\) due to a point-mass at the center is \(a_c=\frac{v^2}{r}\). Thus,

$$\frac{v^2}{r}=G\frac{M_{Sun}}{r^2}⇒v=\sqrt{G\frac{M_{Sun}}{r}}.$$

(This equation says that the velocity \(v\) decreases with increasing \(r\).) When we look out through our telescopes at the distribution of stars in our galaxy, we see that the vast majority of the mass is at the center where there are thousands of globular clusters and stars next to each other in a very dense and compact region. The rest of the stars on the outskirts of the galaxy away from the center are very sparsely distributed along the spiral arms. In fact, we have measured that most of the stars and visible matter and mass is at the center (so we know for sure this is where most of the visible matter and mass is). The stars orbit around the center of the galaxy approximately in circular orbits (so we can approximate and say that all the stars have circular orbits). In this example, we can make all the same assumptions that we made about our solar system and go through the same analysis to find that the velocity \(v\) of any star at distance \(r\) from the center of the galaxy is \(v=\sqrt{\frac{MG}{r}}\) where \(M\) is the total mass of all the visible matter (mainly stars) at the center. From the way that visible matter is distributed throughout our galaxy and the way that the stars go in roughly circular orbits around the center, we would expect that for stars at greater distances \(r\) from the center, they will have smaller velocities \(v\). This is actually not the case. In the early 1970s an astronomer named Vera Rubin along with her colleagues measured the rotation rate (that is to say the orbital speed of various stars) in our galaxy to obtain the curve in Figure #. She was awarded the Gold Medal of the Royal Astronomical Society for her pioneering work. We see from this graph that for increasing  as we move through the central bulge of the galaxy these measurements agree with our theoretical predictions; however for larger values of  that are beyond the central bulge the measured orbital speeds of stars and gas stay roughly constant (in stark contrast to what is actually predicted by theory. We shall see that this suggests that maybe the ordinary, visible masses (stars, planets, etc.) that we see in the night sky (that interacts with light allowing us to see them) isn’t the only kind of mass making up the galaxy.

Let’s come up with a more general model for the mass distribution of the stars in our galaxy. Suppose that the only assumption we make is that the particles are distributed isotropically around a center point. Let’s consider the gravitational force \(F_g\) acting on a star at the outermost distances of our galaxy. According to Newton’s theorem, the gravitational force \(F_g\) acting on this star of mass \(m\) at a distance \(R_{Galaxy}(=\text{radius of galaxy})\) from the center is the force \(F_g\) due to all the mass \(M_{Galaxy}\) enclosed by the sphere of radius \(r\). Thus,

$$F_g=G\frac{M(r)m}{r^2}=ma=\frac{mv^2}{r}⇒M(r)=\frac{v^2R_{Galaxy}}{G}.$$

Note that we can measure Newton’s gravitational constant \(G\), we can measure the radius \(R_{Galaxy}\) of our galaxy, and we can measure the orbital speed \(v\) (which is a constant) of each of the stars. When we calculate \(M_{Galaxy}=\frac{v^2R_{Galaxy}}{G}\), we obtain a value for \(M_{Galaxy}\) which is about 10x bigger than the total mass of all the ordinary matter (stars, planets, etc.) in our galaxy. This suggests that maybe there is some form of invisible matter (called dark matter) distributed throughout the galaxy that does not interact with light.

In the 1930’s astronomers measured the orbital speeds of galaxies in galaxy clusters. These galaxies had roughly circular orbits around the galaxy cluster’s center of mass (or gravity?). An astrophysicist named Fritz Zwicky measured the orbital speeds of a few dozen galaxies in an enormous galaxy cluster called the Coma cluster. For any one of these galaxies orbiting (in a circle) around the Coma cluster’s center of mass with a radius of orbit \(r\), astronomers can measure the mass \(M_{OM}\) (from the luminosity of each galaxy) of all the mass enclosed in the sphere of radius \(r\) surrounding the center of mass. Using the equation \(v=\sqrt{\frac{M_{OM}G}{r}}\), we can calculate what the orbital speed of that galaxy “should” be if \(M_{OM}\) was the only mass there. Zwicky measured that all of these galaxies had speeds greater than the escape speed given by \(v_{esc}=\sqrt{\frac{2M_{OM}G}{r}}\). Subsequent experiments in the decades following Zwicky’s work got analogous results for other galaxy clusters.

Based on the measured mass of all the ordinary matter in our galaxy (which we call \(M_{OM}\)), the gravitational force \(F_{OM,m}=G\frac{M_{OM}m}{R^2_{Galaxy}}\) exerted on a star located at the distant outskirts of our galaxy (where \(r=R_{Galaxy}\)) will be too small to keep it in orbit based on its measured orbital speed \(v\) (where, because \(v\) does not fall off as \(\frac{1}{\sqrt{r}}\), \(v\) is very big at \(R_{Galaxy}\)). The formula \(v=\sqrt{\frac{MG}{r}}\) applies to all objects with a circular orbit around some center; since the stars really do go around the center of our galaxy in circular orbits, the formula \(v=\sqrt{\frac{MG}{r}}\) must apply. Given that the orbital speed \(v\) of all the stars in our galaxy really is constant (as has been measured), it means that \(\frac{M}{r}\) must be constant:

$$v^2=G\frac{M}{r}⇒\frac{M}{r}=k⇒M(r)=kr⇒M(r)∝r.$$

There have been two proposed resolutions which modify our models to agree with experiment. The first is that Newton’s Law of Gravity must be modified over very large distances so that the force \(F_g\) doesn’t diminish as rapidly as \(\frac{1}{r^2}\). The second hypothesis which would reconcile our models with observation is the claim that the mass \(M_{OM}\) due to all the ordinary matter in the galaxy (stars, planets, etc.) isn’t all the mass in the galaxy. This hypothesis is the claim that there is another kind of matter (which is given the name dark matter) which permeates the galaxy that doesn’t interact with light, but which has mass \(M_{DM}\). By Einstein’s mass-energy equivalence principle, it must also have energy \(E_{DM}\) and energy density \(ρ_{DM}\). Since dark matter has energy density \(ρ_{DM}=T^{00}\), it acts as a source of gravity and curves space and time according to the EFE’s. Therefore, the path of a beam of light passing by a distribution of dark matter will be deflected from that of a straight line by an angle \(α\). Using General Relativity, we can derive a relationship between the angle of deflection \(α\) and the distribution of mass \(M_{DM}\). If dark matter could somehow be “isolated” from a galaxy, we could use this method/technique of gravitational lensing to prove or disprove the existence of dark matter.

Shortly after the Big Bang, the cosmic web was formed—a colossal structure made of mostly dark matter which spans across the entire observable universe. Astronomers think that the cosmic web played an important role in the formation of galaxies and galaxy clusters. Some theories suggest that the filaments/strands of the cosmic web intersect where galaxy clusters happen to be. Astronomers using the Hubble Space Telescope analyzed the deflection of galaxy light passing by an enormous filament extending across 60 million light-years from the galaxy cluster MACS J0717; from the deflection of light, they were able to determine where the filament is and what its distribution of mass looked like. Using additional data astronomers were able to construct a three-dimensional model of the filamentary structure. (Click on this link for more details: https://www.youtube.com/watch?v=GjO0zdXqCU0)

We can also use this technique to determine the mass of one of the largest structures in the Universe, namely a supercluster of galaxies. By determining the mass of superclusters, we can then estimate the average mass and energy density of the Universe; it might then seem like a good idea to use the EFE’s to determine the curvature of the entire Universe (not just local regions of it). The problem with this approach is that we are measuring the gravitational effects (namely, the deflection angle of light rays) caused by mostly visible objects (namely galaxy superclusters) in order to determine the mass; but on cosmological scales there are great voids in the Universe which are totally dark. If there is dark matter hidden in these voids, our calculation of total curvature of the universe would be incorrect since it only accounted for the masses of galaxy superclusters and not whatever dark matter might be hidden away in these voids.


This article is licensed under a CC BY-NC-SA 4.0 license.

References

1. DrPhysicsA. "Dark Matter and Galaxy Rotation". Online video clip. YouTube. YouTube, 14 July 2014. Web. 21 May 2017.

2. Leonard Susskind. "Dark matter and allocation of energy density." http://theoreticalminimum.com

3. Wikipedia contributors. "Dark matter." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 17 May. 2017. Web. 18 May. 2017.

Cosmic Microwave Background Radiation

The cooler regions (bluer hint in Figure 3) correspond to collected (by our detectors) photons with less energy and longer wavelengths. The nearby region surrounding the electron that that “cold” photon last scattered off of has more matter present than the nearby regions surrounding the electron that a “hot” photon last scattered off of. The greater abundance of matter “robbed” the photon of more energy. Thus we can relate the non-uniformities in the CMBR to slight non-uniformities in matter density. The gravity exerted by regions with more matter density on nearby particles “overpowered” the gravity exerted by regions with lower matter density. Over very long periods of time this slight imbalance lead to the formation of galaxy superclusters and clusters. According to Newtonian gravity if the matter density was completely uniform galaxy clusters would have never formed. Three centuries ago, Isaac Newton explained (using his law of gravity) that if the distribution of matter in the Universe was completely uniform, all of the matter would condense into a “great spherical mass”:

[include quote]

Therefore, according to Newtonian gravity, if the matter distribution was completely uniform structure (i.e. galaxy clusters, galaxies, stars, planets, etc.) would have never arisen.

The origin of the slight non-uniformity in matter density can be explained by quantum fluctuations in the beginning of the Universe. According to the time-energy uncertainty principle, there will always be particles randomly popping in and out of existence. Since the particles randomly pop in and out of existence, at any instant of time there will always be slight non-uniformities in the distribution of matter and energy. Cosmologists speculate that during the time interval when the Universe was \(t=10^{-37}\) seconds old until it was \(10^{-35}\) seconds old (called the inflationary era), the fabric of space and time stretched apart faster than the speed of light. From the time-energy uncertainty principle we know that at the instant when the age of the Universe was \(t=10^{-37}s\) the distribution of matter and energy was slightly non-uniform. Then, during the inflationary era, the space between every particle expanded faster than the speed of light and thus every particle was causally disconnected during this short period of time. You could imagine that inflation “blew up” and enlarged these non-uniformities while keeping the proportions of their separation distances the same. In the words of the cosmologist Max Tegmark, “When inflation stretched a subatomic region [of space] into what became our entire observable Universe, the density fluctuations that quantum mechanics [and the Uncertainty Principle in particular] had imprinted were stretched as well, to sizes of galaxies and beyond. (p. 107, Our Mathematical Universe)”

The miniscule non-uniformities in temperature (and therefore energy) of the photons coming from the CMBR tells us about the miniscule non-uniformities in matter density when the Universe was only about 300,000 years old. For about the first 300,000 years of the Universe’s life, the Cosmos was too hot for electrons to be captured by hydrogen and helium nuclei. This “soup” of electrons and atomic nuclei acted like an electrical conductor and electrical conductors are opaque to light. Over the next (roughly) 100,000 years the Universe cooled enough for atomic nuclei to capture electrons allowing photons, for the first time, to travel across long distances without being scattered. Somewhere around this time period photons scattered off of electrons for the last time—they would not interact with matter again for another roughly 13.5 billion years. The last electrons that each photon scattered off of can be related to the strength of the gravitational field and thus the distribution of matter in nearby regions around each of those electrons. When our detectors collect photons from the time of last scattering, our detectors are “seeing” the CMBR. From the CMBR you’ll notice that some of the regions are cooler than others.


This article is licensed under a CC BY-NC-SA 4.0 license.

References

1. Singh, Simon. Big Bang: The Origin of the Universe. New York: Harper Perennial, 2004. Print.

2. Wikipedia contributors. "Cosmic Microwave Background." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 12 May. 2017. Web. 18 May. 2017.

3. Tegmark, Max. Our Mathematical Universe: My Quest for the Ultimate Nature of Reality. New York: Knopf, 2014. Print.