What is the logic/rationale behind the vector cross product?

I don’t think I ever understood the rationale behind this.

I get that the dot product ab=abcosθ is derived from the cosine rule. (Do correct me if I’m wrong.)

However, I never really understood why a×b=absinθ. Is there a proof for this?

In addition, how does the vector cross product lead to orthogonal vectors?

I remember learning that they do and I know how to solve equations using the respective formulas, but I never got why is it so.


Bye_World’s Second TreatiseOn the Products of Vectors

Table of Contents

 The Dot Product
 The Cross Product
 The Real Skinny on the Cross Product
 The Wedge Product
 The Relationship between the Cross Product and Wedge Product
 A Quick Note on Further Products


This work is my completely insane attempt to try to fix whatever issue the downvoter to this answer had with my less comprehensive former answer. This answer is undoubtedly one of the longest on math.SE (actually, this is the 7th longest answer on math.SE as of this posting 😁) and goes far beyond what any reasonable person could want from an answer to the above question.

Enjoy. 😉


Our very first notions of vectors, the first objects we were taught in linear algebra, are tuples and oriented line segments. When we learned about these we very quickly were told that they are really the “same” objects. This facilitated our computations. The problem is, they really aren’t the same objects. In my opinion, just because we have a canonical way of associating with each tuple an oriented line segment and with each oriented line segment a tuple (once a basis is chosen for the oriented line segments) doesn’t mean that we don’t need to define our products on these objects individually. It does mean that after we do so, we need to confirm that the products are equivalent in their respective spaces. In this treatise, I will try to give both an algebraic and a geometric definition for each of our products.

Both the set of n-tuples and the set of oriented line segments in n-dimensional Euclidean space (along with the usual operations on those sets) are inner product spaces over the field of real numbers. That just means that they are vector spaces, with an inner product (called the dot product), and the scalars that we are allowed to multiply these vectors by are real numbers.

Now let’s go over exactly what tuples and oriented line segments are.


These are our purely algebraic vectors. An n-tuple is simply an ordered list of n real numbers. For instance, the space of 4-tuples, denoted R4, is the set of all objects of the sort (w,x,y,z) where w,x,y,zR. This is a 4-dimensional vector space over the real numbers.

The way that we add and scale tuples are component-wise. That is (x1,x2,,xn)+(y1,y2,,yn)=(x1+y1,x2+y2,,xn+yn)α(x1,,xn)=(αx1,,αxn)

One of the interesting properties of Rn is that it comes equipped with a natural orthonormal basis. For instance, in R4 that basis is {(1,0,0,0),(0,1,0,0),(0,0,1,0),(0,0,0,1)}. We will soon be able to see that the norm of any of these vectors is 1.

Instead of writing n-tuples as ordered lists with parentheses and commas, often it is more convenient to write them as row or column matrices: (1,2,3)[123][123]

The way that we choose to write down the numbers doesn’t matter when it comes to scalar multiplication or vector addition, however the benefit of writing tuples as matrices is that performing any linear transformation on a tuple is equivalent to multiplying a row (or column) matrix by a unique n×m (or m×n) matrix.

Oriented Line Segments:

Oriented line segments, also known as translation vectors, are the elements of Euclidean space. I’m going to call this space Ln (L for “line segment”). This is not a standard notation, it’s just the one I like. These objects are characterized by a specific length and a specific direction.

Vector addition in this case is given by the parallelogram rule

enter image description here

and scalar multiplication is done by scaling the length of the line segment, preserving its orientation if scaled by a number >0 or negating its orientation if scaled by a number <0

enter image description here

Notice that I have not said anything here about the location of a vector. That is because oriented line segments do not have any intrinsic location, they just exist in space.

There are some interesting things about these vectors. For one thing, every vector is specified in part by its norm — as known as its length. Thus while in most inner product spaces the inner product induces a norm, in this space the norm exists without ever needing to specify an inner product, though we will define one in a bit — the dot product. Other interesting properties that are more fundamental than in less geometric vector spaces are the angles between vectors and the ideas of parallelness and perpendicularity. All of these things exist even without ever defining that dot product.

The Dot Product

The dot product is the inner product of Rn and Ln.

The Algebraic Definition:

The dot product on Rn is defined as follows: given v,wRn where v=(v1,,vn) and w=(w1,,wn), the dot product, denoted vw, is defined by vw=ni=1viwi=v1w1+v2w2++vnwn

The idea here even generalizes pretty well to real-valued functions of a real varible.

The Geometric Definition:

I’m going to define the dot product of two vectors v,wLn in a slightly nonstandard way. In my opinion, this definition is more intuitive geometrically.

I define the dot product, vw, by vw=sprojw(v)w=sprojv(w)v

Where the scalar projection operation, sproj, is defined by sprojw(v)={projwv,the angle between v and w is π2projwv,the angle between v and w is >π2

Here v denotes the length of the vector v and projwv is the projection of the vector v onto the subspace span(w). Thus I define the dot product in terms of length and orthogonal projection and not the other way around. (Note: How I define orthogonal projection for the particular space Ln without making use of the dot product is beyond the scope of this answer.)

enter image description here

While I hope that my definition of the dot product on Ln catches on and I start seeing it in all the new linear algebra texts ;), I should probably mention the standard definition. The standard definition of the dot product on Ln is vw=vwcos(θ) This can be seen to be equivalent to my definition because the signed length of projwv is vcos(θ). You can see this in this image

enter image description here

Thus, one could certainly make the argument that the definition vw=vwcos(θ) is just a more compact version of my own definition. I make that concession, but in my mind a direct association of the dot product and projections is the most intuitive way to define it.

Algebraic Properties:

Both of these dot products share some important algebraic properties. These all can be proven from the definitions. I will simply list them here. Given any vectors u,v,w in either Rn or Ln and any kR:

(1)uv=vu(commutativity)(2)u(v+w)=uv+uw(distributivity)(3)k(uv)=(ku)v=u(kv)(interacts well withscalar multiplication)


One of the major mathematical benefits of defining an inner product is that it induces a norm on the vector space. As mentioned earlier, Ln comes pre-equipped with a norm, but Rn doesn’t. So in Rn, the norm of a vector x is defined by x=xx. I leave it as an exercise for the reader to confirm that the dot product on Rn is positive definite, and thus we don’t need to worry about a negative under the radical.

Another major application is that the dot product provides a convenient way of defining orthogonality in Rn and testing for orthogonality (perpendicularity) in Ln. We define orthogonality, denoted v  w, in Rn by v  wvw=0

The dot product is useful in physics when you only want to know about the components of a vector in a specific direction. For instance, work along a straight line is defined as W=Fr in a gravitational field. F is just the force due to gravity, where F=mgˆe3, and r is the vector which describes the straight line path a particle moves in. But when calculating work, we really only care about the projection of r in the direction of the force (ˆe3 in this case). Thus it makes sense that we’d use the dot product to define it.

The Cross Product

Algebraic Definition:

The cross product is defined as the unique vector b×cR3 such that a(b×c)=det This is an implicit definition. However, it can be shown to be equivalent to b\times c= (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1).

Note: I don’t directly define b\times c as the vector (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1) because (1) it’s harder to remember than my definition and (2) my definition immediately tells readers who are familiar with determinants several properties of the cross product, for instance that v\ \bot\ v\times w for all v,w\in \Bbb R^3 and v\times w = -w\times v for all v,w\in\Bbb R^3.

Explicit Formula for the Cross Product:

Uniqueness Lemma: If v and v’ are two vectors in \Bbb R^3 such that a\cdot v = \det(a,b,c) = a\cdot v’,\ \forall a\in \Bbb R^3, then v=v’.

Proof: Subtract a\cdot v’ = \det(a,b,c) from a\cdot v = \det(a,b,c) to get a\cdot v-a\cdot v’ = 0,\quad \forall a \\ \implies a\cdot(v-v’)=0,\quad \forall a \\ \implies a\ \bot\ (v-v’),\quad \forall a

But the only vector orthogonal to a for all a is the zero vector. Thus v-v’=0 \\ \implies v=v’ Therefore if any vector v satisfies the equation a\cdot v= \det(a,b,c),\ \forall a, then it is unique.\ \ \ \square

Thus we only need to prove that b\times c = (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1) is a vector that satisfies the definition to show that this is the unique cross product defined above.

Lemma: (a_1, a_2, a_3)\cdot (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1) = \det\begin{bmatrix} a_1 & b_1 & c_1 \\ a_2 & b_2 & c_2 \\ a_3 & b_3 & c_3\end{bmatrix}

Proof: On the LHS we get (a_1, a_2, a_3)\cdot (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1) = a_1(b_2c_3-b_3c_2) + a_2(b_3c_1-b_1c_3) + a_3(b_1c_2-b_2c_1) On the RHS, expanding along the left column, we get \begin{align}\det\begin{bmatrix} a_1 & b_1 & c_1 \\ a_2 & b_2 & c_2 \\ a_3 & b_3 & c_3\end{bmatrix} &= a_1\left|\begin{matrix} b_2 & c_2 \\ b_3 & c_3\end{matrix}\right| – a_2\left|\begin{matrix} b_1 & c_1 \\ b_3 & c_3\end{matrix}\right| + a_3\left|\begin{matrix} b_1 & c_1 \\ b_2 & c_2\end{matrix}\right| \\ &= a_1(b_2c_3-b_3c_2) – a_2(b_1c_3-b_3c_1) + a_3(b_1c_2-b_2c_1) \\&= a_1(b_2c_3-b_3c_2) + a_2(b_3c_1-b_1c_3) + a_3(b_1c_2-b_2c_1)\end{align}

This proves that b\times c= (b_2c_3-b_3c_2,\ b_3c_1-b_1c_3,\ b_1c_2-b_2c_1) is in fact a vector which satisfies our definition.\ \ \ \ \square

Therefore this is the unique cross product that satisfies the definition.

Geometric Definition:

To define a product on \Bbb L^3, we simply need to define the length and orientation of the product in terms of the input vectors. Here is our definition:

Given two vectors \vec v, \vec w\in \Bbb L^3, we define a third vector \vec v \times \vec w as the vector whose length is given by the area of the parallelogram with sides \vec v and \vec w and whose direction is orthogonal to both \vec v and \vec w, as determined by the right-hand rule.

![enter image description here

Lemma: \|\vec a\times \vec b\|=\|\vec a\|\|\vec b\|\sin(\theta) where \theta is the angle between vectors \vec a, \vec b \in \Bbb L^3.

Proof: First we remember a fact from geometry: for constant cross sectional areas, the total area is given by “base times height”. This is a direct consequence of Cavalieri’s principle. Then from the following image:

enter image description here

we can see that the base is \|\vec b\| and the height is \|\vec a\|\sin(\theta). Therefore the area of the parallelogram — and thus the length of the vector \vec a \times \vec b — is \|\vec a\|\|\vec b\|\sin(\theta). \ \ \ \ \square

Algebraic Properties:

Both of these cross products share some important algebraic properties. These all can be proven from the definitions. I will simply list them here. Given any vectors u,v,w in either \Bbb R^n or \Bbb L^n and any k\in \Bbb R:

\begin{array}{lcr} (1) & u \times v = -v\times u & \left(\text{anticommutativity}\right) \\ (2) & u\times(v+w) = u\times v + u\times w & \left(\text{distributivity}\right) \\ (3) & k(u\times v) = (ku)\times v = u\times (kv) & \left(\begin{array}{c}\text{interacts well with} \\ \text{scalar multiplication}\end{array}\right) \\ (4) & u\times(v\times w) + v\times(w\times u) + w\times(u\times v) = 0 & \left(\text{Jacobi identity}\right)\end{array}

One more property, which is actually a consequence of the anticommutativity of the cross product (can you prove it?), is that for any vector v, we have v\times v=0.

One important property that the cross product doesn’t have is associativity. Consider the triple products u\cdot (v\cdot w) and u\times (v\times w). Without much effort we can see that u\cdot (v\cdot w) is undefined. This is because v\cdot w is a scalar and then the dot product of a vector and a scalar is undefined by our above definition. However, u\times (v\times w) is defined. Knowing it’s defined, our next question should be “are the parentheses necessary?” Yes. In general, u\times (v\times w) is not equal to (u\times v)\times w.


The mathematical and physical significance of the cross product, v\times w, is it provides a vector orthogonal to the plane, \operatorname{span}(v,w).

For instance, using another physics example, we can experimentally determine that a charged particle moving through a constant magnetic field will instantaneously feel a force in a direction orthogonal to both the direction it is moving in at that instant and the direction of the magnetic field. So it shouldn’t surprise you that the definition of the magnetic force is \vec F_m = q(\vec v\times \vec B), which is just the cross product of the velocity vector \vec v (pointing in the direction the particle is moving) and the magnetic field (pseudo)vector \vec B, scaled by some number q.

The Real Skinny on the Cross Product

The cross product is actually a terrible product, though. Let’s list some of the reasons why.

  1. It’s not commutative, but because it’s anticommutative that’s not that big of a deal. Anticommutative things are actually pretty useful in mathematics (and physics).
  2. It’s not associative, but it does obey the Jacobi identity so I guess that’s sort of OK. It’s not great, though.
  3. The thing that we get from the cross product isn’t really a vector. It’s just a pretender. It’s an object that looks really, really similar to a vector, but doesn’t quite behave right under reflections. If you’re interested, ask your professor about this. The name for this type of object is pseudovector.
    Note: the fact that the cross product of two vectors isn’t a vector isn’t actually a problem. Afterall, the dot product isn’t either. The problem is that there is no standard notation which distinguishes pseudovectors from vectors. So you just have to keep in mind what type of object you’re working with.
  4. But the biggest problem, the most awful thing about the cross product is that it is only defined in 3 dimensions. That’s terrible. Linear algebra works in any finite dimension (infinite dimensional linear algebra is called functional analysis) but we have a product which only works in 3-dimensions? This is not a good product.

It honestly astounds me that we keep using it to this day. Really the cross product should be replaced by something else: the wedge product.

The Wedge Product

Let’s talk about the wedge product. One thing to note in this section is that I will not provide an algebraic definition. That’s not because there isn’t one, it simply requires slightly more math than I’m willing to believe OP will understand. And even more importantly, it’s not necessary. As long as we can determine the key algebraic properties of the wedge product, we’ll have all we need to work with it.

Geometric Motivation:

First, let’s go back over some of the properties of the vectors in \Bbb L^n. These elements are oriented line segments. This means that every vector \vec v\in \Bbb L^n

  1. has a specific length, denoted \|\vec v\|
  2. is parallel to a unique line through the origin (except the
    zero vector, but zero has weird properties in any set of objects)
  3. points in one of the two directions along that line

Vectors (line segments) can be scaled by numbers and added together with the parallelogram rule:

enter image description here

We could make a similar definition for oriented plane segments. These bivectors would be elements of a space denoted \Lambda \Bbb L^3. A bivector B is an object that

  1. has a specific area, denoted \|B\|
  2. is parallel to a unique plane containing the origin (except the zero bivector)
  3. has one of two orientations that are a little harder to visualize than with line segments

Bivectors (which can be visualized as parallelograms in space) can be scaled by numbers and added together via a generalized version of the parallelogram rule:

enter image description here

I can’t find a picture of scalar multiplication of a bivector, but just imagine a parallelogram getting bigger (scaling by a number whose absolute value is \gt 1) or smaller (scaling by a number whose absolute value is \lt 1).

We can also imagine higher dimensional objects, like trivectors, etc. A trivector is just an oriented volume segment (a parallelopiped with an orientation). Here’s a little image to show you how these objects progress:

enter image description here

Geometric Definition:

The wedge product is the operation we use to make bivectors out of vectors.

Given two vectors \vec v,\vec w\in \Bbb L^n, we define the bivector \vec v \wedge \vec w\in \Lambda \Bbb L^n as the oriented plane segment whose area(/norm) is equal to the area of the parallelogram with sides \vec v and \vec w, whose direction is parallel to the plane \operatorname{span}(\vec v, \vec w) (if this is a plane, otherwise \vec v \wedge \vec w = 0), and whose orientation is given by the order of the factors. Here’s an image to help you visualize it:

enter image description here

That’s all we need to uniquely define a bivector. Higher dimensional n-vectors are defined analogously.

Algebraic Properties:

This wedge product has some important algebraic properties. These all can be proven from the definition. I will simply list them here. Given any vectors u,v,w in either \Bbb R^n or \Bbb L^n and any k\in \Bbb R:

\begin{array}{lcr} (1) & u \wedge v = -v\wedge u & \left(\text{anticommutativity}\right) \\ (2) & u\wedge(v\wedge w) = (u\wedge v)\wedge w & \left(\text{associativity}\right) \\ (3) & u\wedge(v+w) = u\wedge v + u\wedge w & \left(\text{distributivity}\right) \\ (4) & k(u\wedge v) = (ku)\wedge v = u\wedge (kv) & \left(\begin{array}{c}\text{interacts well with} \\ \text{scalar multiplication}\end{array}\right)\end{array}

One more property, which is actually a consequence of the anticommutativity of the wedge product (can you prove it?), is that for any vector v, we have v\wedge v=0.

Also note that this product is defined in \Bbb L^n or \Bbb R^n for any n (yay!).


The major application of the wedge product, and the n-vectors it generates, is in representing subspaces of \Bbb R^n and \Bbb L^n as elements of \Lambda \Bbb R^n and \Lambda L^n, respectively.

One of the consequences of this representational property is that we can define the determinant of a linear transformation f: \Bbb R^n \to \Bbb R^n as f(v_1) \wedge f(v_2) \wedge \cdots \wedge f(v_n) = \det(f)v_1\wedge v_2 \wedge \cdots \wedge v_n where v_1, \dots, v_n are n linearly independent vectors in \Bbb R^n. Intuitively this just says that the linear transformation f scales the (signed) volume of an n-dimensional parallelotope by a factor of \det(f).

We can also replace all of the instances of the cross product in physics with the wedge product (or a combination of the wedge product and a generalization of the dot product that is defined on n-vectors) — possibly with some modification — to get formulas which work not only in \Bbb R^3 but also in 4-dimensional Euclidean space (useful in advanced classical mechanics) and Minkowski space (useful in special/ general relativity). I also personally feel that these new formulas are more intuitive than the standard ones — once one has the necessary mathematics understood (example: magnetic bivector fields make way more sense to me than magnetic (pseudo)vector fields).

The Relationship Between the Cross Product and the Wedge Product

I told you that the cross product should be replaced by this wedge product but I haven’t really told what the relationship between them is.

The exact relationship between them is something called duality (in \Bbb R^3 and \Bbb L^3 only), and it’d take even more math to explain that. As this is already a crazy long post, I’ll instead just show you a couple of ways in which they are related.

The first thing I want to point out is that we know the cross product a\times b has a length equal to the area of the parallelogram with sides a and b. But remember, we defined the area of the wedge product a\wedge b to also be the area of the parallelogram with sides a and b. Thus \|a\times b\| = \|a\wedge b\|.

But norm isn’t everything. Let’s look at the components. Remember that the components of the cross product (once you work it all out) is a\times b= (\color{red}{a_2b_3-a_3b_2})e_1 + (\color{purple}{a_3b_1-a_1b_3})e_2 +(\color{blue}{a_1b_2-a_2b_1})e_3 where \{e_1, e_2, e_3\} is an orthonormal basis for \Bbb R^3 or \Bbb L^3. Let’s work out the components of the wedge product using our above rules: \begin{align} a\wedge b &= (a_1e_1 + a_2e_2 + a_3e_3)\wedge (b_1e_1 + b_2e_2 + b_3e_3) \\ &= (a_1b_1)e_1\wedge e_1 + (a_1b_2)e_1\wedge e_2 + (a_1b_3)e_1\wedge e_3 + (a_2b_1)e_2\wedge e_1 + (a_2b_2)e_2\wedge e_2 + (a_2b_3)e_2\wedge e_3 + (a_3b_1)e_3\wedge e_1 + (a_3b_2)e_3\wedge e_2 + (a_3b_3)e_3\wedge e_3 \\ &= 0 + (a_1b_2)e_1\wedge e_2 + (-a_1b_3)e_3\wedge e_1 + (-a_2b_1)e_1\wedge e_2 + 0 + (a_2b_3)e_2\wedge e_3 + (a_3b_1)e_3\wedge e_1 + (-a_3b_2)e_2\wedge e_3 + 0 \\ &= (\color{red}{a_2b_3-a_3b_2})e_2\wedge e_3 + (\color{purple}{a_3b_1-a_1b_3})e_3\wedge e_1 + (\color{blue}{a_1b_2-a_2b_1})e_1\wedge e_2\end{align}

So you can see that in \Bbb R^3 and \Bbb L^3, the wedge product and the cross product have exactly the same components. Thus if you wanted to create the cross product out of the wedge product you’d just need to do the operation \pmatrix{e_2\wedge e_3 \\ e_3\wedge e_1 \\ e_1 \wedge e_2} \mapsto \pmatrix{e_1 \\ e_2 \\ e_3}

A Quick Note on Further Products

The dot, cross, and wedge products are not the only products we can define on Euclidean vectors. Far from it.

For a more historically relevant precursor to the modern products, take a look at the Hamilton product of quaternions.

The two most important products on Euclidean vectors that I have not covered yet are the geometric product and the tensor product. Both the geometric and tensor products contain the wedge product as subproducts. However, these products require mathematical material that goes well beyond what I want to cover in this treatise and thus I will simply provide references.

For information on the geometric product, and the algebra (over the field of reals) that it creates, I’d recommend either of the books Linear and Geometric Algebra by Alan Macdonald or Clifford Algebra to Geometric Calculus by David Hestenes and Garret Sobczyk. Macdonald’s book is great if you’ve never taken a linear algebra course before. If you have, then you can probably handle the more advanced book by Hestenes and Sobczyk.

For information on the tensor product I’d recommend taking a look at the book Introduction to Vectors and Tensors, Volume I by Ray Bowen and C. Wang. The last couple chapters give a pretty good introduction to tensor algebra and if you decide to get the second volume you’ll have a good text on calculus on manifolds as well.

Source : Link , Question Author : Danxe , Answer Author : Community

Leave a Comment