On “familiarity” (or How to avoid “going down the Math Rabbit Hole”?)

Anyone trying to learn mathematics on his/her own has had the experience of “going down the Math Rabbit Hole.”

For example, suppose you come across the novel term vector space, and want to learn more about it. You look up various definitions, and they all refer to something called a field. So now you’re off to learn what a field is, but it’s the same story all over again: all the definitions you find refer to something called a group. Off to learn about what a group is. Ad infinitum. That’s what I’m calling here “to go down the Math Rabbit Hole.”

Upon first encountering the situation described above one may think: “well, if that’s what it takes to learn about vector spaces, then I’ll have to toughen up, and do it.” I picked this particular example, however, because I’m sure that the course of action it envisions is one that is not just arduous: it is in fact utterly misguided.

I can say so with some confidence, for this particular case, thanks to some serendipitous personal experience. It turns out that, luckily for me, some kind calculus professor in college gave me the tip to take a course in linear algebra (something that I would have never thought of on my own), and therefore I had the luxury of learning about vector spaces without having to venture into the dreaded MRH. I did well in this class, and got a good intuitive grasp of vector spaces, but even after I had studied for my final exams (let alone the first day of class), I couldn’t have said what a field was. Therefore, from my experience, and that of pretty much all my fellow students in that class, I know that one does not need to know a whole lot about fields to get the hang of vector spaces. All one needs is a familiarity with some field (say $$R\mathbb{R}$$).

Now, it’s hard to pin down more precisely what this familiarity amounts to. The only thing that I can say about it is that it is a state somewhere between, and quite distinct from, (a) the state right after reading and understanding the definition of whatever it is one wants to learn about (say, “vector spaces”), and (b) the state right after acing a graduate-level pure math course in that topic.

Even harder than defining this familiarity is coming up with an efficient way to attain it…

I’d like to ask all the math autodidacts reading this: how do you avoid falling into the Math Rabbit Hole? And more specifically, how do you efficiently attain enough familiarity with pre-requisite concepts to move on to the topics that you want to learn about?

PS: John von Neumann allegedly once said “Young man, in mathematics you don’t understand things. You just get used to them.” I think that this “getting used to things” is much of what I’m calling familiarity above. The problem of learning mathematics efficiently then becomes the problem of “getting used to things” quickly.

EDIT: Several answers and comments have suggested to use textbooks rather than, say, Wikipedia, to learn math. But textbooks usually have the same problem. There are exceptions, such as Gilbert Strang’s books, which generally avoid technicalities and instead focus on the big picture. They are indeed ideal introductions to a subject, but they are exceedingly rare. For example, as I already mentioned in one comment, I’ve been looking for an intro book on homotopy theory that focuses on the big picture, to no avail; all the books I’ve found bristle with technicalities from the get go: Hausdorff this, locally compact that, yadda yadda…

I’m sure that when one mathematician asks another for an introduction to some branch of math, the latter does not start spewing all these formal technicalities, but instead gives a big-picture account, based on simple examples. I wish authors of mathematics books sometimes wrote books in such an informal vein. Note that I’m not talking here about books written for math-phobes (in fact I detest it when a math book adopts a condescending “for-dummies”, “let’s-not-fry-our-little-brains-now” tone). Informal does not mean “dumbed down”. There’s a huge gap in the mathematics literature (at least in English), and I can’t figure out why.

(BTW, I’m glad that MJD brought up Strang’s Linear Algebra book, because it’s a concrete example that shows it’s not impossible to write a successful math textbook that stays on the big picture, and doesn’t fuss over technicalities. It goes without saying that I’m not advocating that all math books be written this way. Attention to such technical details, precision, and rigor are all essential to doing mathematics, but they can easily overwhelm an introductory exposition.)

Your example makes me think of graphs.

Imagine some nice, helpful fellow came along, and made a big graph of every math concept ever, where each concept is one node and related concepts are connected by edges. Now you can take a copy of this graph, and color every node green based on whether you “know” that concept (unknowns can be grey).

How to define “know”? In this case, when somebody mentions that concept while talking about something, do you immediately feel confused and get the urge to look the concept up? If no, then you know it (funnily enough, you may be deluding yourself into thinking you know something that you completely misunderstand, and it would be classed as “knowing” based on this rule – but that’s fine and I’ll explain why in a bit). For purposes of determining whether you “know” it, try to assume that the particular thing the person is talking about isn’t some intricate argument that hinges on obscure details of the concept or bizarre interpretations – it’s just mentioned matter-of-factly, as a tangential remark.

When you are studying a topic, you are basically picking one grey node and trying to color it green. But you may discover that to do this, you must color some adjacent grey nodes first. So the moment you discover a prerequisite node, you go to color it right away, and put your original topic on hold. But this node also has prerequisites, so you put it on hold, and… What you are doing is known as a depth first search. It’s natural for it to feel like a rabbit hole – you are trying to go as deep as possible. The hope is that sooner or later you will run into a wall of greens, which is when your long, arduous search will have born fruit, and you will get to feel that unique rush of climbing back up the stack with your little jewel of recursion terminating return value.

Then you get back to coloring your original node and find out about the other prerequisite, so now you can do it all over again.

DFS is suited for some applications, but it is bad for others. If your goal is to color the whole graph (ie. learn all of math), any strategy will have you visit the same number of nodes, so it doesn’t matter as much. But if you are not seriously attempting to learn everything right now, DFS is not the best choice.

So, the solution to your problem is straightforward – use a more appropriate search algorithm!

Immediately obvious is breadth-first search. This means, when reading an article (or page, or book chapter), don’t rush off to look up every new term as soon as you see it. Circle it or make a note of it on a separate paper, but force yourself to finish your text even if its completely incomprehensible to you without knowing the new term. You will now have a list of prerequisite nodes, and can deal with them in a more organized manner.

Compared to your DFS, this already makes it much easier to avoid straying too far from your original area of interest. It also has another benefit which is not common in actual graph problems: Often in math, and in general, understanding is cooperative. If you have a concept A which has prerequisite concept B and C, you may find that B is very difficult to understand (it leads down a deep rabbit hole), but only if you don’t yet know the very easy topic C, which if you do, make B very easy to “get” because you quickly figure out the salient and relevant points (or it may be turn out that knowing either B or C is sufficient to learn A). In this case, you really don’t want to have a learning strategy which will not make sure you do C before B!

BFS not only allows you to exploit cooperativities, but it also allows you to manage your time better. After your first pass, let’s say you ended up with a list of 30 topics you need to learn first. They won’t all be equally hard. Maybe 10 will take you 5 minutes of skimming wikipedia to figure out. Maybe another 10 are so simple, that the first Google Image diagram explains everything. Then there will be 1 or 2 which will take days or even months of work. You don’t want to get tripped up on the big ones while you have the small ones to take care of. After all, it may turn out that the big topic is not essential, but the small topic is. If that’s the case, you would feel very silly if you tried to tackle the big topic first! But if the small one proves useless, you haven’t really lost much energy or time.

Once you’re doing BFS, you might as well benefit from the other, very nice and clever twists on it, such as Dijkstra or A*. When you have the list of topics, can you order them by how promising they seem? Chances are you can, and chances are, your intuition will be right. Another thing to do – since ultimately, your aim is to link up with some green nodes, why not try to prioritize topics which seem like they would be getting closer to things you do know? The beauty of A* is that these heuristics don’t even have to be very correct – even “wrong” or “unrealistic” heuristics may end up making your search faster.