I’m far from being an expert in the field of mathematical logic, but I’ve been reading about the academic work invested in the foundations of mathematics, both in a historical and objective sense; and I learned that it all seems to reduce to a proper axiomatic formulation of set theory.
It also seems that all set theories (even if those come in ontologically different flavours, such as the ones which pursue the “iterative approach” like ZFC, versus the “stratified approach” inspired by Russell’s and Whitehead’s type theory first formulated in their Principia– such as Quine’s NFU or Mendelson’s ST) are built as collections of axioms expressed in a common language, which invariably involves an underlying first order predicate logic augmented with the setmembership binary relation symbol. From this follows that FOL makes up the (necessary) “formal template” in mathematics, at least from a foundational perspective.
The justification of this very fact, is the reason behind this question. All the stuff I’ve read about the metalogical virtues of FOL and the properties of its “extensions” could be summarized as the statements below:
 FOL is complete (Gödel, 1929), compact and sound, and all its particular formalizations as deductive systems are equivalent (Lindström, 1969). That means that, given a (consistent) collection of axioms on top of a FOL deductive system, the set of all theorems which are syntactically provable, are semantically satisfied by a model of the axioms. The specification of the axioms absolutely entails all its consequences; and the fact that every first order deductive system is equivalent, suggests that FOL is a contextindependent (i.e. objective), formal structure.
 On the other hand, the Löwenheim–Skolem theorem implies that FOL cannot categorically characterize infinite structures, and so every first order theory satisfied by a model of a particular infinite cardinality, is also satisfied by multiple additional models of every other infinite cardinality. This noncategoricity feature is explained to be caused by the lack of expressive power of FOL.
 The categoricity results that FOLbased theories cannot achieve, can be obtained in a Second Order Logic (SOL) framework. Examples abound in ordinary mathematics, such as the Least Upper Bound axiom, which allows the definition of the real number system up to isomorphism. Nevertheless, SOL fails to verify an analog to the completeness results of FOL, and so there is no general match between syntactic provability and semantic satisfiability (in other words, it doesn’t admit a complete proof calculus). That means that, even if a chosen collection of axioms is able to categorically characterize an infinite mathematical structure, there is an infinite set of wff’s satisfied by the unique model of the axioms which cannot be derived through deduction.
 The syntacticsemantic schism in SOL also implies that there is no such a thing as an equivalent formulation of potential deductive systems, as is the case in FOL and stated by Lindström’s theorem. One of the results of this fact is that the domain over which second order variables range must be specified, otherwise being illdefined. If the domain is allowed to be the full set of subsets of the domain of first order variables, the corresponding standard semantics involve the formal properties stated above (enough expressive power to establish categoricity results, and incompleteness of potential, nonequivalent deductive systems). On the other hand, through an appropriate definition of second order domains for second order variables to range over, the resultant logic exhibits nonstandard semantics (or Henkin semantics) which can be shown to be equivalent to manysorted FOL; and as singlesorted FOL, it verifies the same metalogical properties stated at the beginning (and of course, its lack of expressive power).
 The quantification extension over variables of successive superior orders can be formalized, or even eliminate the distinction between individual (first order) variables and predicates; in each case, is obtained for every N an Nth Order Logic (NOL), and Higher Order Logic (HOL), respectively. Nevertheless, it can be shown (Hintikka, 1955) that any sentence in any logic over FOL with standard semantics to be equivalent (in an effective manner) to a sentence in full SOL, using manysorting.
 All of this points to the fact that the fundamental distinction, in logical terms, lies between FOL (be it singlesorted or manysorted) and SOL (with standard semantics). Or what seems to be the case, the logical foundations of every mathematical theory must be either noncategorical or lack a complete proof calculus, with nothing in between that tradeoff.
Why, so, is FOL invariably chosen as the underlying logic on top of which the set theoretical axioms are established, in any potentially foundational formalization of mathematics?
As I’ve said, I’m not an expert in this topic, and I just happen to be interested in these themes. What I wrote here is a summary of what I assume I understood of what I read (even though I’m personally inclined against the people who speaks about what they don’t fully understand). In this light, I’d be very pleased if any answer to this question involves a rectification of any assertion which happened to be wrong.
P.S. : this is an exact repost of the question I originally asked at Philosophy .SE, because I assumed this to be an overly philosophical matter, and so it wouldn’t be well received by the mathematics community. The lack of response there (be it because I wrong, and this actually makes up a question which can only be answered with a technical background on the subject, or because it’s of little philosophical interest) is the reason why I decided to ask it here. Feel free to point out if my original criteria was actually correct, and of course, I’ll take no offense if any moderator takes actions because of the probable unsuitability of the question in this site.
Answer
For the history of firstorder logic I strongly recommend “The Road to Modern LogicAn Interpretation”, José Ferreirós, Bull. Symbolic Logic v.7 n.4, 2001, 441484. The abstract at least is freely available.
Apart from first order logic and higher order logic there are several less well known logics that I can mention:

Constructive logics used to formalize intuitionism and related areas of constructive mathematics. One key example is MartinLöf’s intuitionistic type theory, which is also very relevant to theoretical computer science.

Modal logics used to formalize modalities, primarily “possibility” and “necessity”. This is of great interest in philosophy, but somehow has not drawn much interest in ordinary mathematics.

Paraconsistent logics, which allow for some inconsistencies without the problem of explosion present in most classical and constructive logics. Again, although this is of great philosophical interest, it has not drawn much attention in ordinary math.

Linear logic, which is more obscure but can be interpreted as a logic where the number of times that an hypothesis is used actually matters, unlike classical logic.
Attribution
Source : Link , Question Author : Mono , Answer Author : Carl Mummert