Mathematicians by and large don’t believe in chance. Yet, after an improbable happenstance resulted in a professional paper, Connecticut College math professors Christopher Hammond and Warren Johnson may be changing their minds.
Asked how he would have reacted two years ago if told he would publish a professional paper based on a baseball statistic, Hammond invoked a common, if non-mathematical, expression of probability: “I would have laughed at you.”
However improbable, the “The James Function” was published in the February edition of “Mathematics Magazine” with an important assist from Professor Steven Miller of Williams College. The article provides a rigorous mathematical treatment of the well-known “log5 method” for determining the likelihood that one team will defeat another, based solely on their winning percentages.
The project’s fortuitous journey began in the fall of 2013, when the Boston Red Sox beat the St. Louis Cardinals four games to two to win their third World Series in 10 years, after a drought that dated back to the days of Babe Ruth.
A senior moment
At the beginning of his senior year, Caleb Garza ’14 was searching for a topic worthy of the senior math seminar presentation required of all majors.
Now 24, the Ocean City, N.J., native’s goal was to demonstrate mathematics’ use outside of the classroom through applications in the wide world of sports.
Baseball “is not one of my favorite sports,” confessed Garza, who prefers soccer and squash, the latter of which he played at the varsity level with the Camels. But since most Americans know the bases are 90 feet apart and the pitching rubber 60 feet, 6 inches from the plate, he figured the game would serve his mathematical purpose.
Johnson, who by chance was the adviser for that fall’s seminar group, sensed Garza’s motivation and joined him in his quest for a specific topic that would meet the quality standard the seminar shares with a sacrifice fly: Both must be of sufficient depth.
Garza got to first base with an Internet search hit on “Pythagoras at the Bat,” a paper Miller had written with some students.
The paper describes the Pythagorean Theorem of Baseball created by Bill James, a pioneer in the advanced baseball statistics field called sabermetrics, a term James coined from an acronym of the Society of American Baseball Research. James famously formulated the theorem to predict a team’s winning percentage based on two simple and readily available statistics: The average number of runs that team scores and the average number of runs it allows.
The reference to Pythagoras, a first-ballot shoo-in for any mathematics hall of fame, involves the theorem the Greek power thinker was best known for: A squared plus B squared equals C squared, which expresses the relationship among the three sides of a right triangle. His name was attached to James’s formula because it, too, uses three squares to predict a team’s winning percentage: the square of the number of runs the team scores, divided by the sum of the squares of the number of runs it scores and the number it allows.
Others later reduced the exponent from two to nearly 1.8, a decimal point shaving scheme that improved the formula’s accuracy.
James’s formula became the cornerstone calculation for the phenomenon called “Moneyball,” most famously used by Oakland Athletics General Manager Billy Beane.
As Miller and his colleagues explain in “Pythagoras at the Bat,” the James Formula “helps solve some of the most important economic decisions a team faces: How much a player is worth, which players should be pursued, and how much money they should be offered.”
Johnson says that though James, whom “Time” magazine named in 2006 one of its 100 most influential people, had little formal training in mathematics, his insights into formulas and what makes them tick is better than many mathematicians and as good as the best managers’ knowledge of players: “He understands how to make a formula do what he wants it to do, and he understands what can possibly go wrong.”
And most things with his Pythagorean Theory of Baseball went right.
It predicts teams’ records within four wins in a 162-game season.
It does it with a simplicity of form that mathematicians liken to poetry—even mathematicians like Hammond, who expresses his opinion of baseball in four short words: “I don’t hate it.”
In contrast, Hammond’s faculty webpage says he is “delighted whenever he can find connections between mathematics and the arts.” It notes that he has lectured “on the mathematical imagery” in Dante’s “Divine Comedy;” and adds that he takes great pleasure when his work in operator theory can be applied to classic complex problems in a way that “greatly enriches the aesthetic dimension of the discipline.”
Ever the professional, though, Hammond’s description of James’s formula as “very pretty, very simple [and] very elegant” didn’t stop him from proving James’s assumption about it was very wrong.
A rookie mistake
Hammond would never have given it a thought, however, had not Lady Luck smiled on the James project in two ways he calls “very fortuitous” and that non-mathematicians would call lottery-scale luck.
The first was that Garza, though well-versed in mathematics, had not taken courses in statistics. If he had, his project on James, who is essentially a baseball statistics guru, surely would have struck out in a different direction.
Only because that wasn’t so did Johnson search out another avenue of inquiry, and even that ran into a roadblock when best-source material seemed nearly as hard to find as a kind word about Ty Cobb.
The only known source was James’s 1981 Baseball Abstract, which was not only out of print but had never been published commercially in the first place. Fortunately for Garza, a longtime baseball fan came through in the clutch.
“I happened to have one,” said Johnson, whose brief bio included in the paper reveals that he “grew up rooting for the Phillies … became a Twins fan while attending the University of Minnesota …. [and] received his Ph.D. from the University of Wisconsin despite failing to become a Brewers fan.”
At Johnson’s suggestion, Garza focused on teasing out the logic in the axioms underlying James’s so-called log5 formulas.
Garza’s remarks on them at his seminar presentation caught Hammond’s attention.
Bowling for scholars
Mathematical axioms are assertions or propositions that underlie a mathematical system. Euclid famously spelled out five axioms as the basis of geometry.
James’s original theory had six conditions, which then Hammond and Johnson refined and reduced by one.
“If Bill James had known calculus,” explained Johnson, “he would have had a different set of axioms.”
With those in place, Hammond stepped up to the plate. Miller proved that the James Function actually does what James said it did; Hammond would prove that it is impossible to prove Miller's result by a uniqueness argument from James's axioms.
“What you hear from James is this argument: The right formula satisfies these six conditions; his formula satisfies the conditions; therefore, it must be the only one.”
The last assertion is called the James Conjecture. Sensing it was false, Hammond started his search for one additional formula that would meet James’s conditions and disprove the conjecture. He found it in a bowling alley.
“I came up with the idea at a third-grader’s birthday party,” Hammond said, something he doesn’t consider at all unusual. “I think every mathematician tells a story like that,” though he personally has had better luck in the open air of grocery store parking lots, not domed bowling alleys.
The winning formula “is difficult to describe precisely in layman’s terms,” Hammond said, but “was specifically designed to possess many of the properties of the original James Function.”
Hammond kept a particularly close eye on his function’s “level curves.”
Just as level curves on topographical maps show constant elevations on a changing landscape, mathematical level curves help describe the geometrical behavior of certain complicated functions. The function Hammond imagined in the bowling alley largely conformed to the contours of the original James Function.
From there, he said, “It was a matter of working out the details.”
Although doing that was akin to backing a full-sized van packed with overexcited Little Leaguers into a tight space without the help of a rear-view camera, Hammond found it refreshing because it involved more general math skills than he typically uses in his highly specialized field of operator theory.
Even professors in his department “don’t necessarily understand one another’s research,” Hammond said. And that makes it difficult for them to collaborate in publishing.
Said Hammond, “I think this is the first time two mathematics professors from Connecticut College have worked on a paper together.”
But to Hammond, it meant much more than that. Because the paper involved baseball, “I could actually talk to people at parties” about the work. And that was like coming out of a dugout and into the sunshine.
'Who's on First?'
If Hammond is a specialist, the mathematical equivalent of a closer, Johnson might be described as a polymath or a utility player.
“I sort of dabble in several different areas,” he said, including “a fair amount of stuff in the history of mathematics,” particularly from 1730 to 1900.
Miller, on the other hand, might be considered nearly promiscuous in his mathematics. Although his specialty is probability theory, which has obvious connections to James, he also has written about mathematics in the fields of accounting, computer science, economics and geology.
“I look at relationships,” he said, and how formulas used in one field of study might be used in other seemingly unrelated fields.
It’s unlikely that he’ll ever find a formula to fit the relationships that led to “The James Function.”
Johnson said that without Garza’s project, which turned up Miller’s paper, neither he nor Hammond would even have explored the James Function—and that, without one another’s help, “it would never have gone anywhere.”
Asked for an assessment of where the endeavor will go from here, Johnson invoked the name of the guy on third in the legendary Laurel and Hardy baseball skit “Who’s on First?”
“I Don’t Know,” he said.
He’s not alone.
Although academics in the field of statistics had examined James’s work before, Hammond is still shocked that no mathematician had taken a serious look.
“It’s like a valley people should have passed through, but they didn’t take that path.”
And, in contrast with his field of specialty, which is mapped well enough that researchers know where to explore next, knowledge of so-called Jamesian functions—functions that satisfy the same five conditions as the original James Function—is so thin that a map of them lacks the equivalent of a “You Are Here” sign identifying the trailhead.
Although the formulas have some similarities to a well-established “theory of paired comparisons” associated with researchers Ralph Allan Bradley and Milton E. Terry, Hammond said, “The philosophical rationale is almost exactly reversed.
“One can say that the James Function belongs to the general area of probability,” he added, “but besides that it really defies categorization.”
Hammond and Johnson have made inroads exploring how Jamesian functions can be applied to sports in which multiple teams compete at the same time, as in a bowling tournament, track meet or crew competition.
In sticking to the athletic theme, Johnson says they’ve confirmed a consistent aesthetic theme as well.
“There are all these beautiful formulas.”
The probability seems slim that Jamesian functions will become their mathematical field of dreams; but given the number of times Lady Luck has smiled on them so far, Hammond and Johnson plan to push ahead. Like Cubs legends Tinker and Evers, they’re willing to do their part as best they can and leave the rest in the hands of Chance.
*Note: Caleb Garza is now director of academics for StreetSquash, a New York City nonprofit that provides afterschool academic support and services for disadvantaged students in sixth grade through college. This summer, to encourage his students’ interest in the field, he offered a one-week seminar about mathematics in baseball.