A New Statistic For Magic: The Gathering
By: Jeff Holt
When we were told that we could do a project on just about anything that was related to mathematics, I immediately thought of games, and my favorite game is Magic: The Gathering. I know there are lots of math and statistics that go into MTG, but I wanted to add something to it. This is my objective: finding a new way to analyze decks and figure out an easy way to make them better.
Where to Start
I started with some research and data gathering. I found some articles and information on counting and probabilities, which are huge when it comes to playing the game. Like any card game, people make judgements based on what they have seen, and what they have at the present time. It isn’t any different with Magic. Since Magic is a competitive game, what you’ve seen, have, and know about becomes more crucial. Your opponent can play anything within the means of their deck, and playing the probabilities becomes a mastery of a skill to have. Even the best players become burned by playing the probabilities incorrectly. Sometimes, players have to take risks to win, and many times over and over, they lose.
So what can we do about it? Due to how much time I had to work on this objective, I came up with an idea that relates to economic analysis. I thought about game theory, and decision making to come up with a statistic that is, currently, a loose analyzer of how well a deck can perform based purely on color, cost, and probability. I found that an opening hand of seven cards can include certain cards, and give the probability of that combination of cards. This can be found by using the hypergeometric function, but what about the total probability of a deck? In the game, there is this idea of “curving out” a deck, or more specifically, an opening hand along with the cards you would hope to draw. Sometimes you need luck, sometimes you don’t.
An ideal opening hand would include 3-4 lands, and maybe 2-3 playable spells from those lands, and maybe 1 late-game spell. Of course, as decks change, an opening hand that is perfect changes. I attempted to develop a system that can test a deck’s ability to curve out properly. There were obstacles in getting a single statistic that seemed to work, and some limited resources made getting a perfect outcome impossible. Despite this, I came up with a statistic that I call the “Curve Efficiency Rating”. This statistic calculates the total probability of cards that can be drawn, along with being able to play them as soon as possible.
Inspiration and Details
One of the articles I read discussed the usefulness of playing fetch lands in Magic. Fetch lands are a nickname for a set of lands that are played like a normal land, but are sacrificed and allows you to search for another land that is of the colors that you need to cast spells in your hand.
This was part of a study by Garrett Johnson that found the effectiveness of using fetch lands in a deck. The probabilities of drawing lands in your opening hand change by how many lands you expect to start with. (The legend in the graph represents the number of fetches in a deck).
On average, a 60 card deck with 24 lands will start with 2.8 lands in each opening hand. This kind of simple statistics was what I wanted to try to find.
The Curve Efficiency Rating is composed of a summation of probabilities of cards in a deck. To start, I looked at 20 of the most popular decks in the Standard format, (this format allows players to construct decks using cards from the last 2 blocks that have been released, with no more than 4 copies of a single card in a deck, besides basic lands). When gathering data, I wanted to focus on land types, converted mana costs (which is the total number of mana needed to cast a spell), and the colors of the spells. This is a basic summary of the decks I looked at:
Deck Names
|
Total Lands
|
Total Creatures
|
Total Non-Creatures
|
Total Multi-Colored Spells
|
|
|
|
|
|
Jund
|
25
|
6
|
17
|
12
|
Reanimator
|
23
|
22
|
9
|
6
|
Junk Crats
|
25
|
13
|
11
|
11
|
Bant Hex
|
22
|
11
|
13
|
14
|
UWR Mid
|
25
|
15
|
13
|
7
|
Naya
|
24
|
12
|
3
|
21
|
R/G Aggro
|
23
|
16
|
10
|
12
|
UWR Flash
|
26
|
10
|
10
|
14
|
Esper Control
|
27
|
5
|
16
|
12
|
B/G Midrange
|
24
|
14
|
18
|
4
|
Jund Aggro
|
23
|
12
|
3
|
26
|
Prime Bant
|
24
|
14
|
9
|
13
|
Naya Blitz
|
21
|
32
|
0
|
7
|
B/G Zombies
|
22
|
12
|
10
|
16
|
Act 2
|
24
|
8
|
11
|
17
|
UWR Geist
|
24
|
9
|
10
|
17
|
Bant Flash
|
25
|
6
|
11
|
18
|
Naya Humans
|
20
|
32
|
2
|
6
|
4-C Rites
|
23
|
8
|
19
|
10
|
BWR Mid
|
25
|
3
|
9
|
23
|
Each deck has 60 cards, except for R/G Aggro (which has 61), and Jund Aggro (64). The actual dataset contains all cards broken down into these categories, but then broken into their colors and mana costs. What the Curve Efficiency Rating calculates is the total summation of the probability of drawing a particular spell, and the lands needed to cast it.
Each card has a probability of being drawn, that is calculated with the chances of drawing the land needed to play it on the earliest turn. There are a few things that are taken into consideration when making these calculations:
- Start with a 7 card hand, and be on the play. This means that you start with 7 cards, and do not draw until the second turn.
- There are no other effects that take place. Mana ramp is not calculated, neither are the plays of an opponent. It is just the raw casting cost and being able to cast it.
- The first turn means turn one, so we want to play a one-cost card on turn one.
- All spells with an X in the casting cost have an X of 3. So a card like Sphinx’s Revelation, that costs WUUX, (one white mana, two blue mana, and X additional mana) will cost 6 mana total in this study.
- Multi-colored spells are generalized and grouped together.
- Each spell is considered its own “game”, such that we consider each spell individually, and it’s purely its own chance of being played without being effected by anything else.
Let’s look into the first deck: Jund. Jund plays black, green, and red spells. It has 25 lands: 8 green/black, 7 black/red, and 7 green/red dual lands, and 3 other lands that only produce 1 colorless mana each turn. Consider a spell that costs 1 black mana to play, what are the chances that we have that spell and the mana to cast it on turn? There are 15 black-mana sources, so we need one of the black sources, and the spell itself. If there are 4 copies of the spell, then the hypergeometric equation will give us a probability of having the cards we need to play that spell on turn one.
Where:
N = the deck size
m = the card we want (number of copies)
n = the remaining number of cards drawn
k = the card we need from the number of copies (always 1)
We take the basic example of the equation, and multiply it by each factor that goes into what we want. We need 1 of the 4 copies of a card to play it, plus 1 black of the 15 to cast it on turn one. We also will have 5 more cards, but they can be anything that is not needed, so the equation will look something like this for this spell, (sorry for the quality, my original image wouldn't work):
(15) (4) (41)
( 1) (1) (5)
(60)
(7)
and so the probability of being able to play this spell on turn one is equal to 11.64%. I’ve calculated the probability for each card in each deck by using this same formula. The only difference from card to card is multiplying by more mana probabilities as they are needed (such as sheer number of mana to different colors). Then I add each probability together for each card in a deck, and get the Curve Efficiency Rating.
Analyzing the CER
It’s quite simple; the higher the number, the more efficient the deck is. Decks that are controlling (meaning blue decks usually), have a higher rating because their spells are more expensive. This means they have more time to draw them, and the lands needed, to play them on the earliest turn.
Deck Names
|
curve efficiency rating
|
|
|
Jund
|
2.2839
|
Reanimator
|
1.37678
|
Junk Crats
|
1.13647
|
Bant Hex
|
1.03669
|
UWR Mid
|
2.29767
|
Naya
|
1.48565
|
R/G Aggro
|
1.69614
|
UWR Flash
|
3.02176
|
Esper Control
|
3.08138
|
B/G Midrange
|
2.19277
|
Jund Aggro
|
1.25698
|
Prime Bant
|
1.87719
|
Naya Blitz
|
0.6847
|
B/G Zombies
|
0.93898
|
Act 2
|
1.64747
|
UWR Geist
|
4.16444
|
Bant Flash
|
2.13506
|
Naya Humans
|
1.75087
|
4-C Rites
|
1.25076
|
BWR Mid
|
2.70770
|
If we understand deck compositions, we see that aggressive decks have low ratings, and this is because many of their spells are low-costing. The midrange decks have spells that are mostly in the 3-5 mana cost range, and the blue decks (of course) have the best ratings because they have higher costing cards.
What This Leads To
Economic analysis uses regressional outputs to predict financial means for demographics amongst the population. Using a similar process, I have built 4 regression models to help deck-builders analyze their composition. Using a small-variant regression output in MS Excel, I found these models based on land, creature spells, non-creature spells, and multicolored spells:
Land
|
Coefficients
|
Intercept
|
-0.72739
|
U/W
|
-0.0299
|
W/G
|
0.186317
|
G/B
|
0.143789
|
B/R
|
0.309508
|
R/U
|
0.293637
|
R/W
|
0.22147
|
W/B
|
-0.05554
|
B/U
|
0.565639
|
U/G
|
0.13486
|
G/R
|
-0.06993
|
W
|
-0.96358
|
U
|
0.365084
|
B
|
0.090531
|
G
|
-0.02711
|
R
|
0.201999
|
Land
|
-0.02208
|
This is the model for lands. They include dual lands, basics, and non-basic/dual lands that only produce colorless mana. Each coefficient is multiplied by the respective number of lands of that type, and then added together with the intercept value. Lands that are positive add more the CER, while negative values do not.
Creature
|
Coefficients
|
Intercept
|
2.837139
|
1-W
|
-0.06215
|
1-B
|
-0.26636
|
1-G
|
-0.17733
|
1-R
|
-0.08536
|
2-W
|
-0.28486
|
2-U
|
-0.18203
|
2-B
|
-0.18532
|
2-G
|
-0.11453
|
2-R
|
-0.04863
|
3-W
|
0.036418
|
3-B
|
0.058173
|
3-R
|
0.82755
|
4-W
|
0.349107
|
4-B
|
0.04141
|
4-R
|
0
|
5-G
|
-0.28138
|
This is the creature model. When a variable says “1-W”, that means it is a white spell that requires one mana to cast (thus, that one mana must be white). A “2-U” creature is a blue spell that requires two mana (one being blue), and so. The same idea is made in regards to the coefficients for each variable as before.
Non-creature
|
Coefficients
|
Intercept
|
1.419468
|
1-W
|
0
|
1-U
|
-0.20993
|
1-B
|
0.446506
|
1-G
|
0.191473
|
1-R
|
0.445932
|
2-W
|
-2.65702
|
2-U
|
-0.33504
|
2-B
|
-0.37775
|
2-G
|
-0.08625
|
2-R
|
-0.10049
|
2-C
|
0
|
3-W
|
-0.45677
|
3-U
|
0.575053
|
3-B
|
0.269095
|
3-C
|
-0.34872
|
4-U
|
0.493799
|
This is the model for non-creature spells, and it follows the same logic as creature spells.
Multi-colored spells
|
Coefficients
|
Intercept
|
1.937248
|
MC-1
|
-0.05796
|
MC-2
|
-0.12051
|
MC-3
|
0.00901
|
MC-4
|
0.040001
|
MC-5
|
0.069495
|
MC-6
|
0.335003
|
Finally, this is the multi-colored spell model. It follows the same logic as the previous models. A deck-builder can use these models to figure out their decks efficiency on casting the cards in the deck.
Problems
For one, it can’t predict games. That sort of variance is impossible to calculate on a game-to-game basis. There is too much that goes on in each game, and that’s what makes it so fun to play.
Secondly, the regression model needs tuning. I did not have access to the best tools to run a 50+ variable regression, and had to break it up into smaller models. While the numbers may not line up to the total rating, it’s the same idea that the higher the total number between the four models, the better the deck will be at casting cards as soon as possible.
Thirdly, calculating individual multi-colored spells would provide a more accurate model.
Where Could It Go From Here
I’d love to work more with this kind of math involved with the game. Everyone knows of saber-metrics for baseball, and the increase in better-analyzing statistics for the sport of basketball. Working with these kinds of numbers would be so much fun for me, and I’d get to be involved with a game that is big part of my personal (and now somewhat academic) life. If I put the time into this sort of work more, then I’m sure it’s possible to break down all sorts of ratings for cards, rather than using hours of play-testing to figure out if one spell should be in a deck over another. These kinds of statistics can further prove that a card is better over another, although the natural fun of picking cards and playing decks will never be replaced by number-crunching. I’d love to expand this same kind of work to other formats, and have an eternal database for legacy, modern, and more popular EDH decks (since those are formats that do not change nearly as often as standard does). As for a project now, I couldn’t have thought of a more enjoyable topic to research and study. Getting to watch games, read articles, and study the calculations behind this game is complete bliss. I suppose a distant dream would be to do this sort of work as a part-time career, but I’ve enjoyed the entire process of what I’ve done so far.
Thank you for reading.
Sources: