Sunday, July 7, 2013

A New Statistic For Magic: The Gathering (Capstone Project)

A New Statistic For Magic: The Gathering
By: Jeff Holt

When we were told that we could do a project on just about anything that was related to mathematics, I immediately thought of games, and my favorite game is Magic: The Gathering. I know there are lots of math and statistics that go into MTG, but I wanted to add something to it. This is my objective: finding a new way to analyze decks and figure out an easy way to make them better. 

Where to Start

I started with some research and data gathering. I found some articles and information on counting and probabilities, which are huge when it comes to playing the game. Like any card game, people make judgements based on what they have seen, and what they have at the present time. It isn’t any different with Magic. Since Magic is a competitive game, what you’ve seen, have, and know about becomes more crucial. Your opponent can play anything within the means of their deck, and playing the probabilities becomes a mastery of a skill to have. Even the best players become burned by playing the probabilities incorrectly. Sometimes, players have to take risks to win, and many times over and over, they lose. 

So what can we do about it? Due to how much time I had to work on this objective, I came up with an idea that relates to economic analysis. I thought about game theory, and decision making to come up with a statistic that is, currently, a loose analyzer of how well a deck can perform based purely on color, cost, and probability. I found that an opening hand of seven cards can include certain cards, and give the probability of that combination of cards. This can be found by using the hypergeometric function, but what about the total probability of a deck? In the game, there is this idea of “curving out” a deck, or more specifically, an opening hand along with the cards you would hope to draw. Sometimes you need luck, sometimes you don’t. 

An ideal opening hand would include 3-4 lands, and maybe 2-3 playable spells from those lands, and maybe 1 late-game spell. Of course, as decks change, an opening hand that is perfect changes. I attempted to develop a system that can test a deck’s ability to curve out properly. There were obstacles in getting a single statistic that seemed to work, and some limited resources made getting a perfect outcome impossible. Despite this, I came up with a statistic that I call the “Curve Efficiency Rating”. This statistic calculates the total probability of cards that can be drawn, along with being able to play them as soon as possible. 

Inspiration and Details

One of the articles I read discussed the usefulness of playing fetch lands in Magic. Fetch lands are a nickname for a set of lands that are played like a normal land, but are sacrificed and allows you to search for another land that is of the colors that you need to cast spells in your hand. 



This was part of a study by Garrett Johnson that found the effectiveness of using fetch lands in a deck. The probabilities of drawing lands in your opening hand change by how many lands you expect to start with. (The legend in the graph represents the number of fetches in a deck). 



On average, a 60 card deck with 24 lands will start with 2.8 lands in each opening hand.  This kind of simple statistics was what I wanted to try to find. 

The Curve Efficiency Rating is composed of a summation of probabilities of cards in a deck. To start, I looked at 20 of the most popular decks in the Standard format, (this format allows players to construct decks using cards from the last 2 blocks that have been released, with no more than 4 copies of a single card in a deck, besides basic lands). When gathering data, I wanted to focus on land types, converted mana costs (which is the total number of mana needed to cast a spell), and the colors of the spells. This is a basic summary of the decks I looked at:

Deck Names
 Total Lands
 Total Creatures
 Total Non-Creatures
 Total Multi-Colored Spells





Jund  
 25
6
17
12
Reanimator
 23
22
9
6
Junk Crats
 25
13
11
11
Bant Hex
 22
11
13
14
UWR Mid
 25
15
13
7
Naya
 24
12
3
21
R/G Aggro
 23
16
10
12
UWR Flash
 26
10
10
14
Esper Control
 27
5
16
12
B/G Midrange
 24
14
18
4
Jund Aggro
 23
12
3
26
Prime Bant
 24
14
9
13
Naya Blitz
 21
32
0
7
B/G Zombies 
 22
12
10
16
Act 2
 24
8
11
17
UWR Geist
 24
9
10
17
Bant Flash
 25
6
11
18
Naya Humans
 20
32
2
6
4-C Rites
 23
8
19
10
BWR Mid
 25
3
9
23

Each deck has 60 cards, except for R/G Aggro (which has 61), and Jund Aggro (64). The actual dataset contains all cards broken down into these categories, but then broken into their colors and mana costs. What the Curve Efficiency Rating calculates is the total summation of the probability of drawing a particular spell, and the lands needed to cast it. 

Each card has a probability of being drawn, that is calculated with the chances of drawing the land needed to play it on the earliest turn. There are a few things that are taken into consideration when making these calculations:

  1. Start with a 7 card hand, and be on the play. This means that you start with 7 cards, and do not draw until the second turn. 
  2. There are no other effects that take place. Mana ramp is not calculated, neither are the plays of an opponent. It is just the raw casting cost and being able to cast it. 
  3. The first turn means turn one, so we want to play a one-cost card on turn one.
  4. All spells with an X in the casting cost have an X of 3. So a card like Sphinx’s Revelation, that costs WUUX, (one white mana, two blue mana, and X additional mana) will cost 6 mana total in this study.  
  5. Multi-colored spells are generalized and grouped together.
  6. Each spell is considered its own “game”, such that we consider each spell individually, and it’s purely its own chance of being played without being effected by anything else. 

Let’s look into the first deck: Jund. Jund plays black, green, and red spells. It has 25 lands: 8 green/black, 7 black/red, and 7 green/red dual lands, and 3 other lands that only produce 1 colorless mana each turn. Consider a spell that costs 1 black mana to play, what are the chances that we have that spell and the mana to cast it on turn? There are 15 black-mana sources, so we need one of the black sources, and the spell itself. If there are 4 copies of the spell, then the hypergeometric equation will give us a probability of having the cards we need to play that spell on turn one. 



Where: 
N = the deck size
m = the card we want (number of copies)
n = the remaining number of cards drawn
k = the card we need from the number of copies (always 1)


We take the basic example of the equation, and multiply it by each factor that goes into what we want. We need 1 of the 4 copies of a card to play it, plus 1 black of the 15 to cast it on turn one. We also will have 5 more cards, but they can be anything that is not needed, so the equation will look something like this for this spell, (sorry for the quality, my original image wouldn't work):

(15) (4) (41)
( 1)  (1)  (5)
                  
      (60)
       (7)

        
and so the probability of being able to play this spell on turn one is equal to 11.64%. I’ve calculated the probability for each card in each deck by using this same formula. The only difference from card to card is multiplying by more mana probabilities as they are needed (such as sheer number of mana to different colors). Then I add each probability together for each card in a deck, and get the Curve Efficiency Rating. 

Analyzing the CER

It’s quite simple; the higher the number, the more efficient the deck is. Decks that are controlling (meaning blue decks usually), have a higher rating because their spells are more expensive. This means they have more time to draw them, and the lands needed, to play them on the earliest turn. 


Deck Names
 curve efficiency rating


Jund
 2.2839
Reanimator
 1.37678
Junk Crats
 1.13647
Bant Hex
 1.03669
UWR Mid
 2.29767
Naya
 1.48565
R/G Aggro
 1.69614
UWR Flash
 3.02176
Esper Control
 3.08138
B/G Midrange  
 2.19277
Jund Aggro
 1.25698
Prime Bant
 1.87719
Naya Blitz
 0.6847
B/G Zombies 
 0.93898
Act 2
 1.64747
UWR Geist
 4.16444
Bant Flash
 2.13506
Naya Humans
 1.75087
4-C Rites
 1.25076
BWR Mid
 2.70770


If we understand deck compositions, we see that aggressive decks have low ratings, and this is because many of their spells are low-costing. The midrange decks have spells that are mostly in the 3-5 mana cost range, and the blue decks (of course) have the best ratings because they have higher costing cards.  

What This Leads To

Economic analysis uses regressional outputs to predict financial means for demographics amongst the population. Using a similar process, I have built 4 regression models to help deck-builders analyze their composition. Using a small-variant regression output in MS Excel, I found these models based on land, creature spells, non-creature spells, and multicolored spells: 


Land 
Coefficients
Intercept   
-0.72739
U/W
-0.0299
W/G
0.186317
G/B
0.143789
B/R
0.309508
R/U
0.293637
R/W
0.22147
W/B
-0.05554
B/U
0.565639
U/G
0.13486
G/R
-0.06993
W
-0.96358
U
0.365084
B
0.090531
G
-0.02711
R
0.201999
Land
-0.02208



This is the model for lands. They include dual lands, basics, and non-basic/dual lands that only produce colorless mana. Each coefficient is multiplied by the respective number of lands of that type, and then added together with the intercept value. Lands that are positive add more the CER, while negative values do not. 


Creature
Coefficients
Intercept   
2.837139
1-W
-0.06215
1-B
-0.26636
1-G
-0.17733
1-R
-0.08536
2-W
-0.28486
2-U
-0.18203
2-B
-0.18532
2-G
-0.11453
2-R
-0.04863
3-W
0.036418
3-B
0.058173
3-R
0.82755
4-W
0.349107
4-B
0.04141
4-R
0
5-G
-0.28138

This is the creature model. When a variable says “1-W”, that means it is a white spell that requires one mana to cast (thus, that one mana must be white). A “2-U” creature is a blue spell that requires two mana (one being blue), and so. The same idea is made in regards to the coefficients for each variable as before. 


Non-creature
 Coefficients
Intercept         
1.419468
1-W
0
1-U
-0.20993
1-B
0.446506
1-G
0.191473
1-R
0.445932
2-W
-2.65702
2-U
-0.33504
2-B
-0.37775
2-G
-0.08625
2-R
-0.10049
2-C
0
3-W
-0.45677
3-U
0.575053
3-B
0.269095
3-C
-0.34872
4-U
0.493799

This is the model for non-creature spells, and it follows the same logic as creature spells. 


Multi-colored spells
Coefficients
Intercept                     
1.937248
MC-1
-0.05796
MC-2
-0.12051
MC-3
0.00901
MC-4
0.040001
MC-5
0.069495
MC-6
0.335003

Finally, this is the multi-colored spell model. It follows the same logic as the previous models. A deck-builder can use these models to figure out their decks efficiency on casting the cards in the deck. 

Problems

For one, it can’t predict games. That sort of variance is impossible to calculate on a game-to-game basis. There is too much that goes on in each game, and that’s what makes it so fun to play. 

Secondly, the regression model needs tuning. I did not have access to the best tools to run a 50+ variable regression, and had to break it up into smaller models. While the numbers may not line up to the total rating, it’s the same idea that the higher the total number between the four models, the better the deck will be at casting cards as soon as possible. 

Thirdly, calculating individual multi-colored spells would provide a more accurate model. 

Where Could It Go From Here

I’d love to work more with this kind of math involved with the game. Everyone knows of saber-metrics for baseball, and the increase in better-analyzing statistics for the sport of basketball. Working with these kinds of numbers would be so much fun for me, and I’d get to be involved with a game that is big part of my personal (and now somewhat academic) life. If I put the time into this sort of work more, then I’m sure it’s possible to break down all sorts of ratings for cards, rather than using hours of play-testing to figure out if one spell should be in a deck over another. These kinds of statistics can further prove that a card is better over another, although the natural fun of picking cards and playing decks will never be replaced by number-crunching. I’d love to expand this same kind of work to other formats, and have an eternal database for legacy, modern, and more popular EDH decks (since those are formats that do not change nearly as often as standard does). As for a project now, I couldn’t have thought of a more enjoyable topic to research and study. Getting to watch games, read articles, and study the calculations behind this game is complete bliss. I suppose a distant dream would be to do this sort of work as a part-time career, but I’ve enjoyed the entire process of what I’ve done so far.

Thank you for reading.  



Sources:

2 comments:

  1. Can't seem to find a way to contact you on this blog. I'm working on a digital fantasy card game right now that I'm interested in applying statistical analysis in the spirit of sabermetrics to - for the purpose of balancing the game. Be nice to chat some time.

    You can contact me at minimallyexceptional@gmail.com

    ReplyDelete
  2. Can't seem to find a way to contact you on this blog. I'm working on a digital fantasy card game right now that I'm interested in applying statistical analysis in the spirit of sabermetrics to - for the purpose of balancing the game. Be nice to chat some time.

    You can contact me at minimallyexceptional@gmail.com

    ReplyDelete