If you are interested in data science in video games and eSports check out my article From Zero to Heroes Never Die which analyses player performance in Overwatch.
Crusader Kings 2 was the last PC game that I bought on disk before surrendering fully to the Steam Gods. I played much of it in the days when I had a severely limited internet connection which is why I have "only" 190 hours clocked up in game. Unlike other Paradox titles such as Stellaris or Europa Universalis 4, or other strategy titles like the Civilisation series, instead of playing as the abstract notion of a "nation" CK2 puts you as the head of a landed dynasty in the medieval world. Time passes day by day, the character you are playing marries, has children, fights wars, contracts diseases, has affairs, appoints councils and eventually dies. The titles this character held pass to the next in line through the often complicated succession laws of the time and you take control of that characters heir who may have inherited some or all of their predecessors land. And so the game continues. You lose the game if your starting dynasty loses all its counties or all members of the dynasty die out.
The long form game lasts from the year 769 until 1453. There are over 1300 individual counties in the game, all ruled by a Count, and these merge in various ways into Duchies which merge into Kingdoms which merge into Empires, all of which rise and fall and are fought over throughout the 700 or so years that the game lasts. The starting conditions are true to real world history; if you take the 1066 start then William the Bastard is Duke of Normandy and can be chosen by the player. Once the game starts and runs for a few years things often diverge dramatically from real world history and there is even one expansion for the game in which the Aztecs invade Europe and North Africa from the West. The game has its own subreddit which is full of awesome images and stories that players have experienced, and lots of memes.
Like all Paradox titles the game contains a massive amount of data. Clicking on a character in game allows you to see portraits representing their parents, spouse, siblings and children. Clicking on any of these allows you to do the same for them. You can click on any county and get a chronological list of every character that held it and the same goes for Duchies, Kingdoms and Empires. I realised that the game must be storing these details in the save file and in a relational format that could be used to build a network. I had already done a project in which I pulled data from a Caves of Qud save file and I wanted to see if the same could be done here.
A non-Ironman, uncompressed CK2 save file can be opened with a text editor. Each character is represented by an entry that includes a unique identifier and values for the unique identifier of their father, mother and dynasty if those exist. It also includes details like how the character died, if they were murdered or died in battle who was responsible and data on their religion and culture. There are details on all dynasties such as their culture and religion and for each title there are details on who held the title, how they got it (inheritance, election, conquest etc) and when their reign started.
I wrote a script that would pull out this data and store it in MongoDB. This script and the ones used for the below data can be found on my GitHub account. I allowed a full game to play out in observer mode from 769 but the game doesn’t end at 1453 in this mode so I stopped on the 1st of January 1460. You can click here to download a zip of the save file I used to make these networks. The save file is from version 2.7.2, the version before the Jade Dragon expansion released in November. I decided to use the previous version of the game as with Jade Dragon China tends to expand quickly and creates a lot of dynasties in the Western Protectorate that tend to die out in a generation or two.
I was able to build 2 main types of networks from the save game data; marriage networks and kill networks. Links to interactive versions of all of the networks built with linkurious.js can be found here. Due to the size of these networks they can take a few seconds to load.
Marriage is a very important tool in CK2 for a number of different reasons, mainly to provide legitimate children to succeed your current ruler and to secure alliances with your inlaws. Unlike in other games where you can sign treaties with those you are on good terms with you can often only form alliances with those you have close marriage ties with in CK2. Marrying members of your immediate family into the powerful dynasty threatening your border may make them like you more and forming an alliance may stave off an invasion and provide you with an ally to attack your rivals.
I wrote some code that took all characters in the game who had a dynasty and a spouse and built a network of all the dynasties with weighted edges existing between those where a character from each had married each other. Clusters form in the network around geography and religion and the below image shows the network colored using the modularity statistic in Gephi. The Indian subcontinent is isolated in the top right and Europe is in the bottom left in orange with Italy splitting out into a cluster of its own just above it. The brown cluster above this is the Greek dynasties of Byzantium which converted early in the game to Islam and expanded massively. The light blue cluster represents the dynasties of the Middle East and North Africa, the dark green cluster to the right is Spain and West Africa and the light green cluster in the middle is the pagan dynasties of Eastern Europe and the Steppes. An interactive version of this network can be found here.
This is the largest connected component in the network and consist of almost 10,000 of the 29,000 dynasties that appear in the game. There are many smaller components consisting of only a handful of dynasties and they do not appear here. Dynasties will cluster closer together if there are more edges (marriages) between them and will be far away if there are less. While there may be few marriages between Irish dynasties (on the right of the bottom orange cluster) and the Italian/Lombard dynasties (the smaller orange cluster to the left) they both marry into French and German dynasties and are therefore closer to each other than they are to the dynasties of distant India in the top right. There is also an interactive verison of this network colored by culture and by religion.
The above network shows all marriages between all dynasties but I wanted to see how the most powerful dynasties acted. I wrote code to find every character who had held a Kingdom or Empire level title and then got all their children. I then built a marriage network for them and an interactive version can be found here showing dynasties that held titles in orange and those that didn’t in blue. Again Europe and India are separated on the left and right. The nodes are sized by PageRank, a measure of the nodes popularity. The large orange nodes have clusters of blue nodes connected into them but there are no large blue nodes that provide spouses to the reigning dynasties often. There are few blue nodes which sit between the larger clusters but there are a number of marriages between ruling dynasties that cross religious divides. In the central column of clusters the top are the pagan dynasties, in the middle is Greece on the left and the Middle East on the right, on the bottom is Spain and North Africa.
Characters can also be killed in the game. Battle, poison, a lone arrow or a basement full of manure and a lit match can all bring a life to an end early. When assassinations are carried out there’s a chance that the character ordering the killing will go undiscovered. Even if this happens the save file still registers, in the victims data, the unique identifier of their killer. The kill networks are built using this data. An edge exists between two dynasties where a member of one killed a member of the other. While it would make more sense for this to be a directed graph, indicating which dynasty did the killing and which the being killed, this caused difficulty when trying to pull out the largest connected component using networkx. What I should have done is build the network undirected, taken out the largest component and then rebuilt a directed network using only these nodes and this is something I might do in future.
The kill networks (colored by cluster, by culture or by religion) follows along the same lines as the marriage networks. India is isolated in the top right, Europe in the bottom left. Greece is to its right and many of the pagan dynasties are above it to the left. Spain is in the centre, the Middle East is above them and North Africa is to the right. These networks take in all killings and a lot of killings are carried out due to rivalry or spite. Religious leaders will demand their liege to burn heretics alive or often prisoners constitutions will not prove up to the task of surviving prison. Here is the code I used to focus just on those killed in battle. During a battle there is a chance that a named character on one side will kill a named character on the other side. On occasion kings will fall in battle, as happened with Conlang De Vannes, the founder of the Kingdom of Ireland. I built a network showing the connections between dynasties who killed a member of another dynasty in battle, again this should be directed to make more sense, and took the largest component. Click here to see the network colored by culture, here by religion.
India forms a closed circle in the top right while small clusters form around the rest of the network, mainly consisting of multiple religions. Religious wars break out when a member of a religion declares a Holy War to take land belonging to a different faith. Members of both religions can flock to the aid of the attacker or defender. As the name of the game suggests the Pope can call Crusades for the conquest of lands not in Catholic control and it is in this fashion that the Republic of Greece, after electing an Islamic leader, was conquered by England in this game. Members of the same religion will also go to war with each other to enforce claims.
There are generally only a small number of characters involved in a battle and only a small change of a killing happening so it is rare to rack up multiple kills during a military career. Here I pulled out all characters who had been killed in battle and grouped by their killer. Both Mahipala Mahipalid and Samir Samirid had 4 kills during their lives. The Ayudha of northern India had the most kills of any dynasty with 18 but suffered heavy loses along with the southern Vengi Chalukya dynasty. There is also a list of all the knights who killed and were then killed themselves in battle, a number of them achieved 2 kills before falling.
While building out the kill networks I also looked at the top killers in the game. Jochi Jochid, the Emperor of the Mongol Empire was responsible for the deaths of 36 people, most of them dying in his prison. Not to be outdone his son and successor Bilge finished off another 42! The save file also contains data on a character’s father and "rfat" or real father in the event that they are born out of marriage. That sort of thing was important back in the days of hereditary titles. If a character has both a fat and rfat value it means the person bringing them up, their father, isn’t their real father. In this notebook I got details on which characters had the most children. Uways Abbasid was bringing up 17 children who had an rfat value meaning he wasn’t their father. On the other hand Abdul-Razzaq Hasan had 22 children with married women and the children were being brought up as belonging to the husband. Amaneus de Carcassonne had 19 children, either with unmarried women or with women who were married but the affair was discovered. In total Muhammad Aleppo was the father of 44 children with 34 mothers and Angilbert Bouvinid had 41 children with 35 different mothers. It must be hard to remember all those birthdays!
Women don’t tend to have as many children as men but Piratamatevi Ay found time to have 10 children in between conquering south India and establishing the Ay as a major power. Amalfrida Liutprandingi had 6 children and was married to the Bishop of Oderzo. Chlotsuintha Lambertingi married the Mayor of Genoa and had 6 children. In both cases the husbands were not the fathers.
Finally, I’ll end where I began. At the start I aimed to focus mainly on the Irish dynasties in the game (and did up marriage networks of them). When I was writing out the code with only a few years worth of data there weren’t any powerful Irish rulers and so I used Charlemagne to test if what I was trying to do was even possible. There are 2 notebooks based around using MongoDB’s $graphLookup on Charlemagne. The second one takes a full count of all his direct descendents through the male and female lines. By the end of the game almost 50,000 characters have been born who are descendants of Charlemagne. Over 700 of them are members of the Italian Alachisling dynasty, 613 are members of the Greek Isauros that ruled Byzantium before its conversion to Islam and 506 are members of the Penikis dynasty that controls Finland, Rus and much of North Eastern Europe. There is also a few cells of code that find all descendants in common between two characters. I chose Charlemagne and Cobthach of the Eóganacht-Locha Léin dynasty, count of Thomand (Clare/Limerick) until his death in 789. Cobthach has over 14,000 descendants himself and he and Charlemagne share 10,000 in common.
Crusader Kings 2 has a wealth of information that can be used for building networks and finding out interesting statistics about a game. While I mainly let the game run away on its own in observer mode I did watch the progress of the Kingdom of Italy and the Knights Hospitaller who controlled most of Germany. The networks and statistics pulled from the game tell that there was an interesting story going on in India.
I’m going to amend the code in the notebooks shortly to work with Jade Dragon as well. I’m also going to see about rebuilding the kill and battle networks as directed networks. I started this project on the 2nd of October so I have been working on it for a long time and am glad to be finished. The Social Network of the 1916 Rising project I did before this was a great help and I reused some of the code for getting the centrality measures.
If you have any suggestions on improvements or on other measures or statistics that could be drawn from the data please suggest them. I’d love to hear about other video games that have save files which allow for data science to be applied in a similar way.