MAS

A Kwartet simulation using logic for multiple agents

written by Diego Cabo & Tanja de Vries & Ruben Kip

Content

Introduction
The rules of Kwartet
Example game loop
Architecture
Actions
Results
Discussion
Conclusion
Improvements

Introduction

Kwartet is a card game that requires the players to use logic to win. In this game the goal is to collect sets of cards of the same group, also called a kwartet, by asking other players for cards. The game requires knowledge of your own cards, the card set and the previous asked cards to figure out which player has the cards you require for completing a Kwartet.

In this project we try to make a working simulation of kwartet using logic rules for a multi-agent system. The system is played by computer generated agents which make decision based on their model. This article will explain our decisions, steps, results and some theory behind the game. The simulation is programmed in Python 3 and uses only the Python standard library. The code can be found here. The simulation is tracked by logging steps from the simulation to a log file. Furthermore we added graph generation of the indivdual agent models, the models can be found in the "knowledge-graphs" directory, created upon running the program.

Besides our goal of making a working kwartet simulation using logic rules we also have set ourselves the extra goal of finding out the best kwartet strategy and become kwartet masters! We also thought it would be a nice addition if you could join the agents in a game of kwaret. So there are 2 modes in this kwartet, one were you can let agents play against each other and one where you can play against other agents using the terminal.

The rules of Kwartet

Before we can let our agents play kwartet and apply a strategy to win, we need to know the rules of kwartet a bit better.

The game of Kwartet is played in an unusual order. That is, after a player finishes his turn, the player last asked a card starts his turn. On your turn, you can ask for a card to any other player. If the other player does not have the card, your turn is over and his will start. If the other player does have the card, he has to give the card to you and you may ask another question to any of the players. Your turn ends at the moment you get no as an answer. The game has two more rules. Firstly, you can only ask for a card if you have a card of the same group. So, you are only allowed to ask for ‘a fat cow’ if you own at least one of the other ‘farm animals’ cards. Secondly, which may be the most important rule of Kwartet, you are not allowed to lie. If someone asks you for a card that you own, you have to give him the card.

Every group consists of 4 cards. If you have collected all cards from a group, you earn 1 point and the group is out of the game. The game ends if all groups are collected and there are no cards left in the game. The player with the most points, i.e. the player that collected the most groups is the winner.

An example game loop

Suppose Diego, Ruben and Tanja are playing a game of Kwartet. Like John Snow, Diego knows nothing, starting the game by asking a random card to a random player. In this case he asked Tanja for the 'smelly goat' card from the 'farm animals' group. Even before Tanja gives an answer, the knowledge of the players changes. To see what happens with the knowledge we go through the first questions of this game step by step.

Diego: “Tanja, do you have from ‘farm animals’ the ‘smelly goat’?”

Based on this announcement it is common knowledge that Diego has a card of the group ‘farm animals’, otherwise he would not be allowed to ask this question. Moreover, it is common knowledge that he does not have the ‘smelly goat’ card.

Tanja: “No.”

From this point it is common knowledge that Tanja does not have the ‘smelly goat’ card, so everyone now removes the possiblity in their model that Tanja has a ‘smelly goat’ card.

Diego’s turn is over, it is Tanja's turn.

Tanja: “Diego, do you have from ‘farm animals’ the ‘crazy pig’?”

This time again it becomes common knowledge that Tanja has a card of the ‘farm animals’, but she does not have the ‘crazy pig’.

Diego: “Yes.”

Diego gives the ‘crazy pig’ card to Tanja.

Now it is common knowledge that Tanja has the ‘crazy pig’.
It is common knowledge Diego & Ruben do not have the ‘crazy pig’.
Moreover, Ruben and Tanja no longer know whether Diego has a ‘farm animals’ card.

Architecture

To model the structure of a Kripke model, we use python dictionaries. Keys in the dictionaries are representative of propositional atoms in each world, with exception of the agent key which defines the agent relation. Keys are nested; by following the structure the keys are added to define the values of the propositional atoms of a world. The final value is an Operator of the K language, [M,K,not]; here not can be seen as removed from the model. A visualisation of the structure can be found in Figure 1. The reasoning behind the key ordering in our data structure is that, a possible group is know by an agent before a single card is known. Furthermore a card group is of higher order than a card, optimizing the search for possible options. If a card is known, the group is known, so there is no option of a known card in a M or deleted group.

While playing different strategies we require to look at different observables, as an example we need to look not only at cards but also at card groups (Figure 2).In our model we require a seperate model for each of these observables, this is because of a limitation in our datastructure. We cannot "remove" a world in the model without affecting worlds that should still be accesble from other worlds. As an example we cannot remove a group for one agent without removing the group for all agents.

To make the relation with Kripke models more specific, we show in Figure 3 a Kripke model with the corresponding data structure that we use in Figure 4. In this example we have three players and one group of cards. Player B starts with two cards and player A and C start with one card. Player A then asks player C for card 1 after which player C announces she does not have this card. The data structure is only shown for player A, we have similar data structures for player B and C.

Fig.3 - Kripke model, where the double circle is the true world. A1 means that player A has card 1.

Fig.4 - Kripke model data structure for player A.

The model described so far can only store first-order logic. However, in a game of kwartet one also uses the knowledge of the other players. To store second-order, an extra layer, the observer, is added between group and player. If the observer is the player himself, then we have in that branch the model as it was before. However, if the observer is one of the opponents, a player stores what he knows about the opponents' knowledge. We could not think of kwartet strategies using third or even higher order logic. If you out-think us, note that higher order logic can be implemented similar to second order logic.

Actions

There are 5 actions that influence the knowledge structure (model):

(Private) Seeing own cards: By the basic step of looking at which cards an agent has in his hand, we can update the model by removing worlds where these cards are owned by other players. This step is private for each individual agent.

(Announcement) Requesting a card: According to the rules, requesting a card can only be done when you have a card from the group. This is a public announcement, so for each agent we update the model, replacing M(requested_group) because we have K(requested_group) for that agent.

(Announcement) Giving a card: Giving a card is one of the most important announcements. After this action we update the model for each agent. Every agent knows that the player that was asked for the card does not have it any more. Every agent also knows that the player that asked for the card, has it now. Furthermore, they know that the agent that was asked for the card might not have cards of the mentioned group any longer. So, unless a player knows with certainty that the player who was requested a card has other cards of that group, he will set the group state for that player to M(group).

(Announcement) Not giving a card: We update the model of each agent by removing the world were the asked agent has this card from the model.

(Announcement) Kwartet!: We update each agent model by removing the group entirely from the model.

Furthermore we have 1 action that does not change the model but uses second order logic:

(Private) Assess model (advanced thinking, not applied for all agents): By individually looking at what other agents know, the agent tries a different strategy of asking cards in the model.

Results

We tested implemented strategies by running the game 100 times using 4 agents and 24 cards. We choose 4 agents so exclusion does not play a direct role. Keep in mind that in kwartet there always plays a certain 'luck' factor, created by the random division of cards and questioning. We tried to decrease the effect by the increasing in player count, card number and of course number of games. The resulting win percentages using the different strategies are below:

Strategy	Agent 1	Agent 2	Agent 3	Agent 4
All random (worst case)	20.9%	28.2%	23.6%	27.2%
A1 1st order, rest random	83.5%	6.3%	2.8%	7.3%
All 1st order, A1 advanced	32.8%	22%	24.5%	20.7%
A1 1st order, A2 2nd order, rest random	36.3%	57.2%	2.7%	3.8%
A1 1st order, A2 2nd order + advanced (best case), rest random	30.6%	62.5%	3.5%	3.3%

We also tested perfomance, as the model grows exponentionally it is intresting to know what the limits in cards, strategy and agent count is. Keep in mind depending on how great the difference in quality between an agent and his opponents playing rounds and time might differ a lot, as these factors are affected if a 'good' or 'bad' playing agent is on play a lot. Also keep in mind that performance might be different based on factors like hardware. We will use these numbers only for indication purpose not for conclusions. These are the performance results for 1 game of kwartet:

Strategy	Average play rounds	Player count	Card count	Time (seconds)
All random (worst case)	340	10	24	130
A1 first order, rest random	100	10	24	32
All 1st order, A1 advanced	90	10	24	50
A1 2nd order, rest 1st order	85	10	24	25
All 2nd order	80	10	24	30

Discussion

Looking at the results between the different strategies, we can already see the 'luck' factor playing a role in that the win percentages are not equal over the agents. This effect is further strenghtened by the fact that the python Random library as with all random functions is a Pseudo-Random function, this means the randomization is deterministic. We will however act like the randomization in card division and unknown card choice as fully random. This because a better strategy is still visible when agents play different strategies.
Not suprisingly we can clearly see that the first order logic already almost always beats the random agents. The random agents in principle only use Deleted and M, by adding K agent 1 can ask all cards when they are known without them being asked back as quickly. The same holds for counting cards which seems to improve win ratio a small percentage. What is more intresting is that we see that second logic order actually already has a significant win ratio over the first order agent.
When implementing different strategies, it is depended on what strategies the other agents play to have a bigger chance at winning. Finding the best and most resilient strategy was not our goal, but might be very good continuation of this project.
We have chosen a model per agent instead of one full model containing all worlds and agent relations. In our opinion a per agent model makes more sense, as in real life situations agents are isolated, therefore an agent cannot easily rely on a model that is shared with multiple agents. A model per agent gives the extra flexibility of fully isolated decision making.

Conclusion

The simulation works as expected, however there is still a lot of improvements possible as we will mention in the improvements section. Our simulation does not manage to solve higher logic then second order, but for the game of Kwartet this is not a very important skill. The second order logic can be useful in cases like, when an agent uses the fact that another agent knows his card, trying to get a Kwartet for this card as soon as possible. Performance wise the model is able to handle an acceptable growing amount of players and cards. From the strategies we implemented, we concluded that the best strategy is to actually hide cards from your card set, playing known known cards as quickly as possible. This strategy is requires the 2nd order logic model. Less suprisingly is that the advanced thinking also improves win chance, counting cards makes a difference. Although there are some clear results, as mentioned earlier in the discussion, we still think it is very dependent on what your opponents do to be succesfull. As final conclusion, we can definitely say that adding second order logic is usefull for agents to win games, as can be seen in the results. So if you are an aspiring pro kwartet player, here is some free advice, count your cards and think about what your opponents are thinking.

Improvements

In our implementation of this simulation of Kwartet there are still things to improve in play style and in performance. We will list a few points:

Our simulation has no implementation for drawing cards as is normal in regular Kwartet. This means however that on different announcements and actions the model changes differently as it might be that cards are still in the deck.

In our data structure all worlds are initialized, creating the largest possible Kripke model. The behaviour of a Kripke model is simulated by setting elements to be removed when we make announcements. Dynamic world creation and removal in a data structure might be more memory efficient and faster to process.

To create a more human like simulation, the simulation could add agents forgetting cards, this can be done by once in a while setting propositionals to M with a probability P in the model.

There are still different alternative strategies or even dynamic strategies. We only implemented the mentioned strategies.