從最終用戶角度來看外部結構
The complete python code and Exploratory Data Analysis Notebook are available at my github profile;
完整的python代碼和Exploratory Data Analysis Notebook可在我的github個人資料中找到 ;
Pokémon is a Japanese media franchise, that began as Pokémon Red and Pokémon Green, a pair of video games for the original Nintendo’s Game Boy, that were developed by Game Freak and published by Nintendo in February 1996. Pokémon has since become the highest-grossing media franchise of all time, with $90 billion in total franchise revenue.
神奇寶貝是日本媒體特許經營權,開始作為口袋妖怪紅和口袋妖怪綠,一對視頻游戲原任天堂的Game Boy,通過游戲怪物進行了開發和任天堂在1996年2月公布了神奇寶貝已經成為最賣座媒體特許經營史,總特許經營收入為900億美元。
Pokémon — Wikipedia Article
神奇寶貝—維基百科文章
Pokémon games, specially the ones from the first and second generations: Pokémon Red, Blue, Yellow, Gold, Silver and Crystal were really remarkable for the entire generation born in the 90’s — myself included. These days, I was looking for some datasets at the Kaggle website, and I ended up finding this post where the author had given data sets containing Pokémon and in game Pokémon combats data, and proposed a challenge: Build up a model that is able to predict the outcome of a battle between two Pokémon, based on the given data.
神奇寶貝游戲,特別是第一代和第二代游戲: 神奇寶貝紅,藍,黃,金,銀和水晶對于90年代的整個一代來說都是非常了不起的-包括我自己。 這些天來,我一直在Kaggle網站上尋找一些數據集,最后發現這篇文章 ,作者在其中給出了包含神奇寶貝的數據集,并在游戲中為神奇寶貝戰斗數據提供了建議,并提出了一個挑戰:建立一個能夠根據給定的數據預測兩個神奇寶貝之間的戰斗結果。
Merging my childhood hobbies with my current ones? Sounds good, so I’m game!
將我的童年愛好與當前的愛好融合在一起? 聽起來不錯,所以我在玩游戲!
數據集 (Datasets)
The Weedle’s Cave post at Kaggle provided the following datasets:
Kaggle的Weedle's Cave帖子提供了以下數據集:
Pokémon Dataset | 800 Inputs , 12 FeaturesThis Dataset contains data from all the Pokémon released until the 6th generation;
神奇寶貝數據集 | 800個輸入,12個功能此數據集包含所有發行到第六代的神奇寶貝的數據;
# — Pokemon Index (Not the same used in the original franchise lists, as Mega Evolutions are listed right after the pokemon on it’s normal form);
# —口袋妖怪指數( 與原始特許經營列表中使用的不一樣,因為《 Mega Evolutions》是在口袋妖怪之后以其正常形式列出的);
Name — Pokemon’s Name;
名稱 -寵物小精靈的名字;
Type 1 — Pokemon’s primary type;
類型1-口袋妖怪的主要類型;
Type 2 — Pokemon’s secondary type(All pokemons have a primary type, but not necessary a second one );
類型2-口袋妖怪的次要類型(所有口袋妖怪都具有主要類型,但不一定是第二個)。
HP — Health Points base stats;
HP -Health Points基本統計數據;
Attack — Phisycal Attack base stats;
攻擊 -物理攻擊基本屬性;
Defense — Phisycal Defese base stats;
防御 -物理防御的基本屬性;
Sp. Atk — Special Attack base stats;
Sp。 魔法攻擊 -特攻基礎數據;
Sp. Def — Special Defense base stats;
Sp。 Def —特殊防御基礎統計;
Speed — Speed base stats;
速度 -速度基準統計信息;
Generation — Which generation of the franchise games this Pokémon belongs;
一代 -神奇寶貝屬于哪一類特許經營游戲;
Legendary — Indicates if the Pokémon belong to the legendary group;
傳奇 —指示神奇寶貝是否屬于傳奇組;

Combat Datasets | 50000 inputs, 3 inputsThis dataset contains data from Pokémon combats, listing the two Pokémon envolved by their index (#), and which one have ended up the battle as the winner:
戰斗數據集 | 50000次輸入,3次輸入此數據集包含來自神奇寶貝戰斗的數據,列出了由其索引( # )演變而來的兩個神奇寶貝,其中一個以勝利者的身份結束了這場戰斗:
First_Pokemon — Trainer A’s Pokémon # (index) — (The Pokémon which have attacked first will always be listed as “First_Pokemon”);
First_Pokemon —教練A的神奇寶貝# (索引)((先攻擊的神奇寶貝將始終列為“ First_Pokemon”);
Second_pokemon — Trainer B’s Pokémon # (index);
Second_pokemon —培訓師B的神奇寶貝# (索引);
Winner — # (index) of the Pokémon which have won the battle;
獲勝者 -贏得戰斗的神奇寶貝# (索引);

神奇寶貝大戰如何運作? (How does a Pokémon Battle work?)
Before moving into the Exploratory Data Analysis (EDA), it’s important to understand how single Pokémon battles work (there are double battles as well, but that’s not the case of the ones recorded on the dataset).
在進入探索性數據分析(EDA)之前,重要的是要了解單次神奇寶貝戰斗是如何進行的(也有雙重戰斗,但記錄在數據集上的情況并非如此)。
Pokémon combats follow the usual RPG games model: Battles are taken in turns, where each participant is able take actions once per turn (not simultaneously), and the first one to reduce the opponent’s Health Points to zero, is the winner. At the franchise games, the actions are taken by Pokémon Trainers, whose are able to choose attacking the foe’s Pokémon, using an item or even switch it’s own Pokémon for another one in the team (trainers are allowed to carry up to 6 Pokémon with them).
神奇寶貝的戰斗遵循通常的RPG游戲模式:戰斗是輪流進行的,每個參與者每回合可以執行一次動作(而不是同時進行),而將對手的健康點降低到零的第一個游戲是獲勝者。 在特許經營游戲中, 神奇寶貝訓練師會采取行動,他們可以選擇攻擊敵人的神奇寶貝,使用一件物品,甚至將自己的神奇寶貝換成團隊中的另一個神奇寶貝(訓練員可以攜帶最多6只神奇寶貝)。
As mentioned above, the metric used to determine who’s the winner is the Health Points, or HP shortly. The amount of HP each Pokémon have is given by the following equation:
如上所述,用于確定誰是獲勝者的指標是健康點,或簡稱為HP 。 每個神奇寶貝擁有的HP量由以下公式給出:

The given datasets only provides the HP Base Stats (highlighted at equation). All the other variables (Individual Values: IV, Effort Values: EV and Level) are mechanics and engines developed in game and won’t be considered in the analysis, as there’s no way to find out this values for the combats recorded at the dataset. However, as shown in the equation, the total HP is directly proportional to the Base Stats HP. In other words, the higher the base stat, the higher is the amount of HP a Pokémon may achieve when fully trained.
給定的數據集僅提供HP基本統計信息 (以等式突出顯示)。 所有其他變量(“個體值: IV” ,“努力值”: EV和“ 水平” )都是游戲中開發的機制和引擎,因此不會在分析中予以考慮,因為無法找到記錄在數據集上的戰斗的該值。 但是,如等式所示,總HP與基礎統計HP成正比。 換句話說,基本屬性越高,經過充分訓練的神奇寶貝可以達到的HP數量就越高。
In order to reduce the opponent’s HP to zero and win the battle, Pokémons are able to use Battle Moves, which are special moviments used to inflict damage, enhance stats or cause some effects at the opponent or even to the environment. In the case of offensive Battle Moves, that being the ones which inflict some damage and reduce the foe’s HP, the damage dealt is calculated using the following equation:
為了將對手的生命值降低到零并贏得戰斗,神奇寶貝可以使用戰斗動作,這是一種特殊的動作,用于造成傷害,提高屬性或對對手甚至對環境造成某些影響。 在進攻性戰斗動作的情況下,由于造成一些傷害并降低了敵人的生命值,因此造成傷害的計算公式如下:

Where:
哪里:

A/D is the ratio between the effective Attack stat of the attacking Pokémon and the effective defense stat of the target Pokémon. “Effective” comes from the fact that there are two types of damage: Physical and Special (in a simple analogy, let’s says it would be something like “magical”). For each of these, specific stats are taken for the A/D factor calculation:Physical moves use attacking Pokémon’s Attack stat, and target’s Defense stat. On the other hand, Special moves consider attacking Sp. Attack stat and target Sp. Defense stat.
廣告 是攻擊神奇寶貝的有效攻擊狀態與目標神奇寶貝的有效防御狀態之間的比率。 “有效”來自以下事實:兩種類型的損害:物理損害和特殊損害(簡單地說,這就像“魔術”一樣)。 對于上述每一項,都會采用特定的統計數據進行A / D因子計算: 物理移動使用攻擊神奇寶貝的“ 攻擊”統計和目標的“ 防御”統計。 另一方面, 特殊動作考慮攻擊Sp。 攻擊統計和目標Sp。 國防統計。
Similarly to the HP, these stats are given by an equation, and the final value is directly proportinal to the respective Base Stat value, which is the only information given for all Pokémon in the Pokémon Dataset.
與HP相似,這些統計信息由一個方程式給出,最終值與各自的基礎統計信息值成正比,這是Pokémon數據集中所有神奇寶貝的唯一信息。

The other piece of information we have to build up our analysis is the Type factor. Long story made short: All Pokémon have a Type, which is something like an essential element of it’s body composition. Actually, every Pokémon have at least one Type, but it’s possible for them to have up to two Types: a primary and a secondary one.
我們需要進行分析的另一條信息是類型因子。 簡而言之:所有神奇寶貝都擁有一種Type,這就像它的身體組成中必不可少的元素。 實際上,每個神奇寶貝都有至少一種類型,但它們最多可能具有兩種類型 :主要類型和次要類型 。
Every Type has specific iteractions with others, being strong, weak, neutral or imune . Also, every Battle Move is associated to one type, so this Type factor shown on the damage equation is the attacking move type x target type damage multiplier. (Don’t panic, we are getting there soon).
每種類型都有與其他類型的特定迭代,包括強,弱,中性或免疫。 另外,每個戰斗動作都與一種類型相關聯,因此傷害方程式上顯示的此類型因子是攻擊動作類型 x 目標類型傷害乘數。 (不要著急,我們很快就會到達那里)。
Hence, for this analysis we will consider that damage is directly proportional to the product of A/D factor and the Type damage multiplier factor, which are the information available at the datasets.
因此,在此分析中,我們將考慮損害與A / D因子和類型損害乘數因子的乘積成正比,這是數據集上可用的信息。
Sound good so far, right? Despite the fact of not having all the necessary information to compute a complete and fully accurate calculations, knowing how to calculate the Type damage multiplier, there’s at least enough information to build up some good ideas about the maths behind Pokémon battles. Then, this is the immediately next step!
到目前為止聽起來不錯,對嗎? 盡管沒有足夠的信息來計算完整且完全準確的計算,盡管知道如何計算類型傷害乘數 ,但至少有足夠的信息可以為神奇寶貝戰斗背后的數學知識建立一些良好的思路。 然后,這是下一步。
如何計算類型傷害乘數系數? (How to calculate the Type damage multiplier factor?)
Now that we know what are Pokémon types, and that they’re really important to the damage calculation, it’s time to know which types are out there in the Pokémon world:
現在我們知道了什么是神奇寶貝類型,并且它們對于傷害計算非常重要,是時候知道神奇寶貝世界中存在哪些類型了:

Each of these types has a offensive and defensive relation to all the others, and this information is given by Game Freak after every new generation release. I’ve taken the most recent Type chart at the Bulbapedia and created a new dataframe, containing the exact same information shown on the figure below:
這些類型中的每一個都與其他所有類型都具有進攻和防御關系,并且在每個新一代發行之后,Game Freak都會提供此信息。 我已在Bulbapedia上獲取了最新的“類型”圖表,并創建了一個新的數據框,其中包含與下圖所示完全相同的信息:

This chart gives us the directly relation of all the types in respect to the others, offensively (references at the horizontal) and defensively (references at the vertical).
該圖表為我們提供了所有類型相對于其他類型的直接關系,包括進攻性(水平處的參考)和防御性(垂直處的參考)。
Example given: Electric types have have a damage multiplier equal to 2 when attacking a Water type, 0.5 attacking a Grass type, equal to 1 against a Fire type, and finally, equal to 0 when attacking a Ground type enemy. Then, we can say that, Electric moves:
給出的示例: 電氣類型在攻擊水類型時具有2的傷害倍數,在攻擊草類型時具有0.5的傷害倍數,在攻擊火類型時具有1的傷害倍數,最后在攻擊地面類型敵人時具有0的傷害倍數。 然后,我們可以說,電氣公司采取以下行動:
Are Super Effective against Water types;
對水類型超級有效 ;
Are Not Very Effective against Grass types;
對草類型不是很有效 ;
Are Neutral against Fire types;
對火災類型持中立態度;
Does not affect Ground types;
不影響接地類型;
Given this four possible interactions between types, and the Type Chart, it is possible now to calculate the Type Damage Multiplier factors for all the Pokémon combats, right?
鑒于類型與類型表之間的這四種可能的相互作用,現在可以計算所有神奇寶貝戰斗的類型傷害倍數因子,對嗎?
還沒! (Not yet!)
As I’ve mentioned before, it’s possible for a Pokémon to have a secondary type. In fact, until the 6th generation of games released by the Japanese Franchise, which is the data included at the Pokémon Dataset, only 48.25% of all Pokémon have just a primary type.
正如我之前提到的,神奇寶貝有可能具有第二種類型。 實際上,直到《神奇寶貝》數據集所包含的數據(由日本特許經營公司發行的第六代游戲)開始,所有神奇寶貝中只有48.25%的游戲只是主要類型。
As the majority of Pokémon have two types, and the interaction factor are cumulative, we need a way to calculate the resultant interaction of multiple types. Since the Combat Dataset does not provide any information regarding which actions have been taken during the battle, I’m taking some assumptions to simplify the analysis: One’s going to consider that:
由于大多數神奇寶貝都有兩種類型,并且相互作用因子是累積的,因此我們需要一種方法來計算多種類型的最終相互作用。 由于“ 戰斗數據集 ”沒有提供有關戰斗中采取了哪些行動的任何信息,因此我采取一些假設來簡化分析:我們將考慮以下幾點:
Pokémons have only used Battle Moves from it’s own types during battle — Example, a Fire/Flying such as Charizard is only able to use Fire moves or Flying moves (Battles moves have only one type associated to them);
小寵物在戰斗中只用戰移至從它自己的類型-例如, 火 / 飛如噴火龍是唯一能夠使用火移動或移動飛(戰斗技有與之關聯的只有一種類型);
- Pokémon trainers will always choose the best battle move option (when applicable) for their Pokémon to use in order to cause as much damage as possible to the enemy’s Pokémon. 神奇寶貝訓練師將始終選擇最佳的戰斗移動選項(如果適用)供其神奇寶貝使用,以對敵人的神奇寶貝造成盡可能多的傷害。
Author’s Note: It’s absolutely possible and normal for Pokémons to know Battle Moves from other types than their own ones, and more experienced players always choose to teach their Pokémon moves from types that might offer them some coverage for their weaknesses (Pokémons are able to know up to 4 moves at the same time). However, this analysis won’t consider this fact due to the lack of information on the combat dataset;
作者注:神奇寶貝完全可以從其他類型上了解自己的戰斗招式,并且經驗豐富的玩家總是選擇從可能為他們的弱點提供掩護的類型上教他們的神奇寶貝招式(神奇寶貝能夠知道同時最多4個動作)。 但是,由于戰斗數據集缺乏信息,因此該分析不會考慮這一事實。
In order to understand how to calculate the Type Damage Multiplier factor, a simple example might be useful.
為了了解如何計算類型損害乘數因子,一個簡單的示例可能會有用。
Imagine that two Pokémon trainers have met and decided to test who’s the best through a battle. Trainer #1 has a Charmander, and Trainer #2 has a Shuckle. Supposing that Charmander has higher Speed stats, Trainer #1 will take actions first.
想象一下,有兩個神奇寶貝教練見過面,并決定通過戰斗來測試誰是最好的。 培訓師#1有一個玩火人,而培訓師#2有一個a 子。 假設Charmander具有較高的速度統計信息,則Trainer#1將首先采取行動。
Note: Speed stat is used to decide which Pokémon will move first at battles. The standard rule for it is quite simple: the fastest moves first. Speed stat is given by the same equation shown before for other stats than HP, however using the Speed base stats instead. Just for the record, there are in game mechanics that override the standard Speed rule for selecting which Pokémon moves first, such as Priority Moves or Hold Items — However, once again, there’s not included at this analysis scope since there’s no data regarding these combat details.
注意:速度統計用于確定戰斗中哪個神奇寶貝將首先移動。 它的標準規則很簡單:最快的動作優先。 對于除HP以外的其他統計信息,速度統計信息由前面顯示的等式給出,但是改用速度基準統計信息。 僅作記錄,游戲機制中有一些優先選擇標準的速度規則來選擇哪個神奇寶貝首先移動,例如“優先移動”或“持有物品”-但是,由于沒有關于這些戰斗的數據,因此在此分析范圍內也沒有包括細節。

As Charmander has only one type, Fire, there’s no option here regarding which move type to use. It will for sure use a Fire move;
由于Charmander只有一種類型,即Fire,因此這里沒有使用哪種移動類型的選擇。 它一定會使用火招;
On the other hand, Shuckle has two types: Bug and Rock. So it’s necessary to consider how Fire moves interact with these two types, and merge these interactions together into a final number;
另一方面, Shuckle有兩種類型: Bug和Rock 。 因此,有必要考慮Fire動作如何與這兩種類型進行交互,并將這些交互合并為一個最終的數字。
Using a Special Move is the best option, as it end up in being a better A/D factor;
使用“特殊移動”是最好的選擇,因為它最終會成為更好的A / D因子。
Knowing the Type Damage Multiplier Factor and the Best A/D Factor, it’s known from the complete damage equation shown before in this article, that the output damage is directly proportional to the product of this two numbers.
從本文前面顯示的完整損傷方程式可以知道類型損傷乘數因子和最佳A / D因子 ,即輸出損傷與這兩個數字的乘積成正比 。

Looking at the Shuckle’s perspective as the attacker, there’s an essential difference: Shuckle has two types, so it’s trainer must decide which move type is the best option to inflict as much damage as possible on the foe’s Charmander;
從Shuckle作為攻擊者的角度來看,有一個本質上的區別: Shuckle有兩種類型,因此,培訓師必須確定哪種移動類型是對敵人的Charmander造成盡可能多傷害的最佳選擇。
For Shuckle, physical attacks are the best option — as Physical A/D Factor is higher than the Special A/D Factor;
對于Shuckle而言,物理攻擊是最佳選擇-因為物理A / D因子高于特殊A / D因子 ;
Considering the interactions of the two types of both Pokémons, there are six possible values for Type Damage Multiplier Factor :
考慮到兩種神奇寶貝的兩種類型的相互作用, 類型傷害倍增因子有六個可能的值:
Type Advantage
類型優勢
4: Both of target’s types are weak against the attacking one;
4 :目標的兩種類型均較弱。
- 2: One of target’s type is weak and the other is neutral against the attacking one; 2:目標的一種類型較弱,另一種對攻擊的類型是中立的;
Neutral Interaction
中性相互作用
1: Either both of target’s type are neutral to the attacking one, or one of the target’s types is strong and the other weak against the offensive one;
1 :目標的兩種類型都對攻擊方中立,或者目標的一種類型在攻擊方中是強而另一種則對攻擊性較弱;
Type Disadvantage
類型劣勢
0.5: One of target’s type is strong and the other is neutral against the attacking one;
0.5 :目標的一種類型強,另一種對攻擊型中立;
0.25: Both of target’s types are strong against the attacking one;
0.25 :目標的兩種類型都強于攻擊型;
0: At least one of the Target’s type is imune to the offensive one;
0 :目標的至少一種對攻擊性免疫。
All of that being said, we have now a good idea of which features to analyze in order to better understand which are the most important factors for a Pokémon to end up a combat as a winner.
綜上所述,我們現在對要分析哪些功能以更好地了解哪些是神奇寶貝最終成為勝利者的戰斗的最重要因素有了一個好主意。
However, none of the valuable information regarding combats were directly given, as the Pokémon Dataset has only individual Pokémon characteristics, and the Combats Dataset only informs which Pokémons where battling and which one was the winner. Then, it was necessary to build an ETL pipeline — extracting all the necessary information from all the three datasets, transform the data through the calculatation of all the valuable features to understand combat winner’s characteristcs (based on the knowledge built on how Pokémon Battles work) and loading everything into a new data set, which I named Pokémon Combats Dataset.
但是,沒有直接提供任何有關戰斗的有價值的信息,因為《Pokémon數據集》僅具有個別的Pokémon特性,而《 Combats Dataset》僅告知哪些Pokémons在哪里作戰,哪一個是獲勝者。 然后,有必要建立一個ETL管道-從所有三個數據集中提取所有必要的信息,通過計算所有有價值的特征來轉換數據,以了解戰斗獲勝者的特征(基于關于神奇寶貝戰斗原理的知識)并將所有內容加載到一個新數據集中,我將其命名為“ PokémonCombats Dataset”。

As well as the complete code for the ETL pipeline and the full data analysis, the resulting Pokémon Combat Dataset is available at the project’s repo at my github profile.
除了ETL管道的完整代碼和完整的數據分析之外,生成的PokémonCombat Dataset也可以在我的github profile的項目倉庫中找到 。
This new data set is composed by 35 features:
此新數據集由35個功能組成:
First_Pokemon — Trainer A’s Pokémon # (index) — (The Pokémon which have attacked first will always be listed as “First_Pokemon”);
First_Pokemon —教練A的神奇寶貝# (索引)((先攻擊的神奇寶貝將始終列為“ First_Pokemon”);
Second_pokemon — Trainer B’s Pokémon # (index);
Second_pokemon —培訓師B的神奇寶貝# (索引);
Winner — # (index) of the Pokémon which have won the battle;
獲勝者 -贏得戰斗的神奇寶貝# (索引);
Base Stats Differences — Difference between base stats for both Pokémon (6 features, one feature per stat: HP, Attack, Defense, Sp. Attack, Sp. Defense, Speed);
基本統計數據差異 -兩個神奇寶貝的基本統計數據之間的差異( 6個功能 ,每個統計數據一個功能:HP,攻擊,防御,特殊攻擊,特殊防御,速度);
Differences Between Effective Stats — Difference of effective stats used for damage calculation for both Pokémon (8 features, Differences between: Attack and HP, Attack and Defense, Sp. Attack and HP, Sp. Attack and Sp. Defense, for the two combinations of Attacker / Target);
有效統計數據之間的差異 —兩種神奇寶貝的傷害計算所用的有效統計數據的差異( 8個要素 ,兩種組合的攻擊和生命值,攻擊和防御,Sp。Attack和HP,Sp。Attack和Sp.Defence之間的差異)攻擊者/目標);
A/D Factors — Effective A/D Factores used in the Damage Equation (4 features, the two possible A/D factors (Physical and Special) for the two possibilities of Attacker / Target;
A / D因子 -傷害方程中使用的有效A / D因子( 4個功能 ,針對攻擊者/目標的兩種可能性的兩種可能的A / D因子(物理和特殊);
Type Damage Multipliers — Type Damage Multiplier Factor (2 features, one for each possibility of Attacker / Target);
類型傷害乘數 —類型傷害乘數因子( 2個特征 ,對每種攻擊者/目標都有一個特征 );
Proportional Damage Factor — Product of the highest A/D factor and type damage multiplier factor (2 features, one for each possibility of Attacker / Target);
比例傷害系數 -最高A / D系數和類型傷害乘數系數的乘積( 2個特征 ,對于攻擊者/目標每種可能性一個);
Damage / HP Factor — Ratio between Proportional Damage Factor and Target’s HP Base Stat (2 features, one for each possibility of Attacker / Target);
傷害/生命值系數 —比例傷害因數與目標的HP基本統計值之間的比率( 2個功能 ,每一種可能的攻擊者/目標);
Pokémon Types — Primary and Secondary (if any) type of each Pokémon (4 features);
神奇寶貝類型 -每個神奇寶貝的主要和次要類型 (如果有)( 4個功能 );
Legendary Pokémon — Flags indicating of any of the Pokémon involved in the combat was a legendary one (2 features);
傳奇的神奇寶貝 -表示參與戰斗的任何神奇寶貝的旗幟是傳奇人物( 2個特征 );
Priority— Flag indicating that the Pokémon witgh the lowest Speed Base Stat have moved first;
Priority(優先級) —表示口袋妖怪擁有最低速度基準狀態的標記已首先移動;
Pokémon 1 has won — Flag indicating that the Pokémon which have striked first ended up winning the combat (This is going to be used as the target variable for the predictive model);
神奇寶貝1已獲勝 -表示先罷工的神奇寶貝最終贏得了戰斗的旗幟(這將用作預測模型的目標變量);
Now, based on this new dataset, combined with the individual Pokémon data from the Pokémon Dataset, it’s possible to perform an Exploratory Data Analysis in order to better understand the most relevant factors that lead to winning Pokémon Battles.
現在,基于此新數據集,再結合神奇寶貝數據集中的各個神奇寶貝數據,可以執行探索性數據分析,以便更好地理解導致神奇寶貝之戰勝利的最相關因素。
探索性數據分析 (Exploratory Data Analysis)
快速回顧一下: (Quick recap here:)
- We know that battles are won inflicting damage over the enemy’s Health Points untill they are reduced to zero;- Damage is a function of many variables, however is directly proportional to the product of two factors: A/D factor and Type Damage Multiplier;- A/D factor is given by the division of Pokémon’s effective stats, which on the other hand, are directly proportinal to the respective Base Stats, that are available for us on the datasets;- Type Damage Multiplier can be calculated using the procedure shown previously on this article;
-我們知道戰斗獲勝會對其敵人的健康點造成傷害 ,直到它們降低到零為止;- 傷害是許多變量的函數,但是與兩個因素的乘積成正比: A / D因素和類型傷害乘數 ; -A / D因子由神奇寶貝的有效屬性除以給定,而有效屬性除以與各自基礎屬性成正比的比例,這些基礎屬性在數據集上可供我們使用;- 類型傷害倍數可以使用所示過程來計算先前在本文上;
Hence, given the Pokémon battle dynamics and the data available, the analysis to understand which are the most valuable characteristics for a Pokémon to have, in order to win combats will be based on the two factors that compose damage: Base Stats (and all the proportional factors one might derive from them) and Type Interactions.
因此,考慮到神奇寶貝的戰斗動態和可用數據,為了贏得戰斗,要分析哪些是神奇寶貝具有的最有價值的特征,將基于構成傷害的兩個因素進行分析: 基本統計數據 (以及所有一個可能從中得出的比例因子) 和類型的相互作用 。
基本統計 (Base Stats)
Which is the most relevant stat for winning combats? Generally, is it worth to use Pokémons with defensive characteristics during your journey, or being offensive is more effective?
贏得戰斗最相關的統計數據是? 通常,在旅途中值得使用具有防御特性的神奇寶貝,還是進攻更有效?
The best way to answer these questions, is to look at the difference of base stats between Pokémons which have won their battles, and the ones which have been defeated:
回答這些問題的最佳方法是,查看贏得戰斗的神奇寶貝與被擊敗的神奇寶貝之間的基本屬性差異:

These distribution plots give us some some valuable information:
這些分布圖為我們提供了一些有價值的信息:
- Generally, as it might be the general intuition, Pokémons with higher Base Stats compared to it’s opponent, end up winning the combat; 通常,由于通常的直覺,與對手相比,擁有更高基本屬性的神奇寶貝最終會贏得戰斗;
Higher offensive base stats (Attack and Sp. Attack) seems to be more relevant than higher defensive stats (HP, Defense and Sp. Defense);
較高的進攻基礎數據( 攻擊和特殊攻擊 )似乎比較高的防御統計( HP , 防御和特殊防御 )更相關;
While the distribution of all other stats are almost symmetric, being only slightly displaced and skewed to the right, the Speed Base Stat is severely right skewed — showing itself as the most relevant base stat for winning combats, by far;
盡管所有其他統計數據的分布幾乎都是對稱的,只是略微偏移并向右偏斜,但速度基準統計數據卻嚴重偏向右-使其成為迄今為止贏得戰斗最相關的統計數據 ;
This result is really interesting, as Speed stats are basically only used to define which Pokémon moves first at the turn. As we might recall from the data explanation on the beggining of this article, the Combat Dataset places as Pokémon #1 the one which have striked first.
這個結果真的很有趣,因為速度統計信息基本上僅用于定義哪個神奇寶貝在轉彎處首先移動。 我們可能從本文開頭的數據解釋中回想起,《 戰斗數據集》將《 神奇寶貝#1 》排在第一位。
However, is not true that Pokémon #1 always had higher speed base stats than Pokémon #2 — and there are many reasons for this:- We are looking at Base Stats, not the Stat itself;- There are in game mechanics that might override the standatd Speed rule for striking first (as Priority Moves, Hold Items, Item usage, Pokémon Switches and Environment Effects) — But there are no details regarding the usage of these mechanics at the datasets;
但是, 口袋妖怪#1總是比口袋妖怪#2擁有更高的速度基礎統計數據是不正確的-這樣做的原因很多:-我們關注的是基礎統計數據,而不是統計數據本身;- 游戲機制中可能存在一些超越首先打擊的標準速度規則(如優先移動 , 持有物品,物品使用,神奇寶貝開關和環境效果)—但是在數據集中沒有關于這些機制的使用的詳細信息;
Then, during the ETL Pipeline, a feature called Prio was created to indicate the combats in which the Pokémon with the lower Speed base stats have striked first.
然后,在ETL管道中 ,創建了一個稱為Prio的功能,以指示戰斗中具有較低速度基礎屬性的神奇寶貝首先發動攻擊。

Looking again at the base stat subtraction between winner and looser Pokémon, one is able to see that, besides the small difference, higher speed stats are still more valuable, even on combats where the Pokémon with lower base stats moved first (Priority = true).
再次查看獲勝者和寬松的神奇寶貝之間的基本屬性減法,人們可以看到,除了細微的差異之外,即使在具有較低基本屬性的神奇寶貝先行的戰斗中,更高的速度屬性仍然更有價值(優先級=真) 。
Attention to the fact that, striking first on the first turn of the battle, does not mean that Pokémon #1 have striked first at all the other turns. In fact, the data shows that having the advantage of moving first every turn by the standart speed rule is the most valuable characteristic for a Pokémon to win combats.
注意在戰斗的第一回合首先發動攻擊的事實,并不意味著神奇寶貝#1在所有其他回合中都先發動了打擊。 實際上,數據顯示, 憑借 神奇的 速度規則,每轉一圈首先移動的優勢 是神奇寶貝贏得戰斗的最有價值的特征。
Hence, the data says that fast and offensive Pokémon are the best choice for your team composition while playing an adventure at actuall Pokémon games.
因此,數據表明,在實際的神奇寶貝游戲中玩冒險游戲時, 快速 進攻的神奇寶貝是組成團隊的最佳選擇。
This covers one side of the damage equation we’ve seen before. How about type interactions, the other proportional factor? How do them affect combat results?
這涵蓋了我們之前看到的損傷方程式的一側。 類型交互如何,另一個比例因子呢? 它們如何影響戰斗結果?
比例損壞因素和類型相互作用 (Proportional Damage Factors and Type Interactions)
Looking at the base stats itself, offensive characteristics have shown to be, generally, more impactfull than defensive ones. But is it always true? How does type interactions affect it?
從基本統計數據本身來看,進攻性特征通常顯示出比防御性更具影響力。 但這總是真的嗎? 類型交互如何影響它?
As we’ve seen during the Pokémon battle dynamics introduction, there are three types of Type Interaction categories, either for offensive or defensive point of views:- Type Advantage — Offensive Multipliers: 4x or 2x | Defensive Multipliers: 0x, 0.25x or 0.5x;- Neutral Interactions — Offensive and Defensive Multipliers: 1x;- Type Disadvantage — (Offensive Multipliers: 0x, 0.25x or 0.5x | Defensive Multipliers: 2x, 4x ;
正如我們在《神奇寶貝》戰斗動態介紹中所看到的那樣,從交互或防御的角度來看,共有三種類型的“ 類型互動”類別:- 類型優勢 -進攻乘數: 4倍或2 倍 | 防御乘數: 0x , 0.25x或0.5x ;- 中立互動 -防御乘數和防御乘數: 1x ;- 類型劣勢 -(攻擊乘數:0 x,0.25x或0.5x |防御乘數: 2x , 4x ;
How impactful are these interactions are on combats? And also, do the most valuable stats characteristcs change at different type interaction scenarios, or being offensive is always the best choice?
這些互動對戰斗有多大影響? 而且,最有價值的統計數據特性是否會在不同類型的交互場景下發生變化,還是進攻總是最好的選擇?

The plot above shows that, type advantages, either offensive and defensive are directly correlated to higher win rate, as the higher the offensive damage multiplier, the higher the win rate is and the the opposite happens to the defensive ones.
上圖顯示,進攻和防守的類型優勢都與更高的獲勝率直接相關,因為進攻傷害倍增系數 越高,獲勝率就越高 ,而防守則相反。
It confirms the intuitive idea of strong relevance of type interactions on winning combats, as they are directly proportional to damage. However it don’t say much about most valuable Pokémon characteristics at different type interaction scenarios.
它證實了類型互動與勝利戰斗有很強相關性的直觀想法,因為它們與傷害成正比。 但是,對于不同類型的交互場景中最有價值的神奇寶貝特性,它并沒有說太多。
In order to better understand this relation between Pokémon characteristcs at favorable or unfavorable type match ups, the box plots below may be more usefull. They represent the winner Pokémon’s A/D factors, Proportional Damage Factor (A/D * Type Damage Multiplier), and the ratio between Proportional Damage Factor and the Enemy’s HP Base Stats (multiplied for 50, to fit the plot better);
為了更好地理解口袋妖怪角色在對位或對位時的這種關系,下面的方框圖可能會更有用。 它們代表獲勝者的神奇寶貝的A / D因子 , 比例傷害因子 ( A / D *類型傷害乘數 )以及比例傷害因子與敵人的HP基本屬性之間的比率( 乘以50,以更好地適應情節 );


Most of these features had huge outliers, hence it was necessary to apply the IQR (InterQuartile Range) Method to get rid of them and proceed the analysis . Complete code and data pre-processing method are available at the original project notebook.
大多數的這些功能有異常巨大的,因此有必要應用IQR( 我 NTER Q uartile [R安格) 方法來擺脫他們,并進行分析。 完整的代碼和數據預處理方法在原始項目筆記本中可用。
These box plots reveals something really interesting:
這些箱形圖揭示了一些非常有趣的東西:
Offensive characteristics are even more valuable for a Pokémon to win battles on type advantage scenarios (either offensive or defensive), compared to the general situation (analysis of the entire set);
與一般情況(對整個場景的分析)相比, 進攻性特征對于神奇寶貝在類型優勢 場景 (進攻性或防御性)上贏得戰斗的價值更大 。
However, this is not true for type disadvantage situations. Indeed, the very opposite happens: Defensive characteristics become the most desirable ones at this type match up;
但是,對于類型不利的情況,情況并非如此。 確實,恰恰相反的事情發生了:在這種類型的比賽中, 防御性特征成為最可取的特征 。
Neutral interactions on the other hand are quite similar to the general situation;
另一方面, 中立的相互作用與一般情況非常相似 。
Merging all the information we have so far, we have that:- Type advantage enhances the probability of winning combats significantly;- Offensive characteristics are the most valuable ones at neutral and favorable type matchups;
合并到目前為止我們獲得的所有信息,我們可以:-類型優勢顯著提高了贏得戰斗的可能性;-進攻特征是中立和有利類型對決中最有價值的特征;
Combining this information with the insigths from the Base Stats analysis, one may conclude that:
將這些信息與《基本統計》分析得出的結論相結合,可以得出以下結論:
The best team composition must be based of not only fast and offensive Pokémons, but also Pokémons with types that have a higher number of favorable, or at least neutral match ups.
最好的團隊組成不僅必須基于快速且具有進攻性的神奇寶貝,而且還必須基于具有更多有利(或至少是中立)對決類型的神奇寶貝。
But, which are the best types at this metrics?
但是,按此指標,哪種類型最好?
類型比賽 (Types Match Ups)
Which types offer the higher number of favorable match ups?
哪些類型的比賽對戰次數更多?
The best way to start investigating this , is to count up the number of favorable and unfavorable type matchups directly from the types dataset.
開始對此進行調查的最好方法是直接從類型數據集中計算有利和不利類型匹配的數量。

The barplot above shows that the number of favorable and unfavorable match ups is not even close to be equally balanced for all the eighteen Pokémon types.
上面的小節顯示,在所有18種神奇寶貝類型中,有利和不利對決的數量甚至都沒有達到均等的平衡 。
This could be the only plot one would need for this analysis if the number of Pokémons per type were equally distributed. But, is it true?
如果每種類型的神奇寶貝數量均等分布,則這可能是該分析所需的唯一情節。 但是,是真的嗎?
This is the next point to be checked:
這是要檢查的下一點:

Is clear that one can’t use the type match up counts plot directly as reference to understand which types could offer the higher coverage, as the number of Pokémons per type is heavily unbalanced.
很明顯,由于每種類型的神奇寶貝數量嚴重不平衡 ,因此不能直接使用類型匹配計數圖來了解哪些類型可以提供更高的覆蓋率。
Hence, instead of using the count of favorable or unfavorable type match ups directly, one should use the percentage of Pokémons these match ups represents:
因此,與其直接使用有利或不利類型對戰的次數,不如使用這些對戰所代表的神奇寶貝的百分比:

Offensive Type Advantage
進攻型優勢
Fighting and Ground, are the best types in terms of offensive advantage match up counts, having 5 favorable match ups each. However, looking at the highest percentage of Pokémon being under a favorable offensive match ups, Ice is the best one , being super effective against 39.2% of the Pokémons from this dataset;
就進攻優勢對局數而言, 格斗和地面是最好的類型,每個都有5個有利對局。 但是,如果在有利的進攻對局中查看神奇寶貝的最高百分比,那么Ice就是最好的,它可以有效地對抗此數據集中39.2%的神奇寶貝;
As Normal type has no advantages, the coverage percent is zero (0%) as well, being the worst type at both metrics;
由于“ 普通”類型沒有優勢,因此覆蓋率也為零( 0% ),在兩個指標上均為最差的類型;
Defensive Type Advantage
防御型優勢
Steel is the best type at both metrics, with 11 defensive favorable match type match ups and 67.38% in terms of percentage of Pokémon;
鋼鐵是兩個指標中最好的類型,有11個防守有利的比賽類型對決,神奇寶貝占67.38% 。
Ice and Normal types are only resistent to 1 type, however Ice is by itself at the bottom of the list in terms of percentage, with onlt 4.75%, as it only resists to itself, and it is the rarest type in the Pokémon world (considering these datasets);
冰和普通類型只有性能穩定的1分型,但是冰是通過自身的按百分比計算的列表的底部,與onlt 4.75%,因為它不僅抵抗它本身,而它 是神奇寶貝世界中最稀有的類型(考慮到這些數據集);
Offensive Type Disadvantages
進攻型劣勢
Grass is the worst type at both metrics, with 7 unfavorable offensive match ups and being offensively weak against 61.62% of the Pokémons;
在這兩個指標上, 草是最差的類型,有7個不利的進攻比賽,對61.62%的神奇寶貝而言進攻能力弱。
Dragon, Ghost and Fairy are the best ones in terms of match up count, being offensively weak against 2 types each, while Dragon is alone at the top in terms of percentage, being offensively weak against only 11.85% of the Pokémons;
龍,鬼 , 妖精都是最佳的比賽中向上計數而言,是對每2種進攻疲軟, 而龍在比例上僅排在首位,在進攻能力上僅對11.83%的神奇寶貝有所弱化;
Defensive Type Disadvantage
防守型劣勢
Rock is only the worst type regarding unfavorable defensive match up counts, with 5 weaknesses. The type at the very bottom when one’s looking at Pokémon defensive weaknesses percentage is Grass: 47.5%;
對于不利的防守比賽, 搖滾是最糟糕的類型,有5個弱點。 看著神奇寶貝的防守劣勢百分比時,最底層的類型是格拉斯 : 47.5% ;
Electric and Normal types have only 1 weakeness each. However, Electric keeps the spot as the best one at the percentage metric, being weak against only 6.25% of the all Pokémon;
電氣和普通類型僅具有1個弱點。 但是,以百分比衡量, 電氣公司仍然是最好的公司,僅占全部神奇寶貝的6.25% ;
As we’ve seen that favorable type match ups enhanced the probability of winning a Pokémon combats, when looking at the win ratio per type, one expects to see the types with the best match up percentages as the ones with the highest win ratios, as long as the combat dataset is really random, and the frequency of Pokémon Types at the recorded combats follow the same distribution of Pokémons per type.
正如我們已經看到的, 有利的類型匹配提高了贏得神奇寶貝戰斗的可能性 ,當查看每種類型的獲勝率時,人們希望看到具有最高匹配率的匹配類型是具有最高獲勝率的類型 ,因為只要戰斗數據確實是隨機的 ,并且所記錄的戰斗中神奇寶貝類型的頻率遵循每種類型的神奇寶貝分布相同 。

The barplot above shows that, the types of Pokémon involved on the 50.000 recorded combats follow the same distribution of Pokémons per type, confirming that the combats were indeed chosen at random, hence one might indeed have the intuition of seeing the types mentioned above as the ones with the highest win rates.
上面的小節顯示,記錄的50.000場戰斗中所涉及的神奇寶貝的類型遵循每種類型的神奇寶貝的相同分布,證實了這些戰斗的確是隨機選擇的,因此人們可能確實有直覺將上述類型視為獲勝率最高的人。
每種類型的獲勝率 (Win Rate per Type)
Are the types that have shown the best coverage in terms of match ups, really the ones with the highest win ratios?
在比賽對局方面表現最佳的類型,真的是獲勝率最高的類型嗎?
In order to check it, let’s have a look at a barplot, showing the win rate per type.
為了檢查它,讓我們看一個條形圖,顯示每種類型的獲勝率。
As Pokémons have Primary and Secondary types, it would be also interesting to see how this characteristics may affect win ratios. Additionally it might be usefull to understand if there are some types which are stronger by itself, and others that might work better as a complement.
由于神奇寶貝具有主要和次要類型,因此查看此特性如何影響勝率也將很有趣。 另外,了解是否存在某些類型本身會更強一些,而另一些類型可能會更好地互補可能會很有用。


The result is quite surprising: higher win rates are not strongly correlated at all to higher favorable type match ups percentages. Then, which is the most relevant factor for a type to have higher win rates?
結果是非常令人驚訝的:更高的獲勝率與更高的有利類型比賽比率根本沒有密切相關 。 那么,哪種類型具有較高的獲勝率是最相關的因素?
Following the initial ideia of this analysis, if type interactions is not the impactfull factor for the win rates, one shall check the base stats. Hence, the next step is to investigate the average base stats of each type, and find out if types have some stats characteristics embedded to them.
根據這一分析的最初想法,如果類型交互不是獲勝率的影響全數因素,則應檢查基本統計數據 。 因此,下一步是研究每種類型的平均基本統計信息,并確定類型是否具有嵌入其中的某些統計信息特性。
每種類型的平均基礎統計 (Average Base Stat per Type)
Do Pokémon types have some specific base stats characteristic embedded into them? Is there some types more focused in Offensive stats, while other are more focused in Defensive ones?
神奇寶貝類型中是否嵌入了某些特定的基本屬性? 是否有些類型更側重于進攻統計,而其他類型更側重于防守統計?
We are going to start investigating stats characteristics per type by finding out which are the best types at each stat (in terms of average base stats), and comparing them visually through bar plots:
我們將開始調查每種類型的統計信息特征,方法是找出每種統計信息中最好的類型(就平均基本統計數據而言),然后通過條形圖直觀地比較它們:







These seven barplots above show the average base stats value per type, for all the six stats (HP, Attack, Defense, Sp. Attack, Sp. Defense and Speed), and also for the total base stats, which is the sum of all base stats together.
上面的這七個條形圖顯示了所有六個統計信息 ( HP,攻擊,防御, Sp。Attack,Sp。Defense 和Speed )以及所有基礎統計信息的總和,即每種類型的平均基礎統計信息的值。基本統計數據一起。
By visually checking these plots, it’s not so easy to figure out some base stats characteristics or patterns per type, however it’s possible to notice something really interesting: The types which have the highest win rates are the ones with the highest average speed base stats, and also the worst ones in terms of win rates are the average slowest ones. They don’t follow the exact same order, but this is an important clue for the task of determining the most important factors for a type to have higher win rates.
通過直觀地查看這些圖,找出每種類型的一些基本統計數據特征或模式并不容易,但是可能會發現一些非常有趣的東西: 獲勝率最高的類型 是平均速度基本統計數據最高的類型 , 在獲勝率方面最差的也是平均最慢的 。 他們并沒有遵循完全相同的順序,但這對于確定具有較高獲勝率的類型的最重要因素的任務來說是重要的線索。

In order to figure out some patterns on the average base stats per type, one might want to look at a different kind of plot, in which is possible to visualize numerical values of all the six stats at the same time. Radar plots might fit this role quite well.
為了弄清楚每種類型的平均基本統計信息的某些模式,可能需要查看另一種圖,其中可以同時可視化所有六個統計信息的數值。 雷達圖可能非常適合此角色。


These Radar Plots above made it clear that, there’s indeed some stats characteristics embedded in each Pokémon type:
上面的這些雷達圖清楚地表明,每種神奇寶貝類型中確實都嵌入了一些統計數據特征:
Fighting types for example tend to be foccused on physical attacks (high average Attack base stats, while other stats are at lower levels);
例如, 戰斗類型往往側重于物理攻擊(平均攻擊基礎統計數據較高,而其他統計數據處于較低水平);
Psychic types on the other hand have average high Sp. Attack and Sp. Defense base stats, while are the others are not so high;
另一方面, 心理類型的平均Sp高。 攻擊和Sp。 國防基礎數據,其他數據不是很高;
The plots also illustrate better the discrepancy between some types average base stats. Example given: By comparing the Bug and Dragon radar plots, it’s possible to notice the huge superiority of Dragon Pokémons over Bug ones on every single stat.
該圖還更好地說明了某些類型的平均基本統計數據之間的差異。 給出的示例 :通過比較Bug和Dragon雷達圖,可以發現在每個統計數據上DragonPokémons比Bug的巨大優勢。
As mentioned above, it’s easy to notice that higher average speed base stats is highly correlated to higher win rates, however besides the fact of the top and bottom of these two ranks be populated by the same Pokémon types, the order they appear are not the same — hence, there are some other relevant factors for a type to have a high win rate at battles.
如上所述,很容易注意到較高的平均速度基準數據與較高的獲勝率高度相關,但是,除了這兩個等級的頂部和底部均由相同的神奇寶貝類型填充外,它們出現的順序不是相同-因此,在戰斗中獲得高勝率的類型還有其他一些相關因素。

Most correlated type features with Win Rate at battles are:
戰斗中與勝率最相關的類型特征是:
Average Speed Base Stat — Correlation: 0.96;
平均速度基準統計資料 -相關性:0.96;
Average Sp. Attack Base Stat — Correlation: 0.53;
平均Sp。 攻擊力基礎統計 -相關性:0.53;
Average Total Stats — Correlation: 0.51;
平均總統計資料 -相關性:0.51;
Average HP Base Stats — Correlation: 0.41;
平均惠普基本統計數據 -相關性:0.41;
Average Attack Base Stats — Correlation: 0.35;
平均攻擊基礎統計數據 -相關性:0.35;
Offensive Type Advantage (Percentage) — Correlation: 0.30;
進攻類型優勢(百分比) -相關性:0.30;
As said before, the best team composition must be based of not only fast and offensive Pokémons, but also must have Pokémons with types that have a higher number of favorable, or at least neutral match ups.
如前所述,最佳團隊組成不僅必須基于快速且具有進攻性的神奇寶貝,而且還必須具有類型具有更高數量(至少是中立)的神奇寶貝 。
However, it’s known now that favorable base stats (specially Speed, Sp. Attack, HP and Attack) and high total base stats are more valuable than type advantages if one is looking to build an efficient Pokémon team at battles. Although, offensive favorable type match ups also have shown it’s importance through data, so one shall also foccus on having a wide offensive type coverage while building the team.
但是,現在知道,如果人們希望在戰斗中組建一支高效的神奇寶貝團隊,那么有利的基礎統計數據(特別是Speed , Sp。Attack , HP和Attack )和較高的總體基礎統計數據比類型優勢更有價值。 盡管進攻型有利的比賽也通過數據顯示出了重要性,所以在組建團隊時也應著眼于廣泛的進攻型報道 。
利用從數據中獲得的知識來建立神奇寶貝團隊 (Building a Pokémon Team using the knowledge acquired from data)
Now, we’re going to use all the information gathered during the data analysis to build a Pokémon team, looking for the highest efficiency at winning combats.
Now, we're going to use all the information gathered during the data analysis to build a Pokémon team, looking for the highest efficiency at winning combats.
As the first generation of Pokémon (the one which brought Pikachu, Charizard, Blastoise and Venusaur)might be the most remarkable one for most of people, specially the ones whom have been born during the 90’s, I’m going to limit the scope just to Pokémons released at this generation. Neither Legendary Pokémon or Mega Evolutions will be included, as they normally have base stats way higher then “ordinary” Pokémon.
As the first generation of Pokémon (the one which brought Pikachu , Charizard , Blastoise and Venusaur )might be the most remarkable one for most of people, specially the ones whom have been born during the 90's, I'm going to limit the scope just to Pokémons released at this generation. Neither Legendary Pokémon or Mega Evolutions will be included, as they normally have base stats way higher then “ordinary” Pokémon.
Based on the Exploratory Data Analysis using all the datasets mentioned on this article, the criteria to choosing the team members are, listed by order of priority:
Based on the Exploratory Data Analysis using all the datasets mentioned on this article, the criteria to choosing the team members are, listed by order of priority:
#1 — Pokémon with High Speed Base Stats;
#1 — Pokémon with High Speed Base Stats;
#2 — Pokémon with Offensive Characteristics (Either Attack or Sp. Attack);
#2 — Pokémon with Offensive Characteristics (Either Attack or Sp. Attack);
#3 —Type combination of all Pokémon on the team shall have a wide range of favorable offensive type match ups (One shall avoid repeating types as much as possible);
#3 —Type combination of all Pokémon on the team shall have a wide range of favorable offensive type match ups (One shall avoid repeating types as much as possible);
As it’s usual on all the Pokémon games, our team is going to be composed of six Pokémon.
As it's usual on all the Pokémon games, our team is going to be composed of six Pokémon.
Running a simple query at the Pokémon Dataset, we have:
Running a simple query at the Pokémon Dataset , we have:

Strictly following the established criteria, one end up with the following Pokémon team:
Strictly following the established criteria, one end up with the following Pokémon team:







Image creditsType badges; Pokémon Sprites
Image credits Type badges ; Pokémon Sprites
Would you try this Pokémon team on your next game run?
Would you try this Pokémon team on your next game run?
All Pokémon included on this list have solid Speed Base Stats (Arcanine being the worst, with a value of 95), have also solid Offensive Base Stats (Physical, Special or even both — Arcanine for example) and do not overlap types (besides Starmie’s secondary type, which overlaps with Alakazam’s), reaching the mark of 13 favorable offensive type match ups, from a total of 18. Sounds promising, right?
All Pokémon included on this list have solid Speed Base Stats ( Arcanine being the worst, with a value of 95 ), have also solid Offensive Base Stats (Physical, Special or even both — Arcanine for example) and do not overlap types (besides Starmie's secondary type, which overlaps with Alakazam’s ), reaching the mark of 13 favorable offensive type match ups , from a total of 18. Sounds promising, right?
下一步 (Next Steps)
Machine Learning model to predict winners on Pokémon Battles
Machine Learning model to predict winners on Pokémon Battles
As the original challenge on the Kaggle post was to build a machile learning model which is able to predict which Pokémon would win a combat, given the two combatant Pokémon, this will be the next step!
As the original challenge on the Kaggle post was to build a machile learning model which is able to predict which Pokémon would win a combat, given the two combatant Pokémon, this will be the next step!
We’ll use the new Dataset derived from the ETL Pipeline described here, and also all the knowledge developed about Pokémon Combats to model and train a Machine Learning Model, and all the details will be fully described on a next article.
We'll use the new Dataset derived from the ETL Pipeline described here, and also all the knowledge developed about Pokémon Combats to model and train a Machine Learning Model, and all the details will be fully described on a next article.
Additionally, once the model is fully trained, it will also be possible to check the efficiency of this Pokémon Team built using Data Analysis!
Additionally, once the model is fully trained, it will also be possible to check the efficiency of this Pokémon Team built using Data Analysis !
Author’s Note: For all the Pokéfans over there — It’s important to bear in mind that all the combats used to build up this analysis are probably in game battles, recorded during a run, battling with wild Pokémons and NPCs (there’s no information about the origin and source of combats). Competitive Pokémon scenarios are completely different, and are not the foccus here.
Author's Note: For all the Pokéfans over there — It's important to bear in mind that all the combats used to build up this analysis are probably in game battles, recorded during a run, battling with wild Pokémons and NPCs (there's no information about the origin and source of combats). Competitive Pokémon scenarios are completely different, and are not the foccus here.
翻譯自: https://towardsdatascience.com/your-favorite-game-from-a-different-point-of-view-93732173adf7
從最終用戶角度來看外部結構
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/388447.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/388447.shtml 英文地址,請注明出處:http://en.pswp.cn/news/388447.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!