import { thesisMediaPath } from '../../4_utils/functions';
import styles from '../../5_assets/styles/ThesisTimeline.module.css';
import CodeBlock from '../CodeBlock';
import React from 'react';

function CreateAI() {
	return (
		<details className={styles.thesisTimeLine}>
			<summary style={{ fontSize: '2em', fontWeight: 'bold' }}>Create AI</summary>
			<p>As the progress of the thesis can be separated into two parts. I can say now that the first part is over with making a fully functional game that satisfies the requirements. Now the second part, making AI that can reach a high score in it, despite the difficulties that are made with the ball surviving through the game more, such as, changing the amplitude and the increased speed.</p>
			<p>There are a lots of algorithms, that can be used to develop such a neural network that is capable of this task. Changing the weights and biases without me interfering in it. I feel like I'm going over lots of expressions here that needs to be clarified first.</p>
			<h3>101 AI</h3>
			<p>The way for AI to work is try to make mimic the way of human brain to work, but this means that the human brain itself is going to develop such an intelligence that is overcoming it ?!?!. It raises some red flags here, but the closest it can get to (at least on my humble machine) is make a neural network capable of solving only one task. That is totally different from the human brain being capable of doing multiple tasks in short period of time, like taking input from senses and giving the output in action without mentioning controlling your breathing and heart beat without you thinking (now you are thinking about them?).</p>
			<p>
				To simplify the process of a human brain. Since a young age when you were a child, you got to learn that something is dangerous or safe, right or wrong, by trying and then learning from your own mistakes, that is called <strong>Reinforcement learning</strong> &nbsp;
				<a className="animated-link" target="_blank" href="[Supervised vs Unsupervised vs Reinforcement Learning | Intellipaat](https://intellipaat.com/blog/supervised-learning-vs-unsupervised-learning-vs-reinforcement-learning/)">
					types of AI
				</a>
				, and this type is the main point in making the AI.
			</p>
			<h4>Life example</h4>
			<p>
				Given example of a child trying to kick the ball, this is the first time a child sees a ball, and doesn't know <strong>yet</strong> what to do with it. The child here is called an agent, and the football is the environment that is trying to solve or in this case, kick it as far as it can be (which they don't know yet). The agent try to touch the ball but it isn't the goal they are seeking to reach (trying over time), after some time of hitting it harder, they realise it won't hurt them (as a punishment) and if they score goal it is good (as a reward). &nbsp;
			</p>
			<p>That is the same way AI is learning in this game. The ball (which is agent) is trying to survive as much as it can between the two waves (environment) but it doesn't know if touching any of waves will end it or not. The ball try to use one of the three controls it have (left, centre, right) and if it goes totally left, it dies, same as right, but if it stays in the middle, it will survive the most, at least for now.</p>
			<p>
				As mentioned in &nbsp;
				<a className="animated-link" href="#display-score">
					counter part
				</a>
				, there is a counter in the game, it will be used as what is called &quot;fitness function&quot;. A &quot;Fitness function&quot; measures how good is our genome doing with a good input for the ball to give a proper output of movement, so the ball survive as far as it can, here is another unknown expression.
			</p>
			<h4>Fitness function</h4>
			<p>There must be a way to reward or punish the AI according to its behave during the game. The more the ball survives the wave, the higher the fitness score for it will get (better quality it is), meanwhile, if it dies on early stages, the fitness score for it will be reduced drastically as a punishment and also be lower in the hierarchy of genomes in the generation. The possibility for low fitness-genome to be mutated into future generation will be lowered.</p>
			<p>Applying this method in the game here, there is a way of reward. If the ball stays in the middle between the two waves, that is for the sake if there is an immediate change in the wave curve, the ball is less venerable to hit into it.</p>

			<CodeBlock
				language="Python"
				code={`  leftDis = int (output[0])
        rightDis = int (output[2])
        if(len(str(leftDis)) and len(str(rightDis)) &gt; 2 ):
            if round(leftDis, -1) == round(rightDis, -1):
                fitness += 2`}
			/>
			<p>Explained from inside out. The fitness is increased by 2, if the rounded output for left distance== rounded output to right distance (the ball is in the middle), but first it checks if both of them are more than 2 two digits so it can round if the numbers are hundreds only.</p>
			<p>As for the punishment. If the genome dies at the beginning of where it was, and without any movement from it to survive, the fitness score will be reduced. The point here is try to find the good punishment for it and not over do it, so it can be balanced.</p>
			<pre>
				<code className="language-python" lang="python">
					if(fitness &lt; 300): fitness -= 20
				</code>
			</pre>
			<p>There was an early implementation, to add reward in case the ball takes the output of staying in the middle and not wiggle around with right and left only. But the ball managed to overcome it and also waggle (as it sought a change in the input more than the reward it would get), so the other way, was to make the distance equal on both left and right sides is the reward.</p>
			<h4>Neural network</h4>
			<p>
				In order to take a decision, there is an input, then an amount of process are made in between to have an output. This is what is called neural network, it consists of layers that modify the input along the way to decide what it will be at the end of it, as an output. The small fundamental creating piece for NN is called a <strong>neurons</strong> , same as in human brain (but not the same amount!). There are input neurons that are found in the first layer, then processed to the second layer and modified to the third one, with the summation being in the output layer.
			</p>
			<p>
				<img style={{ maxWidth: ' 50vw' }} className={styles.timeLineMedia} src={thesisMediaPath('nn.png')} alt="how it looks ?"></img>
			</p>
			<h4>Weight and biases</h4>
			<p>
				In the early stages of the ball when it is learning how to control its movement, the first steps are either going totally to left or right. Which isn't really paying off to make it go far. Each neuron have to tweak the parameters it gets from the one before it, the weights and biases are the one responsible for this, going back to the child example. The brain consists of little neurons that are responsible of making every decision the child take, it changes over time, and it is the case here. The brain changes a little bit on the decision and try different approach to the problem. One time hit the ball hard to left side, one time with little power to right side. &quot;
				<strong>Weights</strong> control the signal (or the strength of the connection) between two neurons. In other words, a weight decides how much influence the input will have on the output.&quot; &nbsp;
				<a className="animated-link" rel="noreferrer" target="_blank" href="https://machine-learning.paperspace.com/wiki/weights-and-biases">
					ref
				</a>
			</p>
			<h4 id="activation-function">Activation function</h4>
			<p>The connection between neurons and each others, have weights and biases to alter in them. Let's say that there is one neurons that the value from it isn't important and can be used when the input is different, even to say that it doesn't exist at all, but only to one specific neuron in the layer after it. If it was to be removed, it would change the value to all the neurons after it.</p>
			<p>That is what activation function is for. Change the value for one specific neuron and decide if it will have more (or less) impact on the neuron that is after it with the connection in between. As the name state &quot;Activation&quot;, the value for it is between 0 and 1 for each neuron.</p>
			<p>
				You might ask &quot;why use an Activation function, isn't it changing in the numbers as the same way the weights and biases are doing ?&quot; Yes and no. The weights and biases change the number <strong>that is going from one neuron</strong> to another, but activation function <strong>change the effect</strong> of the neuron to the one after it.
			</p>
			<h4>Genome</h4>
			<p>All the info about the previous parts above can be collected into one thing here. The genome is the collected part for all of this, it works on neural network that relies on weight and biases to change the input until output and uses a fitness function. One population exists of a pre-defined number of genome inside of it, it can be called also children of the generation.</p>
			<p>&nbsp;</p>
			<h3>Tweak AI</h3>
			<p>
				After the base was made for the algorithm to work properly, there are tweaks that had to be made to it. The algorithm itself takes the input from the <code>config.txt</code> file, there are some expressions that are needed to be explained in the file to know what to change.
			</p>
			<CodeBlock
				language="Python"
				code={`[NEAT]
        fitness_criterion     = max
        fitness_threshold     = 1000000
        pop_size              = 30
        reset_on_extinction   = False`}
			/>
			<p>These are the most important values to look for in this file. They define when the algorithm will stop, and the size for it.</p>
			<ol>
				<li>
					<code>fitness_criterion</code> state when will the algorithm stop regarding the threshold. There are three values for it <code>min</code>, <code>max</code>, and <code>mean</code>. If it is <code>max</code>then it will stop once there is at least one genome which managed to reach the threshold and define it as the winner then terminate the process.
				</li>
				<li>
					<code>fitness_threshold</code> is the high score that the algorithm is learning to reach to. The value written in the file equals 5000 points in the game.
				</li>
				<li>
					<code>pop_size</code> how many genomes can be in one population, in case it is more, the learning time for the algorithm will be increased.
				</li>
			</ol>
			<CodeBlock
				language="Python"
				code={`[DefaultGenome]
        # node activation options
        activation_default      = relu
        activation_mutate_rate  = 1.0
        activation_options      = relu`}
			/>

			<p>
				Going to the activation function, the one that define the effect from one neuron to another, the value is between 0 and 1 more info about it in <a href="#activation-function">Activation function</a> section. &nbsp;
			</p>
			<h3>Observation</h3>
			<p>At first time running the algorithm, the input vision for the ball to the wave, was from the centre of ball to both sides of the wave as a one point. That made it hard for the ball to find its way (or have a futuristic vision you can say). There would be a change in the wave curve and the ball can't detect it, so an idea came of having more range for the ball to see. It would calculate the distance between the ball rectangle to the left or right side of wave in more than one point.</p>
			<p>To get more into it with numbers, there would be a notice of the ball having a weird sense of getting to know its &quot;new&quot; sense of wider vision diameter. The ball would take about 20 generations just to start moving in more random left and right. This is made when the ball had only the vision of its 24-pixel diameter (12px as radius) then extra step of plus 50 pixels, getting this info in nutshell</p>
			<ul>
				<li>One point vision: good as start and better CPU wise.</li>
				<li>Diameter vision + 20: best one in score yet (117 points in generation 89).</li>
				<li>Diameter +50 points: No learning even after nearly 300 generation.</li>
			</ul>
			<p>The reason to increase the vision for the ball (even though it was working fine) is I wanted to test how long would it take from the ball to get used to the new (increased) amount of lines. I can tell you it took long long enough.</p>
			<p>At this point of the game, I implemented the increased speed of wave *4 that improved the learning speed. With normal fps, it would take 8 hours 15 min for 37 generation, but the new one (with limitation to 4 times speed, not more) takes 5 hours 15 second for 100 generation to work. That is 4 times the normal running time of normal pace of game for a human to play it, and the generation threshold to be 300 generation instead of 100, let the laptop run as much as it needs. It took more than 15 hours to finish 294 generation and 11 genome, when went back to check the log, none of them managed to pass 3000 fitness score. That means that none of them had a good intuition of the lines to move left or right and at least overcome one curve in the wave. From this, the amount of numbers increased = more time in training.</p>
			<video className={styles.timeLineMedia} style={{ maxWidth: ' 20vw' }} preload="auto" autoPlay muted loop="True">
				<source src={thesisMediaPath('AI(12.28).webm')} type="video/webm" />
			</video>
			<p style={{ textAlign: 'center', fontSize: '13px', fontStyle: 'italic' }}>from 2022.12.28 second training session</p>
			<p>
				There is a small box that is shown around the ball, it is called ballRect and mentioned a lot in the <a href="#count-distance">Count Distance</a> and <a href="#collision">Collision</a>. ballRect is shown to check if the genome did terminate for an actual collision or because it reached the threshold, like the case here in this video.
			</p>
			<p>In order to save as much as CPU power during the learning process, the box is shown only when fitness is over 50. In addition to some extra visuals in the game, such as the particles behind the ball, but all of them can be viewed again with a key for each one:</p>
			<ul>
				<li>
					<code>v Key</code> to shown <strong>v</strong>ision
				</li>
				<li>
					<code>b key</code> to show the <strong>b</strong>allRect
				</li>
				<li>
					<code>p key</code> to show the <strong>p</strong>articles
				</li>
			</ul>
			<h3>Explain the log</h3>
			<p>During the runtime of AI training session ,the algorithm displays statistics at the end of every generation. To have a better insight of what is going behind the curtains, the one here is taken from training session on the 2nd of Jan 2023.</p>
			<CodeBlock
				language="plaintext"
				code={`****** Running generation 99 ****** 
    14 reached the threshold
    genome number 15 = 57S with fitness 82
    highest fitness now is 20206 generation 37 genome 1
    Population's average fitness: 3039.20000 stdev: 4602.73985
    Best fitness: 20201.00000 - size: (4, 7) - species 1 - id 2564
    Average adjusted fitness: 0.140
    Mean genetic distance 1.803, standard deviation 0.759
    Population of 30 members in 2 species:
       ID   age  size  fitness  adj fit  stag
      ====  ===  ====  =======  =======  ====
         1   99    16  20201.0    0.146    62
         2   99    14   8545.0    0.135    78
    Total extinctions: 0
    Generation time: 324.005 sec (364.069 average)
    Saving checkpoint to neat-checkpoint-99 `}
			/>
			<p>
				There are only <strong>three</strong> lines in this log that I made them to be printed during the process. It would make it easier to check the genome behaviour from the recorded video, and they are:
			</p>
			<ul>
				<li>line 2: says that from this generation, only genome number 14 managed to reach the threshold that was 101 points.</li>
				<li>line 3: there is an if statement, to output genomes running time that exceed a specific score (was 50).</li>
				<li>line 4: store the highest fitness of the genome from the beginning of session. It got the first threshold so I can know if my input is being optimized with every generation or not.</li>
			</ul>
			<p>
				Starting from line 5, all that comes, is made by the <code>reporting()</code> class in the algorithm source code. What will be written here is try to explain every part of it (and heavily taken from &nbsp;
				<a className="animated-link" rel="noreferrer" target="_blank" href="https://neat-python.readthedocs.io/en/latest/glossary.html">
					Glossary — NEAT-Python 0.92 documentation
				</a>
				):
			</p>
			<ul>
				<li>
					line 5 =&gt; <code>stdev</code>: is the standard deviation of each genome to the over of all mean fitness in the generation, so higher it is, the more difference there is &nbsp;
					<a className="animated-link" rel="noreferrer" target="_blank" href="https://www.youtube.com/watch?v=WVx3MYd-Q9w">
						How to Calculate Standard Deviation - YouTube
					</a>
					, &nbsp;
					<a className="animated-link" rel="noreferrer" target="_blank" href="https://www.youtube.com/watch?v=MRqtXL2WX2M">
						Standard Deviation - Explained and Visualized - YouTube
					</a>{' '}
					&nbsp;
				</li>
				<li>
					line 6=&gt;<code>species 1</code> &quot;Subdivisions of the population into groups of similar (by the &nbsp;
					<a className="animated-link" rel="noreferrer" target="_blank" href="https://neat-python.readthedocs.io/en/latest/glossary.html#term-genomic-distance">
						genomic distance
					</a>
					&nbsp; measure) individuals (
					<a className="animated-link" rel="noreferrer" target="_blank" href="https://neat-python.readthedocs.io/en/latest/glossary.html#term-genome">
						genomes
					</a>
					), which compete among themselves but share fitness relative to the rest of the population. This is, among other things, a mechanism to try to avoid the quick elimination of high-potential topological mutants that have an initial poor fitness prior to smaller “tuning” changes&quot;.
				</li>
				<li>
					line 8=&gt; <code>Mean genetic distance</code> is measurement to the difference (or tweaks) that have been made in the genomes of this generation to their parents from previous generation. As they might have been populated from parents that aren't in the previous generation exactly.
				</li>
				<li>
					line 16=&gt; <code>Saving checkpoint to neat-checkpoint-99</code>: so I can come back and get e live feed from the same generation. It is only valid with the same settings that were used during running it first time, any changes on the configuration will make it unusable. That is why recording the sessions with OBS was useful.
				</li>
			</ul>
		</details>
	);
}

export default CreateAI;
