Document

CITS3001 Algorithms and Artificial Intelligence

珠玉2024-04-20 15:31:28

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: _0206girl

1 Project Overview

In this project, you will develop AI agents to control the iconic character Mario in the classic game Super Mario Bros using the gym-super-mario-bros environment. The main objective is

to implement at least two distinct AI algorithms/methods and compare their performance, strengths, and weaknesses in the context of playing the game.

You can undertake this project in teams of 2 that you select, if you are looking for a partner please reach out to your lab demonstrators by emails

1.1 Requirements

1.1.1 Gym-Super-Mario-Bros Environment Setup

• Set up the gym-super-mario-bros [2] environment on your local machine or any desig nated platform, see 4.2.1Environment Creation.

As gym-super-mario-bros requires a graphics environment it will not be possible to run it through WSL, so a Windows native python installation will be required. Macos and Linux users should have no issues.

1.1.2 AI Algorithm Implementations

• Choose and experiment with at least two AI algorithms. You may wish to consider the following:

– Reinforcement Learning: Q-learning [3], TD(λ) [4]

– Rule-Based AI: logic and heuristics.

– Monte Carlo Tree Search (MCTS) [5]

You are welcome to use more advanced algorithms that utilise deep learning such as DQN’s [6] or Proximal Policy optimisation etc. but these are not covered in the unit and lab facilitators may not be able to assist with your implementations. These algorithms will also have to be referenced in your project report.

You are allowed to use existing implementations such as those from stable baselines however you are required to implement at least one of the algorithms yourself.

For example the following combinations of algorithms would be allowed (this list is not exhaustive):

• Hand Implemented Rule based agent and PPO from Stable Baselines

• Hand Implemented TD(λ) and Hand Implemented DQN

but comparing PPD and DQN from stable baselines would not be allowed.

You are welcome to used utilities and libraries from Stable Baselines in your own imple mentations, just make note of them in your report. An algorithm from stable baselines can be counted as hand implemented if sufficient fine tuning, adjustments or optimisations have been made for the Super Mario Bro’s environment, but you will have to make note of these

in your report and you may be required to explain them in an interview see 2.

For example if you were to take the DQN from Stable baselines, define your own custom policy, have custom image prepossessing and added internal replay it could count as hand implemented.