Starter-kit

Participant will get a starter-kit. It contains:

# The examples of agent
agent/agent.py
agent/agent_MP.py
agent/agent_rllib.py
agent/checkpoint-25

# CBEngine config file
agent/gym_cfg.py

# To customize CBEngine interfaces
agent/CBEngine_round3.py

# sample traffic flow data and road network data
data/flow_round3.txt
...
data/roadnet_round3.txt
...

# demo script for generating sample traffic flow data
data/traffic_generator.py

# where you store your model
model/

# scoring script for single flow
evaluate.py

# evaluation and scoring script
evaluate.sh

# rllib train example
rllib_train.py

# example script for using rllib_train.py
train.sh

# rllib testing example
rllib_test.py

# script for parallel evaluating the model
rllib_evaluate.sh

# a simple demo to check your simulation environment. Note that **only** `observation` and `reward` could be modified. Please make sure that the dimension of `observation` is aligned with ``gym_cfg.py``. You could continue using the `observations` defined in the qualification phase, but the previous `reward` can't be used in `rllib` because `rllib` requires that each agent to be assigned with a `reward`. We provide 2 demo `rewards` definitions, "pressure" and "queue length", along with the old `reward` in the comment of default `CBEngine_round3.py``.
demo.py

Participants should implement their algorithm in agent.py. In the final phase, custom CBEngine_round3 is available. Participants can only revise the observation and reward of its agent.

Participants should submit their own CBEngine_round3 for training or evaluation. Note that only observation and reward could be modified. Please make sure that the dimension of observation is aligned with gym_cfg.py. You could continue using the observations defined in the qualification phase, but the previous reward can’t be used in rllib because rllib requires that each agent to be assigned with a reward of single value. We provide 2 demo rewards definitions, “pressure” and “queue length”, along with the old reward in the comment of default CBEngine_round3.py.
Now the current step is not included in observation by default. It is now included in obs['info']['step']
The observation format is modified to align with rllib api. For more information, please refer to the observation
Now the keys (i.e. agent_id) of actions, reward, observation, dones are str instead of int.
Now env.reset return a dict: observation.
Now route and t_ff are removed from “vehicle_info” in final phase.