Skip to content

ABKGroup/CT-UCSD

Repository files navigation

This repository provides scripts and testcases that we have used to run Circuit Training at UCSD. We copied the original Circuit Training (CT) repository and made the necessary modifications to run our experiments. To run CT as we have, on our testcases, please follow the instructions below.

Building the Docker Environment

To build the Docker environment, use the following commands:

cd tools_ours
./docker_image_build.sh

Starting the Docker Container

To start the Docker container, use the command below:

## You may need to update the user ID or run as root.
docker run --gpus all -u 1031:1032 --network=host -it --rm -v $(pwd):/workspace --workdir /workspace circuit_training:corepy39cu12 bash

Training From Scratch

To launch the training from scratch, execute the following commands:

## Update set_envs.sh (Lines: 38, 39 and 41)
source ./run_script/set_envs.sh
./run_script/reverb_train_eval.sh

## Collector Server 1
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect1.sh

## Collector Server 2
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect2.sh

## Collector Server 3
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect3.sh

## Collector Server 4
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect4.sh

## Collector Server 5
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect5.sh

The above example runs 256 collector jobs across five different servers, each with 96 CPU threads, and runs one training job, one evaluation job, and a reverb server on the main server. The main server is equipped with eight NVIDIA-V100 GPUs and 96 CPU threads. Note that as described in Figure 6 of the ISPD22 paper, the number of GPUs used will influence the training outcome.


Fine-tuning the Pre-trained Model

First, download the pre-trained model released by Circuit Training authors from here. To fine-tune the pre-trained "AlphaChip" model, execute the following commands:

## Update set_envs_FT.sh (Lines: 38, 39, 41, 47 and 48)
source ./run_script/set_envs_FT.sh
./run_script/reverb_train_eval_FT.sh

## Collector Server 1
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect1.sh

## Collector Server 2
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect2.sh

## Collector Server 3
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect3.sh

## Collector Server 4
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect4.sh

## Collector Server 5
## Update ./run_script/set_envs_collect.sh (Lines: 38, 39, 41 and 54)
source ./run_script/set_envs_collect.sh
./run_script/run_collect5.sh

The above example runs 256 collector jobs across five different servers, each with 96 CPU threads, and runs one training job, one evaluation job, and a reverb server on the main server. The main server is equipped with eight NVIDIA-V100 GPUs and 96 CPU threads. Note that as described in Figure 6 of the ISPD22 paper, the number of GPUs used will influence the training outcome.


Pre-training Model with MemPoolGroup NG45

The following example shows pre-training the model with seven variants of MemPoolGroup NG45: x-flip, y-flip, xy-flip, shift, shift-x-flip, shift-y-flip, shift-xy-flip.

## Update ./run_script/run_script_pt_mpg/set_envs_PT.sh (Lines: 61-89 and 101)
source ./run_script/run_script_pt_mpg/set_envs_PT.sh
./run_script/run_script_pt_mpg/reverb_train_eval_PT.sh

## Collector Server 1
## Update ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh (Lines: 61-89 and 100)
source ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh
./run_script/run_script_pt_mpg/run_collect1_PT.sh

## Collector Server 2
## Update ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh (Lines: 61-89 and 100)
source ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh
./run_script/run_script_pt_mpg/run_collect2_PT.sh

## Collector Server 3
## Update ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh (Lines: 61-89 and 100)
source ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh
./run_script/run_script_pt_mpg/run_collect3_PT.sh

## Collector Server 4
## Update ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh (Lines: 61-89 and 100)
source ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh
./run_script/run_script_pt_mpg/run_collect4_PT.sh

## Collector Server 5
## Update ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh (Lines: 61-89 and 100)
source ./run_script/run_script_pt_mpg/set_envs_collect_PT.sh
./run_script/run_script_pt_mpg/run_collect5_PT.sh

The above example runs 252 collector jobs (36 for each variant of MemPoolGroup-NG45) across five different servers, each with 96 CPU threads. In addition, it runs one training job, one evaluation job, and a reverb server on the main server, which is equipped with eight NVIDIA V100 GPUs and 96 CPU threads.

Once pre-training is complete, you can use the checkpoint and policy from the /workspace/logs/run_YOUR_PRETRAIN_MODEL/<seed>/policies directory to fine-tune the model on the target design.


Pre-training Model with Scaled Versions of CT-Ariane

The following example demonstrates pre-training the model using both x-flip and y-flip variants of CT-Ariane, CT-Ariane-X2, and CT-Ariane-X4 (i.e., six netlists).

## Update ./run_script/run_script_pt_ariane_x4/set_envs_PT.sh (Lines: 61-86 and 98)
source ./run_script/run_script_pt_ariane_x4/set_envs_PT.sh
./run_script/run_script_pt_ariane_x4/reverb_train_eval_PT.sh

## Collector Server 1
## Update ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh (Lines: 61-86 and 97)
source ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh
./run_script/run_script_pt_ariane_x4/run_collect1_PT.sh

## Collector Server 2
## Update ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh (Lines: 61-86 and 97)
source ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh
./run_script/run_script_pt_ariane_x4/run_collect2_PT.sh

## Collector Server 3
## Update ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh (Lines: 61-86 and 97)
source ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh
./run_script/run_script_pt_ariane_x4/run_collect3_PT.sh 

## Collector Server 4
## Update ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh (Lines: 61-86 and 97)
source ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh
./run_script/run_script_pt_ariane_x4/run_collect4_PT.sh

## Collector Server 5
## Update ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh (Lines: 61-86 and 97)
source ./run_script/run_script_pt_ariane_x4/set_envs_collect_PT.sh
./run_script/run_script_pt_ariane_x4/run_collect5_PT.sh

The above example runs 252 collector jobs (42 for each variant of CT-Ariane-X4) across five different servers, each with 96 CPU threads. In addition, it runs one training job, one evaluation job, and a reverb server on the main server, which is equipped with eight NVIDIA V100 GPUs and 96 CPU threads.

Once pre-training is complete, you can use the checkpoint and policy from the /workspace/logs/run_YOUR_PRETRAIN_MODEL/<seed>/policies directory to fine-tune the model on the target design.


Testcases

Here are the open testcases on which we have run CT from scratch and fine-tuned the pre-trained model released by the authors of the Circuit Training paper. Note that we set "MACRO_COUNT" to the number of macros in the design plus one.

  1. CT-Ariane:

    • DESIGN_NAME: ariane
    • MACRO_COUNT: 134
    • Use ./run_scripts/cong_tsmc7.sh to update the routing resource.
  2. CT-Ariane-X2:

    • DESIGN_NAME: ariane_X2
    • MACRO_COUNT: 267
    • Use ./run_scripts/cong_tsmc7.sh to update the routing resource.
  3. CT-Ariane-X4:

    • DESIGN_NAME: ariane_X4_xflip_yflip
    • MACRO_COUNT: 533
    • Use ./run_scripts/cong_tsmc7.sh to update the routing resource.
  4. Ariane-NG45:

    • DESIGN_NAME: ariane133_ng45
    • MACRO_COUNT: 134
    • Use ./run_scripts/cong_ng45.sh to update the routing resource.
  5. Ariane-ASAP7

    • DESIGN_NAME: ariane_asap7
    • MACRO_COUNT: 134
    • Use ./run_scripts/cong_asap7.sh to update the routing resource.
  6. BlackParrot-NG45

    • DESIGN_NAME: bp_ng45
    • MACRO_COUNT: 221
    • Use ./run_scripts/cong_ng45.sh to update the routing resource.
  7. BlackParrot-ASAP7

    • DESIGN_NAME: bp_asap7
    • MACRO_COUNT: 221
    • Use ./run_scripts/cong_asap7.sh to update the routing resource.
  8. MemPoolGroup-NG45

    • DESIGN_NAME: mempool_group_ng45
    • MACRO_COUNT: 325
    • Use ./run_scripts/cong_ng45.sh to update the routing resource.
  9. MemPoolGroup-ASAP7

    • DESIGN_NAME: mempool_group_asap7
    • MACRO_COUNT: 325
    • Use ./run_scripts/cong_asap7.sh to update the routing resource.

If you want to generate these testcases, please refer to the MacroPlacement repository here.


About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published