CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization

This repository hosts CollectiveHLS, an ultra-fast, knowledge-driven approach for optimizing FPGA designs through High-Level Synthesis (HLS). CollectiveHLS relies on two additional projects: HLSAnalysisTools, which generates a feature vector from the original source code of the target application, and GenHLSOptimizer, which collects design latency (in milliseconds) along with BRAM%, DSP%, FF%, and LUT% utilization for various directive combinations on a specific FPGA at a target frequency. CollectiveHLS utilizes applications from Machsuite, RodiniaHLS, and various publicly available GitHub repositories. It targets the AMD UltraScale+ MPSoC ZCU104 FPGA, operating at a clock frequency of 300MHz.

To apply CollectiveHLS to unseen applications, you must first use HLSAnalysisTools to extract the necessary source code information and generate the source code feature vector as detailed in the original manuscript.
To expand CollectiveHLS' knowledge base, you will also need to use GenHLSOptimizer to build the Quality of Result (QoR) metrics database and identify the Pareto frontier of the designs.

This repository contains the source code for the core components of CollectiveHLS and offers a brief demo showcasing its basic functionalities using two applications: the KNN Pipeline from RodiniaHLS and GEMM NCubed from Machsuite.

Getting Started

These instructions will get you a copy of the project on your local machine.

Prerequisites

This project was tested on Ubuntu 20.04 LTS (GNU/Linux 5.4.0-187-generic x86_64) with Python 3.8.10 and Vitis 2022.1 suite installed.

In addition, the following libraries are needed:

psutil (5.9.2)
numpy (1.23.2)
pandas (1.4.3)
seaborn (0.12.0)
matplotlib (3.5.3)
scikit-learn (1.1.2)

which can be simply installed using the following command.

python3 -m pip install -r requirements.txt

Run

After downloading the software in the Prerequisites section you can clone this repository on your local machine.

Optimize Application with CollectiveHLS

python3 CollectiveHLS.py --APPLICATION_TO_BE_OPTIMIZED <ApplicationName>

Example 1: Optimize the KNN Pipeline from RodiniaHLS using CollectiveHLS

python3 CollectiveHLS.py --APPLICATION_TO_BE_OPTIMIZED RodiniaHLS-KNN-Pipeline

Output

*******************************************************
*                   CollectiveHLS                     *
*******************************************************
* Number of PCs         = 3
* Number of Clusters    = 5
* Probability Threshold = 0.1
* VitisHLS Opt.         = False
* Re-propose Directives = False
* FPGA Id               = xczu7ev-ffvc1156-2-e
* Target Clock Period   = 3.33
*******************************************************
* Optimized App. = RodiniaHLS-KNN-Pipeline
*******************************************************
* Design Latency = 10.69353243 msec
* BRAM %         = 0 %
* DSP %          = 58 %
* FF %           = 46 %
* LUT %          = 82 %
* Synthesis Time = 691.6875653266907 sec
*******************************************************

Example 2: Optimize the GEMM NCubed application from Machsuite using CollectiveHLS, ensuring that the re-propose directives feature is disabled

python3 CollectiveHLS.py --APPLICATION_TO_BE_OPTIMIZED Machsuite-GEMM-NCubed --REPROPOSE_DIRECTIVES False

Output

*******************************************************
*                   CollectiveHLS                     *
*******************************************************
* Number of PCs         = 3
* Number of Clusters    = 5
* Probability Threshold = 0.1
* VitisHLS Opt.         = False
* Re-propose Directives = False
* FPGA Id               = xczu7ev-ffvc1156-2-e
* Target Clock Period   = 3.33
*******************************************************
* Optimized App. = Machsuite-GEMM-NCubed
*******************************************************
* Design Latency = 0.01646019 msec
* BRAM %         = 0 %
* DSP %          = 22 %
* FF %           = 354 %
* LUT %          = 262 %
* Synthesis Time = 1081.106892824173 sec
* The design proposed by CollectiveHLS was synthesizable but not feasible.
*******************************************************

Example 3: Optimize the GEMM NCubed application from Machsuite using CollectiveHLS, ensuring that the re-propose directives feature is enabled

python3 CollectiveHLS.py --APPLICATION_TO_BE_OPTIMIZED Machsuite-GEMM-NCubed --REPROPOSE_DIRECTIVES True

Output

*******************************************************
*                   CollectiveHLS                     *
*******************************************************
* Number of PCs         = 3
* Number of Clusters    = 5
* Probability Threshold = 0.1
* VitisHLS Opt.         = False
* Re-propose Directives = True
* FPGA Id               = xczu7ev-ffvc1156-2-e
* Target Clock Period   = 3.33
*******************************************************
* Optimized App. = Machsuite-GEMM-NCubed
*******************************************************
* Design Latency = 0.01646019 msec
* BRAM %         = 0 %
* DSP %          = 22 %
* FF %           = 354 %
* LUT %          = 262 %
* Synthesis Time = 1080.2370615005493 sec
* The design proposed by CollectiveHLS was synthesizable but not feasible.
*******************************************************

# Re-propose Directives Iteration 1

*******************************************************
*                   CollectiveHLS                     *
*******************************************************
* Number of PCs         = 3
* Number of Clusters    = 5
* Probability Threshold = 0.1
* VitisHLS Opt.         = False
* Re-propose Directives = True
* FPGA Id               = xczu7ev-ffvc1156-2-e
* Target Clock Period   = 3.33
*******************************************************
* Optimized App. = Machsuite-GEMM-NCubed
*******************************************************
* Design Latency = 0.01646019 msec
* BRAM %         = 0 %
* DSP %          = 22 %
* FF %           = 354 %
* LUT %          = 262 %
* Synthesis Time = 1107.0174641609192 sec
* The design proposed by CollectiveHLS was synthesizable but not feasible.
*******************************************************

# Re-propose Directives Iteration 2

*******************************************************
*                   CollectiveHLS                     *
*******************************************************
* Number of PCs         = 3
* Number of Clusters    = 5
* Probability Threshold = 0.1
* VitisHLS Opt.         = False
* Re-propose Directives = True
* FPGA Id               = xczu7ev-ffvc1156-2-e
* Target Clock Period   = 3.33
*******************************************************
* Optimized App. = Machsuite-GEMM-NCubed
*******************************************************
* Design Latency = 0.61709229 msec
* BRAM %         = 0 %
* DSP %          = 0 %
* FF %           = 15 %
* LUT %          = 12 %
* Synthesis Time = 83.9670057296753 sec
*******************************************************

Publications

If you find our project useful, please consider citing our papers:

@ARTICLE{10310220,
  author   = {Ferikoglou, Aggelos and Kakolyris, Andreas and Kypriotis, Vasilis and Masouros, Dimosthenis and Soudris, Dimitrios and Xydis, Sotirios},
  journal  = {IEEE Embedded Systems Letters}, 
  title    = {CollectiveHLS: Ultrafast Knowledge-Based HLS Design Optimization}, 
  year     = {2024},
  volume   = {16},
  number   = {2},
  pages    = {235-238},
  keywords = {Source coding;Optimization;Knowledge based systems;Field programmable gate arrays;Measurement;Sociology;Pareto optimization;Collective;data-driven;design space exploration (DSE);field programmable gate array (FPGA);high-level synthesis (HLS)},
  doi      = {10.1109/LES.2023.3330610}
}

@inproceedings{ferikoglou2024data,
  title     = {Data-driven HLS optimization for reconfigurable accelerators},
  author    = {Ferikoglou, Aggelos and Kakolyris, Andreas and Kypriotis, Vasilis and Masouros, Dimosthenis and Soudris, Dimitrios and Xydis, Sotirios},
  booktitle = {Proceedings of the 61st ACM/IEEE Design Automation Conference},
  pages     = {1--6},
  year      = {2024}
}

@article{ferikoglou2024collectivehls,
  title     = {CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization},
  author    = {Ferikoglou, Aggelos and Kakolyris, Andreas and Masouros, Dimosthenis and Soudris, Dimitrios and Xydis, Sotirios},
  journal   = {ACM Transactions on Reconfigurable Technology and Systems},
  volume    = {18},
  number    = {1},
  pages     = {1--32},
  year      = {2024},
  publisher = {ACM New York, NY}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Applications		Applications
KnowledgeBase		KnowledgeBase
modules		modules
.gitignore		.gitignore
CollectiveHLS.py		CollectiveHLS.py
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization

Getting Started

Prerequisites

Run

Publications

About

Releases

Packages

Languages

License

aferikoglou/CollectiveHLS

Folders and files

Latest commit

History

Repository files navigation

CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization

Getting Started

Prerequisites

Run

Publications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages