Contribute Kernel to FlagGems#
This use case introduces how to contribute the KernelGen generated Kernel, CUDA Implementation, Correctness Test, and Speedup Ratio Test codes to FlagGems GitHub.
The process is as follows:
Generate Kernel, CUDA Implementation, Correctness Test, and Speedup Ratio Test codes. For more information, see Generate Kernels through your operator definitions.
Save the codes as files, respectively.
Convert the files.
Contribute the conveted file to FlagGems GitHub.
Note
The predefined use case ReLU in KernelGen is used as an example. In this use case, we assume that the ReLU operator is a new operator you just generated though KernelGen but you have not contributed it to FlagGems GitHub.
Convert files generated from KernelGen#
To convert files generate from KernelGen:
Rename the four files as follows:
relu_triton.py: This file includes the Kernel code.relu_baseline.py: This file includes the CUDA Implementation code.test_relu_accuracy.py: This file includes the Correctness Test code.test_relu_performance.py: This file includes the Speedup Ratio Test code.
Clone the FlagGems GitHub repository.
git clone https://github.com/flagos-ai/FlagGems
Clone the KernelGen GitHub repository。
git clone https://github.com/flagos-ai/KernelGen
Navigate to the
toolsdirectory of the KernelGen project.cd /your/project/KernelGen/tools
Run the following script to convert these files into two FlagGems-compatible files.
python kernelgen_to_flaggems.py \ ./tests \ ./output \ relu
The converted files are as follows:
/tmp/output/relu.py: Includes the Kernel code. This code is the same as it in KernelGen./tmp/output/relu_test.py: Includes the converted Correctness test and Speedup Ratio codes.
Create a pull request in FlagGems github#
You can now create a pull request in FlagGems GitHub.
Ensure the two converted files are placed properly as follows:
Place the
relu.pyfile at:src/flag_gems/experimental_opsPlace the
relu_test.pyfile at:src/flag_gems/experimental_ops/exp_tests