WIP: Add support for kmpc_parallel for for pragma target teams distribute parallel do#51
Open
DominikAdamski wants to merge 18 commits intoROCm:aomp-devfrom
Open
WIP: Add support for kmpc_parallel for for pragma target teams distribute parallel do#51DominikAdamski wants to merge 18 commits intoROCm:aomp-devfrom
DominikAdamski wants to merge 18 commits intoROCm:aomp-devfrom
Conversation
Done:
Added declaration of kmpc_parallel_51
Set constant args
Added logic for setting up size of array which will
contain target symbols
Calling kmpc_parallel_51
Moved deinit function just after kmpc_parallel_51
Not done:
Passing symbols to target array
Creating separate function which will reflect kernel code
Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
kmpc_parallel_51 requires that offloaded symbols are passed as addresses inside pointer array Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Flang generates new SPMD kernels which use kmpc_parallel_51 function. Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Do not assign new dtype value for device symbols via get_type function. Use macros PASSBYVALP PASSBYREFP instead. Remove hack for load/store of the last symbol. Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Fixed passing scalars which type is different than int64. Fixed passing allocatable arrays. Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
If we generate initialization function for SPMD kernels we need to store addresses of the arguments before we call kmpc_parallel_51 function. We use ptrtoint instruction for scalar variables. Before this patch the LLVM IR code was generated wrongly for complex variables: void kernel_func(<float, float> *Arg_c) //some code ptrtoint i64* %Arg_c //error Arg_c was declared as pair of floats //some code call kmpc_parallel_51() //some code This patch causes that LLVM IR contains correct ptrtoint instruction: void kernel_func(<float, float> *Arg_c) //some code ptrtoint <float, float>* %Arg_c //ok, Arg_c was declared as pair of floats //some code call kmpc_parallel_51() //some code Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Fortran objects should be passed as i64* Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Clang uses kmpc_parallel_51 function for handlig target parallel for pragma. Flang should use the same functions as Clang for pragma target parallel Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Pragma target teams distribute parallel for calls kmpc_parallel_51 function in different way in comparison to pragma target parallel. Pragma target parallel calls kmpc_parallel_51 function just after initialization procedure. Pragma target teams distribute parallel for calls kmpc_parallel_51 function in the body of the outer loop. Signed-off-by: Dominik Adamski <Dominik.Adamski@amd.com>
Pragma target teams distribute parallel do consists of two levels of parallelism (teams and threads). Threads from one team should be launched in parallel via __kmpc_parallel_51 function. The code which is executed by threads should be outlined to separate function. Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
bcdf266 to
852a46c
Compare
Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.