-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Split VORTEXM4 from VORTEX target and fix SGEMM_DIRECT support for SME-capable targets #5423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
martin-frbg
wants to merge
35
commits into
OpenMathLib:develop
Choose a base branch
from
martin-frbg:issue5414
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+247
−98
Open
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
ca22e28
Rename sgemm_direct_sme1.S to sgemm_direct_sme1_2VLx2VL.S
martin-frbg 22c6607
Use ASMNAME to get symbol name from build system; leave x18 unused as…
martin-frbg 89898fc
Add sgemm_direct_performant for switching between direct and regular …
martin-frbg 08a0032
Build symbol name from build system variables
martin-frbg 53d3bb5
Get symbol name from build system; change b.first to b.mi for AppleCl…
martin-frbg 731f4dd
Add VORTEXM4 settings
martin-frbg e82bcd2
Update ARM64 sgemm_direct object generation
martin-frbg 0203657
Add sgemm_direct_performant for ARM64
martin-frbg de91afd
Move SGEMM_DIRECT after the CBLAS parameter check and add sgemm_direc…
martin-frbg 202a7a0
Separate VORTEXM4 from VORTEX and ARMV9SME
martin-frbg e76c390
Add sgemm_direct_performant for ARM64
martin-frbg ef0b883
Add sgemm_direct_performant for ARM64
martin-frbg ccfd017
Enable SME on MacOS and add VORTEXM4 to DYNAMIC_ARCH list
martin-frbg b0a00fb
Add minimal compiler flags for VORTEXM4
martin-frbg 3097046
Add VORTEXM4 target
martin-frbg 4e2a8c1
Split VORTEXM4 from VORTEX target due to SME support
martin-frbg 18f9582
Add VORTEXM4
martin-frbg ca542f3
Add VORTEXM4
martin-frbg a4f5fec
Add compiler options for VORTEXM4
martin-frbg c794d0a
Add VORTEXM4
martin-frbg 4328c91
relax requirements in compiler SME capability check
martin-frbg 426b5f2
Add compiler options for VORTEXM4
martin-frbg 0bc19a1
Update SME kernel details
martin-frbg bf98e44
Add VORTEXM4 to DYNAMIC_ARCH list
martin-frbg 4609732
Relax version number requirement for AppleClang
martin-frbg 05dbb54
Delete misplaced file
martin-frbg 107c883
Update SME-related kernels
martin-frbg 501728a
adjust register 20 accesses to 21 after moving x18
martin-frbg edaa73f
Hide the local 2VLx2VL symbol as static is insufficient for this with…
martin-frbg 1ee8879
Add VORTEXM4
martin-frbg 7f89c6f
smh-based direct sgemm currently requires leading dimensions to be sa…
martin-frbg 8e50b8d
Add d8 to d15 to clobber lists as the code does not expressly save them
martin-frbg b4fc09e
Add registers d8 to d15 to clobber lists as the code does not express…
martin-frbg 1b88c9c
remove debugging printouts
martin-frbg 2b5d8c7
remove debugging printout
martin-frbg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,6 +111,7 @@ THUNDERX2T99 | |
TSV110 | ||
THUNDERX3T110 | ||
VORTEX | ||
VORTEXM4 | ||
A64FX | ||
ARMV8SVE | ||
ARMV9SME | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For RowMajor, shouldn't the leading dimension check be (lda==k && ldb==n && ldc==n) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normally yes but arguments have already been reshuffled at this point (I think - I'll recheck when I get back to this later this week)