aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal/ssa/_gen/PPC64latelower.rules
AgeCommit message (Collapse)Author
9 dayscmd/compile: improve uint8/uint16 logical immediates on PPC64Jayanth Krishnamurthy jayanth.krishnamurthy@ibm.com
Logical ops on uint8/uint16 (AND/OR/XOR) with constants sometimes materialized the mask via MOVD (often as a negative immediate), even when the value fit in the UI-immediate range. This prevented the backend from selecting andi. / ori / xori forms. This CL makes: UI-immediate truncation is performed only at the use-site of logical-immediate ops, and only when the constant does not fit in the 8- or 16-bit unsigned domain (m != uint8(m) / m != uint16(m)). This avoids negative-mask materialization and enables correct emission of UI-form logical instructions. Arithmetic SI-immediate instructions (addi, subfic, etc.) and other use-patterns are unchanged. Codegen tests are added to ensure the expected andi./ori/xori patterns appear and that MOVD is not emitted for valid 8/16-bit masks. Change-Id: I9fcdf4498c4e984c7587814fb9019a75865c4a0d Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10 Reviewed-on: https://go-review.googlesource.com/c/go/+/704015 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Paul Murphy <paumurph@redhat.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Mark Freeman <markfreeman@google.com>
2024-08-26cmd/compile: intrinsify math.MulUintptr on PPC64Paul E. Murphy
This can be done efficiently with few instructions. This also adds MULHDUCC for further codegen improvement. Change-Id: I06320ba4383a679341b911a237a360ef07b19168 Reviewed-on: https://go-review.googlesource.com/c/go/+/605975 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Archana Ravindar <aravinda@redhat.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-22cmd/compile/internal/ssa: reintroduce ANDconst opcode on PPC64Paul E. Murphy
This allows more effective conversion of rotate and mask opcodes into their CC equivalents, while simplifying the first lowering pass. This was removed before the latelower pass was introduced to fold more cases of compare against zero. Add ANDconst to push the conversion of ANDconst to ANDCCconst into latelower with the other CC opcodes. This also requires introducing RLDICLCC to prevent regressions when ANDconst is converted to RLDICL then to RLDICLCC and back to ANDCCconst when possible. Change-Id: I9e5f9c99fbefa334db18c6c152c5f967f3ff2590 Reviewed-on: https://go-review.googlesource.com/c/go/+/586160 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2023-11-13cmd/compile/internal/ssa: on PPC64, merge (CMPconst [0] (op ...)) more ↵Paul E. Murphy
aggressively Generate the CC version of many opcodes whose result is compared against signed 0. The approach taken here works even if the opcode result is used in multiple places too. Add support for ADD, ADDconst, ANDN, SUB, NEG, CNTLZD, NOR conversions to their CC opcode variant. These are the most commonly used variants. Also, do not set clobberFlags of CNTLZD and CNTLZW, they do not clobber flags. This results in about 1% smaller text sections in kubernetes binaries, and no regressions in the crypto benchmarks. Change-Id: I9e0381944869c3774106bf348dead5ecb96dffda Reviewed-on: https://go-review.googlesource.com/c/go/+/538636 Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-10-18cmd/compile/internal/ssa: on PPC64, generate large constant paddiPaul E. Murphy
This is only supported power10/linux/PPC64. This generates smaller, faster code by merging a pli + add into paddi. Change-Id: I1f4d522fce53aea4c072713cc119a9e0d7065acc Reviewed-on: https://go-review.googlesource.com/c/go/+/531717 Run-TryBot: Paul Murphy <murp@ibm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-10-18cmd/compile: avoid ANDCCconst on PPC64 if condition not neededLynn Boger
In the PPC64 ISA, the instruction to do an 'and' operation using an immediate constant is only available in the form that also sets CR0 (i.e. clobbers the condition register.) This means CR0 is being clobbered unnecessarily in many cases. That affects some decisions made during some compiler passes that check for it. In those cases when the constant used by the ANDCC is a right justified consecutive set of bits, a shift instruction can be used which has the same effect if CR0 does not need to be set. The rule to do that has been added to the late rules file after other rules using ANDCCconst have been processed in the main rules file. Some codegen tests had to be updated since ANDCC is no longer generated for some cases. A new test case was added to verify the ANDCC is present if the results for both the AND and CR0 are used. Change-Id: I304f607c039a458e2d67d25351dd00aea72ba542 Reviewed-on: https://go-review.googlesource.com/c/go/+/531435 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Paul Murphy <murp@ibm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-09-06cmd/compile/internal/ssa: improve masking codegen on PPC64Paul E. Murphy
Generate RLDIC[LR] instead of MOVD mask, Rx; AND Rx, Ry, Rz. This helps reduce code size, and reduces the latency caused by the constant load. Similarly, for smaller-than-register values, truncate constants which exceed the range of the value's type to avoid needing to load a constant. Change-Id: I6019684795eb8962d4fd6d9585d08b17c15e7d64 Reviewed-on: https://go-review.googlesource.com/c/go/+/515576 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-02-06cmd/compile: add rules to emit SETBC/R instructions on power10Archana R
This CL adds rules that replaces instances of ISEL that produce a boolean result based on a condition register by SETBC/SETBCR operations. On Power10 these are convereted to SETBC/SETBCR instructions that use one register instead of 3 registers conventionally used by ISEL and hence reduces register pressure. On loops written specifically to exercise such instances of ISEL extensively, a performance improvement of 2.5% is seen on Power10. Also added verification tests to verify correct generation of SETBC/SETBCR instructions on Power10. Change-Id: Ib719897f09d893de40324440a43052dca026e8fa Reviewed-on: https://go-review.googlesource.com/c/go/+/449795 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Archana Ravindar <aravind5@in.ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-14cmd/compile: merge zero constant ISEL in PPC64 lateLower passPaul E. Murphy
Add a new SSA opcode ISELZ, similar to ISELB to represent a select of value or 0. Then, merge candidate ISEL opcodes inside the late lower pass. This avoids complicating rules within the the lower pass. Change-Id: I3b14c94b763863aadc834b0e910a85870c131313 Reviewed-on: https://go-review.googlesource.com/c/go/+/442596 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Joedian Reid <joedian@golang.org>
2022-10-24cmd/compile: enable lateLower pass on PPC64Paul E. Murphy
This allows new rules to be added which would otherwise greatly overcomplicate the generic rules, like CC opcode conversion or zero register simplification. Change-Id: I1533f0fa07815aff99ed8ab890077bd22a3bfbf5 Reviewed-on: https://go-review.googlesource.com/c/go/+/442595 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>