| Age | Commit message (Collapse) | Author |
|
There is an enormous amount of code moving around in this CL,
but the code is the same, and it is invoked in the same ways.
This CL is preparation for the new linker structure, not the new
structure itself.
The new library's definition is in include/link.h.
The main change is the use of a Link structure to hold all the
linker-relevant state, replacing the smattering of global variables.
The Link structure should both make it clearer which state must
be carried around and make it possible to parallelize more easily
later.
The main body of the linker has moved into the architecture-independent
cmd/ld directory. That includes the list of known header types, so the
distinction between Hplan9x32 and Hplan9x64 is removed (no other
header type distinguished 32- and 64-bit formats), and code for unused
formats such as ipaq kernels has been deleted.
The code being deleted from 5l, 6l, and 8l reappears in liblink or in ld.
Because multiple files are being merged in the liblink directory,
it is not possible to show the diffs nicely in hg.
The Prog and Addr structures have been unified into an
architecture-independent form and moved to link.h, where they will
be shared by all tools: the assemblers, the compilers, and the linkers.
The unification makes it possible to write architecture-independent
traversal of Prog lists, among other benefits.
The Sym structures cannot be unified: they are too fundamentally
different between the linker and the compilers. Instead, liblink defines
an LSym - a linker Sym - to be used in the Prog and Addr structures,
and the linker now refers exclusively to LSyms. The compilers will
keep using their own syms but will fill out the corresponding LSyms in
the Prog and Addr structures.
Although code from 5l, 6l, and 8l is now in a single library, the
code has been arranged so that only one architecture needs to
be linked into a particular program: 5l will not contain the code
needed for x86 instruction layout, for example.
The object file writing code in liblink/obj.c is from cmd/gc/obj.c.
Preparation for golang.org/s/go13linker work.
This CL does not build by itself. It depends on 35740044
and will be submitted at the same time.
R=iant
CC=golang-dev
https://golang.org/cl/35790044
|
|
Fixes #5764.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/13441051
|
|
external linking
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:
10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS
This CL prepares for external linking support to ARM.
The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.
The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.
10271047
This CL adds support for -linkmode external to 5l.
For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.
In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.
9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.
The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.
Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.
The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.
The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.
The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.
This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.
Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6Q
Fixes #5590.
R=rsc
CC=golang-dev
https://golang.org/cl/12871044
|
|
Avoid generating TLS relocations on OpenBSD.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7641055
|
|
This CL was written by rsc. I just tweaked 8l.
This CL adds TLS relocation to the ELF .o file we write during external linking,
so that the host linker (gcc) can decide the final location of m and g.
Similar relocations are not necessary on OS X because we use an alternate
program start-time mechanism to acquire thread-local storage.
Similar relocations are not necessary on ARM or Plan 9 or Windows
because external linking mode is not yet supported on those systems.
On almost all ELF systems, the references we use are like %fs:-0x4 or %gs:-0x4,
which we write in 6a/8a as -0x4(FS) or -0x4(GS). On Linux/ELF, however,
Xen's lack of support for this mode forced us long ago to use a two-instruction
sequence: first we load %gs:0x0 into a register r, and then we use -0x4(r).
(The ELF program loader arranges that %gs:0x0 contains a regular pointer to
that same memory location.) In order to relocate those -0x4(r) references,
the linker must know where they are. This CL adds the equivalent notation
-0x4(r)(GS*1) for this purpose: it assembles to the same encoding as -0x4(r)
but the (GS*1) indicates to the linker that this is one of those thread-local
references that needs relocation.
Thanks to Elias Naur for reminding me about this missing piece and
also for writing the test.
R=r
CC=golang-dev
https://golang.org/cl/7891047
|
|
Instructions for use in AES hashing. See CL#7543043
R=rsc
CC=golang-dev
https://golang.org/cl/7548043
|
|
Added the -shared flag to 5l/6l to output a PIC executable with the required
dynamic relocations and RIP-relative addressing in machine code.
Added dummy support to 8l to avoid compilation errors
See also:
https://golang.org/cl/6822078
https://golang.org/cl/7064048
and
https://groups.google.com/d/topic/golang-nuts/P05BDjLcQ5k/discussion
R=rsc, iant
CC=golang-dev
https://golang.org/cl/6926049
|
|
The 8l linker automatically inserts XCHG instructions
to support otherwise impossible byte registers
(only available on AX, BX, CX, DX).
Sometimes AX or DX is needed (for MUL and DIV) so
we need to avoid clobbering them.
R=golang-dev, dave, iant, iant, rsc
CC=golang-dev
https://golang.org/cl/6846057
|
|
This CL adds support for the these 7 new instructions to 6a/6l in
preparation of the upcoming CL for AES-NI accelerated crypto/aes:
AESENC, AESENCLAST, AESDEC, AESDECLAST, AESIMC, AESKEYGENASSIST,
and PSHUFD.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5970055
|
|
Saving the code in case we improve things enough that
it matters later, but at least right now it is not worth doing.
R=ken2
CC=golang-dev
https://golang.org/cl/6248071
|
|
Added handler for:
MOVQ xmm_reg, xmm_reg/mem64
MOVQ xmm_reg/mem64, xmm_reg
using native MOVQ (it take precedence above REX.W MOVD)
I don't understood 6l code enough to be sure that my small changes
didn't broke it. But now 6l works with MOVQ xmm_reg, xmm_reg and
all.bash reports "0 unexpected bugs".
There is test assembly source:
MOVQ X0, X1
MOVQ AX, X1
MOVQ X1, AX
MOVQ xxx+8(FP), X2
MOVQ X2, xxx+8(FP)
and generated code (gdb disassemble /r):
0x000000000040f112 <+0>: f3 0f 7e c8 movq %xmm0,%xmm1
0x000000000040f116 <+4>: 66 48 0f 6e c8 movq %rax,%xmm1
0x000000000040f11b <+9>: 66 48 0f 7e c8 movq %xmm1,%rax
0x000000000040f120 <+14>: f3 0f 7e 54 24 10 movq 0x10(%rsp),%xmm2
0x000000000040f126 <+20>: 66 0f d6 54 24 10 movq %xmm2,0x10(%rsp)
Fixes #2418.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5316076
|
|
R=golang-dev
CC=golang-dev, rsc
https://golang.org/cl/4950050
|
|
R=rsc
CC=golang-dev, vcc.163
https://golang.org/cl/4963044
|
|
6a/a.h:
. Dropped <u.h> and <libc.h>.
. Made definition of EOF conditional.
6a/a.y:
. Added <u.h> and <libc.h>.
6a/lex.c:
. Added <u.h> and <libc.h>.
. Dropped <ctype.h> (now in <u.h>).
6c/gc.h:
. Added varargck pragma for "lD".
6c/swt.c:
. Dropped unused "thestring" argument in Bprint() calls.
6l/Makefile:
. Dropped unneeded directory prefix.
6l/l.h:
. Dropped unneeded directory prefix.
. Added varargck pragma for "I" and "i".
6l/obj.c:
. Dropped unneeded assignment.
. Dropped unreachable goto statement.
6l/pass.c:
. Dropped assignments flagged as unused.
6l/prof.c:
. Replaced "#if 0" with "#ifdef NOTDEF".
6l/span.c:
. Dropped unused incrementation.
. Added USED() as required.
. Dropped unreachable "return" statement.
R=golang-dev
CC=golang-dev, rsc
https://golang.org/cl/4747044
|
|
Using the CRC32 instruction speeds up the Castagnoli computation by
about 20x on a modern Intel CPU.
R=rsc
CC=golang-dev
https://golang.org/cl/4650072
|
|
R=ken2, ken3
CC=golang-dev
https://golang.org/cl/3505041
|
|
Sub-symbols are laid out inside a larger symbol
but can be addressed directly.
Use to make Mach-O pointer array not a special case.
Will use later to describe ELF sections.
Glimpses of the beginning of ELF loading.
R=ken2
CC=golang-dev
https://golang.org/cl/2623043
|
|
That is, move the pc/ln table and the symbol table
into the read-only data segment. This eliminates
the need for a special load command to map the
symbol table into memory, which makes the
information available on systems that couldn't handle
the magic load to 0x99000000, like NaCl and ARM QEMU
and Linux without config_highmem=y. It also
eliminates an #ifdef and some clumsy code to
find the symbol table on Windows.
The bad news is that the binary appears to be bigger
than it used to be. This is not actually the case, though:
the same amount of data is being mapped into memory
as before, and the tables are still read-only, so they're
still shared across multiple instances of the binary as
they were before. The difference is just that the tables
aren't squirreled away in some section that "size" doesn't
know to look at.
This is a checkpoint.
It probably breaks Windows and breaks NaCl more
than it used to be broken, but those will be fixed.
The logic involving -s needs to be revisited too.
Fixes #871.
R=ken2
CC=golang-dev
https://golang.org/cl/2587041
|
|
Lay out code before data.
R=ken2
CC=golang-dev
https://golang.org/cl/2490043
|
|
R=ken2
CC=golang-dev
https://golang.org/cl/2481042
|
|
Also change the span-dependent jump algorithm
to use fewer iterations:
* resolve forward jumps at their targets (comefrom list)
* mark jumps as small or big and only do small->big
* record whether a jump failed to be encodable
These changes mean that a function with only small
jumps can be laid out in a single iteration, and the
vast majority of functions take just two iterations.
I was seeing a maximum of 5 iterations before; the
max now is 3 and there are fewer that get even that far.
R=ken2
CC=golang-dev
https://golang.org/cl/2537041
|
|
The old code said
if(x) {
handle a
return
}
aa = *a
rewrite aa to make x true
recursivecall(&aa)
The new code says
params = copy out of a
if(!x) {
rewrite params to make x true
}
handle params
but it's hard to see that in the Rietveld diffs because
it gets confused by changes in indentation.
Avoiding the recursion makes other changes easier.
R=ken2
CC=golang-dev
https://golang.org/cl/2533041
|
|
Using explicit relocations internally, we can
represent the data for a particular symbol as
an initialized block of memory instead of a
linked list of ADATA instructions. The real
goal here is to be able to hand off some of the
relocations to the dynamic linker when interacting
with system libraries, but a pleasant side effect is
that the memory image is much more compact
than the ADATA list, so the linkers use less memory.
R=ken2
CC=golang-dev
https://golang.org/cl/2512041
|
|
* Maintain Sym* list for text with individual
prog lists instead of using one huge list and
overloading p->pcond.
* Comment what each file is for.
* Move some output code from span.c to asm.c.
* Move profiling into prof.c, symbol table into symtab.c.
* Move mkfwd to ld/lib.c.
* Throw away dhog dynamic loading code.
* Throw away Alef become.
* Fix printing of WORD instructions in 5l -a.
Goal here is to be able to handle each piece of text or data
as a separate piece, both to make it easier to load the
occasional .o file and also to make it possible to split the
work across multiple threads.
R=ken2, r, ken3
CC=golang-dev
https://golang.org/cl/2335043
|
|
This is entirely adding and removing tabs.
It looks weird but will make the diffs for the
next change easier to read.
R=ken2
CC=golang-dev
https://golang.org/cl/2490041
|
|
R=ken2
CC=golang-dev
https://golang.org/cl/2373043
|
|
R=r
CC=golang-dev
https://golang.org/cl/2221042
|
|
Makes binaries work with 6cov again.
R=ken2
CC=golang-dev
https://golang.org/cl/2192041
|
|
Changing 5g and 5l too, but it doesn't work yet.
R=ken2
CC=golang-dev
https://golang.org/cl/2136047
|
|
Returns R14 and R15 to the available register pool.
Plays more nicely with ELF ABI C code.
In particular, our signal handlers will no longer crash
when a signal arrives during execution of a cgo C call.
Fixes #720.
R=ken2, r
CC=golang-dev
https://golang.org/cl/1847051
|
|
5l, 6l, 8l: change ELF header so that strip doesn't destroy binary
Fixes #261.
R=iant, r
CC=golang-dev
https://golang.org/cl/994044
|
|
hopefully no change
R=rsc
http://go/go-review/1017035
|
|
enough to make nm and oprofile work.
R=r
http://go/go-review/1017016
|
|
R=r
DELTA=3214 (904 added, 2260 deleted, 50 changed)
OCL=35425
CL=35427
|
|
better mach binaries.
cgo working on darwin+linux amd64+386.
eliminated context switches - pi is 30x faster.
add libcgo to build.
on snow leopard:
- non-cgo binaries work; all tests pass.
- cgo binaries work on amd64 but not 386.
R=r
DELTA=2031 (1316 added, 626 deleted, 89 changed)
OCL=35264
CL=35304
|
|
if first function in file was dead code, it was being
discarded along with the file name information for that file.
leave the functions in the master function list longer:
let xfol take the dead code out of the code list,
and let span skip the unreachable functions during output.
before
throw: sys·mapaccess1: key not in map
panic PC=0x2e7b20
throw+0x33 /Users/rsc/go/src/pkg/runtime/runtime.c:65
throw(0x5834f, 0x0)
sys·mapaccess1+0x73 /Users/rsc/go/src/pkg/runtime/hashmap.c:769
sys·mapaccess1(0x2b9bd0, 0x0)
gob·*Encoder·Encode+0x16b /Users/rsc/go/src/pkg/fmt/print.go:2926
gob·*Encoder·Encode(0x2bb440, 0x0, 0x558b0, 0x0, 0x2e4be0, ...)
main·walk+0x331 :1603
main·walk(0x33a480, 0x0)
main·walk+0x271 :1596
main·walk(0x300640, 0x0)
main·walk+0x271 :1596
main·walk(0x300520, 0x0)
main·walk+0x271 :1596
main·walk(0x300240, 0x0)
main·walk+0x271 :1596
main·walk(0x678f8, 0x0)
main·main+0x22 :1610
main·main()
after
throw: sys·mapaccess1: key not in map
panic PC=0x2e7b20
throw+0x33 /Users/rsc/go/src/pkg/runtime/runtime.c:65
throw(0x5834f, 0x0)
sys·mapaccess1+0x73 /Users/rsc/go/src/pkg/runtime/hashmap.c:769
sys·mapaccess1(0x2b9bd0, 0x0)
gob·*Encoder·Encode+0x16b /Users/rsc/go/src/pkg/gob/encoder.go:319
gob·*Encoder·Encode(0x2bb3c0, 0x0, 0x558b0, 0x0, 0x2e4be0, ...)
main·walk+0x331 /Users/rsc/dir.go:121
main·walk(0x2f6ab0, 0x0)
main·walk+0x271 /Users/rsc/dir.go:114
main·walk(0x301640, 0x0)
main·walk+0x271 /Users/rsc/dir.go:114
main·walk(0x301520, 0x0)
main·walk+0x271 /Users/rsc/dir.go:114
main·walk(0x301240, 0x0)
main·walk+0x271 /Users/rsc/dir.go:114
main·walk(0x678f8, 0x0)
main·main+0x22 /Users/rsc/dir.go:128
main·main()
mainstart+0xe /Users/rsc/go/src/pkg/runtime/amd64/asm.s:55
mainstart()
goexit /Users/rsc/go/src/pkg/runtime/proc.c:133
goexit()
R=r
DELTA=46 (20 added, 25 deleted, 1 changed)
OCL=34094
CL=34103
|
|
* remove now-unused D_SBIG (was for typestrings)
* rename elf64.[ch] to elf.[ch]
* pull in elf headers from FreeBSD instead of writing our own
* emit non-header ELF data in data section
* stub out a few more ELF sections needed for dynamic loading
R=r
DELTA=1928 (1237 added, 635 deleted, 56 changed)
OCL=33642
CL=33658
|
|
do not emit unreachable data symbols.
R=austin
DELTA=103 (71 added, 4 deleted, 28 changed)
OCL=33325
CL=33622
|
|
archive size +70%
binary size +30%
old
wreck.mtv=; ls -l /Users/rsc/bin/{godoc,gofmt}
-rwxr-xr-x 1 rsc eng 1487922 Aug 13 13:21 /Users/rsc/bin/godoc
-rwxr-xr-x 1 rsc eng 995995 Aug 13 13:21 /Users/rsc/bin/gofmt
wreck.mtv=; du -sh $GOROOT/pkg/
9.5M /home/rsc/go/pkg/
wreck.mtv=;
new
wreck.mtv=; ls -l /Users/rsc/bin/{godoc,gofmt}
-rwxr-xr-x 1 rsc eng 2014390 Aug 13 14:25 /Users/rsc/bin/godoc
-rwxr-xr-x 1 rsc eng 1268705 Aug 13 14:25 /Users/rsc/bin/gofmt
wreck.mtv=; du -sh $GOROOT/pkg
16M /home/rsc/go/pkg
wreck.mtv=;
R=ken
OCL=33217
CL=33220
|
|
no types yet.
R=ken
OCL=33142
CL=33146
|
|
character string to machine address.
not filled in, just carved out.
R=austin
DELTA=77 (11 added, 34 deleted, 32 changed)
OCL=33122
CL=33124
|
|
make endianness explicit when writing values.
R=rsc
DELTA=129 (37 added, 7 deleted, 85 changed)
OCL=31826
CL=31854
|
|
editing the firstp list was ineffective,
because follow rebuilds it from the textp list.
the symbols for dead code were being dropped
from the binary but the code was all still there.
text for fmt.Printf("hello, world\n") drops
from 143945 to 128650.
R=r,ken
DELTA=22 (20 added, 0 deleted, 2 changed)
OCL=28255
CL=28290
|
|
to match traditional c linkers.
R=r
DELTA=42 (8 added, 12 deleted, 22 changed)
OCL=28101
CL=28115
|
|
throwing away dead code at end of file.
also fix an uninitialized memory error
found by valgrind.
R=r
DELTA=7 (5 added, 2 deleted, 0 changed)
OCL=23991
CL=23994
|
|
* add gotype string to symbol table
* fill in gotype in 6l for known funcs/vars
* print gotype with nm -t
* load symbol and pc/ln tables into memory at magic address 0x99<<32.
* add sys.symdat() to retrieve raw bytes of symbol table
and pc/ln table.
most of this should be considered experimental
and subject to change.
R=r
DELTA=157 (128 added, 0 deleted, 29 changed)
OCL=19746
CL=19750
|
|
these guys really really want long to be 32-bits,
so ,s/long/int32/ (and then manual fixup).
still passes all tests.
(i started out looking for just those longs that
needed to be int32 instead, and it was just too hard
to track them down one by one.)
the longs were rare enough that i don't think
it will cause integration problems.
R=ken
OCL=13787
CL=13789
|
|
SVN=123521
|
|
tacked above each TEXT entry
SVN=123496
|
|
SVN=121164
|