Releases: llvm-mos/llvm-mos-sdk
SDK v0.11.6
New targets
- The new
atari8-stdcart
target now allows compiling to a standard 8KiB or 16KiB Atari 8-bit cartridge. Thanks, @cwedgwood !
Assembler improvements
INC A
,INA
,DEA
, andDEC A
spellings are now supported forINC
andDEC
. Thanks, @pfusik !
Optimizations
- The compiler now spends more time on Loop Strength Reduction, the most important loop optimization pass. Loop quality is particularly important on the 6502, especially relative to overall program size, so we can afford to spend longer on it. Overall, it doesn't seem to impact the compile times that much, but we see a sizable gain on some benchmarks.
Cleanup
- The compiler now uses the
.zeropage
directive to introduce imaginary registers when generating assembly output. This means that uses of imaginary registers are no longer wrapped withmos8(...)
, which makes assembly listings considerably easier to read.
SDK v0.11.5
New features
- Added a
.zeropage <symbol>
directive to allow marking external symbols as belonging to the zero page, much like cc65's.importzp
. This is more ergonomic than wrapping each usage withmos8(...)
, and it paves the way to more readable compiler-generated assembly as well.
Bug fixes
- The
BRK()
macro in6502.h
now contains anop
after thebrk
. This allows thebrk
to return successfully to the program afterwards, which makes this more useful when temporarily stopping the program to debug it. - Fixed a stray write to the zero page in the MMC1 bank handler. A symbol was accidentally placed in absolute memory rather than the zero page, but the memory was referenced using just the low byte.
SDK v0.11.4
Bug fixes
- Fixed transposition of the x and y arguments in neslib's
scroll()
. - Fixed MMC1 issue where use of certain banking routines might not register the corresponding NMI handler.
- Pinned Ubuntu and Windows versions to the previous release.
- Generally, systems libraries offer backwards, but not forwards, compatibility, so a binary built against newer system libraries may not run against an older system. Accordingly, we now build against the oldest supported Ubuntu and Windows versions offered by Github Actions.
- Prevent post-link tools from running when using
-Wl,--lto-emit-asm
. llvm-mos/llvm-mos#256 - Fix crash in specific increment/decrement optimization scenario. llvm-mos/llvm-mos#257
- Don't allocate variables covered by
#pragma clang section
to the zero page. This would break later, as the implicit section would override the zero page tag, but the code would still use 8-bit references. - Don't link against Github Action's libxml and zlib. These aren't reliably available on all target platforms, and they're used for niche functionality in LLVM.
SDK v0.11.2
New Features
- Added
-fnonreentrant
,-freentrant
,__attribute__((nonreentrant))
, and__attribute__((reentrant))
. The-fnonreentrant
Clang flag and__attribute__((nonreentrant))
function attribute directly tell the compiler that it can safely assume that no more than one instance of a given function can simultaneously be active, whether via recursion or interrupt handling. The compiler flag applies to every function in the given module. The-freentrant
and__attribute__((reentrant))
flags produce the current default behavior; the compiler cannot assume nonreentrancy, but it still may be able to prove that a given function is safe. The function attributes take precedence over the compiler flags.
Optimizations
- Fixed llvm-mos/llvm-mos#244. This can prevent spilling and filling of immediate loads, preferring instead to reissue the load.
- Fixed llvm-mos/llvm-mos#245, a minor load elision optimization.
SDK v0.11.1
New targets
- The NES MMC3 target is now complete and tested.
Bug fixes
-
#89 - Check for the batch file version for mos-clang platforms
- This fixed an issue that prevented successful builds using the CMake toolchain config on Windows. Thanks @jroweboy !
-
llvm-mos/llvm-mos#243 - Inline asm branch to wrong location
- This fixed an issue where the assembler would fail to emit a relocation for unknown branch targets, which could cause invalid code generation. Only affects assembler and inline assembly.
Optimizations
- Small functions in the NES neslib and nesdoug libraries were rewritten in C. Combined with LTO, this allows them to be inlined. Given that a number of functions are essentially accessors and mutators for global state, this can dramatically improve the code sequences that involve them.
Misc
-
Integrated changes from upstream LLVM.
-
llvm-objdump now prints instruction arguments in hexadecimal by default.
SDK v0.11.0
Breaking changes
-
.init priorities have been rescaled from 10, 20 to 100, 200, to give more room between init events.
-
Famitone2 no longer automatically initializes sounds using sounds_data and music using music_data. Users should expcility call initialization functions before using the library. This allows banked use cases to work, since they depend on initializing the library more than once.
Bug fixes
-
Fixed a pair of issues (llvm-mos/llvm-mos#236, llvm-mos/llvm-mos#237) which caused codegen to hang on 128-bit operations. This affected the rust-mos standard library.
-
Fixed llvm-mos/llvm-mos#131, which caused Mac builds to fail when using
-DBUILD_SHARED_LIBS=ON
. Thanks, @pfusik ! -
Fixed an issue, also found by @pfusik , where a comparison with zero would be incorrectly elided with a preceding inline assembly fragment, which may not have set N and Z to the expected values.
Optimizations
- The neslib library has been broken up somewhat so that NMI and init functionality for unused portions can be better GCed away.
Library features
-
Famitone2 now includes a library that provides fixed-bank wrappers, for when the library is in a bank. It also now supports setting the banks for music and data, for when these are in a different bank from the library.
-
NES libraries now dynamically build a NMI routine based on what's actually linked into the program. The user can override this default NMI routine and provide their own. To facilitate this, all NMI routines provided by libraries are accessible via a simple JSR.
-
NES targets now have a generic
.chr_rom
section that allows completely controlling the layout of CHR-ROM banks. This is useful if you want to use a single binary file to set all banks. -
The
cbm_k
functions have been added tocbm.h
for all Commodore targets. #87. Thanks, @cnelson20 ! -
getchar()
has now been implemented for Atari8. #88. Thanks, @pfusik ! -
There are now
BRK()
,CLI()
, andSEI()
macros that generate the respective instructions, available in6502.h
.
SDK v0.10.1
Bug Fixes
- Fixed llvm-mos/llvm-mos#238 - Linkage to non-existent libzstd.1.dylib on MacOS: Breakage in MacOS binaries due to unintentional dependency on Homebrew ZSTD library present on Github Actions builders, but not in stock MacOS image.
- Fixed llvm-mos/llvm-mos#235 - @llvm.smul.with.overflow.i64 hangs compiler
- Fixed llvm-mos/llvm-mos#234 - @llvm.fshl hangs compiler: This fixes an issue where detecting an integer rotation pattern produced an infinite loop in the backend.
SDK v0.10.0
Breaking changes
- Removed stub PPU functions from NES target.
New targets
- Added NES-CNROM target.
New functionality
- Added padlib and zaplib functions to nesdoug library.
- Added banking library to MMC1's libc.
- The third byte of PRG VMAs is now the bank number.
- Neslib's NMI handler and famitone2 now support PRG and CHR banking.
Bug fixes
- Unhandled addresses are now reported to Mesen as "register labels".
- Various bug-fixes to neslib, nesdoug, and famitone2 libraries.
Optimizations
- The legalizer now uses known bits information to select 8-bit addressing whenever it can prove that the high 8 bits of a pointer offset are all zero.
- Callee-saved registers are no longer saved or restored for functions that can be proven never to return. (Thanks Anshil!)
SDK v0.9.2
Bug fixes
- The linker will now emit errors for relative branches that are out of range. Previously, this would result in a miscompile. Note that the compiler should never generate these; they should only be possible via hand-written assembly.
New features
- The compiler now supports the
.hword
,.word
,.dword
, and.xword
directives, for 1, 2, 4, and 8 byte values, respectively. - The NES target now includes a port of the Famitone2 library. The SDK also includes a llvm-mos port of the included assembly generation utilities. The library has been modified to be placeable anywhere in memory by the linker, and no hand-edits are required to use it.
- The NES target has a new
.dpcm
section, which places DPCM data with the alignment (64 bytes) and address (>=$c000) necessary for the APU. It also generates a__dpcm_offset
symbol to the start of the section in the format the APU expects (addr >> 6). The famitone2 port picks up samples from this section using usual linker relocation mechanisms; no hand edits are necessary.
SDK v0.9.1
New libraries
- The nesdoug library has been added to the NES target and is accessible via
-lnesdoug
. It is likely buggy and should be considered alpha quality.
Bug fixes
- Added C++
extern C
declarations to the neslib library. - The
neslib
crt0 additions now clear memory before copying.data
, not after. This had the effect of zeroing all data segments. - Misc small cleanups and bug fixes to the neslib port.
- BSS and Data section zero and copy routines no longer run if a section coincidentally begins with the prefix. Instead, the prefix must end with a dot, followed by another name. For example, a
.data_ptr
section will no longer trigger data copying, while.data.ptr
and.data
both still would.
Optimizations
- The register allocator will no longer place a value into a register such that the only uses of that register are copies out of it to physical registers. It will instead prefer to split the live range into something that can be assigned to the destination physical register. This has the particular effect of rematerializing constant loads used as function arguments to right before the call, rather than stashing them in the zero page.
- Multi-byte comparisons against zero no longer consider bytes that are statically known to be zero.
- Sums where one addend is either -1 or 1, depending on control flow, are split into separate increment and decrement operations.
- An expensive copy optimization pass was added right before copy elimination. This helps to remove some of the worse excesses of the register allocator, and this brings our CoreMark score up to 0.089 from 0.088. (A rare occurrence!)